Transport Survey Methods : Best Practice for Decision Making 9781781902882, 9781781902875

Every three years, researchers with interest and expertise in transport survey methods meet to improve and influence the

243 62 27MB

English Pages 821 Year 2013

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Transport Survey Methods : Best Practice for Decision Making
 9781781902882, 9781781902875

Citation preview

TRANSPORT SURVEY METHODS: BEST PRACTICE FOR DECISION MAKING

TRANSPORT SURVEY METHODS: BEST PRACTICE FOR DECISION MAKING

EDITED BY

JOHANNA ZMUD RAND Corporation, VA, USA

MARTIN LEE-GOSSELIN Laval University, Quebec, Canada

MARCELA MUNIZAGA University of Chile, Santiago, Chile

JUAN ANTONIO CARRASCO University of Concepcion, Concepcion, Chile

United Kingdom – North America – Japan India – Malaysia – China

Emerald Group Publishing Limited Howard House, Wagon Lane, Bingley BD16 1WA, UK First edition 2013 Copyright r 2013 Emerald Group Publishing Limited Reprints and permission service Contact: [email protected] No part of this book may be reproduced, stored in a retrieval system, transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without either the prior written permission of the publisher or a licence permitting restricted copying issued in the UK by The Copyright Licensing Agency and in the USA by The Copyright Clearance Center. Any opinions expressed in the chapters are those of the authors. Whilst Emerald makes every effort to ensure the quality and accuracy of its content, Emerald makes no representation implied or otherwise, as to the chapters’ suitability and application and disclaims any warranties, express or implied, to their use. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN: 978-1-78190-287-5

ISOQAR certified Management System, awarded to Emerald for adherence to Environmental standard ISO 14001:2004. Certificate Number 1985 ISO 14001

Contents

List of Contributors

xi

Preface

xvii

Acknowledgements

xxi

PART I. SETTING THE CONTEXT 1. Transport Surveys: Considerations for Decision Makers and Decision Making Johanna Zmud, Martin Lee-Gosselin, Marcela Munizaga and Juan Antonio Carrasco 2. Keynote — Total Design Data Needs for the New Generation Large-Scale Activity Microsimulation Models Konstadinos G. Goulias, Ram M. Pendyala and Chandra Bhat

3

21

PART II. FOCUS ON IMPROVED METHODS: THEMES 1 TO 5 THEME 1: MAINSTREAMING MOBILITY-AWARE AND ON-LINE TECHNOLOGIES 3. Cell Phone Enabled Travel Surveys: The Medium Moves the Message Jane Gould 4. A Case Study: Multiple Data Collection Methods and the NY/NJ/CT Regional Travel Survey Jean Wolf, Jeremy Wilhelm, Jesse Casas and Sudeshna Sen 5. Conducting a GPS-only household travel survey Peter R. Stopher, Christine Prasad, Laurie Wargelin and Jason Minser

v

51

71

91

vi

Contents

6. The Role of Web Interviews as Part of a National Travel Survey Linda Christensen 7. Using Accelerometer Equipped GPS Devices in Place of Paper Travel Diaries to Reduce Respondent Burden in a National Travel Survey Abby Sneade

115

155

8. Workshop Synthesis: Validating Shifts in the Total Design of Travel Surveys Anthony J. Richardson and T. Keith Lawton

181

9. Workshop Synthesis: Multi-Method Data Collection To Support Integrated Regional Models Eric J. Miller and Caitlin Cottrill

195

THEME 2: IMPROVING RESPONDENT INTERFACES 10. Web-Based Travel Survey: A Demo Pierre-Le´o Bourbonnais and Catherine Morency 11. Web versus Pencil-and-Paper Surveys of Weekly Mobility: Conviviality, Technical and Privacy Issues Marius The´riault, Martin Lee-Gosselin, Louis Alexandre, Franc- ois The´berge and Louis Dieumegarde 12. Workshop Synthesis: Designing New Survey Interfaces Marcelo G. Simas Oliveira and Mark Freedman 13. Shipper/Carrier Interactions Data Collection: Web-Based Respondent Customized Stated Preference (WRCSP) Survey Rinaldo A. Cavalcante and Matthew J. Roorda 14. Workshop Synthesis: Alternative Approaches to Freight Surveys Jesse Casas and Matthew J. Roorda

207

225

247

257

279

THEME 3: COMPARING SURVEY MODES AND METHODS 15. Analysis of PAPI, CATI, and CAWI Methods for a Multiday Household Travel Survey Martin Kagerbauer, Wilko Manz and Dirk Zumkeller

289

Contents

vii

16. Comparing Trip Diaries with GPS Tracking: Results of a Comprehensive Austrian Study Birgit Kohla and Michael Meschik

305

17. Correcting Biographic Survey Data Biases to Compare with Cross-Section Travel Surveys Francis Papon

321

18. Workshop Synthesis: Comparative Research into Travel Survey Methods Jimmy Armoogum and Marco Diana

337

THEME 4: FACING UP TO SAMPLE ATTRITION IN LONGITUDINAL SURVEYS 19. Optimal Sampling Designs for Multi-Day and Multi-Period Panel Surveys Makoto Chikaraishi, Akimasa Fujiwara, Junyi Zhang and Dirk Zumkeller 20. Data Quality and Completeness Issues in Multiday or Panel Surveys Bastian Chlond, Matthias Wirtz and Dirk Zumkeller 21. Workshop Synthesis: Longitudinal Methods: Overcoming Challenges and Exploiting Benefits Elizabeth Ampt

349

373

393

THEME 5: UNDERSTANDING THE SOCIAL CONTEXT OF DATA COLLECTION 22. Affective Personal Networks versus Daily Contacts: Analyzing Different Name Generators in a Social Activity-Travel Behavior Context Juan Antonio Carrasco, Cristia´n Bustos and Beatriz Cid-Aguayo

409

23. Qualitative Methods in Transport Research: The ‘Action Research’ Approach Karen Lucas

427

24. Workshop Synthesis: Collecting Qualitative and Quantitative Data on the Social Context of Travel Behaviour Kelly J. Clifton

441

viii

Contents

PART III: FOCUS ON NEW METHODS AND DATA SOURCES: THEMES 6 TO 8 THEME 6: NEW CHALLENGES IN DEALING WITH TIME: ENVIRONMENTAL PEAKS AND PLANNING HORIZONS 25. Empirically Constrained Efficiency in a Strategic-Tactical Stated Choice Survey of the Usage Patterns of Emerging Carsharing Services Scott Le Vine, Aruna Sivakumar, Martin Lee-Gosselin and John Polak 26. Workshop Synthesis: Methods for Capturing Multi-Horizon Choices Chandra Bhat and Matthew Roorda 27. Survey Data to Model Time-of-Day Choice: Methodology and Findings Julia´n Arellana, Juan de Dios Ortu´zar and Luis Ignacio Rizzi 28. Collection of Time-Dependent Data Using Audio-Visual Stated Choice Chester Wilmot and Ravindra Gudishala 29. Workshop Synthesis: Survey Methods to Inform Policy Makers on Energy, Environment, Climate and Natural Disasters Gerd Sammer and Juan de Dios Ortu´zar

453

471

479

507

523

THEME 7: NEW PERSPECTIVES ON OBSERVING CHOICE PROCESSES: PSYCHOLOGICAL FACTORS 30. Factors Affecting Respondents’ Engagement with Survey Tasks Peter Bonsall, Jens Schade, Lars Roessger and Bill Lythgoe

539

31. A Stated Adaptation Approach to Surveying Activity Scheduling Decisions Claude Weis, Christoph Dobler and Kay W. Axhausen

569

32. Workshop Synthesis: Cognitive and Decision Processes Underlying Engagement in Stated Response Surveys Peter Bonsall

591

33. Measuring User Satisfaction in Transport Services: Methodology and Application Pedro Donoso, Marcela Munizaga and Jorge Rivera

603

Contents 34. Semantic Approach to Capture Psychological Factors Affecting Mode Choice: Comparative Results from Canada and Chile Alejandro Tudela, Khandker M. Nurul Habib and Ahmed Osman Idris 35. Workshop Synthesis: Measuring the Influence of Attitudes and Perceptions Mark Bradley

ix

625

643

THEME 8: NEW TYPES OF DATA STREAMS: OPPORTUNITIES AND CHALLENGES 36. Smart Card Validation Data as a Multi-Day Transit Panel Survey to Investigate Individual and Aggregate Variation in Travel Behaviour Ka Kee Alfred Chu and Robert Chapleau 37. Indirect Measurement of Level of Service Variables for the Public Transport System of Santiago Using Passive Data Pablo Beltra´n, Antonio Gschwender, Marcela Munizaga, Meisy Ortega and Carolina Palma

649

673

38. Towards a Reliable Origin-Destination Matrix from Massive Amounts of Smart Card and GPS Data: Application to Santiago Flavio Devillaine, Marcela Munizaga, Carolina Palma and Mauricio Zu´n˜iga

695

39. Workshop Synthesis: Exploiting and Merging Passive Public Transportation Data Streams Catherine Morency

711

40. A GPS/Web-Based Solution for Multi-Day Travel Surveys: Processing Requirements and Participant Reaction Stephen Greaves and Richard Ellison

721

41. Spatiotemporal Data from Mobile Phones for Personal Mobility Assessment Zbigniew Smoreda, Ana-Maria Olteanu-Raimond and Thomas Couronne´

745

42. Workshop Synthesis: Post-Processing of Spatio-Temporal Data Peter Stopher and Abby Sneade

769

Index

781

List of Contributors

Louis Alexandre

Hydro-Quebec, Que´bec, Canada

Elizabeth Ampt

Sinclair Knight Merz (SKM), Adelaide, Australia

Julia´n Arellana

Departamento de Ingenierı´ a Civil y Ambiental, Universidad del Norte, Barranquilla, Colombia

Jimmy Armoogum

Department of Transport, Economics and Sociology (DEST), IFSTTAR, Noisy le Grand, France

Kay W. Axhausen

Institute for Transport Planning and Systems (IVT), ETH Zurich, Switzerland

Pablo Beltra´n

Cityplanning, Santiago, Chile

Chandra R. Bhat

Department of Civil, Architectural and Environmental Engineering, University of Texas at Austin, Austin, TX, USA

Peter Bonsall

Institute for Transport Studies, University of Leeds, Leeds, UK

Pierre-Le´o Bourbonnais

Department of Civil, Geological and Mining Engineering, Polytechnique Montre´al, Montre´al, Que´bec

Mark Bradley

Resource Systems Group, Santa Barbara, CA, USA

Cristia´n Bustos

Solutiva Consultores, Concepcio´n, Chile

Juan Antonio Carrasco

Department of Civil Engineering, Universidad de Concepcio´n, Chile

Jesse Casas

Westat, Rockville, Montgomery County, MD, USA

Rinaldo A. Cavalcante

Department of Civil Engineering, University of Toronto, Ontario, Canada

Robert Chapleau

E´cole Polytechnique de Montre´al, Montre´al, Que´bec, Canada

xii

List of Contributors

Makoto Chikaraishi

Department of Urban Engineering, The University of Tokyo, Tokyo, Japan

Bastian Chlond

Institute for Transport Studies, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany

Linda Christensen

DTU Transport, Danish Technical University, Lyngby, Denmark

Ka Kee Alfred Chu

Agence me´tropolitaine de transport, Montre´al, Que´bec, Canada

Beatriz Cid-Aguayo

Department of Sociology and Anthropology, Universidad de Concepcio´n, Chile

Kelly J. Clifton

Portland State University, Portland, OR, USA

Caitlin Cottrill

Smart-FM, Singapore

Thomas Couronne´

Sociology and Economics of Networks and Services Department, Orange Labs R&D, Paris, France

Flavio Devillaine

Coordinacio´n Transantiago, Santiago, Chile

Marco Diana

Department of Environmental, Land and Infrastructures Engineering (DIATI), Politecnico di Torino, Torino, Italy

Louis Dieumegarde

Universite´ Laval, CRAD, Que´bec, Canada

Christoph Dobler

Institute for Transport Planning and Systems (IVT), ETH Zurich, Zurich, Switzerland

Pedro Donoso

Laboratorio de Transporte y Uso de Suelo, Universidad de Chile, Santiago, Chile

Richard Ellison

Institute of Transport & Logistics Studies, University of Sydney, NSW, Australia

Mark Freedman

Westat, Rockville, Montgomery County, MD, USA

Akimasa Fujiwara

Graduate School for International Development and Cooperation, Hiroshima University, Higashi-Hiroshima, Japan

Jane Gould

UCLA Transportation, LA, CA, USA

Konstadinos G. Goulias

Geography Department, University of California Santa Barbara, Santa Barbara, CA, USA

Stephen Greaves

Institute of Transport & Logistics Studies, University of Sydney, NSW, Australia

Antonio Gschwender

Coordinacio´n Transantiago, Santiago, Chile

List of Contributors

xiii

Ravindra Gudishala

Department of Civil and Environmental Engineering, Louisiana State University, Baton Rouge, LA, USA

Martin Kagerbauer

Institute for Transport Studies, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany

Birgit Kohla

Institute for Transport Studies, University of Natural Resources and Life Sciences, Vienna, Austria

T. Keith Lawton

Keith Lawton Consulting Inc., Newberg, OR, USA

Scott Le Vine

Department of Civil and Environmental Engineering, Centre for Transport Studies, Imperial College London, South Kensington, UK

Martin Lee-Gosselin

ESAD-CRAD, Universite´ Laval, Que´bec City, Canada

Karen Lucas

Transport Studies Unit, Oxford University, Oxford, UK

Bill Lythgoe (deceased)

Institute for Transport Studies, University of Leeds, Leeds, UK

Wilko Manz

INOVAPLAN GmbH, Ettlingen, BadenWuerttemberg, Germany

Michael Meschik

Institute for Transport Studies, University of Natural Resources and Life Sciences, Vienna, Austria

Eric J. Miller

Department of Civil Engineering, University of Toronto, Ontario, Canada

Jason Minser

Abt SRBI, Savannah, GA, USA

Catherine Morency

Department of Civil, Geological and Mining Engineering, Polytechnique Montreal, Montre´al, Que´bec, Canada

Marcela Munizaga

Departamento de Ingenierı´ a Civil, Universidad de Chile, Santiago, Chile

Khandker M. Nurul Habib

Civil Engineering Department, University of Toronto, Ontario, Canada

Ana-Maria Olteanu-Raimond

Sociology and Economics of Networks and Services Department, Orange Labs R&D, Paris, France

Meisy Ortega

MIT, Boston, MA, USA

Juan de Dios Ortu´zar

Departamento de Ingenierı´ a de Transporte y Logı´ stica, Pontifica Universidad Cato´lica de Chile, Santiago, Chile

xiv

List of Contributors

Ahmed Osman Idris

Civil Engineering Department, University of Toronto, Ontario, Canada

Carolina Palma

Cityplanning, Santiago, Chile

Francis Papon

IFSTTAR, Noisy-le-Grand, France

Ram M. Pendyala

Department of Civil, Environmental and Sustainable Engineering, Arizona State University, Tempe, AZ, USA

John Polak

Department of Civil and Environmental Engineering, Centre for Transport Studies, Imperial College London, UK

Christine Prasad

Institute of Transport and Logistics Studies, The University of Sydney, NSW, Australia

Anthony J. Richardson

The Urban Transport Institute (TUTI), Alexandra, Victoria, Australia

Jorge Rivera

Facultad de Economı´ a y Negocios, Universidad de Chile, Santiago, Chile

Luis Ignacio Rizzi

Departamento de Ingenierı´ a de Transporte y Logı´ stica, Pontifica Universidad Cato´lica de Chile, Santiago, Chile

Lars Roessger

Department of Traffic and Transportation Psychology, Technische Universita¨t Dresden, Dresden, Germany

Matthew Roorda

Department of Civil Engineering, University of Toronto, Ontario, Canada

Gerd Sammer

Institute for Transport Studies, University of Natural Resources and Life Science Vienna, Vienna, Austria

Jens Schade

Department of Traffic and Transportation Psychology, Technische Universita¨t Dresden, Dresden Germany

Sudeshna Sen

Merkle, Oak Brook, IL, USA

Marcelo G. Simas Oliveira

GeoStats LP, Atlanta, GA USA

Aruna Sivakumar

Department of Civil and Environmental Engineering, Centre for Transport Studies, Imperial College London, UK

Zbigniew Smoreda

Sociology and Economics of Networks and Services Department, Orange Labs R&D, Paris, France

Abby Sneade

Department for Transport, London, UK

List of Contributors

xv

Peter R. Stopher

Institute of Transport and Logistics Studies, The University of Sydney, NSW Australia

Franc- ois The´berge

Faculty of Planning, Architecture and Visual Arts, Universite´ Laval, Que´bec, Canada

Marius The´riault

ESAD-CRAD, Universite´ Laval, Que´bec, Canada

Alejandro Tudela

Civil Engineering Department, Universidad de Concepcio´n, Chile

Laurie Wargelin

SRBI, New York, NY, USA

Claude Weis

Institute for Transport Planning and Systems (IVT), ETH Zurich, Zurich, Switzerland

Jeremy Wilhelm

GeoStats LP, Atlanta, GA, USA

Chester Wilmot

Department of Civil and Environmental Engineering, Louisiana State University, Baton Rouge, LA, USA

Matthias Wirtz

Institute for Transport Studies, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany

Jean Wolf

GeoStats LP, Atlanta, GA, USA

Junyi Zhang

Graduate School for International Development and Cooperation, Hiroshima University, HigashiHiroshima, Japan

Johanna Zmud

Transportation, Space and Technology Program, RAND Corporation, St Arlington, VA, USA

Dirk Zumkeller

Institute for Transport Studies, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany

Mauricio Zu´n˜iga

Departamento de Ingenierı´ a Civil, Universidad de Chile, Santiago, Chile

Preface

The objective of this book is to communicate the state of ‘‘good’’ practice in transport survey methods from around the world. It identifies the progress made toward methodological solutions and the challenges that remain ahead. Most importantly, it consolidates an international perspective on improving data and information to support transportation decision-making. One of the central conclusions to be drawn from its contents is that innovation to improve our international data and information infrastructure, and the thorough testing of innovations, should be a primary preoccupation for the immediate future. This book brings together a selection of peer-reviewed papers and workshop syntheses from the 9th international conference held in Puyehue, Chile in November 2011. It is not a proceedings volume, as this role was fulfilled by the provision, to all participants, of unpublished electronic pre-prints of papers and posters that were accepted by a review board for presentation at the Conference. Each ISCTSC conference is organized around key themes. The Chile Conference focused on ‘‘Scoping the Future While Staying on Track.’’ It sought a strategic balance between, on one hand, the anticipation of changing data needs that result from important ongoing shifts in major transport policy issues and, on the other, the imperative to ensure that benchmark data are stable and consistent enough for comparisons to be made across time. This is the contemporary context for identifying and researching developments in survey methods, in concert with other data sources, which can meet the resultant challenges. Of course, data collection methods are also subject to challenge and change, most notably in the current era from technological developments. Our overarching goal is thus to review, critique, and update the body of knowledge about survey methodologies, in order to enhance the quality, value, and utility of the data that surveys provide for shaping transport practice, policy, and programs. This publication should be of substantial interest to analysts, planners, and researchers who provide information and knowledge to transportation policy makers and decision-makers. It should also be of interest to students and their teachers, because it considers the current condition of transport data and information to support good decision-making, and identifies where future improvements are needed. Finally, it has relevance for transportation policy makers and decisionmakers who rely on good information and intelligence for the soundness of the work that they do.

xviii

Preface

The book’s international perspective would not be possible without the institutional infrastructure of the International Steering Committee for Travel Survey Conferences (ISCTSC). This sentiment, which is usually placed in the acknowledgments, resides here because of the significance of this organization for good practice in transport survey methods. The mission of the organization is to organize periodic international conferences on research into the conduct of transport surveys that support planning, policy, modeling, monitoring, and related issues for urban, rural, regional, intercity, and international person, vehicle, and commodity movements. The ISCTSC vision is the continuous improvement of transport survey methods, and of the information they provide to decision-makers, in both developed and developing countries. With respect to developing countries, it supports its mission through a scholarship fund that subsidizes conference attendance, and through the donation of conference publications, such as this, to developing-country libraries. A professional volunteer organization, the ISCTSC, has a rich legacy of past conferences and publications that began in the late 1970s with a small invitational conference in Eibsee, Germany, organized by Werner Bro¨g and colleagues. Subsequent conferences were held in Hungerford Hill, Australia in 1983 (Ampt, Richardson, & Bro¨g, 1985), Washington, DC in 1990 (Ampt, Richardson, & Meyburg, 1992), Steeple Aston, England in 1996 (Bonsall & Ampt, 1996), Eibsee, Germany in 1997 (TRB, 2000), Kruger Park, South Africa in 2001 (Stopher & Jones, 2003), Playa Herradura, Costa Rica in 2004 (Stopher & Stecher, 2006), and Annecy, France in 2008 (Bonnel, Lee-Gosselin, Zmud, & Madre, 2009). With the publication of this book, and under a recent constitutional change that provides for overlapping cochair terms, we pass the ISCTSC baton to the recently elected cochairs. They are: Marcela Munizaga, who so ably coled the Local Organizing Committee for the Chile Conference (for a two-conference term); and the return of Tony Richardson, who provided much inspiration for the series as one of its ‘‘founding mothers and fathers’’ (for one conference). We sign off knowing that they will benefit, as have we, from the extraordinary goodwill that sustains the ISCTSC around the world, and confident that the future of the series is in excellent hands. Johanna Zmud Martin Lee-Gosselin October 2012

References Ampt, E. S., Richardson, A. J., & Bro¨g, W. (Eds). (1985). New survey methods in transport. Utrecht, The Netherlands: VNU Science Press. Ampt, E. S., Richardson, A. J., & Meyburg, A. H. (Eds). (1992). Selected readings in transport survey methodology. Melbourne, Australia: Eucalyptus Press.

Preface

xix

Bonnel, P., Lee-Gosselin, M. E. H., Zmud, J., & Madre, J. L. (Eds). (2009). Transport survey methods: Keeping up with a changing world. Bingley, UK: Emerald Group Publishing Limited. Bonsall, P., & Ampt, E. S. (Eds.) (1996). Conference proceedings, 4th international conference on Survey Methods in Transport, Steeple Ashton, Oxford, UK. Stopher, P. R., & Jones, P. M. (Eds). (2003). Transport survey quality and innovation. Oxford, UK: Pergamon. Stopher, P. R., & Stecher, C. (Eds). (2006). Travel survey methods: Quality and future directions. Oxford, UK: Elsevier. Transportation Research Board. (2000). Transport surveys: Raising the standard. Transportation Research Circular E-C008, National Research Council, Washington, DC.

Acknowledgements

The conference in Chile was conceived and directed by the International Steering Committee on Transport Survey Conferences (ISCTSC), under the co-chairmanship of Martin Lee-Gosselin and Johanna Zmud. The conference was held in collaboration with the Institute for Complex Engineering Systems (ISCI) in Santiago, Chile. A Local Organizing Committee (LOC) in Chile deserves much credit for the success of the Chile Conference — both in terms of their active role in the technical program and for their coordination of logistical arrangements. For the period leading up to the conference through to the completion of the book the ISCTSC members were: Carlos Arce, ArceZmud, LLC, USA Tom Adler, Resource Systems Group, USA Jimmy Armoogum, IFSTTAR, France Patrick Bonnel, ENTPE, France Chandra Bhat, The University of Texas at Austin, USA Werner Bro¨g, Socialdata, Germany Kelly Clifton, Portland State University, USA Martin Lee-Gosselin, Laval University, Canada (co-chair) Jeff Guo, Beijing Transportation Research Center, China Stephan Krygsman, University of Stellenbosch, South Africa Peter Jones, University College London, United Kingdom Jean-Loup Madre, IFSTTAR, France Arnim Meyburg, Cornell University, USA Catherine Morency, Polytechnique Montreal, Canada Juan de Dios Ortu´zar, Pontificia Universidad Cato´lica de Chile, Chile (also on LOC) Alan Pisarski, Consultant, USA Tony Richardson, The Urban Transport Institute, Australia Gerd Sammer, University of Natural Resources and Life Sciences, Vienna, Austria Peter Stopher, University of Sydney, Australia Orlando Strambi, Escola Polite´cnica de USP, Brazil Harry Timmermans, Eindhoven University of Technology, Netherlands Chester Wilmot, Louisiana State University, USA Toshiyuki Yamamoto, Nagoya University, Japan

xxii

Acknowledgements

Johanna Zmud, RAND Corporation, USA (co-chair) Dirk Zumkeller, Karlsruhe University, USA These two LOC co-chairs also served as members of the ISCTSC: Juan Antonio Carrasco, Universidad de Concepcio´n, Chile Marcela Munizaga, Universidad de Chile, Chile Other LOC members were: Esteban Godoy, SECTRA, Chile Luis Rizzi, Pontificia Universidad Cato´lica de Chile, Chile Alan Thomas, SECTRA, Chile Alejandro Tudela, Universidad de Concepcio´n, Chile We would also like to acknowledge the workshop chairs and rapporteurs, who contributed substantially to the success of the conference, and Karla Jaramillo and Natalia Rivas who assisted the LOC and also staffed the conference registration desk throughout the conference. We are grateful to the following organizations that provided sponsorships for the conference:  PTV NuStats, USA  The Complex Engineering Systems Institute (ISCI), Chile The sponsorship of these organizations enabled a number of scholarships to be awarded to delegates from countries throughout Latin America and elsewhere, none of whom would have been able to attend without this assistance. The conference also was organized under the auspices of:  Sociedad Chilena de Ingenierı´ a de Transporte (SOCHITRAN)  Subscretarı´ a de Transportes, Gobierno de Chile The preparation of this book was greatly aided by the guidance of Cristina Irving and Claire Swift of Emerald Group Publishing Limited, and the work of ISCTSC’s publication coordinator Ana Arce Casas. We thank them for their attention to detail.

PART I SETTING THE CONTEXT

Chapter 1

Transport Surveys: Considerations for Decision Makers and Decision Making Johanna Zmud, Martin Lee-Gosselin, Marcela Munizaga and Juan Antonio Carrasco

Abstract This book provides an international perspective on improving information to support transportation decision making. It comprises a selection of papers plus workshop syntheses from the 9th International Conference on Transport Survey Methods in Chile in November 2011. The conference was organized into 14 workshops with both paper presentations and discussions in the workshops forming the majority of the conference activity. The papers reported primarily on research pertaining to continuous improvement in transport survey methods — the backbone of the transportation data pipeline in most countries. But some papers also addressed the new ways in which innovation — notably technological innovation — is being applied to the capture and analysis of data to produce necessary information faster, better, and less expensively. The conference program built on a rich legacy of intellectual pursuits spanning the past two decades, and it is anticipated that the conference will continue into the future. Thus, the contents of this book represent a 5–10 year view through a moving window on the international state of the practice and concerns in transport survey methods. Keywords: Location-aware technologies; decision processes; respondent interfaces; social context; new data streams; multi-horizon choices

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

4

Johanna Zmud et al.

1.1. Introduction In today’s difficult global economy, governments are struggling with demands to increase basic services and to do so with fewer available resources. Governments must ask themselves where the marginal dollar of expenditure will have maximum impact. This is true across all sectors of private and public economies but particularly so in the transportation sector. Regardless of country, transportation infrastructure is in a critical state. Bridges are load-restricted, closed, or falling down from lack of maintenance; transit systems endure unending cycles of maintenance interruptions; congestion wastes commuters’ time and impedes logistics. Most governments’ statistics have shown that transport has become worse for just about everyone. In most major cities, journey times by all modes of transport have lengthened. On the other hand, there are efficiencies to be gained. As the Chilean Transport Deputy Minister Gloria Hutt reflected upon in the opening plenary, in the last decade there has been an enormous change in the access to communication technologies (e.g., satellite TV, mobile phone, Internet access). Such technologies have had impact not only on the functioning of the transportation system and on the ways in which individuals organize their travel, but also these have had impacts on the ways in which data are obtained and analyzed. In hard economic times, policy makers are looking for every opportunity to spend less and get more ‘‘bang for the buck.’’ It is a time for smarter decisions — especially transportation investment and policy choices based on objective information. That takes data — and good data are getting increasingly harder to come by. How do we get the robust data needed for sound decision making? The answer lies in the collection of good passenger and freight transportation data on volumes, origin, and destination flows, costs of travel, impacts of travel, influencers on demand, and substitutes for travel. The papers in this book illustrate how travel behavior research is addressing the need for good data and better information about passenger and freight travel to support decision making. The papers span 14 different topics that contend with stepwise improvements in mainstream transport survey methods, technology applications that support new types of data collection and analysis methods, or innovative methods to address new policy or planning challenges. The ground covered (see Table 1.1) is worth noting, as these themes were not predetermined but were derived from more than 140 extended abstracts that were submitted to the ISCTSC in response to the Call for Papers — almost double the number submitted to the previous conference. The themes are thus in themselves an expression of contemporary priorities from the international community of transport survey researchers, and of the lively current interest in the field. As Table 1.1 indicates, the types of improvements deal with survey and sampling design, the use of non-survey data sources, data processing, and interpretation. While household travel behavior surveys were often the methodological focus of the papers in this book, the primary objective of most papers is to provide approaches to improved measurements of critical data regardless of the specific type of survey or other method employed. The terms ‘‘data,’’, ‘‘information,’’, and ‘‘knowledge’’ are frequently used in this book, often interchangeably. But in reality, there are distinctions that are important

Considerations for Decision Makers and Decision Making

5

Table 1.1: Workshop Topics at the 9th International Conference on Transport Survey Methods, 2011. 1. Bringing Location-Aware Technologies into the Travel Survey Mainstream: Complement or Stand-Alone? 2. Cognitive and Decision Processes Underlying Engagement in Stated Response Surveys 3. Methods for Capturing Multi-Horizon Choices 4. Designing New Survey Interfaces and Front-End Software 5. Exploring and Merging Passive Public Transport Data Streams 6. Validating Shifts in the Total Design of Travel Surveys 7. Survey Methods to Inform Policy: Environment, Energy, Climate and Natural Disasters 8. Measuring the Influence of Attitudes and Perceptions 9. Longitudinal Methods: Overcoming Challenges and Exploiting Benefits 10. Post-Processing of Spatio-Temporal Data 11. Comparative Research into Survey Methods 12. Multi-Method Data Collection to Support Integrated Regional Models 13. Alternative Approaches to Freight Surveys 14. Collecting Qualitative and Quantitative Data on the Social Context of Travel Behavior

to make and keep. Most of the papers in this book focus on data; that is, numbers, words, images that have yet to be analyzed to produce information or statistics. Information is produced through processing, manipulating, and organizing data. Transport statistics are an important subset of information. Knowledge is attained by interpreting information received. All three concepts (data, information, knowledge) are prerequisites for good decision making. Taken as a whole, the papers in this book touch on improvements in all three areas and comprise an international perspective on best practice for decision making. This introductory chapter provides the relevance of and context for the papers that comprise this book. The relevance is discussed in the next section, Considerations for Decision Makers. It addresses why decision makers should be concerned about data for decision making. Then, a section on considerations for decisions is presented. This section provides an overview of the cross-cutting themes and key issues represented by the papers in this book.

1.2. Considerations for Decision Makers When the international transport survey community gathered in Chile in November 2011, which was more than four years after the beginning of the financial crisis, ‘‘uncertainty’’ continued to remain high. The United States was limping out of an

6

Johanna Zmud et al.

economic recession, the Euro Zone was dealing with several member economies near collapse, and financial distress continued to be increasingly widespread throughout the world. Regardless of country, conference delegates faced the same concerns about budget cuts for transportation survey programs. This ‘‘state of the world’’ was seen as unfortunate because modern economies run on statistics. Businesses, governments, and households base their decisions on them. All developed countries have some method of generating timely statistics on basic social, economic, and demographic attributes. Most have the same capability for generating transport statistics. In this era of tight budgets, such data are often undervalued and not considered to be the information assets that they really are. Transport data can inform decision makers about what really works; for example, how best to relieve congestion and improve supply-chain connectivity to make freight transportation more competitive. Good data can enable people and businesses to use the transportation system more efficiently and so contribute to a goal of universal mobility. Freight transportation volumes can be an early indicator of the state of the economy. There is plenty that decision makers can learn from good data thoughtfully used. As the 2003 Special Report 277 of the Transportation Research Board, Measuring Personal Travel and Goods Movement, noted, ‘‘without good data, decisions will be arbitrary, options overlooked, and solutions misguided’’ (National Research Council, 2003). It has always been the case that a balance needs to be achieved between the importance of the information needs and the cost of collecting and supporting data with the necessary accuracy, detail, and timeliness. To achieve a balance, decision makers need to determine their information priorities and put in place arrangements to secure the quality of the data to satisfy these needs. Undervaluing data, and the surveys and other methods used to obtain them, is imprudent at a time when our world needs good information and rich statistics. The issues that politicians and policy makers face have become more and more challenging and nuanced. In this context, policy decisions need to be based on careful and rigorous analysis using sound and transparent data. Such data are essential to issue recognition, program design, policy choice, and accurate forecasting, as well as to monitoring and evaluation. Transport surveys collect data that can be processed into information to make decisions. Data can be descriptive (i.e., the ‘‘what is’’ condition of the system) or diagnostic (i.e., ‘‘what is wrong’’ condition, where ‘‘what is wrong’’ is measured as the disparity between ‘‘what is’’ and ‘‘what ought to be’’). The processes by which information is derived from these data may take many forms (qualitative and quantitative): research, analysis of data, economic and statistical modeling; cost/ benefit analysis; and the aggregation of opinions and beliefs. The methodologies that are used to gather and synthesize the information are just as significant because they impact the quality of the information. Transport surveys are not one-time expenditures. Data are dynamic not static. Updating of information is required as people’s understanding changes, as new research produces new results, as issues intrinsically change, and as new methods, approaches, or technologies become available to obtain necessary data. This is particularly important to ensuring adequate coverage of current and emerging data needs while maintaining comparable

Considerations for Decision Makers and Decision Making

7

indicators of transport demand and other indicators over long periods of time. The conference organizers recognized the tension that is often present between methodological innovation on the one hand and protecting the comparability of survey results over time on the other, having adopted scoping the future while staying on track as the conference theme. Research such as that contained in this book is important for future decision making. There is a clear imperative for the development of increasingly creative and complex approaches to survey design, execution, and analysis. Smarter transportation decisions require comprehensive, accurate, and timely data about travel demand, infrastructure condition, travel time reliability, the equity of access, and environmental impacts. With such information, decision makers can better understand where and what the needs are, what works and does not, and where the payoffs are greatest.

1.3. Considerations for Good Decisions We argue above that good decisions require quality data and sound information, yet fiscal constraints limit the ability of public agencies to fund quality data collection efforts. As a result, transportation data users and suppliers are consistently pressed to find better, faster, and cheaper ways of collecting data. Thus, continuous improvement in transport survey data methods, procedures, and tools is an imperative, not a luxury. The first paper in the book, the keynote paper by Goulias, Pendyala, and Bhat (2011), goes beyond the notion of continuous improvement to present a new conceptual model for data to inform decision making. It presented an unabashed exploration of a ‘‘data collection paradise’’ that enumerated and explained the type of data needed for travel demand modeling and simulation related to the new generation of models for large-scale regional policy analysis. In doing so, the authors described an ‘‘ideal’’ total design data collection method that sought to observe individual and group behaviors embedded within their spatial, temporal, and social contexts. This was done through an approach that uses core and satellite survey components that can inform current and future model building. The remaining papers contained in this book address this concept of continuous improvement of existing data collection methods, while blazing a trail toward an ideal of total design that is sensitive to the future. They are organized into eight themes:      

Mainstreaming mobility-aware and online technologies Improving respondent interfaces Comparing survey modes and methods Facing up to sample attrition in longitudinal surveys Understanding the social context of data collection New challenges in dealing with time: environmental peaks and planning horizons

8

Johanna Zmud et al.

 New perspectives on observing choice processes: psychological factors  New types of data streams: opportunities and challenges. These eight themes do not represent the contents of all of the papers presented at the conference or of all of discussions. But taken as a whole they represent a wellbalanced treatment of the state of practice in transport survey methods.

1.3.1.

Mainstreaming Mobility-Aware and Online Technologies

Three workshops and six of the papers addressed the mainstreaming of new technology supports for transport surveys. These papers were related to use of mobile phones, global positioning satellite (GPS), and the web to support interviews, and provided a good representation of the types of new technologies that are considered state-of-practice for travel surveys. The growth of interest in them has been substantial as the weaknesses and limitations of conventional survey practices, such as low response rates due to high respondent burden, have been well documented. Best practice and professional protocols dictated that survey managers and developers continually seek to reduce respondent burden through many mechanisms, including technology applications. The current financial climate also required cost savings of survey managers and developers to be identified wherever possible. So the discussions in many, if not all, of the workshops took technological applications into account. The papers under this theme focused on extending and documenting the mainstreaming of mobility aware devices. The Gould (2011) paper on use of cell phones explored the types of travel information that are likely to be inferred from text surveys and from cell phone traces, recognizing that passive GPS traces might change the level of measurement and the inferences that can be made about travel behaviors. The Wolf, Wilhelm, Casas, and Sen (2011), Stopher, Prasad, Wargelin, and Minser (2011), and Sneade (2011) papers all looked at the mainstreaming of GPS technology. The first two of these GPS-focused papers discussed the opportunities and challenges with conducting GPS-only household travel surveys, which both sets of researchers point to as a likely future direction for household travel surveys. The Sneade (2011) paper, on the other hand, considered the ramifications of using GPS technology for a long-standing national travel survey in place of the travel diary. The conclusion was not to replace the conventional methodology at this time. The Christensen paper (Christensen, 2011) likewise analyzed the effect of adding a web survey to a traditional telephone-based national travel survey by asking the respondents to check in on the web and answer the questions via the web. In this case, the conclusion was more optimistic. In related workshop discussions, a point of criticism leveled at the current state of the practice was that in application and use of GPS and other devices there is the tendency to ‘‘replicate old methods with technology’’ rather than seek new designs that optimize the advances of the new technologies. This tendency is driven, in part,

Considerations for Decision Makers and Decision Making

9

by fear of rupture in the comparability over time of established repeated crosssectional, continuous, or longitudinal surveys. This is certainly the case in the research conducted by Sneade and Christensen (Christensen, 2011; Sneade, 2011). However, as the Wolf and Stopher papers (Wolf et al., 2011; Stopher et al., 2011) highlight, there is the desire to push the envelope in terms of how much of the data are collected via technology devices — from small subsamples, to larger subsamples, to 100 percent of data collected. Also seen was the promise of using mobility-aware devices to observe travel behavior over longer periods of time than are covered by most established travel surveys. These longer periods may extend to weeks rather days. Some improvements in what might be called ‘‘intelligent passive’’ observation show promise for reducing the respondent burden associated with seven-day diary methods, while the major use of web-based tools in such surveys may possibly shift from validation (as in prompted recall) to keeping respondents engaged and interested. Web-based prompted recall techniques are also being used more readily. This development has much to do with the availability of ancillary data, such as GIS layers of road and transit network characteristics and the ability of survey designers to manipulate and integrate these data into their survey approaches so as to collect more accurate information on amount of travel as well as travel origins and destinations. Notwithstanding these examples of continuous improvement, the papers and workshop syntheses point to numerous examples of the survey methods work that remain to be done in technology-based travel surveys. There are new classes of selection bias and response bias associated with such surveys that have yet to be suitably addressed. What will be the future trade-offs between achieving probability samples and attaining coverage of the survey population? Should social networks be used to implement snowball sampling? The workshop syntheses focus on important issues in understanding the implications of implementing changes in survey design, such as the use of GPS devices or the development of online survey systems. What are the implications in terms of the validity and reliability of the resulting information and its utility for transportation planning and policy making?

1.3.2.

Improving Respondent Interfaces

Improving transport survey data depends to a large degree on attracting respondents to participate in transport survey activities and maintaining their interest over the course of the survey period. Transport survey methodologists are increasingly turning to information technologies and geomatics to change the way in which respondents interact with a survey, to enhance respondent interest in surveys, to decrease respondent burden, to lower costs and, eventually, to design continuous self-administered surveys that are predominantly passive. There are still considerable challenges to understanding the usability and relevance of these new survey interfaces. Good interfaces depend on a strong understanding of web technologies and an excellent sense of graphic design, layout, and style to build high-performance

10

Johanna Zmud et al.

front-end user-interface components that engage the users. The conception of survey user interfaces does not seem to be going through a sudden paradigm shift, but rather a steady growth of the role of technology through the addition of multiple modes and their continuous evolution. Three papers and two workshop syntheses examined the current state of the practice for improving respondent interfaces. In The´riault, Lee-Gosselin, Alexandre, The´berge, and Dieumegarde (2011), researchers evaluated a new set of functionalities deployed in a web survey interface to collect personal travel behavior data. This interface used applets developed in Java, and Google Maps in order to assist recording of activity places (geocoding) and the reporting of actual trips into a relational database while using e-mail to recruit and support respondents. In Bourbonnais and Morency (2011), researchers demonstrated the usability of the web to conduct a large-scale household travel survey in metropolitan areas and for large trip generators like universities. It presented the development and implementation process for a web-based tool as well as various statistics on the way respondents interacted with the tool. Cavalcante and Roorda (2011) investigated the impact of the new survey interface in the context of a webbased stated preference (SP) survey to estimate a modeling system of shipper carrier interactions in the logistics services market. These papers and the discussion on this topic at the conference pointed to the need for a better understanding of the potential for negative impact of new user interfaces on bias as well as possible positive impacts on response rates and accessibility. The challenge is to add new interactive platforms while making an effort to stay compatible, or equivalent, with previous survey efforts. This latter need is important in order to generate datasets that are comparable with historical data, which enable longitudinal analyses and the understanding of changes that occur over time.

1.3.3.

Comparing Survey Modes and Methods

Surveys are currently the main method for collecting essential transport data. Survey methods, however, evolve constantly. In the 1970s, the debate among survey researchers was over the acceptability of random digit dial phone surveys, compared with the much more expensive face-to-face interviews of randomly selected households and mail surveys. In the 1990s and 2000s, the debate was over the acceptability of computer-administered interviews. Today, advances in communication technology continually alter the most effective ways to reach people, requiring researchers to decide which approaches to sample selection and survey administration will yield data appropriate to answer important questions. Meanwhile, advances in information technology have altered the most effective ways to obtain and process geo-located information. As already noted, there is considerable interest in and uptake of new technologies in the conduct of transport surveys. Given the potential practical benefits associated with technology-supported surveys, as well as the expected wider application of these technologies in future survey research, it is important and indeed necessary to understand the benefits and limitations these newer methods bring to transport data.

Considerations for Decision Makers and Decision Making

11

The workshop discussion on this topic at the conference noted that comparative research is needed because the best method to collect any kind of data depends also on the purpose of the study, namely the way the resulting data will be used. In some cases, data are collected only to study availability and use of transport, in other cases they may feed trip-based, activity-based, or micro-simulation models. Three papers reported on research that compares survey modes and methods. Kagerbauer, Manz, and Zumkeller (2011) compared three household travel survey methods, PAPI (paper and pencil interview), CATI (computer-assisted telephone interview) and CAWI (computer-assisted web interview), on survey participation rates. Kohla and Meschik (2011) compared PAPI, passive GPS tracking, active GPS tracking, and prompted recall interviews in terms of the accuracy of the data reports. Papon (2011) compared biographic surveys with traditional cross-section travel surveys to assess impact on response rates and survival bias. All three papers conclude with differential impacts by survey mode. Today the challenge is how to conduct surveys in a world where the modes of communication have proliferated, where cell phones are as prevalent as land lines, where market research is common over the Internet, but where no one mode is likely to cover all people in the population equally well and no two modes can be said to have comparable impacts on data reports. Accordingly, the synthesis of the workshop on this theme focuses on the need for additional survey research.

1.3.4.

Facing Up to Sample Attrition in Longitudinal Surveys

In past decades, we have observed continuous increases in travel demand along with economic growth. Under such circumstances, infrequent travel surveys were often sufficient for monitoring travel demand. Since the 1990s, as per capita growth of everyday travel has leveled off significantly in many industrialized countries, we have observed very heterogeneous development in travel demand among different population segments — such as continuing growth of car use among elderly people, while there are signs of decreases of car use among the young. In addition, intrapersonal behavioral variation is growing, with escalating variation in mode use. In light of these new developments, the requirements for data on personal travel have changed. As a way to meet these new travel data requirements, there is rising interest in longitudinal, continuous, and panel surveys. A special problem of longitudinal surveys is sample attrition. Generally there is a relationship between the complexity of a survey and the resulting respondent burden and its affect on response rates. Two papers in this book address this challenge. Chikaraishi, Fujiwara, Zhang, and Zumkeller (2011) examine how to design smaller surveys while minimizing the loss of necessary information. The study extends previous studies on sampling designs for travel diary surveys by dealing with statistical relations between sample size, survey duration for each wave, and frequency of observation, and provides the numerical and empirical results to show how the proposed method works. Then, Chlond, Wirtz, and Zumkeller (2011), look at how to assess the completeness of

12

Johanna Zmud et al.

reported mobility in longitudinal surveys. They find that reporting behaviors are different depending on the number of repetitions. These effects positively influence the quality and completeness and therefore the reliability of recorded mobility figures in multi-period mobility surveys.

1.3.5.

Understanding the Social Context of Data Collection

Interest in understanding the social context of travel behavior is recent — surfacing only within the past decade. Interest stems from both questions about the quality of data that have been collected and from the policy concerns that have prompted new data collection activities. The scope of these questions requires research using innovative techniques that are derived from the diversity of methods employed in the social sciences, both qualitative and quantitative. Understanding the social context of the data collection enables the designer of the survey or the user of the data to understand the inherent challenges in elicited participation and problems that might arise in using the resulting data. A discussion of the papers in this book that focus on the issue will shed additional clarity on the topic. Carrasco, Bustos, and Cid-Aguayo (2011) investigated the role of social networks in travel behavior through a new data collection effort that uses social networks to collect a wide array of information about the social, urban, and temporal context where social activity-travel behavior occurs. A special focus was on how these techniques help to understand the role of income and access to amenities on those spatial and temporal patterns. Lucas (2011) took a different approach to research on the social context of data collection. She explored how ‘‘action research’’ can be used in transport research in resolving major transport policy challenges, such as the mitigation of climate change and environmental impacts, transport-related social exclusion, and intergenerational equity issues. The method is specifically designed to support and actively engineer behavior change as an integral part of the research process. A unique distinction is that the process is inherently collaborative and involves repeated exchanges between the researcher and the ‘‘researched.’’ As the workshop report on this topic illustrates, there is much future research needed on this topic. The workshop discussion raised more questions than could be answered about what is social context, the utility for including it in transport research, and the best approaches for collecting information about it.

1.3.6.

New Challenges in Dealing with Time: Environmental Peaks and Planning Horizons

Many of the topics of the conference workshops were extensions of issues and discussions from previous conferences. But two topics raised new methodological interests and questions. These were done from the context of specific policy questions. One looked at multi-method survey packages and the other at collecting

Considerations for Decision Makers and Decision Making

13

data on the interaction between day-to-day tactical decisions (such as destination and travel mode) and longer-term strategic decisions (such as residence location or vehicle access). A finding of the multi-method workshop was that transport and land use models will be progressively embedded in a comprehensive system of integrated urban models that also includes population demographics, markets for education, jobs, and houses, the demographics of firms, and flows of energy, water, waste, and pollution, much like that advanced in the keynote paper by Goulias et al. (2011). This will bring both challenges (e.g., the interoperability of data methods) and opportunities for the exchange of data. As these challenges are in the context of specific policy questions, the methods brought to bear must be at their most flexible and creative. Three papers provided a current snapshot of representative research in terms of these new challenges. Le Vine, Sivakumar, Lee-Gosselin, and Polak (2011) examined methods for capturing choice preferences that had different time horizons but were linked in a strategic-tactical structure: purchasing ‘‘mobility resources,’’ which include commitments such as car ownership and subscriptions to car-sharing services, and choosing a mode of transport for a particular instance of travel. Methodological innovation was brought to the task in that respondents were asked to indicate their choices in the context of giving advice to a demographically similar ‘‘avatar.’’ Arellana, Ortu´zar, and Rizzi (2011) also focused on innovations in the capture of choice data — specifically departure time choice. Departure time choice depends on the desire to carry out activities at certain times and places, influenced by travel conditions, congestion levels, activity schedules, and external trip factors. The paper reports on a complex data collection procedure allowing the researchers to obtain detailed input data from different sources and at different time periods. Wilmot and Gudishala (2011) also look at time-dependent stated choice. Here they develop and present hypothetical storms in a video, employing a sequence of scenarios showing prevailing conditions at discrete points in time as each storm approaches land in order to develop more accurate evacuation demand models. As the workshop synthesis covering this topic explains, multi-horizon choices are made within a context that changes over time. Thus, representation of context is crucial in multi-horizon decisions because many choices are highly constrained. Some of the critical dimensions of the context identified in the workshop include economic, time and space constraints and considerations. The context of decision making develops as an interaction between the larger environment (built environment, regional economy, culture, technology) and the state of the individual decision maker (their own economic and physical resources, social network). It is within this context that processes and outcomes then interact and survey methodologies must adapt and change.

1.3.7.

New Perspectives on Observing Choice Processes: Psychological Factors

Surveys sometimes fail to meet expectations because respondents are disinterested or disengaged. This can be easy to explain. For example, respondents are often expected

14

Johanna Zmud et al.

to distinguish between genuine telephone surveys and sales calls that are disguised as surveys. Having been deceived once, a respondent may refuse to respond to any call that announces their selection to participate in a survey. In other circumstances, respondents may have accepted to take part in a survey, and be totally convinced of its bona fide, but their pattern of responses defies any reasonable logic, and this may (or may not) indicate disengagement with the survey task. This is particularly problematic in stated response (SR) surveys, i.e., those used to assess expected behavior in hypothetical situations, and especially those employing stated choice experiments in which respondents are expected to trade off alternatives whose attributes (e.g., travel time, travel cost, and comfort) are varied in accordance with an experimental design. Apparent disengagement may manifest itself in a variety of ways, such as over-rapid responses to questions that require some deliberation, high levels of incomplete responses, or unlikely patterns of responses. For example, is a respondent who simply picks the cheapest alternative in every question telling us he is bored, that he didn’t understand the instructions, or that he uses that simple heuristic to make choices in real life? Although these problems have long been discussed in relation to decision theory, we picked this issue as a priority because, recently, survey designers are increasingly looking for tools to explain dubious response patterns and modify survey designs accordingly. This is in fact quite a complex problem area, and it was addressed in the conference in three ways. The first way was through a workshop that dealt specifically with cognitive processes underlying disengagement in SR surveys. It concluded that a sweeping assumption that any inexplicable response patterns simply replicates the respondent’s real-life approach to decision making, tempting as it may be, does not survive serious scrutiny. A series of experiments were recommended to improve the detection of low engagement with the SR survey tasks, including background logging of response times and patterns in the case of computerized choice experiments (Weis, Dobler, & Axhausen, 2011) and to identify causes of low engagement and explore correlates of those causes with personal and contextual characteristics (Bonsall, Schade, Roessger, & Lythgoe, 2011). The second way was through a review of the measurement of perceptions and attitudes, treating such variables both as influences on various dimensions of travel choice and as potential explanations of aspects of the choice process, as noted in the papers of Donoso, Munizaga, and Rivera (2011), and Tudela, Nurul Habib, and Idris (2011). The third way was part of deliberations of the workshop on the social context of data collection, already discussed in Section 1.3.5 in particular regarding an iterative procedure in which qualitative data provides relatively simple stories that explain quantitative findings and lead to a more complex analytical understanding. This too throws light on the psychology of responses to surveys, notably because the greatest challenge to instrument designers is dealing with framing, which translates to the context that respondents assume is ‘‘behind’’ the questions posed. Once again, this is particularly problematic for SR surveys, especially if some of the hypothetical choices are perceived to be socially desirable or undesirable.

Considerations for Decision Makers and Decision Making 1.3.8.

15

New Types of Data Streams: Opportunities and Challenges

Six of the papers and two workshop syntheses address new data streams in both the realm of public transport surveys and for travel behavior capture. In the area of public transport data we find several papers that examined the alignment of surveys with administrative data. Administrative data are in the form of information coming from fare systems, online travel planners, network inventories, or financial transactions. Chu and Chapleau (2011) used data from transit smart card automatic fare collection (AFC) systems to synthesize individual-level attributes of users by summarizing multi-day validation records from each card. The new dimensions were then transposed to various levels of aggregation and studied simultaneously in multivariate analysis. They discuss the limitations, biases, and strategies of doing so. Beltra´n, Gschwender, Munizaga, Ortega, and Palma (2011) explored the possibility of automatically generating level of service indicators from processing of raw automatic vehicle location (AVL) and AFC data that are used for operational planning and monitoring of the public transport system of Santiago, Chile. The advantage of doing this was that these measurements and estimates were found to be reliable because they were obtained from large samples, and cost-effective because the analysis was executed at nearly no cost. Likewise Devillaine, Munizaga, Palma, and Zu´n˜iga (2011) used AFC data to estimate highly representative, although not bias-free origin-destination (OD) matrices. The researchers applied two methods of validation: endogenous and exogenous validation. As the workshop synthesis pertaining to this theme suggests, smart card systems and other passive data streams offer promising avenues for operational, tactical, and strategic planning of transportation networks — meeting the criteria of obtaining data quicker, better, and at less cost. In the area of travel behavior capture, an important element was best practice for post-processing of spatio-temporal micro-data. Although there is much work left to do, important strides are being made in areas such as automatic mode detection and the interpretation of short dwell periods (including mode transfers). Interestingly, some of these strides are being made without the provision of ancillary data, such as GIS layers of road and transit network characteristics, a promising development in parts of the world where such data are incomplete, inaccurate, or absent. Greaves and Ellison (2011) described the system setup and processing requirements for a long-duration longitudinal GPS/prompted recall survey conducted in Sydney, Australia, using an in-car GPS device within a prompted recall interface accessed over the Internet by participants. Their approach includes an important assessment of participant burden and cognition by analyzing the respondent’s prompted recall activity and comparing his/her responses to information inferred from the GPS data. Smoreda, Olteanu-Raimond, and Couronne´ (2011), on the other hand, tested several alternative methods of collecting data (active and passive localization) from mobile phones for personal mobility analysis. They define active localization as being akin to a personal travel diary, while passive localization is based solely on phone network data, which are automatically recorded for technical

16

Johanna Zmud et al.

or billing purposes. Smoreda et al.’s work begins to fill the promise of future directions that surfaced in the workshop discussion related to this theme. Workshop members expressed the hope that longer-term research will lead to mode detection that is independent of user feedback. It was also hoped that the ability to link spatiotemporal data to Smartcard and other data sources would become possible. It is probably fair to say that a majority of the international transport survey methods community assume that advanced technology supports will play an inevitably increasing role in the next 5–10 years, especially in surveys of personal travel, and that they may well transform some stages of the transport data collection ‘‘supply chain’’ radically.

1.4. Summary The editors of this volume observe, from successive conferences in the ISCTSC series, that differences around the world in mainstream transport survey methods are slowly diminishing. For example, the recent developments and experiments in technologyaided surveys, including web-based methods, have widened a debate that not so long ago was confined to the relative merits of personal interview, telephone interviews, and postal questionnaires, each of which had their national champions. Most of the major country players in that debate have seen response rates decline, and all have been engaged in survey research that includes technologies. At the same time, Dillman’s notion of ‘‘quality at every stage’’ has become orthodox. However, so has the imperative to be efficient in the wake of the international financial crisis that accelerated dramatically just a few months after the previous conference in Annecy in May 2008. The emphasis in most of the Annecy workshop discussions was on developing practical, achievable, and affordable strategies for the collection of essential transport data that would be less contingent on shifting political and funding priorities. At the conclusion of the 2008 conference, cross-cutting goals were identified focusing on stable, continuous national surveys that take full advantage of technological developments in collection, analysis, and visualization. The 2011 Chile conference ended on a rather different consensus. While recognizing the methodological progress consolidated in the conference, the consensus was summed up by co-chair Johanna Zmud and (for the LOC) by Juan de Dios Ortu´zar as a need for serious self-examination. They translated this into the following eight questions for which the conference outputs could provide some, but not necessarily complete, responses: 1. 2. 3. 4. 5.

Are we doing our job properly? Can we really capture ‘‘the universe’’? Are we generalizing about new methods from biased information? Are we still too focused on ‘‘what’’ or ‘‘how’’ and not enough on ‘‘why’’? Are we exploiting our understanding of decision making processes?

Considerations for Decision Makers and Decision Making

17

6. Are we asking the right questions? 7. Where we are in understanding what we are trying to improve? 8. Are we chasing response at the expense of scientific rigor? Accordingly, one of the central messages of the Chile Conference is that innovation, and the thorough testing of innovations, should be our main preoccupation for the immediate future if we expect to produce the data that wise transport planning decisions and policies require.

References Arellana, J. A., Ortu´zar, J. de D., & Rizzi, L. I. (2011). Survey data to model time-of-day choice. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Beltra´n, P., Gschwender, A., Munizaga, M., Ortega, M., & Palma, C. (2011). Indirect measurement of level of service variables for the public transport system of Santiago using passive data. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Bonsall, P. W., Schade, J., Roessger, L., & Lythgoe, B. (2011). Can we believe what they tell us? Factors affecting people’s engagement with survey task. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Bourbonnais, P., & Morency, C. (2011). Web-based personal travel survey: A demo. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Carrasco, J. A., Bustos, C., & Cid-Aguayo, B. (2011). Affective personal networks versus daily contacts: Analysing different name generators in an activity-travel behaviour context. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Cavalcante, R., & Roorda, M. (2011). Shipper/carrier interactions data collection: Web-based respondent customized stated preference (WRCSP) survey. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Chikaraishi, M., Fujiwara, A., Zhang, J., & Zumkeller, D. (2011). Optimal sampling designs for multi-day and multi-period panel surveys. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Chlond, B., Wirtz, M., & Zumkeller, D. (2011). Do dropouts really hurt? — Considerations about data quality and completeness in combined multiday and panel surveys. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Christensen, L. (2011). The role of web interviews as part of a national travel survey. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Chu, K. K. A., & Chapleau, R. (2011). Smart card Validation data as a multi-day transit panel survey to investigate individual and aggregate variation in travel behaviour. Paper presented at

18

Johanna Zmud et al.

the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Devillaine, F., Munizaga, M., Palma, C., & Zu´n˜iga, M. (2011). Obtaining a reliable origin destination matrix from massive smartcard and GPS data: Application to Santiago. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Donoso, P., Munizaga, M., & Rivera, J. (2011). Measuring user satisfaction in transport services: Methodology and application. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Gould, J. (2011). Cell phone enabled travel surveys: The medium moves the message. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Goulias, K. W., Pendyala, R.M., & Bhat, C. R. (2011). Total design data needs for large scale activity microsimulation models. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Greaves, S. P., & Ellison, R. (2011). A GPS/web-based solution for multi-day travel surveys: Processing requirements and participant reaction. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Kagerbauer, M., Manz, W., & Zumkeller, D. (2011). Methodological analysis of different methods within one multi-day household travel survey — PAPI, CATI, and CAWI in comparison. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Kohla, B., & Meschik, M. (2011). Comparing trip diaries with GPS tracking — Results of a comprehensive Austrian study. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Le Vine, S., Sivakumar, A., Lee-Gosselin, M., & Polak, J. (2011). Empirically-constrained efficiency in a strategic-tactical stated choice survey of the usage of patterns of emerging carsharing services. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Lucas, K. (2011). Qualitative methods in transport research: Is ‘action research’ a methodology too far? Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. National Research Council. (2003). Measuring personal travel and goods movement: A review of the Bureau of Transportation Statistics’ surveys — special report 277. Washington, DC: The National Academies Press. Papon, F. (2011). Correcting biographic survey data biases to compare with cross-sectional travel surveys. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Smoreda, Z., Olteanu-Raimond, A. M., & Couronne´, T. (2011). Spatio-temporal data from mobile phones for personal mobility assessment. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Sneade, A. (2011). Using accelerometer equipped GPS devices in place of paper travel diaries to reduce respondent burden in a national travel survey. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18.

Considerations for Decision Makers and Decision Making

19

Stopher, P. R., Prasad, C., Wargelin, L., & Minser, J. (2011). Conducting a GPS-only household travel survey. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. The´riault, M., Lee-Gosselin, M., Alexandre, L., The´berge, F., & Dieumegarde, L. (2011). Web versus pencil-and-paper surveys of weekly mobility: Conviviality, technical and private issues. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Tudela, A., Nurul Habib, K. M., & Idris, A. O. (2011). Semantic approach to capture psychological factors affecting mode choice: Comparative results from Canada and Chile. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Weis, C., Dobler, C., & Axhausen, K. W. (2011). A stated adaptation approach to surveying and modelling household activity. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Wilmot, C., & Gudishala, R. (2011). Collection of time-dependent data using audio-visual stated choice. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Wolf, J., Wilhelm, J., Casas, J., & Sen, S. (2011). A case study: Multiple data collection methods and the NY/NJ/CT regional travel survey. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18.

Chapter 2

Keynote — Total Design Data Needs for the New Generation Large-Scale Activity Microsimulation Models Konstadinos G. Goulias, Ram M. Pendyala and Chandra R. Bhat

Abstract Purpose — In this paper we describe a total design data collection method (expanding the definition of the usual ‘‘total design’’ terminology used in typical household travel surveys) to emphasize the need to describe individual and group behaviors embedded within their spatial, temporal, and social contexts. Methodology/approach — We first offer an overview of recently developed modeling and simulation applications predominantly in North America followed by a summary of the data needs in typical modeling and simulation modules for statewide and regional travel demand forecasting. We then proceed to describe an ideal data collection scheme with core and satellite survey components that can inform current and future model building. Mention is also made to the currently implemented California Household Travel Survey that brings together multiple agencies, modeling goals, and data collection component surveys. Findings — The preparation of this paper involved reviewing emerging transportation modeling approaches and paradigms, policy questions, and behavioral issues and considerations that are important in the multimodal transportation planning context. It was found that many of the questions being asked of policy makers in the transportation domain require a deep understanding of the interactions and constraints under which individuals make activity-travel choices, the learning processes at play, and the attitudes

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

22

Konstadinos G. Goulias et al. and perceptions that shape ways in which people adjust their travel behavior in response to policy interventions. Based on the work, it was found that many of the traditional travel survey designs are not able to provide the comprehensive data needed to estimate activity-based model systems that truly capture the full range of behavioral considerations and phenomena of importance. Originality/value of paper — This paper offers a review of the emerging transportation modeling approaches and behavioral paradigms of importance in activity-based travel demand forecasting. The paper discusses how traditional travel survey designs are inadequate to meet the data needs of emerging modeling approaches. Based on a review of all of the data needs and new data collection methods that are making it possible to observe a full range of human behaviors, the paper offers a total survey data collection design that brings together many different surveys and data collection protocols. The core household travel survey is augmented by a full slate of special purpose surveys that together yield a rich behavioral database for activity-based microsimulation modeling. The paper is a valuable reference for transportation planners and modelers interested in developing data collection enterprises that will feed the next generation of transportation models. Keywords: Data; surveys; diaries; time-use; activity-based; forecasting

2.1. Introduction and Background Our recent experience testing a new generation of models for large-scale policy analysis in a substantially heterogeneous region in the United States motivates this paper that enumerates and explains the type of data needed for travel demand modeling and simulation. The fundamental motivation for our travel model system emerged from the policies examined, which include the following: 1. land use changes to increase density and diversity of development (e.g., the California Senate Bill 375 that requires coordination of policies on land use with transportation and the creation of sustainability plans — see Appendix 2.A.1 for an excerpt of the requirements); 2. impacts of changing demographics due to aging of population, in- and outmigration, and changes in fertility, mortality, as well as changes in educational attainment, family formation and dissolution, and changes in employment achievement and prospects; 3. impacts of market introduction and penetration of new types of vehicles including flexible fuel, electric and hybrid vehicles, as well as the resulting changes in the composition and use of household vehicle fleets; 4. new development and the addition of roadway and transit infrastructure components; 5. pricing of services (including parking and toll roads) and access restrictions to part of a city; and

Total Design Data Needs for Microsimulation

23

6. spatio-temporal changes through policy interventions in the housing, school, and employment patterns. A modular simulation model system that we have developed to support the above policy needs includes the following components: (1) synthetic population generation to recreate the resident population of a study area; (2) household evolution and spatial location choice models (e.g., for residence, work, and school) to represent socio-demographic changes at the most elementary level of a person and a household, and concomitant changes in the ability to participate in activities; (3) highway and transit accessibility to capture changes in the activity opportunities and the transportation systems that serve them; (4) activity scheduling and daily simulation to represent individual and household activity and travel; (5) vehicle utilization and allocation within the households; (6) interfaces with other travel demand models, including network assignment; and (7) energy consumption and emission estimation. These components are strung together either in a sequential manner or with important ‘‘feedbacks’’ to ensure coherency in the overall simulation. It should be noted that these components are also found in other modeling and simulation applications (Donnelly, Erhardt, Moeckel, & Davidson, 2010). In addition, a variety of data exchange and model interfaces are also designed to convert data from one form to another, and provide databases and maps for verification, validation, and visualization of policy scenarios. The core data used by many of these components are from household activitytravel surveys that collect information on (1) household-level characteristics (e.g., household location, household structure, life cycle stage, income, vehicle ownership, bicycle ownership, housing characteristics, and house ownership); (2) individual-level characteristics (e.g., age, gender, race, ethnicity, education, student status and school location, employment status, employment location, work hours, and the ownership of driving license); and (3) information of the travel undertaken by individuals, which includes how, why, when, and where they traveled. Time use and activity surveys (and their variants of place-based surveys) may also collect additional information on the activities participated by individuals such as timing, and duration of in-home, at work, and at other place activities. Many household travel surveys also include information about the fleet of household vehicles owned, including the make, model, year of manufacture, and additional information about each vehicle and related transactions. Surrounding these core data elements are land use and infrastructure data that include information on the spatial residential characteristics of households, employment locations, school and activity opportunity locations (often aggregated at the level of traffic analysis zones), and transportation network data that include highway network (roadway functional class, distance, direction, number of lanes, hourly capacity, posted speed limit, and so forth) and transit network data (routes followed by buses and trains, the frequency of service, travel speeds, distance, and travel time among nodes in the network). These data elements are used by major public agencies to develop and test simulation models. However, such data and information are not always sufficient to estimate models that assess policies at the level of individual

24

Konstadinos G. Goulias et al.

decision makers. For this reason, supplemental data are obtained from secondary sources (e.g., the US census and other ad hoc social and demographic surveys) or from specifically designed surveys to answer policy questions (e.g., stated preference surveys). The joint use of different surveys for the same model development task requires data adjustments and courageous assumptions to make the surveys coherent in time, space, and the social milieu. We describe in this paper a total design data collection method (expanding the definition of the usual ‘‘total design’’ terminology used in typical household travel surveys) to emphasize the need to describe individual and group behaviors embedded within their spatial, temporal, and social contexts. We first offer a summary of the data needs in a typical modeling and simulation application for regional travel demand forecasting in the United States (see also Bhat et al., 2012; Goulias et al., 2011; Pendyala et al., 2011; Vyas et al., 2011), and point out important data gaps in key descriptors of behavior as well as other external data we need for model development and verification tasks. We then proceed to create a staged development approach with core and satellite surveys to sketch out an ideal data collection program that can inform current and future model building. We also describe briefly a new generation of data collection efforts that contain additional core questions and survey modules to inform the immediate next generation of activity-travel simulation models. Many of the ideas discussed in this paper are extracted by merging our experience with modeling and simulation reviews by developers of activity-based models in the United States (see Bowman, 2009; Ferdous, Sener, Bhat, & Reeder, 2009; Rossi, Bowman, Vovsha, Goulias, & Pendyala, 2010; SimTRAVEL Research Initiative, 2008–2011); Europe (see SustainCity, 2012), and less developed countries in the world (Yagi & Mohammadian, 2010). There are important variants to these modeling and simulation approaches (some of these currently developed by the authors of this paper) that are not included here to make the presentation tractable and to focus on developments that have a stronger connection to applications and practice that we see emerging in regional travel models in the United States.

2.2. Travel Demand Forecasting Model Design Framework Figure 2.1 shows a schema of the cascading form of model components for a typical activity-based microsimulation travel demand forecasting system. Although the sketch is from a recently developed model for a mega-region, it is representative of many models developed in the United States to address ‘‘green’’ policies that aim at motivating people to adopt strategies that decrease fuel consumption and mobile source emissions. The set of blocks on the left side represents groups of models that are designed for the first year (baseline) of the simulation that usually corresponds to a baseline year used in a regional or statewide transportation plan. Each block of the figure represents a set of techniques and statistical models developed to replicate the resident population activity and activity-travel decision making in a region.

Total Design Data Needs for Microsimulation Agent and Environment Evolution

Baseline Year (t=1)

• Synthetic Population • Accessibility by Time-of-Day • Long Term Choices • Car Ownership and Type • Activity and Travel Scheduling • Routes & Assignment • Energy Consumption & Emissions

25

Simulation Year (t=t+1)

• Population Evolution • Urban Landscape Evolution • Infrastructural Changes • Scenario Databases • Information Fusion • Accessibility Computation

• Synthetic Population • Accessibility by Time-of-Day • Long Term Choices • Car Ownership and Type • Activity and Travel Scheduling • Routes & Assignment • Energy Consumption & Emissions

Figure 2.1: A schema of a typical model. In particular, the entire first set of models on the left side of Figure 2.1 recreates the resident population and attributes a daily activity-travel schedule to each individual in the population. The individual activity-travel schedules are then translated into vehicular travel patterns and assigned to the network to compute energy consumption and emissions for a baseline year. The middle block evolves the region’s economic and demographic landscape over time to a pre-specified time increment. This computerized evolution is done using land use simulation models, an evolutionary engine of households and persons, and data fusion of information from cities and subareas within cities to build scenarios of macro changes. The righthand-side block of models is a repetition of the daily activity and travel patterns models but at the next and all subsequent years of the simulation. Key difference, however, is the synthetic population that does not need to be repeated in the same fashion as the baseline year, but it is used to guide the demographic microsimulation of the middle block. The model system above can represent the activity-travel impacts of a whole host of land use and transportation policies. For instance, land use policies of increased density and land use mix can be reflected in shifts in spatial distribution of economic activity (middle block), location decisions, car ownership and use, and activity participation and destination choices (including decisions to participate in activities and to travel alone or with others) (right block). But before being able to do so, we need to estimate and validate the model components represented in Figure 2.1, which leads to the issue of data needs, as described below.

2.2.1.

Population Synthesis

The process starts with synthetically generating/recreating the entire resident population person-by-person and household-by-household (see http://urbanmodel.asu.

26

Konstadinos G. Goulias et al.

edu/popgen.html). The input to this software and block of methods is the spatial organization of the simulated area in the form of zone-specific target univariate distributions of resident person and household characteristics, as provided by agencies that track the demographic characteristics of a region. As the intent of population synthesis is to recreate the population on a person-by-person and household-byhousehold basis, the target univariate distributions are used as the control totals for each spatial unit of analysis in an iterative algorithm that starts from a multivariate set of relationships (in essence a cross-tabulation) among the person and household variables used as seed information. More specifically, there are two types of data needed (Pendyala et al., 2011): (a) one or more contingency tables (cross-classification) from micro-data to capture the relationships among variables at the household and person levels (these data provide the seed information) and (b) univariate distributions that the synthetic population should satisfy to represent the resident population in each geographical subdivision, such as a block, a block group, a traffic analysis zone, or a tract (these data serve as the target).

2.2.2.

Accessibility by Time of Day

To represent employment opportunities and the spatio-temporal distribution of activity participation opportunities and the desirability of activity locations, opportunity-based accessibility indicators are developed at fine spatial resolutions. In this way, we represent the ease (or difficulty) of reaching different types of industries (representing the opportunities for activity participation) from each geographical location within a pre-specified travel time (or generalized cost-time buffer). In a recent application called SimAGENT, we used 10, 20, and 50 minutes of roadway travel buffers from each of the 203,000 micro-zones and computed the number of industry-specific employees that can be reached (Chen et al., 2011). To reflect the time-of-day variation in accessibility, different values are obtained for the morning peak period (6–9 am), midday (9 am to 3 pm), evening peak period (3–7 pm), and at night (7 pm to 6 am) capturing not only the different roadway conditions, but also the patterns of opening and closing of businesses during the day. Data needed for this block include (a) indicators of industry-specific presence (e.g., number of firms by industry and number of employees, number of firms by industry type and floor space, number of employees in a geographical area by industry type and associated activity); (b) travel time and travel cost among all pairs of origins and destinations by time of day; and (c) availability of firms or groups of firms by time of day (or proxies of this availability).

2.2.3.

Long-Term Choices

In a schema such as the one in Figure 2.1, the residential location of a household is generally established by the population synthesis process for the base year (the left

Total Design Data Needs for Microsimulation

27

side of the figure). However, for subsequent model years and to evolve the population over time or for locating households in individual parcels or building units, a residential location/relocation model is also needed. Conditional on home locations, other long-term decisions are then modeled in this ‘‘long-term choices’’ block. In particular, every student (whether full time or part time) needs a school location and every worker needs a work location. These locations often serve as anchors (outside home) around which people engage in discretionary activities and travel. School location choice models are often difficult to simulate. Many children study at neighborhood schools, but a substantial proportion also go to private or charter schools that are located farther away from home. Adults may choose to attend different universities and colleges in the metropolitan area. For this reason additional models are developed at the person level. For example, when we examine persons in college, a model is used to assign a college location that is a hierarchical function of accessibility. In addition to standard demographic characteristics and measures of accessibility, information about school enrolment and school quality would be useful for modeling school location choice. It is generally possible to obtain information about school locations and enrolment. However, it is more difficult to get information about school quality, one of the key predictors of school location choice. Workers are identified using a labor force participation model that is a function of age, gender, education, and presence of children in the household. Employed persons are then assigned using probabilistic choice models to their type of industry, work location (which is also a function of accessibility), weekly work duration, and work flexibility. Each individual is also assigned a driver’s license depending on age, gender, and race. Using these characteristics, household income is computed as a function of race, presence of elderly individuals, education level of members of households, and employment industry of workers in the household. Then, a residential tenure model (own or rent) follows with a housing type model to assign each household to a single-family detached, single-family attached, apartment, and mobile home or trailer type of residence.

2.2.4.

Car Ownership and Type

Policy analysis provides estimates of greenhouse gas emissions and energy consumption. This motivates the inclusion of model components that are capable of explicitly modeling vehicle fleet composition and usage, and the allocation of vehicles to primary drivers in the household. This set of models may also be viewed as longer- or medium-term choices that households make as opposed to daily shortterm choices that are made in the context of daily activity-travel engagement. This type of model in essence determines the predicted non-commercial regional vehicle fleet mix that is used as input to the emission estimation software. This is also particularly important because of the expected market penetration of ‘‘new’’ technology fuel-efficient vehicles (such as electric cars) and the incentive programs

28

Konstadinos G. Goulias et al.

created at the state and federal levels in the United States to promote such vehicles. A model system like this can be used to assess different incentive structures promoting environmentally friendly technologies in cars. One of the inhibitors in building car ownership, car body type, car vintage, and make models is the existence of many possible alternatives based on the combinations of the different body types, vintages, and makes. Further, a household may own more than one vehicle. A very good option is to use a multiple discrete continuous extreme value (MDCEV) model capable of simulating the entire fleet of vehicles in a household and the annual mileage that each vehicle is driven or used. The MDCEV model extends the classic multinomial logit model in innovative ways to accommodate the multiple discreteness in the choice process and simultaneously predict the total annual mileage that a vehicle would be driven. This annual mileage may be treated as a general mileage budget that guides the use of a vehicle on a day-to-day basis (for example, a household may choose to use a special collector’s edition car only sparingly while using the family van for everyday chores). As indicated earlier, vehicle types are defined by body type, fuel type, and vintage of the car. Within each of the body and fuel types, the MDCEV model is capable of simulating the exact make and model of the vehicle, thus providing detailed information regarding the vehicle fleet in the population. By tracking individual vehicles throughout the course of a day, a complete trajectory can be built and this information can then be used to (a) estimate energy consumption and emission models and (b) offer the baseline data for ecodriving advisory programs. Following (or in combination with) the development of a vehicle fleet composition model, it is desirable to allocate each vehicle in the fleet to a primary driver in the household (see Vyas et al., 2011). This facilitates a ‘‘higher level’’ allocation of vehicles to drivers with the idea that drivers generally use the vehicle(s) allocated to them, particularly when undertaking drive alone trips. When undertaking joint activities and trips, it is possible that a tour level vehicle choice is exercised and subsequent model components can effectively simulate such processes. However, the choice of vehicle will be influenced by the ‘‘higher level’’ allocation of vehicles to primary drivers. For this reason, questions about vehicle allocation are very useful. At this point of the simulation cascade on the left side of Figure 2.1, the model system produces the spatial distribution of all the residents by different social and demographic levels (including race) as well as employment and school locations assigned to each person. In addition, each household is assigned to a housing type. This resembles a complete census of the resident population and can be done at any level of spatial aggregation. One could also draw samples from this population or proceed to the next step using 100% of the simulated residents. This is particularly convenient when there is a whole range of scenarios or policies to examine. In this case, a sample can be used to quickly narrow down to the ones to study in more detail, and these can be taken through the 100% of simulated residents. It is also possible to focus on a specific subarea (e.g., a city) and perform detailed analysis and modeling while keeping the rest of the region as an evolving background. Thus, the data needs will depend on the spatial resolution and the specific use of the model.

Total Design Data Needs for Microsimulation 2.2.5.

29

Daily Activity Schedules and Travel Choices

For each synthetically generated household and person within each household, daily activity and travel patterns are created in this block of models. While there are a variety of model systems that have been implemented or developed in the activitybased modeling field, two families of models appear to have emerged. One family of models, which may be labeled as tour-based models (see Bradley, Bowman, & Griesenbeck, 2010; Goulias, 2007; Vovsha, Bradley, & Bowman, 2005) focus on the generation of tours and the trips that comprise the tours, as well as the tour/trip characteristics. These models have been successfully implemented in practice and provide a robust framework capable of accounting for interdependencies among trips. These frameworks can be expanded to account for interactions among household members, although such extensions have not been very common. The tour-based models in practice are largely composed of a series of multinomial logit and nested logit models that are strung together to form a long chain of models, so that logsums (i.e., sums of exponentiated utility functions) from one set of logit models can feed into another choice model specification at a higher level in the hierarchy. This structure provides the ability to account for accessibility impacts on tour generation, destination choice, mode choice, and time of day choice. Time of day is treated as a discrete choice with the time of day being split into one hour, 30 minute, or 15 minute slices. By introducing logsum terms into the model specifications of higher level models, the tour-based modeling paradigm is able to account for key behavioral phenomena of interest. Improvements in accessibility (as represented by changes in logsum values) will impact destination choice and tour generation, not to mention time of day choice and mode choice. While the tour-based models provide a powerful practical framework for modeling daily activity-travel patterns, they are somewhat restrictive in their ability to simulate the emergent nature of activity-travel demand. Pendyala et al. (2011) review some of their limitations in detail. We favor continuous time activity-based model systems in which time plays an all-encompassing role, activity durations and time use are explicitly modeled, and spur-of-the-moment activity generation can be explicitly represented. Figure 2.2 constitutes a simplified representation of a worker’s daily activity-travel pattern and also illustrates the temporal resolution that survey data should provide. In this example, simulation starts at 3 am on the first day and proceeds for 24 hours until 3 am on the second day. Within this framework, the continuous time model will first determine the fixed activity pegs (geographical anchors) around which discretionary activities and travel must be undertaken. The fixed activity pegs typically correspond to work activities, the timing of which are modeled in an earlier step based on the work-related characteristics (work duration, flexibility, etc.) predicted in the long-term choice module. In the most recent versions of the continuous-time activity system for a large mega-region in the United States, the fixed activity pegs include joint activities undertaken by individuals in the household (see Bhat et al., 2011). The activity pegs determine the daily time space diagram of a person’s path, based on the notion of a time-space prism (Ha¨gerstrand, 1970, 1989).

30

Konstadinos G. Goulias et al. Temporal fixity

3 a.m. on day d Home -Stay Duration

Before Work Tour

Home Work Commute

Home -Stay Duration

Work -Stay Duration ...

S1

S2

Leave home for non -work activities

Work Based Tour

Arrive back home

Leave for work

Arrive at work

Leave work

Temporal fixity Work -Stay Duration

Work Home Commute

Home -Stay Duration

After Work Tour

3 a.m. on day d+1 Home -Stay Duration

...

S3

S4 Arrive back at work

Leave work

Arrive back home

S5 Leave home for non --work activities

S6 Arrive back home

Figure 2.2: Representation of a worker’s daily activity-travel pattern. See also Bhat, Guo, Srinivasan, and Sivakumar (2004). The time periods corresponding to the fixed activity pegs are locked and do not constitute open time-space prisms in which individuals can pursue discretionary activities and travel. After higher level models establish the locked periods and the open time-space prisms, the simulator generates discretionary activities within open time-space prisms along the course of a day. History dependence is incorporated into the model framework so that activities in the latter part of the day are influenced by activities undertaken earlier in the day. For example, if a person has done two shopping episodes in the earlier part of the day, it is less likely that this person will undertake yet another shopping activity toward the end of the day. In each open prism, an individual can make multiple stops outside home, or simply choose to stay home until travel must be undertaken to the next fixed activity. The number of stops undertaken in each open time-space prism is dependent on the time available in the prism (the size of the prism), which is in turn influenced by the network accessibility and level of service measures. It should be noted that the figure is a rather simplified representation of a pattern. It is technically feasible to have any number of tours or trip chains within an open time-space prism, and within each tour, it is possible to have any number of stops. For a non-worker, Figure 2.3 offers another example of a daily activity-travel pattern. In the case of non-workers, the degrees of freedom are greater as there is only home and joint activities (including serve passenger trips) as the fixed pegs around which discretionary activities and travel must be undertaken. In particular, for both workers and non-workers, it is necessary to first establish household responsibilities and child dependencies that may constrain certain periods of the day and limit the flexibility in scheduling and undertaking non-work activities and trips.

Total Design Data Needs for Microsimulation 3 a.m. on day d Morning Home-Stay Duration

1st Tour

S1

31

Home-Stay Duration before 2 nd Tour

S2

Departure for First Stop (S 1 )

First Return-Home Episode

Departure for Third Stop (S 3 )

3 a.m. on day d+1 Home-Stay Duration before M th Tour

M th

Tour

Last HomeStay Duration

SK-1 SK (M -1) - th ReturnHome Episode

Departure for (K-1) th Stop (S K- 1 )

M th Return-Home Episode

Figure 2.3: Representation of a non-worker’s daily activity-travel pattern. See also Bhat et al. (2004).

In the case of non-workers, the simulation process then proceeds through the successive generation of tours with one or more stops in each tour. Individuals can return home for temporary stays during the course of the day. Within each tour, an unlimited number of stops can be made for various purposes, each with a destination, mode, and duration attached to it. The above framework is a hybrid of multiple elements and brings together the best of the behavioral paradigms in the activity-based travel modeling arena. On the one hand, household and individual level day-patterns are imposed such that certain periods of the day that need to be dedicated to activities that are fixed in space and time are set aside. On the other hand, the model framework accommodates the emergent nature of activity engagement and travel demand with the possibility for discretionary activities and travel to be undertaken on the fly. The examples here can be implemented at the level of the traffic analysis zone (TAZ) and this is the preferred starting point of model implementation in the United States because many regions have an already operational four-step model with TAZ populated with data. However, an implementation that treats space in as disaggregate a manner as possible (similar to the continuous disaggregate treatment of time) is ideal. This puts more pressure on land use data and network fidelity. These data are available and often of better quality at the traffic analysis zone, which is the level that many operational models use. But, over time, as the network fidelity gets better, land use models become more detailed, and land use data reliability at small spatial scales improves. As a consequence, the spatial resolution of the activity-based model system is

32

Konstadinos G. Goulias et al.

increasingly becoming more disaggregate. In fact, the spatial unit for activity-based model systems is moving to the census block group, block, or even individual activity locations (parcels) and points on a network. It is also important to note that model systems are being designed to operate at multiple scales, enabling developers to use data from different scales depending on data supply and methods to convert data from one scale to another (downscaling or upscaling). This discussion is clearly pointing to the need for surveys to record all locations visited using longitude and latitude as well as verified addresses, while time should be measured at the finest possible level. At this stage of the model system, the output contains more complete data than an activity survey diary database (because it recreates the activity-travel patterns of the entire population of a region). This is an important consideration for verification and validation, since the simulation produces more information than is currently available from other sources to use as the gold standard or benchmark. Typical external data used to verify the output are commute statistics among subregions, other local surveys, or counts of persons at specific locations by time of day.

2.2.6.

Routes, Assignment to Networks, and Emissions

The output of a continuous time activity-based travel model system such as the one just discussed can be used in many different ways. For example, testing of policy scenarios may be studied and their impact assessed by examining the timing decisions of individuals (e.g., advancing or postponing the starting of trips). The model can also be coupled with other algorithms that undertake traffic assignment, producing estimates of the number of cars on a network and/or routes chosen by vehicles from an origin to a destination. Continuous time activity-based models may also be interfaced with algorithms that are able to produce point-by-point stops in the daily activity and travel paths of simulated persons. While this approach to assignment is appealing from the standpoint of predicting each individual’s actual path through the network, it should also be noted that the current state-of-the-art in activity-based travel demand models covers only personal passenger travel; it does not cover freight travel, non-resident in the region, tourist and visitor travel, and possibly some special generator travel such as airport travel. Some applications in this direction exist but are not widely used yet. For the above reason, post processing routines typically convert the activity-travel patterns into origin-destination trip tables and combine these trips with origin-destination trip tables of the other types of trips (obtained using existing classical processes) prior to feeding the entire travel demand into traffic assignment processes. This is the procedure used when a static traffic assignment process is used. If dynamic traffic assignment procedures are used, then it may make sense to retain the detailed trip records coming out of the activity-based travel model and feed individual trips or tours to the dynamic assignment model. Origin-destination trip tables of the other types of trips can be fed to the dynamic assignment model in the usual way; however, routines need to be set up to ensure that there is consistency in how trips from the activity-based model and trips from the other origin-destination trip tables are routed through the network along the

Total Design Data Needs for Microsimulation

33

continuous time axis. Similar processing is also required when using traffic microsimulation techniques (such as the ones embedded in TRANSIMS and MATSIM), with the key difference of converting all travel into travel plans to mimic activity and travel schedules of people. One example of this type of application is TASHA in Toronto (Hao, Hatzopoulou, & Miller, 2010). Energy consumption and emissions are still (even in advanced models) relying on averaging of vehicles characteristics, average link operating speeds, and other approximations. Modelling advances are rapidly made toward developing emission speed profiles by different types of vehicles (Barth & Boriboonsomsin, 2008) that in turn can be used to develop daily emission profiles for each vehicle and driver in a simulated activity-travel schedule at fine temporal and spatial resolutions.

2.3. Agent and Environment Evolution The application of an activity-based travel model system requires the generation of a synthetic population for the entire region. The activity-travel patterns of the individuals in the synthetic population are simulated using the activity-based travel model. In this regard, there is considerable interest in being able to forecast the characteristics of the population in the future through the evolution of a base year population. The base year population would be taken through a series of models that mimic the life cycle processes of households and individuals (the agents). These include household emigration and immigration, entry into and exit from labor force, aging, birth and death, marriage and divorce, household formation and dissolution, entry into and exit from school, acquisition of driver’s license, and any other socioeconomic phenomena that characterize the evolution of the population over time. There are thus two ways to generate a future year synthetic population that also define the type of data needs we face. Marginal distributions on control variables of interest for a future year may be obtained from a land use model, and these future year marginal distributions may be used to generate a synthetic population for the future year in a manner that is similar to that in the base year. Alternatively, the base year synthetic population may be aged and evolved through a series of life cycle stage transition models on an annual time step to better replicate the demographic processes at play in the region. Life cycle stage transitions are in essence the simulation of turning points in the life of individuals. The population synthesis process within the activity-based model system should then be enhanced to include a series of life cycle evolution models that would provide a future year population. An evolution model system should also include processes to explain regional vehicle fleet evolution and predict the future year vehicle fleet as well as the path this fleet follows. In fact, just as a population of households and individuals evolves over time, so does a fleet or population of vehicles. New vehicles are acquired, old vehicles are scrapped, and some vehicles are simply swapped, traded, or replaced. It would be desirable to implement a series of vehicle transition or transaction models that allow one to evolve the vehicle fleet in response to changes in socioeconomic characteristics

34

Konstadinos G. Goulias et al.

of the household, the available vehicle types and technologies, the regulatory policy environment, incentives and tax policies, costs of acquiring and maintaining vehicles, and vehicle attributes. In this way, vehicle fleet composition and usage levels can be forecast for any horizon year. The vehicle evolution models should include a suite of components capable of simulating the acquisition, disposal, and replacement of vehicles on an annual cycle. The environment of firms and establishments (and, therefore, the opportunity space for activity participations of individuals and households) also changes, which, in turn, affects the behavior of individuals and households. One way to capture the evolving nature of the environment of firms and establishments is to adopt similar techniques as for household evolution. We can consider every establishment of firm and take it through transitions of creation, location choice, dissolution, movement to a different location, or any other type of morphing. This is known as firmographics and it is used to some extent in microsimulation-based land use models. The task of evolving this aspect of the environment is usually in the domain of land use models and their integration with travel demand models. There is yet another evolutionary aspect of the environment that is surprisingly somewhat neglected in modeling and simulation. A fundamental piece of information we use in travel behavior is travel impedance from a point to another, which is a function of the transportation infrastructure. Changes in the nature of the characteristics of the infrastructure have an impact on impedance and thus on behavior. For instance, schedules of transit and availability of lanes on a highway change over time (e.g., weekday vs. weekend, winter vs. summer) and they also change from one year to the next (e.g., additions and elimination of routes, opening of a new lane, conversion of lanes to HOV/HOT). However, most models do not consider such varying impedance levels, and do not collect information on travel behavior characteristics in the context of the impedance characteristics being experienced. The end result is differences in infrastructure data vintage and survey data vintage, and the resulting obvious potential threat to validity when one combines the two data sources to estimate behavioral models. In this context, it may be desirable to have evolutionary models of infrastructure that provide yearly snapshots to parallel all the other models discussed above.

2.4. The Units of Analysis The unit of analysis for an activity-based travel model is the individual, as individual activity travel patterns are simulated through the time-space domain. However, it is necessary to recognize that individual activity-travel patterns are influenced by household interactions and child dependencies. To account for and model these interactions, the household is also a unit of analysis and considered as the behavioral entity (decision making unit). Outcomes are determined and reported at both levels (household and household member). At any point in the simulation, one should be able to collectively report the activity-travel engagement patterns of all individuals in

Total Design Data Needs for Microsimulation

35

a household and thus obtain a more holistic perspective on the household situation. Different models are estimated at appropriate levels to reflect the behavioral nature of the phenomenon under study. Work location and school location choice models are estimated at the level of the individual person. Vehicle fleet composition and usage models are estimated at the household level. However, there are additional levels within the person and household levels of decision making. Mode choice models are estimated at the tour and trip (segment) level. Destination choice models are estimated at the individual stop/trip level. Household-level joint activity engagement and allocation patterns may be estimated using the MDCEV model at the household level (Bhat et al., 2011). Activity type choice and generation models are, however, estimated at the individual activity level for activities generated on the fly. Activity duration models are estimated at the individual activity level as well. Vehicle type choice models are estimated at the tour level as one generally needs to retain the same vehicle through the completion of a tour. All this points to the need to have data on activity locations, activities, stops, trips, tours, vehicles used for each movement, persons that act, and companionship in activities and travel as locations, stops, trips, and activities. As mentioned earlier, time is to be treated as a continuous entity, while space, although may continue to be treated in a more aggregate fashion with existing data, is preferred to be recorded as longitude and latitude of every location visited during a diary. Continuous time is generally approximated using a one minute resolution, thus creating a total of 1440 possible time slices. Appropriate definitions should also be provided to remove any ambiguity regarding the interpretation or representation of a tour in contrast to a trip. A tour is best viewed as a series of trips with the origin of the first trip and the destination of the last trip being exactly identical. In other words, a closed chain with intermediate stops constitutes a tour. There have been some alternative definitions for tours/chains, but the closed chain definition is likely to serve the activity-based travel model development effort well as one can account for all interdependencies across all trips that are linked together in some way.

2.5. Total Design Data Needs Figure 2.4 represents a somewhat ideal complete total data provision survey design scheme that we use as benchmark for this paper. A design of this type (Goulias & Morrison, 2010; Pendyala, 2011) contains a main household survey (the Core) that collects the base data elements needed for an activity-based model system but also serves other simplified versions of travel demand forecasting. This is the main household, person, and base diary portion of Figure 2.4. Then, subsamples are randomly selected and invited to participate in more detailed surveys targeting a specific topic area. Using suitable statistical methods (e.g., to account for selectivity bias), the responses can then be expanded to the entire main household sample and those, in turn, can be expanded to the entire population using synthetic population methods when desired. Surrounding the main household and activity diary portion

36

Konstadinos G. Goulias et al.

Figure 2.4: The data collection overall scheme. are a variety of ‘‘satellite’’ surveys that shed light on specific behavioral facets and provide data for modeling and simulation. The satellite surveys of Figure 2.4 are defined by a theme, but there is no theoretical underpinning requiring each satellite component to be a separate entity from the rest. For example, one of the satellite surveys of Figure 2.4 spans an entire week for a smaller portion of the sample. This core component could also be designed as a wave of a panel and/or a continuous survey design to capture additional behavioral dynamics that are not included in our large-scale models yet, but we know would enhance our predictive ability if made available. In this design, a set of complementary survey components may be added to provide more in-depth behavioral data (about behavioral facets addressing other model components) as well as data for verification and validation. This overall design aims at collecting in-depth data by minimizing the overall cost and time to implement the survey and also provides flexibility in designing relatively independent survey modules (satellites). Each satellite aims at a set of individual objectives and distributes survey burden to a different group of participants from the main survey, minimizing survey burden and fatigue as well as implementation costs. For these

Total Design Data Needs for Microsimulation

37

reasons we name this type of survey design a total data needs survey design that covers as many behavioral facets as possible to enable new generations of travel demand forecasting models. We proceed here with a brief description of possible satellite surveys (clockwise).

2.5.1.

One Week Activity (and Travel) Diary

In this component, households are recruited to participate in an entire week diary. This will enable the creation of models that account for day-to-day variation in activity scheduling and travel and attempt to identify shifting of tasks and activities from one day of the week to the next. The survey design will dictate the optimal timing of this component to maximize response rate and completion rate, and minimize any biases. It is important that a survey of this type benefit from a design that is able to capture the behavioral processes of scheduling activities, and planning and subsequent rescheduling modifications as in Auld, Mohammadian, and Doherty (2009).

2.5.2.

In-Depth Car Ownership and Use

In this component, we envision the design of an in-depth survey to identify the determinants for each of the car ownership, car type (e.g., new/used, model, make, and fuel type), and car assignment decisions. In the car assignment data collection, both the primary and secondary drivers should be identified. Questions should also be created to identify determinants of changes in car ownership, type, and assignment of cars to household members. Particular emphasis should be given to policy controlled determinants (e.g., taxation, incentives). One approach to study this latter part is using combinations of revealed and stated preference surveys.

2.5.3.

Location Choice and Activity Satisfaction

Destination choice in conventional models is treated as a naive selection among comparable objects. From environmental psychology, however, we know that places have symbolic and other meanings that travel behavior models neglect (see sense of place in Deutsch & Goulias, 2012). This component identifies how destinations are perceived and what role these perceptions play in their selection. It also aims at quantifying the contribution to subjective well-being experienced for each activity and travel episode (see, for example, Goulias, Ravulaparthy, Yoon, & Polydoropoulou, 2012). We would also suggest the design of a small-scale survey following the day reconstruction method (DRM). There are two key objectives for this component: (a) provide a benchmark for the diary instrument; and (b) create an assessment of activities (including trips) and subjective experiences that is able to capture

38

Konstadinos G. Goulias et al.

preferences, satisfaction, and perceived quality of life. This second set of objectives will enable estimation of choice models with latent variables and classes that are by far richer and more informative than their counterpart observed variable discrete choice models.

2.5.4.

Residence, Workplace, and School Location Choice

This is a critical survey component for behaviorally integrated land use travel demand models. We expect this component to be an in-depth survey to identify the determinants for each of the residential, workplace, and school choices (see Kortum, Paleti, Bhat, & Pendyala, 2012). Both primary locations and secondary locations should be examined in more detail than typical household surveys and data collected to estimate choice models for each facet.

2.5.5.

Retrospective and Prospective Location Choices

Questions should also be created to identify determinants of change for each location examining behavior retrospectively and prospectively. Particular emphasis should be given to policy controlled determinants. This portion is shown in Figure 2.4 as a separate survey component because of the possible need to also add questions about personal biography of each household member using techniques that are not used by typical household surveys (e.g., ethnography).

2.5.6.

Toll Willingness to Pay

In this supplemental survey, we envision identifying attitudes and willingness to pay for tolls on highways. The data in this component can be used to develop behavioral equations of the willingness to pay, which, in turn, enable the large-scale regional simulation models to develop pricing strategies (Bhat & Castelar, 2002; Bhat & Sardesai, 2006).

2.5.7.

Long-Distance Travel

Travel models in mega regions and statewide applications also need models that are able to capture what is called interregional travel and long-distance travel. Many of the trips in this class are business related, leisure related, or simply long commutes. Regional forecasting applications need data to estimate this type of trip making, but also need data to correlate long-distance travel and short-distance travel. This data component aims to accomplish exactly this objective, and to enable the study of trade-offs people make when they engage in travel that, for example, requires an

Total Design Data Needs for Microsimulation

39

overnight stay outside the home base. In addition, it is also desirable to study the relationship between land use and the propensity to make long-distance travel.

2.5.8.

Expenditures and Budgeting Survey

Annual, monthly, or even weekly expenditures for activity participation, travel, and vehicle and housing unit maintenance ownership and energy consumption are not collected in typical travel surveys. The most recent policy actions, however, require linking housing to transportation demand. In addition, efforts to develop more complete household greenhouse footprints will increase and add to the move toward developing models of comprehensive accounting of energy demand. This component will provide the data needed to enable a direct association between travel and at home energy consumption to eventually create models of the type in Fissore et al. (2011).

2.5.9.

In-Depth Mode Supplement and Active Living Questions

Mode choice is of paramount importance and particularly when linked to destination choice. This add-on survey collects information about detailed reasons for not using specific modes, including non-motorized modes for active living studies. The survey objective is to identify situational constraints, attitudes, and predispositions in favor or against modes such as walk, bike, and public transportation. Moreover, collecting information about the chosen and not chosen modes enables the creation of models to study policy actions that go beyond the time-cost-comfort analysis. It is also possible to add a stated choice, intentions, and preference components to this module. Equally important is also the added detail of collecting data about walking and biking either as a main mode for each trip or as an access mode to another main mode (e.g., walking from a parking lot to an office, biking to a bus stop and then taking the bus).

2.5.10. GPS and GPS OBD (Verification, Special Days, Emissions) This is a GPS household member tracking component to (a) develop a database to correlate destinations to routes and identify a typology of different types of routes and stop making patterns; (b) develop a route choice model; (c) estimate the level and nature of misreported trips by different modes of the main two-day activity diary; (d) verify day-to-day behavioral change in other survey components and day of the week effects; and (e) provide detailed operating characteristics of the household vehicles. This component for persons carrying GPS devices (wearable GPS) can also be supplemented with an online diary and vehicle-mounted GPS (weeklong to

40

Konstadinos G. Goulias et al.

capture day-to-day variation) and On-Board Diagnostics (OBD) devices (to identify driving patterns and correlate/link them with emission models).

2.5.11. Panel of Households and Persons and Multi-Day Activity Repeated observation of the same people over time provides a unique source of information for understanding change in behavior and develop more accurate travel demand models. Examples of this type of surveys includes designs such as Mobidrive (Axhausen, Zimmermann, Scho¨nfelder, Rindsfu¨ser, & Haupt, 2002) to identify weekly rhythms in activity scheduling and travel and household panel surveys that enable disentangling temporal ordering and causality in behavior (see the edited volume by Golob, Kitamura, & Long, 2007). Figure 2.4 cannot be implemented in its entirety within the resources available for most surveys and requires modification to meet some additional data needs for model components of the specific region or a state/country developing their model systems. The minimum data elements required are household composition information, person characteristics (age, gender, education, employment, drivers license, marital status), and vehicle data. A base diary activity-travel diary (two days are desirable to study day-to-day variation, but a single day diary is current practice in the United States) shall include a complete record of each person’s daily schedules including all activities engaged in and all trips made (with a record of their assembly into tours), locations visited, the persons with whom each activity and trip were made, and activities carried out at home and at other places. Ideally, this diary will be for a prespecified pair of days for all persons in the household and will spread over a 12 month period (to mimic the American Community Survey) with a uniform distribution of interviews throughout the survey period. Development of the next generation activity-based model(s) requires a greatly enhanced activity diary of complete households. The additional travel behavior indicators needed for the new models are usually incorporated in activity diaries. The usual design in this setting is a household questionnaire that includes social and demographic information of each household member, housing characteristics and auto ownership. The activity diary portion of the questionnaire includes but is not limited to respondents’ activities, travel, and characteristics of the surrounding environment (including parking locations and parking rates). The overall survey design will also enable the study of toll modes and willingness to pay, gather additional data on walk/bike modes, transportation demand management program participation, and auto ownership (including fuel efficiency and fuel usage), and enable the study of equity and environmental justice issues. More detailed listing of the variables needed are included in the references to this paper. Current activity-based models are based on data that are stitched together from a variety of surveys. Assumptions about validity of this action are usually untested due to necessity and lack of suitably designed surveys. Attempts, however, are made to design decennial household travel surveys (e.g., the CHTS (California Household

Total Design Data Needs for Microsimulation

41

Travel Survey), 2012) and fill important gaps of information supporting large-scale models. Similar attempts are also made to develop the data required to model the dynamically changing environment in which simulated agents live. In parallel, new model development continues to use whatever data are available with substantial improvements in our ability to simulate policies and validate models using internal to the new survey and external information. In data collection, however, and mainly with a push by the recent legislative innovation of coordinating land use policies with transportation policies we see positive change. A much simpler version of Figure 2.4 is being explored for the California Household Travel Survey (CHTS, 2012). CHTS is currently (May 2012) in its initial months with a possible delivery of data in March 2013. CHTS includes a core (household and single day diary covering the entire state with additional observations for some of the California regions), a three-day wearable GPS and a seven-day vehicle GPS with a small sample of OBD, a long-distance (trips longer than 50 miles) travel log to capture trips made in two weeks preceding the diary day, a car ownership and type revealed and stated preference survey, and an ‘‘augment’’ survey for the Southern California region with questions about location choices. This shows survey practice is already attempting to satisfy some of the data needed for large-scale regional simulation models but far from complete. Although theoretically a satellite design of a large-scale survey may solve the problem of the curse of dimensionality, respondent burden and fatigue, and exploding data collection budgets, it is unknown if a survey of this type is feasible from the survey data collection management/operations viewpoint and if multiple contractors (in CHTS there are three different contractor teams collecting data without a contractual obligation and pressure from the funding agencies to coordinate and collaborate) are able to coordinate their data collection efforts and provide a harmonized database for modeling and simulation. From this viewpoint, the California Household Travel Survey is a real-life experiment in a state that requires policy analysis supported by this type of data in a timely fashion to demonstrate strategies of meeting greenhouse gas emission targets set by policy. For an example of these targets, see the California Air Resources Board website http:// www.arb.ca.gov/cc/sb375/final_targets.pdf (2012). The policy environment, substantial progress in modeling and simulation, and willingness to innovate in data collection will give us many opportunities to rethink about data needs and model development in new ways and at a faster pace than in any past epochs.

Acknowledgments Funding and other support for this chapter were provided by the Southern California Association of Governments, the University of California (UC) Lab Fees program through a grant to UCSB on Next Generation Agent-based Simulation, and the UC Multicampus Research Program Initiative on Sustainable Transportation. Past research grants to UCSB from the University of California Transportation Center (funded by US DOT RITA and Caltrans), to UT Austin from the Texas Department

42

Konstadinos G. Goulias et al.

of Transportation, and to Arizona State University by the Federal Highway Administration have also supported the development of ideas in this chapter. A thank you goes to Rajesh Paleti of UT Austin who created Figures 2.2 and 2.3. This paper does not constitute a policy or regulation of any public agency.

References Auld, J., Mohammadian, A., & Doherty, S. (2009). Modeling activity conflict resolution strategies using scheduling process data. Transportation Research Part A: Policy and Practice, 43(4), 386–400. Retrieved from http://dx.doi.org/10.1016/j.tra.2008.11.006 Axhausen, K. W., Zimmermann, A., Scho¨nfelder, S., Rindsfu¨ser, G., & Haupt, T. (2002). Observing the rhythms of daily life: A six-week travel diary. Transportation, 29(2), 95–124. Retrieved from http://dx.doi.org/10.1023/A:1014247822322 Barth, M., & Boriboonsomsin, K. (2008). Real-world CO2 impacts of traffic congestion. Retrieved from http://www.uctc.net/research/papers/846.pdf Bhat, C. R., & Castelar, S. (2002). A unified mixed logit framework for modeling revealed and stated preferences: Formulation and application to congestion pricing analysis in the San Francisco bay area. Transportation Research Part B: Methodological, 36(7), 593–616. Retrieved from http://dx.doi.org/10.1016/S0191-2615(01)00020-0 Bhat, C. R., Goulias, K. G., Pendyala, R. M., Paleti, R., Sidharthan, R., Schmitt, L., & Hu, H. (2012). A household-level activity pattern generation model for the simulator of activities, greenhouse emissions, networks, and travel (SimAGENT) system in Southern California. Paper presented at the 91st Transportation Research Board annual meeting, January 22–26, Washington, DC. In the TRB 91st Annual Meeting Compendium of Papers DVD. Bhat, C. R., Guo, J. Y., Srinivasan, S., & Sivakumar, A. (2004). Comprehensive econometric microsimulator for daily activity-travel patterns. Transportation Research Record, 1894, 57–66. Retrieved from http://dx.doi.org/10.3141/1894-07 Bhat, C. R., & Sardesai, R. (2006). The impact of stop-making and travel time reliability on commute mode choice. Transportation Research Part B, 40(9), 709–730. Retrieved from http://dx.doi.org/10.1016/j.trb.2005.09.008 Bowman, J. (2009). Historical development of activity based model theory and practice. Retrieved from http://jbowman.net/papers/2009.Bowman.Historical_dev_of_AB_model_ theory_and_practice.pdf Bradley, M., Bowman, J. L., & Griesenbeck, B. (2010). SACSIM: An applied activity-based model system with fine-level spatial and temporal resolution. Journal of Choice Modeling, 3(1). California Air Resources Board. (2012). Attachment 4: Approved regional greenhouse gas emission reduction targets. Retrieved from http://www.arb.ca.gov/cc/sb375/final_targets.pdf. Accessed on November 2012. California Department of Transportation. (2012). California Household Travel Survey (CHTS). Retrieved from http://www.californiatravelsurvey.com/welcome.aspx. Accessed on November 2012. California Government SB 375 (2011). Senate Bill No. 375. Retrieved from http:// www.leginfo.ca.gov/pub/07-08/bill/sen/sb_0351-0400/sb_375_bill_20080930_chaptered.pdf. Accessed on November 2012. Chen, Y., Ravulaparthy, S., Deutsch, K., Dalal, P., Yoon, S. Y., Lei, T., y, Hu, H.-H. (2011). Development of indicators of opportunity-based accessibility. Transportation Research

Total Design Data Needs for Microsimulation

43

Record: Journal of the Transportation Research Board, 2255, 58–68. (Transportation Research Board of the National Academies, Washington, DC) Deutsch, K., & Goulias, K. G. (2012). Understanding places using a mixed method approach. Paper 12-2984 presented at the 91st annual meeting of the Transportation Research Board, January 22–26, Washington, DC. In the TRB 91st annual meeting compendium of papers DVD. Donnelly R., Erhardt, G. D., Moeckel, R., & Davidson, W. A. (2010). Advanced practices in travel forecasting. National Cooperative Highway Research Program, Synthesis 406, Transportation Research Board, Washington, DC. Ferdous, N., Sener, I. N., Bhat, C. R., & Reeder, P. (2009, October). Tour-based model development for TxDOT: Implementation steps for the tour-based model design option and the data needs. Report 0-6210-1, prepared for the Texas Department of Transportation. Fissore, C., Baker, L. A., Hobbie, S. E., King, J. Y., McFadden, J. P., Nelson, K. C., & Jakobsdottir, I. (2011). Carbon, nitrogen, and phosphorus fluxes in household ecosystems in the Minneapolis-Saint Paul, Minnesota, urban region. Ecological Applications, 21(3), 619–639. Retrieved from http://dx.doi.org/10.1890/10-0386.1 Golob, T. F., Kitamura, R., & Long, L. (2007). Panels for transportation planning: Methods and applications. Boston, MA: Springer. Goulias, K. G. (2007). Activity based travel demand model feasibility study. Final Report Submitted to the Southern California Association of Governments. Contract Number 07-046-C1. Work Element Number 07-070.SCGC08. June, Solvang, CA. Goulias, K. G., Bhat, C. R., Pendyala, R. M., Chen, Y., Paleti, R., Konduri, K. C., Lei, T., y Hu, H. (2011). Simulator of activities, greenhouse emissions, networks, and travel (SimAGENT) in Southern California. Paper Presented for Presentation at the 2012 Transportation Research Board Annual Meeting. Goulias, K. G., & Morrison, E. L. (2010). Pre-survey design consultant for the year 2010 postcensus regional travel survey. Final Summary Report Project Number 10-046-C1(April 2010 to July 2010). Submitted to Southern California Association of Governments and Caltrans. June, Solvang, CA. Goulias, K. G., Ravulaparthy, S., Yoon, S. Y. & Polydoropoulou, A. (2012). An exploratory analysis of the time-of-day dynamics of episodic hedonic value of activities and travel. Paper presented at the 2012 International Association for Travel Behavior Research Conference, July 15–20, Toronto, Canada. Ha¨gerstrand, T. (1970). What about people in Regional Science? Papers in Regional Science, 24(1), 6–21. Retrieved from http://dx.doi.org/10.1007/BF01936872 Ha¨gerstrand, T. (1989). Reflections on ‘‘what about people in regional science?’’. Papers in Regional Science, 66(1), 1–6. Hao, J. Y., Hatzopoulou, M., & Miller, E. (2010). Integrating an activity-based travel demand model with dynamic traffic assignment and emission models: Implementation in the Greater Toronto, Canada, area. Transportation Research Record, 2176, 1–13. Retrieved from http:// dx.doi.org/10.3141/2176-01 Kortum, K., Paleti, R., Bhat, C. R., & Pendyala, R. M. (2012). A joint model of residential relocation choice and underlying causal factors. Paper 12-3769 presented at the 91st Annual Meeting of the Transportation Research Board, January 22–26, Washington, DC. In the TRB 91st annual meeting compendium of papers DVD. Pendyala, R. (2011). A household travel survey data collection plan. Report for Maricopa County, Tempe, AZ, USA. Pendyala, R. M., Bhat, C. R., Goulias, K. G., Paleti, R., Konduri, K. C., Sidharthan, R., Hu, H., y Christian, K. P. (2011). The application of a socio-economic model system for activity-based modeling: Experience from Southern California. Paper Presented for

44

Konstadinos G. Goulias et al.

Presentation at the 2012 Transportation Research Board Annual Meeting and publication in the Transportation Research Record. Rossi, T., Bowman, J., Vovsha, P., Goulias, K. G., & Pendyala, R. (2010). CMAP strategic plan for advanced model development. Final Report of the CMAP Advanced Travel Model Cadre. Chicago, IL, USA. SimTRAVEL Research Initiative. (2008–2011). Reports. Retrieved from http://urbanmodel. asu.edu/intmod/reports.html SustainCity. (2012). Welcome to SustainCity. Retrieved from http://www.sustaincity.org/ index. Accessed on November 2012. Vovsha, P., Bradley, M., & Bowman, J. (2005). Activity-based travel forecasting models in the United States: Progress since 1995 and prospects for the future. In H. Timmermans (Ed.), Progress in activity-based analysis (pp. 389–414). Oxford, UK: Elsevier Science. Vyas, G., Paleti, R., Bhat, C. R., Goulias, K. G., Pendyala, R. M., Hu, H., Adler, T. J., & Bahreinian, A. (2011). A joint vehicle holdings (type and vintage) and primary driver assignment model with an application for California. Paper Presented for Presentation at the 2012 Transportation Research Board Annual Meeting. Yagi, S., & Mohammadian, A. (2010). An activity-based microsimulation model of travel demand in the Jakarta Metropolitan Area. Journal of Choice Modelling, 3(1), 32–57.

Appendix 2.A.1. Excerpt from California Legislative Initiative Senate Bill 375 (California Government SB 375, 2011) was enacted to reduce greenhouse gas emissions from automobiles and light trucks through integrated transportation, land use, housing, and environmental planning. Under the law, SCAG is tasked with developing a Sustainable Communities Strategy (SCS), a newly required element of the 2012 Regional Transportation Plan (RTP) that provides a plan for meeting emission reduction targets set forth by the California Air Resources Board (ARB). On September 23, 2010, ARB issued a regional 8% per capita reduction target for the planning year 2020, and a conditional target of 13% for 2035.

2.A.1.1. SCS Requirements According to SB 375, ‘‘each metropolitan planning organization shall prepare a sustainable communities strategy, including the requirement to utilize the most recent planning assumptions considering local general plans and other factors. The Sustainable Communities Strategy shall:  identify the general location of uses, residential densities, and building intensities within the region;  identify areas within the region sufficient to house all the population of the region, including all economic segments of the population, over the course of the planning period of the regional transportation plan taking into account net migration into the region, population growth, household formation and employment growth;

Total Design Data Needs for Microsimulation

45

 identify areas within the region sufficient to house an eight-year projection of the regional housing need for the region;  identify a transportation network to service the transportation needs of the region;  gather and consider the best practically available scientific information regarding resource areas and farmland in the region;  consider the state housing goals specified in Sections 65580 and 65581;  set forth a forecasted development pattern for the region, which, when integrated with the transportation network, and other transportation measures and policies, will reduce the greenhouse gas emissions from automobiles and light trucks to achieve, if there is a feasible way to do so, the greenhouse gas emission reduction targets approved by the state board;  allow the regional transportation plan to comply with the federal Clean Air Act’’

PART II FOCUS ON IMPROVED METHODS: THEMES 1 TO 5

THEME 1 MAINSTREAMING MOBILITY-AWARE AND ON-LINE TECHNOLOGIES

Chapter 3

Cell Phone Enabled Travel Surveys: The Medium Moves the Message Jane Gould

Abstract Purpose — To assess how cell phone technology might impact the collection of travel data in the future. Design/methodology/approach — Two different types of cell phone enabled studies are considered. First, we examine how the text feature of phones can be used for person-to-person surveys, and second, we explore an aggregate level survey enabled by an anonymous and passive GPS trace. Findings — This study explores the types of travel information that are likely to be inferred from text surveys and cell phone traces. It recognizes that a passive GPS trace might change the level of measurement and the inferences we make about travel behaviors. Research limitations/implications — The study is prospective. It anticipates that over the next 10–15 years cell phone tracking technology will improve, as well as the speed and capability of algorithms for post-processing the information. Practical implications — Cell phone enabled studies may provide a new tool and new level of measurement, as traditional survey response rates decline, and it becomes more difficult and expensive to conduct conventional travel surveys. The capacity of cell phones for travel survey work is improving, but it is not fully realizable today (2012). Originality/value — This study provides a context to understand how the technology of the cell phone might be integrated with more traditional travel

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

52

Jane Gould surveys to streamline data collection, and produce new types of spatial detection, measurement, and tracking. Keywords: Cell phones; GPS; passive tracking; future surveys; mobile phones; small screens

From the time that cell phones were used chiefly by ‘‘early adopters’’ they have played a significant role in travel and mobility. Initially, cell phones helped people to coordinate their meeting points during travel. In 2003 a Finnish researcher observed that cell phone users communicated their present location (e.g., ‘‘I’m at the train station’’) about 70% of the time, compared to just 5% for landline users (Oulasvirta, 2005). With the addition of Internet service, the cell phone has become an indispensable tool for mobile route planning and navigation. Soon, a next generation of phones, equipped with GPS, will transform how we coordinate with others, make travel payments, and when and how we travel. It will also transform how transportation researchers seek to study and measure travel behaviors. The process for gathering data is evolving from ‘‘intrusive’’ surveys that take place on the web, phone, in-person, or mail to methods that collect data electronically and passively. Until recently, GPS (with accelerometers) were a stand-alone and separate recording device, but they are increasingly becoming a standard feature of mobile cell phones. The mobile cell phone is a ‘‘pattern break’’ of singular importance to survey designers. The act of recording travel, like any experiment, may interact with and change the underlying behavior. Stand-alone GPS devices, while small, are a reminder that behaviors are being ‘‘recorded’’; researchers take pains to say that the respondents quickly learn to disregard the presence of the recording devices. Assuming this is so (it is a difficult proposition to test), the separate recording device has an additional drawback: it is difficult to recruit a truly random sample. Stopher (2009a) for example, reported from a 2007 study that in the few instances where a GPS was used as a stand-alone method for travel measurement, the response rates for recruitment were similar to those for conventional surveys. The timing for passive electronic data collection is opportune, for the foundation of transportation models is built on the representativeness and accuracy of survey data. Stopher (2009a) notes the irony between the precision these models require and the measurement error introduced from factors such as declining response rates, increasing globalization of the population, declining literacy, and respondents’ who view surveys as irrelevant. Among those who do participate, survey designers confront respondents who seem to have shorter attention spans, less ‘‘investment’’ to report detailed trips, and more missing trip data overall. In this chapter, ‘‘passive GPS’’ describes a future data collection using the mobile phone instead of a stand-alone GPS device. When incorporated into a phone, the passive GPS acts as sensor in the background. There is no active decision to use it, other than to turn the telephone on, and carry it. Unlike a stand-alone GPS, people seldom forget to charge it, and seldom leave home without it. Today, stand-alone GPS devices offer superior accuracy but the precision from cell phone traces is

Cell Phone Enabled Travel Surveys: The Medium Moves the Message

53

improving.1 An assumption underlying this chapter is that, like Moore’s Law of computer chip memory,2 there will be advances to both the algorithms, and to cell phone operations, like battery life, memory, CPU usage, and ‘‘time-to-first fix’’ (Reddy et al., 2010). Another assumption, the more fundamental one, is that the cell phone will evolve as an even more essential ‘‘wearable mobile computer.’’ We posit that GPS tracking on cell phones will fundamentally and substantively change not only how we collect our data, but also what transportation researchers seek to know. A preview of the change is found in this chapter’s literature review, which draws from several fields. ‘‘Location-based studies,’’ which use the cell phone for tracking, record data at a system-wide or aggregate level. This is somewhat like beginning with the answers to a survey, but then working back to deduce the questions that were posed. In 2006, Ratti, Pulselli, Williams, and Frenchman noted that aggregate level tracking portended significant change for traffic engineers: cell phone traces could reveal in real time the actual patterns of movement, in lieu of simulating them through models or estimates. In conventional travel surveys there is considerable ‘‘noise’’ in the data, as validity depends on factors like the representativeness of the sample, and then on the integrity and veracity of selfreports. With a passive GPS trace, the ‘‘noise’’ from survey measurement and error are minimized. The trade-off is that as we reduce the ‘‘noise’’ from survey measurement the ‘‘signal’’ or data is modified. We consider how the collection of real-time tracking may change measures like trip frequency, mode-share, and activity coding. Recognizing that there will still be a need, albeit reduced, for studies that take place at the level of the individual respondent, we also study the applicability of cell phones to send and receive text surveys. Again, the concept of signal to noise ratio is useful. The small screen size and limited attention of the user suggest that cell phone enabled surveys will be shorter and more superficial, hence ‘‘noisier.’’ However, there are also new types of content, that is, ‘‘signals’’ created by novel features like picturetaking and real-time transmission. Organization of this Chapter 1. We briefly review some key studies from the growing literature on mobility and cell phone applications. Much of this work has taken place in computer science and media studies. This will establish the groundwork for the next section. 2. We examine emerging issues for text-based surveys that will be sent and received over cell phones. While the need for these surveys is likely to be diminished, there may be inventive ways of collecting survey data using smart phones.

1. As of 2011 most cell phone tracking studies use a less precise method: estimating location by triangulating the distances between cell phone tower locations. 2. The point is that the technologies will advance — exponentially. Moore’s Law — linked to digital electronics — observes that the capacity of microchips has doubled every 18 months, and grown in order of magnitude every 5 years (cited in Oulasvirta, 2005).

54

Jane Gould

3. Finally, we assume that the cell phone trace replaces many conventional travel studies. What data elements might be improved, and what information (or signals) is entirely new?

3.1. Literature (Cell Phone Tracking) New technologies are often packaged in familiar ways: the initial cell phone-trace studies were similar to those with stand-alone GPS devices. In a comprehensive review of GPS, Chorus and Timmermans (2010) attribute the first cell phone study (with a PDA) in 2005 in Japan (see, e.g., Asakura & Hato, 2004), and they cite five additional cell phone trials through 2009. By 2009, there is recognition that the cell phone data has new features and that their ‘‘location-based services’’ may provide a different vantage. In 2005, Asakura, Hato, and Sugino discuss the usefulness of tracking tourists, pedestrians, and cyclists using ‘‘dot data analysis’’ that has both space and time dimensions. Instead of conducting personal surveys of intentions and activities, they propose studying movement at the micro-movement level and then generating thousands of profiles for a simulation model. A related thread of studies grew out of the computer science and communications field. Eagle and Pentland (2005) proposed that location aware devices, specifically cell phones, could collect data that focused less on the micro-movement of individuals and more on patterning in the collective or aggregate. They observed: Surveys are plagued with issues however, such as bias, scarcity of data, and lack of continuity between discrete questionnaires. It is this absence of dense, continuous data that also hinders the machine learning and agent-based modeling communities from constructing more comprehensive predictive models of human dynamics. Over the last two decades these has been a significant amount of research attempting to address these issues by building location-aware devices capable of collecting rich behavioral data. There were several early demonstrations of this concept particularly by Ratti et al. (2006), Reades, Calbrese, and Ratti (2009), and Reades, Calabrese, Sevstuk, and Ratti (2007) who worked with anonymized records from mobile operators in Italy, and mapped mobility patterns in Milan and then Rome, by capturing the latitude and longitude, and time of day, when calls or texts were engaged. These researchers observed that a more sophisticated procedure would allow the continuous tracing of users’ movements at regular intervals, throughout the day. Another early study acknowledged that mobile phone records, which routinely record the location of millions of cell phones, could be mobilized in emergencies, as well as for routine operations by traffic engineers and public works (Madey et al., 2007). This paper demonstrated an algorithm to detect the mode difference between walking and in-vehicle travel.

Cell Phone Enabled Travel Surveys: The Medium Moves the Message

55

In 2008 Gonzales, Hidalgo, and Barabasi published a landmark study, based on the analysis of one hundred thousand mobile phone user records, over a six-month period. The research reconstructed travel trajectories, by recording cell phone tower locations whenever those selected in a subsample initiated or received a call or text. They observed a high degree of regularity in the daily travel patterns, captured by a high return probability to a few highly frequented locations (e.g., home, work). These findings were surprising to the researchers, as they studied outside the transportation field. Over the past year or two, there are several studies that demonstrate that cell phone traces correlate closely with travel statistics, like mode choice and volume of travel, collected by the US Census. There have also been more studies that demonstrate improvement in the post-processing algorithms, and the ability to reproduce a travel route (Bierlaire, Chen, & Newman, 2010; Chen, Newman, & Bierlaire, 2009). A small, but promising study by Reddy et al. (2010) tested a mobile phone equipped with a GPS receiver and an accelerometer. They coded travel mode over 124 hours of data and 16 users, and did not do any user-specific training. They were able to detect with a 93.6% accuracy rate, differences between being stationary, walking, running, biking, and motorized travel. In the last year, there have also been some novel applications to transportation issues. Becker et al. (2011) analyzed Call Detail Records (CDR) from 35 cell towers (and 300 antennas) within a five-mile radius of downtown Morristown, New Jersey. The team studied CDR patterns to identify travel information not reported in the US Census, like the inflow to the city after commute hours, and the source/destination of tourists and occasional visitors. They suggest this data would ‘‘be useful to urban planners for example to reduce traffic congestion by organizing new transit and bike routes or park and ride programs y also (to place) late night shuttle buses to keep inebriated drivers off the road.’’ Privacy issues were addressed by stripping the CDRs of identifiers with the exception of the home zip code. A similar study was conducted by Isaacman et al. (2011). They also used anonymous CDR (not a continuous GPS trace). They mapped home to work commute trips in both New York City and Los Angeles, and other mobility patterns. They cite a numerical consistency between New York and Los Angeles for the number of ‘‘important places’’ visited, which is similar to the finding in Gonzales et al. (2008). They also try to estimate, from their aggregate level data, the carbon emissions but end up having to assign trips to modes. A recent study, where the algorithms are constructed from the GPS trace (not CDR) is reported by Van der Hoeven (2010). The Dutch navigation firm, Tom Tom, developed an application that tracks travel congestion on major roads in the Netherlands by positioning in real time, all 4 million subscribers to Vodafone’s mobile phone services. Van der Hoeven (2010) observes that if all mobile carriers provided data, there would be an accurate and continuous survey of travel demand since nearly everyone in Holland carries a cell phone. At an airport, for example, continual tracking would provide insight into visitors’ route choice and mode. Like the Tom Tom service, there is an increasing number of field tests and apps that provide traffic information and transportation-based services that are personalized

56

Jane Gould

to the individual users or their mobile device (Bayir, Demirbas, & Cosar, 2010; Manasseh, Aherh, & Sengupta, 2009). The latter use triangulation of distances from cell phone towers to determine the user location, but they note that GPS traces are feasible and ‘‘are left as a future work.’’

3.2. New Opportunities: Active Surveys on Phones A smart phone with GPS ‘‘leaps over’’ many hurdles facing survey research. The frequency of travel trips can be collected in the aggregate, without identifying the ID of individual respondents or their message content. This helps address privacy concerns. And, travel information can be collected without attempting to contact, and then incentivize to participate, a random sample. Looking forward, we expect there will be fewer ‘‘conventional’’ household surveys done via the Internet, paper, phone, or in-person. In Section 3.3, we consider the implications of this. Here, in Section 3.2, we look at more conventional surveys that continue to recruit a sample, and in most cases, use incentives for participation. We anticipate that more of these surveys will take place over smart phones or tablet devices. We focus specifically on self-completed surveys, those where respondents link to a web site or survey form, and text back their responses. The transformation of data collection to the cell phone, from conventional methods, and even stand-alone GPS devices, is a significant pattern break. Table 3.1 groups some of the key changes. The second column of the table is the baseline, summarizing data points, which are collected today. Because much of this data is based on self-report, the so-called signal-to-noise ratio is low. Location-based studies, using the cell phone, reduce this noise and collect ‘‘outside’’ data like time, latitude and longitude, and duration. We discuss (in Section 3.3) how this introduces new ways of measuring travel behavior. Here, in Section 3.2, we discuss studies that use the cell phone to send/receive text surveys. Survey practitioners are concerned that mobile phones produce more measurement error than landlines because there is less fidelity, a shorter attention span, less privacy, and other distractions (Kennedy & Everett, 2011). However, participants accessing the text feature may have different ways to respond. For example, they can provide immediate on-the-spot feedback and deploy distinctly new features: they can troll the net for outside information, be ‘‘pinged’’ in real time to participate, and be prompted to use the camera feature to share visual content. But, unlike the GPS trace, the attitudinal survey conducted via text or voice on a mobile phone is traceable to the individual respondent. Thus respondents will need to be recruited or ‘‘opt-in’’ and they may expect to be compensated for their opinions or time. In the following discussion we elaborate on this. We also consider how specific types of surveys used in transportation, such as the stated preference survey; the survey of attitudes and opinions; and the level of service study, may be modified and perhaps improved when they are sent and received over the cell phone.

Likely to be at aggregate level

Cell phone traceb

b

Variable

Duration of survey

CommunityOngoing wide/ Anonymous

Recruited

Sample or population

Mass movement behavioral patterns

Within/ Between samples

Measurement level

Objective

Selfreport

Mode

Speed

Objective

Selfreport

Surveys that are sent/received via text on cell phones are discussed in Section 3.2. Surveys and studies based on a cell phone passive trace are discussed in Section 3.3.

Likely to be panel

Cell phone texta

a

Design

Approach

Table 3.1: A comparison of data elements from survey modes.

NA

Selfreport

Person traveling together

Objective

Selfreport

Route choice

Objective

Objective

Time of day

O/D

Inferred

Selfreport

Type of variables

Inferred

Selfreport

Activities outside home

NAyMay be inferred

From recruitment

Demographics (age, income, educ., household)

58

Jane Gould

3.2.1.

Cell Phone Surveys, Panels, and Non-Probability Sampling

While large household surveys may be expected to decline in number, researchers may seek to learn individual ‘‘motivations’’ by reaching convenience samples, quota samples, and panel studies. Panels offer a particular advantage for transportation studies, because demographic variables need to be collected only once and the panel can be ‘‘pinged’’ or ‘‘prompted’’ on the cell phone to report characteristics of their trip taking over time. But, it is hard to imagine a cell phone sample being selected through probability-based random sampling. Panels are likely to be identified by address-based sampling or proprietary lists, and then recruited through an e-mail (not voice) contact. Or, respondents may increasingly self-identify an interest through Facebook, Twitter, or other social media and choose to ‘‘opt-in’’ to a posted survey. Either method depends on non-probability based methods for recruitment. Researchers will then need to make post-survey adjustments and weighting, in order to make their inferences more valid and reproducible. Recently, a comprehensive review by the American Association for Public Opinion Research (AAPOR, 2010) examined the validity of panels. They raised critical concerns about the overall accuracy of the methodology and the ability to compensate for non-probability sampling. One of the underlying problems is: what criteria motivate respondents to participate in panels? The AAPOR report suggests five typical appeals:     

A contingent incentive or prize Self-expression (to register one’s opinions) Fun (entertainment value) Social comparison (to find out what other people think) Convenience (easy to join and participate)

These five criteria underscore why social media might be a boon for recruiting subjects, and why Facebook, etc. may be so useful. But, thinking about ‘‘why’’ people agree to participate is important. Researchers cannot assume, even if they have a demographically matched panel, for example, that the participants are fair and unbiased toward the topic. Panels put high demands on their respondents, panel responses are seldom anonymous, and there is often a substantial dropout rate over successive waves. According to the AAPOR report (AAPOR, 2010), results of most panel studies are typically not generalizable to the larger population. Moreover, postsurvey adjustments, like post-stratifications and propensity weighting rely on assumptions that may increase overall measurement error.

3.2.2.

Cell Phone Surveys ‘‘At Large’’: Adapting to the Small Screen

In order to minimize the disruption and obtrusiveness of a telephone call, researchers are likely to contact their panels using text surveys (or prompts), accessed over the phone.

Cell Phone Enabled Travel Surveys: The Medium Moves the Message

59

Very little is known today about compressing a survey into the physical space of a cell phone screen. One can speculate that the reduced screen size requires that fewer items are asked, and that both rating scales and write-in text are shorter. Two other likely impacts, also unknown, are primacy and recency: because of the physical layout on the cell phone ‘‘page,’’ there may be a tendency to select the ‘‘first’’ or ‘‘easy’’ answers. Closely related to this tendency, is the more general category of ‘‘Satisficing’’: satisficing focuses on the cognitive effort that respondents devote to the survey process and some believe that respondents take more ‘‘shortcuts’’ in selfadministered surveys (AAPOR, 2010; Kennedy & Everett, 2011). Among the shortcuts are answering all questions uniformly (non-differentiation), answering randomly, answering more quickly, skipping items, or ‘‘don’t know.’’ Although there is no research evidence about this yet, surveys completed by cell phone would appear to manifest ‘‘satisficing’’ behaviors. The speed of completion for a questionnaire, the ability to use the phone for other activities like web surfing or navigation, and abbreviated text messages might impact survey quality. The ‘‘satisficing’’ behavior might also be more likely among the type of users, for example, young and male, who may be willing to participate in a cell phone survey. However, there are characteristics of cell phone surveys that could provoke a more reflective stance. The first is ‘‘real-time’’ capability: just as respondents can blog or report news as events unfold, they can also provide impressions and rate items in real time. An important, and still untapped, enhancement is how to integrate users’ pictures and video into the survey package. A second enhancement is that respondents can use online capabilities to see their responses in relation to others. Although reflexive feedback was the key feature of the Rand/Delphi Survey (Rand, 1968) it is unclear how this feature might be integrated into new smart phone surveys. One of the adjuncts to current GPS studies has been the use of non-probability surveys that use the GPS record to augment self-recall and ‘‘prompt’’ respondents to verify or describe their travel behaviors. The cell phone may turn out to be a compatible medium for additional prompted recall studies; respondents can report their behaviors in situ, and report not only what activity they are engaged in, but why they did not choose a different one (i.e., the decision to not cycle or walk). However, the prompted recall, particularly since it is a ‘‘disruptive-survey’’ by design, faces the problem of survey recruitment, and whether those who opt-in are systematically different. Stopher (2009b) raises an additional concern that prompted surveys, sent to a person engaged in travel, would flag privacy issues.

3.2.3.

Cell Phone Surveys and Stated Preference/Revealed Preference Studies

A customized survey that asks respondents why they did/did not engage in a behavior in real time may give new direction to stated preference studies. An oftenlevelled criticism of these studies is that they are too abstract for respondents and do not capture real behaviors in real settings. There are several reasons for this.

60

Jane Gould

First, for many transportation choices, it may be too expensive and nonproductive for travelers to do extensive search. Second, many transportation behaviors are thought to be habitual (Chorus & Timmermans, 2010). Moreover, the process of participating in a stated preference study is taxing, in terms of respondent recruitment and participation. Surveys taken on cell phones may eliminate some of the participation burden, and make stated preference studies more realistic and concrete for respondents. The stated preference study, somewhat like a prompted recall, can be conducted in real time, while the participant is engaged in the travel activity. By way of example, suppose a research team wanted to estimate customer demand for a future train route. Researchers might first identify respondents who travel the proposed route today using a different mode, like bus or airplane. In real time, the researcher could send the traveler a customized stated-preference study. This survey could reflect the traveler’s current travel time, cost, and service preferences. The survey software, might then ask respondents, based on the trip they are currently taking, to develop their own weights or to name and weight new attributes. This could improve these studies in a way articulated by Chorus and Timmermans (2010), to be able to change the perception of attributes of alternatives.

3.2.4.

Cell Phone Surveys of Attitudes/Opinions

Almost all researchers who write surveys for cell phones will encounter the need to write short ‘‘haiku’’ like questions. This economy of words may be particularly troublesome for those who need to explore multiple dimensions of a topic, or tease out multiple factors, say of ‘‘like’’ and ‘‘dislike.’’ Because of the limited screen size, one survey question may have to be asked in lieu of several. Again, by way of example, consider a survey that asks respondents to rank and rate their satisfaction with a travel mode, say light rail. With traditional Internet or paper surveys, a battery of questions about the experience on the rail might be asked, along with a final one about the quality of the trip overall. In data analysis, a factor analysis would be used to identify the important and critical questionnaire items. Researchers using cell phone studies will often not have the luxury to ask multiple questions without burdening the respondent and risking non-completion. The brevity of the cell phone survey is likely to work against detailed attitude and opinion studies. Short cell phone surveys may not allow researchers to ask the traditional battery of items needed to reliably measure attitudes and a single question or two may not suffice. However, this problem is superseded by a different challenge to conventional attitude and opinion measurement. Research that began in the 1980s questions the predictive validity of the attitude survey. There is a branch of investigation in social psychology cautioning that attitudes reported in surveys are superficial and not reflective of genuine viewpoints, which are reflected in more subconscious reasoning and reflections (Beatty, 2010). If this research stream is viewed seriously, it upends the validity of doing attitudinal studies with a battery of like/dislike probes.

Cell Phone Enabled Travel Surveys: The Medium Moves the Message 3.2.5.

61

Cell Phone Surveys and Level of Service Questionnaires

Levels of service (LOS) questionnaires are frequently used in many areas of transportation and they probe conditions, such as whether the service was reliable, timely, clean, and comfortable. Unlike the attitude and opinion survey, they report relatively objective phenomena and might be fairly easy for respondents to report/ complete on a small cell phone screen. However, with a passive GPS and the opportunity for remote sensing, there is likely to be less dependence on these surveys. Sensors are able to update on arrival and departure times; cameras can record conditions, and identify, in the case of transit, load factors. While the need for traditional LOS surveys may decline, there may be totally novel ways for travelers to report their requirements. For example, a user-based LOS will empower cell phones users to geotag locations where they would like additional public transit service or identify where bicycle lanes are inadequate or failing.

3.2.6.

Cell Phone Surveys and the Search for External Records/Demographics

As survey researchers design for the small screen and short format of the cell phone survey, there is likely to be pressure to either recruit more panel studies or shorten the number and type of demographic variables that are asked. Collecting demographic data, like income, age, gender, and household size, is vital to researchers, but is also likely to become more difficult when surveys are ‘‘downsized’’ to fit on a smart phone. In order to retain the spontaneity and speed of a mobile survey, researchers may face a trade-off, asking fewer personal and in-depth questions. Moreover, collecting these demographic profiles then creates privacy risks for both cell phone users and cell phone companies. Opportunely, there is some evidence from transit research that demographic information might be inferred from the GPS trace, and then linked to census data. In addition, in the special case of transit, smart cards can be linked, in theory, to the point of sale, such as a credit card, and then to other external data sources. Apart from their links to demographic information, smart cards leave a ‘‘breadcrumb trail’’ that can also be used for O/D studies, to analyze transfer activity, and to study day of the week patterns. Some transit researchers have suggested that passenger surveys, are no longer necessary because this information can be inferred from the location (Chapleau, Trepanier, & Chu, 2008). Transit studies provide a hint that as cell phone enabled surveys increase, researchers will seek external sources of data. An initial location-based study by LeeGosselin and Harvey (2006) identified unique external data sources such as license plates and electronic tolls. Demographic information is likely to be inferred from geolocations. It should not be overlooked that an external source for geolocation data is often the U.S. Census. While it is deemed to have a high degree of accuracy, census data is a mix of self-report and interviewer data, it is centrally cleaned and

62

Jane Gould

standardized, and there are time lags before it is published. The final irony is that census information is itself based on surveys of individuals and households.

3.3. New Opportunities: Passive Surveys on Phones Surveys conducted over the phone, on paper, etc. are fundamentally ‘‘noisy’’ because their accuracy and reliability depends first on recruiting a representative sample, and second, on the quality and veracity of individual reports. Over the next 10–15 years, what if transportation data was collected primarily by passive GPS? There are many original and new inferences to be made from the passive GPS trace. For example, the time of travel becomes readily available and might provide new insights for models of congestion pricing and environmental impacts. Another novel dimension might be the objective speed of travel, not perceived/reported travel time. In the following discussion, we consider the practice of employing an aggregate level GPS trace in lieu of conventional survey data to infer travel mode, trip counts and trip activities, route choice, and dynamic route change.

3.3.1.

Passive GPS and Inferences from Travel Mode: Walking/Bicycle/Vehicle

3.3.1.1. Walking trips Ground-truth counts consistently show that survey respondents underreport their walking trips. The reasons are multiple: survey respondents ‘‘forget,’’ they view the walking trip as too brief, or, in the context of a survey, it burdens the time to completion. However, a more profound source of bias is that walking trips are not well captured as an ‘‘activity.’’ Walking is both the mode (e.g., to get to the gym) and the activity (recreation). In the future, reliable data from a GPS/passive may provide more complete information than surveys. The passive GPS may facilitate an understanding of ‘‘where’’ and ‘‘why’’ walking takes place, alongside the role of the built environment (or a reinterpretation of existing studies). Using aggregate level data, the GPS enumerates walking distance with much more precision than a self-reported survey. And, when walking routes are viewed alongside maps of the built environment, GPS data may provide insight about factors that facilitate or impede pedestrian activity, like dark or windy corridors, the absence of sidewalks, or topography. The GPS/passive trace has an immediate application and extension from walking to transit research. Conventional surveys oversimplify transit taking because respondents do not focus on transit’s multimodal aspect. Survey respondents also overestimate the time spent waiting for transit. The passive GPS trace can provide more complete information on the characteristics of walking to/from a bus stop and door-to-door travel time (Li & Shalaby, 2008). A particularly interesting ‘‘in-site’’ experiment is measurement of the walking premium associated with rapid-bus service.

Cell Phone Enabled Travel Surveys: The Medium Moves the Message

63

3.3.1.2. Bicycles/Cycling A GPS trace offers many of the same advantages for learning about bicycle travel, as it does for walking. Bicycle trips, like walking, may be undercounted by traditional surveys because they are less likely to take place on a regular, reoccurring basis. An additional problem is that many bicycle riders are young and male, and as Murakami (2008) notes they shun survey taking. The GPS trace may be quite attractive to this demographic group, because cyclists can share with each other and with planners their information on route choice. It will also provide more accurate data on the multiday patterns of bicycle commuting and mixed mode choice. Finally, the GPS trace will better count and delineate the differences in route choice and time of day — between recreational bikers and commuters. This new volume and detail on cycle usage may help communities reexamine the accuracy and usefulness of their bike-master plans. 3.3.1.3. Parking Our understanding of vehicle use may be enhanced by the passive GPS trace. Today, few travel surveys ask about the dynamics of parking and many transportation models still do not explicitly include the time and cost to park a vehicle. But, in large urban areas, the availability and price of parking will influence whether a trip takes place at all and then, the choice of travel mode too (e.g., taxi or train). ‘‘Location-based’’ software that run on the cell phone are likely to help count parking events, providing that new algorithms are written. Today, GPS helps drivers identify ‘‘open’’ parking spaces. An interesting experiment, already taking place, records the difference in search time and curb-behaviors for GPS users and nonusers (Rodier & Shaheen, 2010). Using a measure of duration, we might also learn whether parking itself generates ‘‘new’’ trips and cold-starts, as drivers move their vehicles to new locations when their ‘‘metered time’’ expires. Parking rates, recorded in the ITE manual, might be recalculated, with new aggregated GPS results. These counts can provide a more accurate and reliable means of estimating trip generation by activity centers with different characteristics. For example, parking requirements associated with say an urban hospital with good public transit service might be compared to a hospital, in a similar urban setting, lacking transit service.

3.3.2.

Passive GPS and Inferences about Trip Counts and Activities

Counting the number and frequency of travel trips with a passive GPS trace is likely to disrupt current and known data keeping. First, GPS devices are known to enumerate more trips than self-reported survey: the undercounting of trips by selfreport occurs because respondents (a) avoid survey fatigue, (b) cannot recall their complete travel, or (c) do not cognitively grasp ‘‘trip chains.’’ Second, data keeping from self-reported surveys typically reports behaviors over a short duration, usually one day, on occasion, a week, and seldom for a mix of both weekdays and weekend.

64

Jane Gould

There is ample evidence that travel behavior is more dynamic, and less predictable over longer time periods. Yet another limitation of self-reported survey data is that activities are not as discrete as their measurement. For example, in a travel diary a respondent might code their ‘‘evening out’’ as recreation. In real time, using text and IM on their cell phone that respondent engages in multiple activities: they order take-out on the phone, window shop for a future purchase, network with friends to find a meeting point, and catch up on work related e-mails while traveling to meet. Researchers may find they can no longer validly code trips into discrete categories. However, the passive GPS may afford new opportunities: it may, at the aggregate level, capture the previously unmeasured influence of ‘‘connected’’ behaviors. Writing from the marketing field, Tancer (2009) describes how Internet behavior can be analyzed collectively: the volume and location of searches reveal larger patterns, which are never detected individually. Transportation planners know that there are network level impacts that are associated with weather, holidays, large sporting events, and the like. The FHWA, for example, funded a project to study the patterns and regularity of crowd movements during large planned special events (cited by Calabrese, Pereira, Lorenzo, Liu, & Ratti, 2010). The passive GPS trace may link travel patterns with previously unexplored social and temporal phenomena.

3.3.3.

Passive GPS and Inferences about Route Choice and Dynamic Travel Information

A fundamental assumption of the activity diary/travel survey is that underlying ‘‘activities,’’ which are then completed at ‘‘destinations,’’ motivate trip taking. While fill-in surveys can facilely present a list of ‘‘in-home/out-of-home’’ activities, they are not a good medium for probing what motivated the activity or how the travel-trip was chosen. Doherty (2006), Lee-Gosselin, Doherty, and Papinksi (2006), and others have used prompted surveys and custom designed web studies to fill-in this data gap. Paradoxically, while passive GPS traces measure macro-level movement, they may have a dual role, helping to mine the confluence of individual choice. Increasingly cell phones are being used to provide travelers with real-time dynamic traffic information (Technology Review, 2011). It is becoming feasible to correlate route and mode changes with the distribution of traffic updates and alternative choices. As an example, consider a transit rider consulting a real-time, crowd-sourced buslocation app on their cell phone (e.g., ‘‘Tiramisu,’’ see Steinfeld, Zimmerman, Tomasic, Yoo, & Aziz, 2011). Depending on the predicted wait time, the potential rider may, on the spot, select a different route or a different mode (walking), or they may choose to complete their activities (say shopping and recreation) in a different temporal sequence. The original mode, destination, and activity have each changed with updated information. Potentially, the passive GPS will record travel that is less predictable than traditional surveys and will provide insights into how people respond to dynamic information, like traffic congestion, in situ.

Cell Phone Enabled Travel Surveys: The Medium Moves the Message

65

Passive GPS will also provide useful information to anticipate natural experiments. How do travelers react when there are disruptions to the transportation network, preannounced road closings, detours, or bridge closings? Or, more mundanely, how might variable congestion pricing impact the network, in terms of both temporal and spatial shifts? Given baseline data, the GPS trace can then compare the changes in traffic volumes, route-taking, and temporal distributions. With a bank of experiential data, self-reported surveys may no longer need to ask travelers what they ‘‘might do.’’

3.4. Conclusions and Future Issues for Study The tools we use to collect travel data are intricately linked to our data processing capabilities (Kitamura, Chen, Pendyala, & Narayanan, 2000). In the 1950s and 1960s computing power and data storage were limited; the four-step model was not able to incorporate predictive data elements, like time of day and multimodal trip taking. In the 1970s, data processing capabilities expanded and models then incorporated more behavioral and household level observations. Today, the speed of computer processing and new data mining tools make it feasible to envisage continuous mobility traces that process quantum amounts of aggregated and anonymous data points. The cell phone GPS trace will soon collect travel information silently and continuously and reveal actual patterns of movement, rather than those estimated through simulation or models. Fewer survey studies may mean less information about the demographic and psychological motivations for travel, which are important predictors in current modeling. The trade-off is that as we reduce the ‘‘noise’’ from survey measurement the ‘‘signal’’ will be modified, and record far more system-wide and baseline travel data. In many ways this is a stronger signal because it will be produced in real time, across different transportation modes. More than five years ago, Doherty (2006) wrote a paper with the provocative title ‘‘Should We Abandon Activity Type Analysis.’’ He suggested then that transportation researchers look again at the use of traditional activity classifications like work and shopping. His weeklong scheduling study found that behavior was far more dynamic than what was captured by the conventional survey measures. This work suggests that survey results were ‘‘noisier’’ than originally imagined because the studies classified behavior through the ‘‘lens’’ of researchers, not of respondents. As we move forward, we are less likely to want to actively recruit samples that report on their daily travel behavior if we fundamentally ask new questions. It is recognized that we will still encounter segments of the population who do not wish to carry the phone or have earlier software. However, the market shows tremendous growth. In six countries more than 50% of the population uses smart phones: these are Australia, the United Kingdom (sic), Sweden, Norway, Saudi Arabia, and the United Arab Emirates. An additional seven countries — the United States, New Zealand, Denmark, Ireland, Netherlands, Spain, and Switzerland — now have more than 40% smart phone penetration (Think with Google, 2012).

66

Jane Gould

The source notes that, ‘‘a global movement is happening as smart phone adoption moves mainstream. Mobile devices have become indispensable to people’s lives and are driving massive changes in consumer behavior.’’ For reaching emergency services, GPS is generally regarded as a useful feature. In the future, acceptance of its value may tip public sentiment to support development of a vast ‘‘data commons,’’ particularly for traffic information and congestion mitigation. Recruitment for additional, specialized studies, say for attitudinal data, short stated preference studies, or demographic information, might facilely take place, still using the telephone, if users are compensated with either free cell phone plans or air time. In conventional travel surveys there is considerable ‘‘noise’’ in the data, as validity depends on factors like the representativeness of the sample, and then on the integrity and veracity of self-reports. With a passive GPS trace, the ‘‘noise’’ from survey measurement and error are minimized. The trade-off is that as we reduce the ‘‘noise’’ from survey measurement the ‘‘signal’’ or data is modified. We consider how the collection of real-time tracking may change measures like trip frequency, modeshare, and activity coding. For some segments, like bicycle users, there is immediate value in visualizing daily travel, and making it into more of a participative venture. Hence, for some groups the passive GPS trace may evolve into a more interactive and personalized visual travel logger. It is also possible that ‘‘collective,’’ ‘‘crowd-sourced’’ travel behaviors might emerge, say for daily rail and bus riders. It is a useful speculation whether a GPS trace can be used, in the words of Kitamura, Fujii, and Pas (1997), to make people more aware of their travel patterns, and help them individually and collectively, assess the impact of transportation on their quality of life. At the collective level, real-time traffic information is already being used to redirect vehicle trips and reduce air pollution and fuel consumption. Perhaps the GPS data could be used, at a macro level, to analyze natural experiments; for example, when drivers encounter less traffic on a regular commute, do they make additional new trips on the same tour, or new trip later in the day or evening? It is necessary and inevitable as this new technology is scoped, to address whether the GPS traces are confidential and anonymous. This was anticipated by Ratti et al. (2006), who viewed privacy issues as a concern when the data was provided to a third party, other than the mobile phone operator. Working with anonymous, aggregated data, individual movements cannot be tracked and individual privacy is a non-issue (Ratti et al., 2006, citing Fisher & Dobson, 2003; see also Wigan & Clarke, 2009). In April 2011 (parenthetically, while this chapter was being written) cell phone traces became a popular news story. A hacker revealed that Apple I-phones, and to a lesser extent Android phones, were surreptitiously recording tracking locational data. Once installed, the operating system upgraded logged location data, including time and date, and stored it in a hidden file. The software used triangulation from cell phone towers and Wi-Fi. It was not as accurate as it would be using GPS data (Homeland Security Newswire, 2011). Ironically, Quercia, DiLorenzo, Calabrese, and Ratti (2011) had developed a cell phone application to measure outdoor advertising exposure that scrambles the name/identity of the recipient and reports erroneous

Cell Phone Enabled Travel Surveys: The Medium Moves the Message

67

locations. Ingeniously, the randomized response algorithm still makes it possible to accurately collect and process aggregated location data. In this discussion of cell phone privacy issues, we should recall that the demand for location-based services began with an European Universal Service Directive (Directive 2002/22/EC) which required fixed and mobile network operators to transmit the location of people calling ‘‘112’’ emergency lines, in the best possible way based on their national emergency standards and the technological possibilities of the networks (GIS news, cited by Ratti et al., 2006). A later implementation in the United States was mandated by the Federal Communications Commission and launched emergency e-911 service. As cell phones with smart applications grow, they will increasingly be used to help people navigate, select routes, and save fuel or travel time. They will also be used to purchase goods, and carry transit fares. In many ways, they will become the indispensable ‘‘computer that you wear.’’ Like e-911, the privacy issues raised by their breadcrumb trail are likely to be offset by new applications that are seen to ‘‘empower’’ consumers rather than ‘‘surveil’’ them (LeeGosselin, Doherty, & Shalaby, 2010). So, as data becomes ubiquitous and continuous, the challenge will be not to conceal it — but to exploit it usefully. Developing ‘‘good’’ data algorithms has some similarity to developing ‘‘good’’ survey questions because in both cases the researcher influences what is measured: for example, inferences about speed and stops are used to deduce the mode of travel, and inferences from GIS overlays and duration are used to code activities. Srinivasan, Bricka, and Bhat (2010) observe, ‘‘The use of passive GPS technology in travel surveys shifts considerable burden from the respondent to the analyst.’’ There is no standardization so different algorithms will make different observations about the same ‘‘objective’’ travel event (Stopher, 2009b). An analogy from self-reported surveys is that there are multiple ways to get at the same information and elicit different answers (e.g., ‘‘What is your age’’; When is your birthday’’; ‘‘How old are you?’’). The final observation, and a challenge for the future, is that although cell phone traces are an entirely new way of measuring mobility patterns, researchers will indubitably choose to interpret them in familiar ways. Resistance is due, in part, to professional norms of the research community and in part to the extensive investments made in existing simulations and models. But, in the literature review we noted how computer scientists and media specialists were beginning to probe mobility data for spatial/temporal patterns. Not unlike the fleetness of the horseless carriage, the passive GPS brings a previously unimagined volume and speed to travel data. The continuing push to investigate this data from other disciplines, and the opportunity for both new revenue and new data streams will likely refocus transportation efforts.

Acknowledgments I would like to acknowledge the assistance and offer of mobile survey data from Geoff Palmer at Survey-on-the-Spot, and the assistance of the marketing department

68

Jane Gould

at the MBTA (Massachusetts Bay Transit Authority) which allowed me to internally test mobile surveys on their Blackberry phones. I would also like to acknowledge the help of Ryan Chin, who helped me navigate the MIT Media Lab, in a different context.

References American Association for Public Opinion Research (AAPOR). (2010). Research synthesis on online panels. Public Opinion Quarterly, 74(4), 711–781. Asakura, Y., & Hato, E. (2004). Tracking survey for individual travel behaviour using mobile phones: Recent technological development. Transportation Research C, 12, 207–233. Asakura, Y., Hato, E., & Sugino, K. (2005). Simulating travel behaviour using location positioning data collected with a mobile phone system. Springer Operations Research/ Computer Science Interfaces Series, 31, 183–203. Bayir, M., Demirbas, M., & Cosar, A. (2010). A web- based personalized mobility service for smartphone applications. The Computer Journal, 54(5), 800–814. Beatty, G. (2010). Why aren’t we saving the planet? A psychologist’s perspective. London: Routledge. Becker, R., Ca´ceres, R., Hanson, K., Loh, J. M., Urbanek, S., Varshavsky, A., & Volinsky, C. (2011, June). Clustering anonymized mobile call detail records to find usage groups. 1st Workshop on Pervasive Urban Applications (PURBA). Retrieved from http://www. kiskeya.com. Accessed on June 23, 2011. Bierlaire, M., Chen, J., & Newman, J. (2010). Modeling route choice behavior from smartphone GPS data. Transport and Mobility Laboratory Report No. 101016, Ecole Polytechnique Federale de Lausanne, Switzerland. Calabrese, F., Pereira, F., Lorenzo, G., Liu, L., & Ratti, C. (2010). The geography of taste: Analyzing cell-phone mobility and social events. Pervasive Computing, 6030, 22–37. doi:10.10070978-3-642-12654-3_2 Chapleau, R., Trepanier, M., & Chu, K. K. A. (2008). The ultimate survey for transit planning: Complete information with smart card data and GIS. Presented at the 8th international conference on transport survey method, Annecy, France. Chen, J., Newman, J., & Bierlaire, M. (2009). Modeling route choice behavior from smart-phone GPS data. Retrieved from http://transpor2.epfl.ch/proceedings/CHEN09_IATBR.pdf. Accessed on June 2, 2011. Chorus, C., & Timmermans, H. (2010). Ubiquitous travel environments and travel control strategies: Prospects and challenges. In M. Wachowicz (Ed.), Movement aware applications for sustainable mobility: Technology and applications (pp. 30–51). Hershey, PA: IGI. Doherty, S. (2006). Should we abandon activity type analysis? Redefining activities by their salient attributes. Transportation, 33(6), 517–536. doi:10.10070811116-006-0001-9 Eagle, N., & Pentland, A. (2005). Eigenbehaviors: Identifying structure in routine. Retrieved from http://vismod.media.mit.edu/tech-reports/TR-601.pdf Fisher, P. F., & Dobson, J. E. (2003). Who knows where you are, and who should, in the era of mobile geography? Geography, 88(4), 331–337. Gonzales, M., Hidalgo, C., & Barabasi, A. (2008). Understanding individual human mobility patterns. Nature, 453, 479–482. Homeland Security Newswire. (2011, April 22). Cell phone privacy. Retrieved from http:// www.homelandsecuritynewswire.com/cell-phone-privacy. Accessed on May 24, 2011.

Cell Phone Enabled Travel Surveys: The Medium Moves the Message

69

Isaacman, S., Becker, R., Caceres, R., Kabourov, S., Martonosi, M., Rowland, J., & Varshavsky, A. (2011). Identifying important places in people’s lives from cellular network data. Conference Paper. Retrieved from http://www.cs.arizona.edu Kennedy, C., & Everett, S. (2011). Use of cognitive shortcuts in landline and cell phone surveys. Public Opinion Quarterly, 75(2), 336–348. doi:10.1093/poq/nfr007 Kitamura, R., Chen, C., Pendyala, R., & Narayanan, R. (2000). Micro-simulation of daily activity-travel patterns for travel demand forecasting. Transportation, 27, 25–51. doi:10.1023/A:1005259324588 Kitamura, R., Fujii, S., & Pas, E. (1997). Time-use data, analysis and modeling: Toward the next generation of transportation planning methodologies. Transport Policy, 4(4), 225–235. doi:10.1016/S0967-070X(97)00018-8 Lee-Gosselin, M., Doherty, S., & Papinksi, D. (2006). Internet-based prompted recall diary with automated GPS activity-trip detection: System design. Proceedings of the 85th annual meeting of the Transportation Research Board, Paper No. 06-1934, Washington, DC. Lee-Gosselin, M., Doherty, S. T., & Shalaby, A. (2010). Data collection on personal movement using mobile ICTs: Old wine in new bottles? In M. Wachowicz (Ed.), Movementaware applications for sustainable mobility: Technologies and approaches (pp. 1–14). Hershey, PA: Information Science Reference. Lee-Gosselin, M., & Harvey, A. (2006). Non web technologies. In P. Stopher (Ed.), Travel survey methods: Quality and future directions (pp. 561–568). Bingley, UK: Emerald. Li, Z., & Shalaby, A. (2008). Web based GIS system for prompted recall of GPS assisted personal travel surveys: System development and experimental study. 87th Transportation Research Board meeting, Washington, DC. Madey, G., Barabasi, A., Chawla, N., Gonzalez, M., Hachen, D., Lantz, B., Pawling, A., y Yan, P. (2007). Enhanced situational awareness: Application of DDDAS concepts to emergency and disaster management, Computational Sciences — ICCS. Lecture Notes in Computer Science, 4487, 1090–1097. Manasseh, C., Aherh, K., & Sengupta, R. (2009). The connected traveler: Using location and personalization on mobile devices to improve transportation. ACM Digital Library. Proceedings of the 2nd International Workshop on Location and the Web, Lisbon, Portugal. Murakami, E. (2008, January). Hard to reach populations. Presentation to N.Y. Metropolitan Transportation Council and Region 2 UTRC meeting. In Resources on Household Travel Survey Methods, New York, NY. Oulasvirta, A. (2005). Grounding the innovation of future technologies. Human Technology, 1(1), 58–75. Quercia, D., DiLorenzo, G., Calabrese, F., & Ratti, C. (2011). Mobile phones and outdoor advertising: Measurable advertising. IEEE Pervasive Computing, 10(2), 28–36. doi:10.1109/ MPRV.2011.15 RAND. (1968). Retrieved from http://www.rand.org/pubs/memorandaRM588.html. Accessed on September 1, 2011. Ratti, C., Pulselli, R. M., Williams, S., & Frenchman, D. (2006). Mobile landscapes: Using location data from cell phones for urban analysis. Environment and Planning B: Planning and Design, 33(5), 727–748. doi:10.1068/b32047 Reades, J., Calbrese, F., & Ratti, C. (2009). Eigenplaces: Analyzing cities using the space-time structure of the mobile phone network. Environment and Planning B, 36, 824–836. doi:10.1068/b34133t Reades, J., Calabrese, F., Sevstuk, A., & Ratti, C. (2007). Cellular census: Explorations in urban data collection. IEEE Pervasive Computing, 6(3), 30–38.

70

Jane Gould

Reddy, S., Mun, M., Burke, J., Estrin, D., Hansen, M., & Srivastava, M. (2010). Using mobile phones to determine transportation modes. ACM Transactions on Sensor Networks, 6(2) article 13. Rodier, C., & Shaheen, S. (2010). Transit based smart parking: An evaluation of the San Francisco bay area field test. Transportation Research, Part C, 12, 225–233. Srinivasan, S., Bricka, S., & Bhat, C. (2010). Methodology for converting GPS navigational streams to the travel-diary data format. Transportation Research Board annual meeting paper, Washington, DC. Steinfeld, A., Zimmerman, J., Tomasic, A., Yoo, D., & Aziz, R. (2011). Mobile transit rider information via universal design and crowd-sourcing. Transportation Research Board annual meeting paper, Washington, DC. Stopher, P. (2009a). The travel survey toolkit: Where to from here. In P. Bonnel, M. LeeGosselin, J. Zmud & J. Madres (Eds.), Transport survey methods (pp. 15–46). Bingley, UK: Emerald Group Publishing. Stopher, P. (2009b). Collecting and processing data from mobile technologies. In P. Bonnel, M. Lee-Gosselin, J. Zmud & J. Madres (Eds.), Transport survey methods (pp. 361–391). Bingley, UK: Emerald Group Publishing. Tancer, B. (2009). Click. London: Harper Collins. Technology Review. (2011). Social surveillance yields smarter directions by Tom Simonite. Technology Review, February 2. Think with Google. (2012). Retrieved from http://www.thinkwithgoogle.com/mobileplanet. Accessed on July 12 2012. Van der Hoeven, F. (2010). Setting the stage for the integration of demand responsive transport and location-based services. In M. Wachowicz (Ed.), Movement aware applications for sustainable mobility: Technology and applications. Hershey, PA: IGI Global Publishing. doi:10.4018/978-1-61520-769-5.ch004 Wigan, M. R., & Clarke, R. (2009). Transport and surveillance aspects of location-based services. Transportation Research Board, 2105, 92–99. doi: 10.3141/2105-12

Chapter 4

A Case Study: Multiple Data Collection Methods and the NY/NJ/CT Regional Travel Survey Jean Wolf, Jeremy Wilhelm, Jesse Casas and Sudeshna Sen

Abstract Purpose — The Regional Household Travel Survey (RHTS) was a large-scale regional household travel survey that covered 28 counties in the New York, North New Jersey, and Connecticut regions (i.e., the New York City ‘‘megaregion’’). Data collection for the survey began in October 2010 and concluded in November 2011. The chapter discusses the multiple modes and methodologies used in the RHTS, and presents the participation rates and trip rates obtained using this multimodal approach. Methodology/approach — This survey used a combination of web, telephone, and mail-out/mail-back methods to collect household and travel information from approximately 18,800 households. Ten percent of the sampled households participated in the survey by using wearable global positioning system (GPS) devices that collected detailed travel data which, in turn, were processed and presented back to the households in a GPS-based prompted recall interview administered by web or telephone. The GPS component was used to generate trip rate correction factors for the other 90% diary-based households. Findings — This large regional survey was the first to use this specific combination of methods and technologies, and provides many insights into the success of targeted survey modes and methods for different population groups. Keywords: GPS; Household Travel Survey; large scale; multimode; New York; New Jersey; Connecticut Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

72

Jean Wolf et al.

4.1. Introduction The Regional Household Travel Survey (RHTS), sponsored by the New York Metropolitan Transportation Council (NYMTC) and the North Jersey Transportation Planning Authority (NJTPA), was a large-scale regional household travel survey conducted from the fall of 2010 through the fall of 2011. This survey used a combination of web, telephone, and mail-out/mail-back methods to recruit and retrieve household, person, vehicle, and travel information from approximately 18,800 households located in the 28-county region that spans areas in New York, New Jersey, and Connecticut. In addition, 10% of the sample participated by using wearable global positioning system (GPS) devices that collected detailed travel data which were, in turn, processed and presented back to the households in a GPS-based prompted recall interview administered by web or telephone. The primary purpose of the GPS component was to generate trip rate correction factors (to account for trip underreporting typically found in diary-based surveys) for the other 90% of the sampled households. This large regional survey is the first to use this specific combination of methods and technologies and provides many insights into the success of targeted methods for different population groups. The pretest was conducted in the spring of 2010 to test and evaluate these methods, with the main survey starting in the fall of 2010 and completing in the fall of 2011. An address-based sampling frame was used, with a portion of the households matched to known/available landline telephone numbers (i.e., ‘‘matched’’ sample) and the remainder considered as ‘‘unmatched.’’ Predefined percentages of the matched and unmatched sample were flagged for GPS participation in order to recruit the targeted 10% subsample. Any household that stated that it did not want to receive GPS devices or participate in the GPS component was asked to remain in the diary portion of the study. Monetary incentives were offered, ranging from 25 USD to 75 USD per household, with the amount varied based on matched/ unmatched status, GPS participation, and use of the web option for the recruit and retrieval interview. This chapter reviews the multiple survey modes and methodologies used in the RHTS and presents preliminary results reflecting the effectiveness of this multimodal approach in collecting the desired representative sample of households in the region.

4.2. Background The past decade of research in the household travel survey profession has seen a preponderance of questions about the cause and effect of declining response rates (Alsnih, 2006). Much of the literature has identified the promise of multiple method data collection as a means for overcoming these declining rates while simultaneously cautioning about the potential for fragmentation that may occur with the application

A Case Study: Multiple Data Collection Methods

73

of multiple survey modes (Zmud, 2003). What follows is a cursory, chronological review of the research and commentary on the application of mixed modes of data collection. There are five broad categories of data collection modes:  face-to-face or intercept surveys, also referred to personal interviews, administered on paper (PAPI, Paper and Pencil Interview) or on a computer (CAPI, ComputerAssisted Personal Interview);  telephone (most often administered using a computer and known as CATI, Computer-Assisted Telephone Interview);  postal or mail (self-administered by PAPI);  Internet (known as CASI, Computer-Assisted Self Interview or CAWI, ComputerAssisted Web Interview); and  GPS methods, that include the collection of second-by-second GPS traces that are processed into trip information — using only software algorithms or in tandem with a GPS-based prompted recall interview (which could in turn be administered by any of the first four survey modes listed). There are advantages and disadvantages to each of these modes. These trade-offs have been covered in detail by numerous researchers, including Wolf (2000), Bonnel (2003), Morris and Adler (2003), and Bayart, Bonnel, and Morency (2009). A comparison matrix created first by Ettema, Timmermans, and Van Veghel (1996), adapted for use by Morris and Adler (2003), and presented at the 2001 ISCTSC conference is shown in Table 4.1; this table presents a view of the trade-offs as seen at the time. A simple summation of the ordinal rankings indicates that Morris and Adler (2003) might have ranked face-to-face first (22 points), telephone and Internet equally (17 points), and postal questionnaires last (13 points). One category that may be different today is the cost rankings, particularly as pertains to labor. The more share of a project that can be conducted by web, the lower the cost of manual labor will be needed for dialling households. As such, the ordinal rankings for telephone and Internet modes could be switched, giving a slight edge to the Internet mode in overall ranking. It should also be noted that GPS methods were not considered as a viable option at the time this table was created. One key observation to be made when reviewing a comparison table such as this is that the inherent weaknesses of each mode can potentially be offset by offering a mix of modes in a given study. With this in mind, different modes could be offered together, separately, or in combinations, depending on factors such as the needs of the target population, eligible respondents, data needs, and budgetary considerations (Zmud, 2003). Furthermore, by offering a range of options for participation, overall coverage of the target populations should increase by supporting different reporting preferences. Up through the 1970s, the face-to-face, in-home approach of data collection was the dominant mode available to researchers. In the 1970s, the telephone interview began to supplant the face-to-face approach within the United States, while the

74

Jean Wolf et al.

Table 4.1: Relative merits of modes of primary data collection. Attribute

Survey mode Face-to-face interview

Telephone interview

Postal questionnaire

Internet questionnaire

Coverage ++++ ++ +++ + Response ++++ ++ + +++ rate Data ++++ +++ + ++ quality Language/ ++++ +++ + ++ Literacy Complexity +++ + ++ +++ of questions Costs + +++ ++++ ++ Quality ++ +++ + ++++ control Cultural Patriarchal Some genders/ Less suited to Access to the issues societies may cultures may be situations Internet may be present uncomfortable where restricted in difficulties disclosing multiple developing gaining sensitive languages are countries (e.g., individual information spoken parts of Africa) responses (e.g., income) Sharing of addresses, particularly in developing countries, poses problems Note: + , + + , + + + , + + + + represent an ordinal scale with + + + + indicating the best score.

face-to-face interview remained the primary mode in many other countries. These two modes have remained the primary methods for collecting travel survey data until recently with the rise in availability and widespread adoption of the Internet allowing for more web-based Internet surveying to become a part of the ‘‘toolkit’’ (Stopher, 2009b). In 2003, a paper in the ISCTSC conference compared three common modes: mail, telephone, and face-to-face surveys (Bonnel, 2003). Of note, Internet-based self-administered surveys did not merit mention in that discussion. In the same proceedings, Morris and Adler (2003) compared four modes (excluding GPS) and

A Case Study: Multiple Data Collection Methods

75

noted that one of the primary shortcomings of the Internet-based survey was in its limited coverage and high costs (when compared to telephone or postal questionnaires). The authors speculated that these drawbacks would diminish over time as the Internet became more widespread. In the current generation, it is tempting to view web-based surveys as having the potential to replace prior approaches. However, research is showing that the provision of multiple modes can be an effective tool to aide in the collection of data with adequate representativeness by providing a menu of options for people to respond in a way that fits their comfort level — some may elect to respond by web, but others may prefer to mail in diaries or speak to someone on the telephone (Bayart et al., 2009). The development of GPS as a tool for data collection hit its first major milestone in 1996 with the FHWA-sponsored Lexington pilot study (Wagner, Murakami, & Neumeister, 1997) and in 1997 with the Austin household travel survey (Casas & Arce, 1999). Since then, ongoing improvements in the technology have led to options for the use of both in-vehicle and wearable GPS devices, allowing for the large-scale capture of more accurate and more detailed data about travel behavior in a passive, objective manner. Wolf identified a range of GPS-based survey options at the ISCTSC conference in 2004 (Wolf, 2006) and Stopher and Wolf (Wolf, 2009) each provided an update on the state of the art for GPS use in travel surveys at the 2008 conference. Most recently, the use of GPS technology to replace traditional methods has gained momentum, with GPS-only surveys leveraging GPS-based prompted recall interviews implemented successfully for household travel surveys conducted in Jerusalem (Oliveira et al., 2011), New York/New Jersey/ Connecticut (Wilhelm, Wolf, & Oliveira, 2012), and in Cincinnati Ohio (Giaimo, Anderson, Wargelin, & Stopher, 2011). GPS methods offer reduced respondent burden with higher levels of data accuracy and more robust datasets; the exact technologies used to capture these data have also evolved, with lower cost GPS data loggers and smart phone options now available. Although some researchers have cautioned that GPS appears to have different influences on participation rates for different members of the population (Bricka, 2009), others have found that response rates to GPS-only surveys are not appreciably different from the rate seen in other modes (Stopher, 2009a). The survey presented in this chapter is one of the first to offer GPS as an option to other, more traditional, methods also offered within the same survey effort.

4.3. Regional Household Travel Survey (RHTS) Overview The target sample size for the RHTS was 18,800 households located across a 28-county region, with 10% of these households using GPS devices rather than travel diaries to capture trip details. To obtain adequate representation of the region’s population, an address-based sampling frame was used that contained a current listing of all city and rural route residential postal addresses in the 28-county regional

76

Jean Wolf et al.

area. A main advantage of using an address-based sampling frame is its reach into population groups that typically participate at lower-than-average levels, largely due to coverage bias (such as households with no phones or cell phone-only households). For efficiency of data collection, the addresses were matched to telephone numbers and had a listed name of the household appended to it. The frame was then stratified by county and sampling bins. A total of 21 sampling bins were defined based on transit accessibility and area type to adequately capture the varying travel behavior characteristics of residents in the 28-county study area. A systematic random sample of the addresses in the sampling frame was drawn within each of the strata — a combination of county and sampling bin. Sample generation was done on a bimonthly basis. All sampled households received advance letters informing them about the purpose of the study, the desire to have households such as theirs participating in the study, the incentive amount (as appropriate), and instructions for participating. Recipients of the advance letter were provided with a telephone number or website (and personal identification number, or PIN) which they could use to initiate the recruitment interview on line. Households that were considered to be matched sample were also informed that they would receive a call soon to start their survey participation. Survey interviewers then attempted to make contact with the matched households that had not already completed the online survey, and collected household, person, and vehicle-level details about the household before assigning a specific travel date for participants. The same data were collected for respondents who self-recruited online. Recruited households were then mailed a survey packet that included pertinent information about the survey in a cover letter and a customized diary for each member of the household. Respondents were asked to enter their travel data online the day after their assigned travel date; those who did not report their data online and who were from the matched sample were called. Unmatched sample was called if a phone number was collected during the recruit interview. Households were sent reminder texts and/or e-mails if they provided a mobile phone number (and indicated that they would like to receive text messages) or an e-mail address. GPS households were flagged in the sample prior to recruitment. Once recruited, wearable GPS devices were provided for all household members between the ages of 16 and 75, with diaries provided for other household members (i.e., those under the age of 16 or older than 75). In addition, simple memory joggers (as seen in Figure 4.1) were provided with the GPS devices to help GPS participants remember some of the key details of travel during the prompted recall interview. The pretest revealed that a memory jogger was necessary, especially during CATI retrieval interviews when the participants could not see the map with their GPS-based trips displayed. Details of the pretest were covered by Chaio et al. (2011). Once households completed the recruit interview (and GPS households were informed of the GPS component), diary packets were mailed (or GPS packages were shipped), with further instructions provided about how to participate. To handle participant self-administered travel reporting, a web-based application was developed and implemented that utilized Google Maps to display a map interface

A Case Study: Multiple Data Collection Methods

77

Figure 4.1: GPS memory jogger example (front and back).

complete with the road and transit networks, points of interest, and either shortest path routes for diary participants or actual travel routes for GPS participants. This application, called TripBuilder Web, was developed by GeoStats to support all methods of travel reporting for GPS or diary households — telephone interviews (CATI), participant self-completion (CASI), and mail-back data entry. By using one tool for the collection of trips reported by different methods, consistent rules and logic for data entry and validation could be assured, reducing potential biases inherent in each mode. In addition, follow-up phone calls were made to any mail-back households where the diaries returned have incomplete or unclear information. As mentioned previously, GPS-based prompted recall was supported for both CASI and CATI retrieval methods. Once GPS devices were returned by study participants, data were immediately downloaded, imported into the study GPS database, and algorithms were run on the data to automatically identify trip ends and travel modes. Data analysts at GeoStats would perform a quick review of the processed trip details and then the data were released for the prompted recall interview. Households that reported that they wanted to complete the survey by phone were called immediately, with those who indicated a preference for web retrieval being contacted and encouraged to go online to complete the survey.

78

Jean Wolf et al.

This survey also implemented the use of automated text messages and/or e-mails to remind respondents of their upcoming travel date or to remind them to provide their data after the travel date had passed. This was offered as an alternative to telephone calls to increase contact rates. These reminders included the travel day reminder delivered the day prior to the assigned travel day, a retrieval reminder for web households (diary or GPS-based prompted recall), and a GPS equipment return reminder for those who did not return their equipment when instructed. These options worked well for participants who did not like telephone contact. When combined with the web recruit and retrieval interviews, some participants were able to avoid all telephone contact with the survey administrators.

4.4. Results through September 2011 As of September 28, 20111 a total of 28,934 households were successfully recruited to participate in the Regional Household Travel Survey (RHTS). This total was calculated based on the assumption that 65% of the recruited households (or 18,800 households) would complete the survey and report travel details. Travel retrieval was approximately 83% complete (15,692 of 18,800) at the end of September, with data collection completed in November 2011. As seen in Table 4.2, 87.5% of households were recruited via CATI, and 12.5% were recruited online (CASI). GPS households have shown a higher tendency to participate in recruitment online rather than by telephone, with 45% of GPS households completing the recruitment survey online, compared to 9% of diary households participating online. Table 4.3 summarizes the overall data collection mode among those who were recruited via CATI and CASI. Those who were recruited in a particular mode were more likely to follow through with retrieval using the same mode. The benefit with the two-stage survey approach is that respondents can opt to complete one part of the survey in one mode and switch to another, based on what is most convenient for them. A slightly higher percentage of those who were recruited by CASI were retrieved by CASI (57%) compared to those recruited by CATI and followed through retrieval via CATI (51%). Switching modes after recruitment was slightly less likely to occur among CASI recruits than in CATI recruits, with 34% of CATI recruits switching to CASI and only 31% of CASI recruits switching to CATI. When looking at survey modes by sample type (as seen in Table 4.4), as expected, about half of the unmatched sample completed the survey online. This may be due to a preference to not wanting to provide information on the phone and perhaps why they disconnected their landline telephone in the first place. Those who were

1. Although the fieldwork is now complete and survey results summarized, the sponsoring agencies have requested that the statistics reported here remain through September 2011 rather than for the complete study period until the final report for the survey is published.

A Case Study: Multiple Data Collection Methods

79

Table 4.2: Recruit mode by GPS versus diary status (universe ¼ all recruited households). GPS household flag GPS Count Recruitment mode CATI 1,788 CASI 1,434 Total

3,222

Diary

Total

Percent

Count

Percent

Count

Percent

55.5 44.5

23,534 2,178

91.5 8.5

25,322 3,612

87.5 12.5

100.0

25,712

100.0

28,934

100.0

Table 4.3: Recruitment and retrieval modes. Recruit mode/Retrieval mode

Count

Percent

CATI CATI CASI Mail Total

7,534 5,107 2,204 14,845

50.8 34.4 14.8 100.0

CASI CATI CASI Mail Total

804 1,484 302 2,590

31.0 57.3 11.7 100.0

recruited online were also more likely to follow through with retrieval online as well. About half of these households (51%) also completed retrieval online compared to 36% of all households. About half of the households from the matched sample completed the survey over the telephone (CATI) compared to just over one-third of the matched sample participating online. Households from the unmatched sample (70%) were more likely to follow through with the entire survey process (both recruitment and retrieval) than their matched counterparts (54%). When looking only at GPS households, these households were more likely to come from the unmatched sample than from either matched or targeted Hispanic surname sample, with 54% of GPS households retrieved from unmatched sample (see Table 4.5). GPS households were also more likely to complete the survey online if they were in the unmatched sample frame. Overall, 59% of GPS households confirmed their GPS trips by telephone, while 41% confirmed travel using the web survey.

80

Jean Wolf et al.

Table 4.4: Retrieved households by data collection mode by matched/unmatched sample (universe: all retrieved households). Retrieval mode

Data to date Matched sample

Unmatched sample

Targeted Hispanic surname

Count Percent Count Percent Count CATI CASI Mail Total

7,019 4,789 2,138 13,946

Productivity Retrieval rates

50.3 34.3 15.3 100.0

669 832 128 1,629

53.7%

41.1 51.1 7.9 100.0

77 31 9 117

68.9%

Percent 65.8 26.5 7.7 100.0

19.3%

Total Count Percent 7,765 5,652 2,275 15,692

49.5 36.0 14.5 100.0

54.2%

Table 4.5: GPS retrievals by sample and survey mode (universe: all GPS retrieved households). Retrieval mode

Data to date Matched sample

Unmatched sample

Count Percent Count CATI CASI Mail Total

676 188 10 874

77.3 21.5 1.1 100.0

439 584 7 1,030

Targeted Hispanic surname

Percent

Count

Percent

42.6 56.7 0.7 100.0

2 1 0 3

66.7 33.3 .0 100.0

Total Count Percent 1,117 773 17 1,907

58.6 40.5 0.9 100.0

Table 4.6 shows all combinations of recruit and retrieval modes, broken down by GPS and diary households. This table also includes households that were recruited but not retrieved. Overall, respondents who are CASI recruited have a higher tendency to report their diary data (69%) compared to their CATI counterparts (56%). When comparing differences between GPS and Diary households, GPS households did a slightly better job of completing retrieval (61%) compared to diary households (57%). Table 4.7 presents some summary statistics of all retrieved households. The average household size for all participating households is 2.2 persons. Households completing retrieval via the mail-back option tend to be larger (3.0 persons) than the

CATI/CATI CATI/CASI CATI/Mail CATI/Not retrieved Subtotal CATI recruits Subtotal CATI recruited and retrieved CASI/CATI CASI/CASI CASI/Mail CASI/Not retrieved Subtotal CASI recruits Subtotal CASI recruited and retrieved Total recruited Grand total recruited and retrieved

Recruit mode/ Retrieve mode

43.7 8.1 0.7 47.5 100.0 52.5 25.6 45.2 0.5 28.7 100.0 71.3 100.0 60.8

367 648 7 412 1,434 1,022

3,222 1,960

Column percent

781 144 13 850 1,788 938

Count

GPS

11.1

47.0 45.1 2.4 37.3

10.7 3.0 .6 7.6

Row percent

25,712 14,689

414 790 280 694 2,178 1,484

6,508 4,615 2,082 10,329 23,534 13,205

Count

100.0 57.1

19.0 36.3 12.9 31.9 100.0 68.1

27.7 19.6 8.8 43.9 100.0 56.1

Column percent

Diary

88.9

53.0 54.9 97.6 62.7

89.3 97.0 99.4 92.4

Row percent

28,934 16,649

781 1,438 287 1,106 3,612 2,506

7,289 4,759 2,095 11,179 25,322 14,143

Count

Table 4.6: Recruit and retrieval mode by GPS versus diary (universe ¼ all recruited households).

100.0 57.5

21.6 39.8 7.9 30.6 100.0 69.4

28.8 18.8 8.3 44.1 100.0 55.9

Column percent

Total

100.0

100.0 100.0 100.0 100.0

100.0 100.0 100.0 100.0

Row percent

A Case Study: Multiple Data Collection Methods 81

82

Jean Wolf et al.

Table 4.7: Summary statistics.

Avg. household Size Avg. income Avg. household trips Avg. person trips Avg. age

CATI

CASI

Mail

Total

2.2

2.1

3.0

2.2

$50,000 – 74,999 11.6

$75,000 – 99,999 12.1

$75,000 – 99,999 15.6

$50,000 – 74,999 11.9

4.1 45.2

4.5 38.5

4.3 44.1

4.3 42.5

overall population, and therefore tend to report more household trips (15.6). The average reported income range2 for CATI households is lower than for households completing retrieval via CASI and mail-back but equal to the range for the overall average. Respondents completing retrieval over the web tend to be younger than the survey population overall (38.5 compared to 42.5, respectively). This demonstrates that the use of multiple modes for data collection is beneficial to the overall representativeness of the dataset as large households (which tend to complete via mail-back) and young households (which tend to complete online) are two of several significant population segments that are more difficult to survey using CATI methods only. As seen in Table 4.8, as household size increased, the percentage of households completing retrieval over the telephone decreased while, conversely, completion percentages for web retrieval increased. The same percentage of households with two household members, as opposed to three or more household members, completed retrieval via the mail-back option. Table 4.9 shows the retrieval modes used based on type of household phone. Cellonly households participated in the survey online as a higher percentage than other types of households. About half (53%) of cell-only households completed retrieval using the web survey, compared to 20% of landline-only and 36% of dual phone households. Cell-only households were also less likely to complete retrieval using the mail-back option compared to the other groups (8% compared to 15% for the other groups). When comparing the retrieval mode split, it appears that CATI was the preferred survey mode when considering all households. However, when comparing retrieval mode by ethnicity (as seen in Table 4.10), Asian respondents tended to complete the retrieval interview online (52%). This is compared to all other ethnicities, where a

2. Household income was only collected as a range in bins ranging in size from $15,000 to $50,000. Therefore, specific incomes are not known.

A Case Study: Multiple Data Collection Methods

83

Table 4.8: Household size by retrieval mode. Retrieval mode CATI CASI Mail Total

1 person

2 persons

3 or more persons

Total

Count Percent Count Percent

Count

Percent

Count

Percent

2,856 1,465 607 4,928

2,175 2,190 799 5,164

42.1 42.4 15.5 100.0

7,765 5,652 2,275 15,692

49.5 36.0 14.5 100.0

58.0 29.7 12.3 100.0

2,734 1,997 869 5,600

48.8 35.7 15.5 100.0

Table 4.9: Type of phone for household by retrieval mode. Retrieval mode CATI CASI Mail Total

Cell only

Land only

Dual

Total

Count

Percent

Count

Percent

Count

Percent

Count

Percent

456 615 92 1,163

39.2 52.9 7.9 100.0

965 287 222 1,474

65.5 19.5 15.1 100.0

6,299 4,706 1,952 12,957

48.6 36.3 15.1 100.0

7,720 5,608 2,266 15,594

49.5 36.0 14.5 100.0

Note: Responses of ‘‘Don’t Know’’ or ‘‘Refused’’ are excluded from the table.

majority of respondents completed retrieval over the telephone. African American households completed the survey by CATI at a higher rate than any other ethnicity, with nearly two-thirds doing so. Table 4.11 shows that respondents between the ages of 25 and 54 comprised the majority age group category for all retrieval modes. Compared to other retrieval modes, CATI had a higher percentage of respondents aged 65+, while CASI had the highest percentage of respondents aged less than 25. Overall, the percentage of Under 25 households is lower than the figures from the census and that the numbers for age groups 25–54 and 65+ match closely to census distributions. The 55–64 age group is higher than found in the census. This is a common distribution and something that is addressed when weighting and expanding the data. Also noteworthy is the fact that the figures for CASI responses show a better rate for the Under 25 group and match more closely than the other two modes for 25–54 and 55–65 age bins. Not surprisingly, the 65+ responses came largely by CATI. When evaluating retrieval mode based on reported household income (as seen in Table 4.12), CATI percentages were highest for the lowest income bins and decreased consistently as income increased (starting at 73.4% for the lowest bin and decreasing to 37.9% for the highest income bin). Conversely, the CASI percentages were lowest

11,079 10,459 3,999 25,537

N

43.4 41.0 15.7 100.0

Percent

Caucasian

2,040 619 509 3,168

N 64.4 19.5 16.1 100.0

Percent

African American

720 1,137 317 2,174

N 33.1 52.3 14.6 100.0

Percent

Asian

Note: Responses of ‘‘Don’t Know’’ or ‘‘Refused’’ are excluded from the table.

CATI CASI Mail Total

Retrieval mode 1,959 988 432 3,379

N 58.0 29.2 12.8 100.0

Percent

Hispanic/Mexican

Table 4.10: Ethnicity of respondent by retrieval mode (universe ¼ total retrievals, persons).

474 279 90 843

N

56.2 33.1 10.7 100.0

Percent

Other races

16,272 13,482 5,347 35,101

N

46.4 38.4 15.2 100.0

Percent

Total

84 Jean Wolf et al.

a

3,657 3,625 1,306 8,588 7,048

Count

22.1 26.5 24.1 24.1 32.4

Row percent

Less than 25

5,947 6,432 1,869 14,248 9,346

Count 35.9 46.9 34.5 39.9 42.9

Row percent

25–54

3,860 2,554 1,324 7,738 2,516

Count 23.3 18.6 24.5 21.7 11.6

Row percent

55–64

2,825 825 800 4,450 2,855

Count 17.0 6.0 14.8 12.5 13.1

Row percent

65+

289 264 113 666 NA

1.7 1.9 2.1 1.9 NA

Row percent

DK/RF Count

U.S. Census Bureau: Profile of General Population and Housing Characteristics 2010 (U.S. Census Bureau, 2012).

CATI CASI Mail Total U.S. Census 2010a

Retrieval mode

Table 4.11: Age of household participant by retrieval mode.

16,578 13,700 5,412 35,690 21,765

Count

100.0 100.0 100.0 100.0 100.0

Row percent

Total

A Case Study: Multiple Data Collection Methods 85

CATI 838 CASI 176 Mail 128 Total 1,142

73.4 15.4 11.2 100.0

1,086 359 258 1,703

63.8 21.1 15.1 100.0

Percent

N

N

Percent

$15,000 – 29,999

Less than $15,000

1,148 582 314 2,044

N 56.2 28.5 15.4 100.0

Percent

$30,000 – 49,999

Table 4.12: Household income by retrieval mode.

1,191 953 343 2,487

N 47.9 38.3 13.8 100.0

Percent

$50,000 – 74,999

791 848 263 1,902

N 41.6 44.6 13.8 100.0

Percent

$75,000 – 99,999 N 38.9 46.3 14.8 100.0

Percent

$100,000 – 149,999

1,098 1,308 419 2,825

Household income

423 528 160 1,111

N

38.1 47.5 14.4 100.0

Percent

$150,000 – 199,999

382 463 164 1,009

N

37.9 45.9 16.3 100.0

Percent

$200,000 or more

6,957 5,217 2,049 1,4223

N

48.9 36.7 14.4 100.0

Percent

Total

A Case Study: Multiple Data Collection Methods

87

for the lowest income bin and increased steadily as income increased (starting at 15.4% and increasing to 45.9%).

4.4.1.

Trip Rates

When looking at household-level trip rates (see Table 4.13), GPS households had higher trip rates than non-GPS households. Overall, the average household trip rate is 9.7 trips, but GPS households made, on average, almost 12 trips on their assigned travel date. Table 4.14 shows that corresponding person trip rates for GPS households held true as well. Similar to the previous table, person trip rates are higher for respondents in GPS households with an average of 5.4 whereas diary persons made 4.1 trips on their assigned travel date. This difference is significant and could be attributable to the GPS-based prompted recall method and/or to the possibility that GPS participants make more trips per day than non-GPS participants. Table 4.15 takes a closer look at the person-level and household-level trip rates broken down by GPS type and by retrieval mode. Mean person trip rates were higher for respondents completing via the web survey, than those participants completing the survey using either CATI or mail-back retrieval modes. Similarly, households that went online to provide their trip information reported more trips than those who responded via CATI or mail-back, regardless of GPS or diary household type. Mailback respondents (an option only available for diary households) reported almost as many trips as CASI respondents — likely due to larger households participating via mail-back than other data reporting modes.

Table 4.13: Average household trip rates by survey type. Survey type GPS Diary Total

Mean

N

11.9 9.4 9.7

1,907 13,785 15,692

Table 4.14: Person trip rates by GPS type. Survey type

GPS Diary Total

All respondents

Respondents age under 65

Mean

N

Mean

N

5.4 4.1 4.3

4,170 31,520 35,690

5.5 4.2 4.5

3,831 26,743 30,574

88

Jean Wolf et al.

Table 4.15: Average person and household trip rates by GPS type and by retrieval mode. Retrieval mode

CATI CASI Mail Total

Persons GPS

Households Diary

GPS

Diary

Mean

N

Mean

N

Mean

N

Mean

N

5.3 5.6 0 5.4

2,461 1,658 0 4,119

3.9 4.3 4.3 4.1

14,117 12,042 5,361 31,520

11.6 12.1 0 11.8

1,117 773 0 1,890

8.2 10.6 10.1 9.4

6,648 4,879 2,258 13,785

4.5. Findings and Next Steps As mentioned previously, the purpose for providing multiple methods of data collection in the RHTS was to maximize the opportunity for participation. It was thought that by providing an opportunity to participate in the survey either through the web, telephone, or mail, the survey coverage across different demographic segments would be increased. Based on the preliminary results using 83% of the targeted sample, it does seem evident that this theory was true. For example, offering a web-based survey increased participation from younger respondents who tend to be cell-only households and who are more Internet-savvy than the general population. Asians also used the web reporting option more than the other two options offered. Older respondents who were not as Internet-savvy proved to prefer to talk to someone on the phone. Mail-back options for diary households also proved to be an attractive reporting option for a significant number of households, particularly larger households; this is somewhat understandable given that households recruited by CATI experienced the amount of phone time required to report basic household-level information and are likely to prefer to just mail diaries back rather than engaging in another lengthy telephone interview. Although it was expected that different households would have different reporting modes, one unpredicted finding was that different members of a household preferred different reporting modes. Since the system was designed to support multiple reporting modes within a household, there was no problem supporting these intrahousehold differences. The reporting mode by participant was captured, with a household-level reporting mode assignment based on the last method used by a household member to complete the reporting for the household. This further supports the theory that multimodal surveys will improve both the coverage and participation rates within the survey. With respect to comparing GPS households and non-GPS households (i.e., diary households), as expected, the GPS-only method combined with the GPS-based

A Case Study: Multiple Data Collection Methods

89

prompted recall interview produced significantly higher trip rates at both the person and household level. Web-based methods for both GPS and diary households also produced higher trip rates, leading to new research questions with respect to how interviewer-mediated survey methods compare to web-based surveys in the collection of travel information. Data collection for this large-scale regional travel survey was completed in November 2011. Since then, all collected data were cleaned and delivered, first in draft datasets and then again as final datasets. Team members have been working on data weighting and expansion, leveraging the GPS results to better reflect expected trip rates, and are now finalizing the report and summarization of results. Work is also underway on the development and implementation of a web-based survey data visualizer and query system. Future reports and papers should reveal details of these elements as well as provide final survey results.

Acknowledgments The authors gratefully acknowledge the clients/sponsors of the Regional Household Travel Survey (Jorge Argote of NYMTC and Bob Diogo of NJTPA) for their direction and feedback throughout the survey effort. We would also like to recognize our consulting team partner on this survey — Parsons Brinckerhoff, Inc. (PB).

References Alsnih, R. (2006). Characteristics of web-based surveys and applications in travel research. In P. Stopher & C. Stecher (Eds.), Travel survey methods: Quality and future directions. Oxford: Elsevier. Bayart, C., Bonnel, P., & Morency, C. (2009). Survey mode integration and data fusion: Methods and challenges. In P. Bonnel, M. Lee-Gosselin, J. Zmud & J.-L. Madre (Eds.), Transport survey methods: Keeping up with a changing world. Bradford: Emerald. Bonnel, P. (2003). Postal, telephone, and face-to-face surveys: How comparable are they? In P. Stopher & P. Jones (Eds.), Transport survey quality and innovation. Oxford: Pergamon. Bricka, S. (2009). What is different about non-response in GPS aided surveys? In P. Bonnel, M. Lee-Gosselin, J. Zmud & J.-L. Madre (Eds.), Transport survey methods: Keeping up with a changing world. Bradford: Emerald. Casas, J., & Arce, C. (1999, January). Trip reporting in household travel diaries: A comparison to GPS collected data. Presented at the 78th annual meeting of the Transportation Research Board, Washington, DC. Chaio, K. A., Argote, J., Zmud, J., Hilsenbeck, K., Zmud, M., & Wolf, J. (2011, January). Continuous improvement in regional household travel surveys: The NYMTC experience. Presented at the 90th annual meeting of the Transportation Research Board, Washington, DC. Ettema, D. E., Timmermans, H. J. P., & Van Veghel, L. (1996). Effects of data collection methods in travel and activity research. In Research Report of European Institute of Retailing and Service Studies, Eindhoven University of Technology, Eindhoven, The Netherlands.

90

Jean Wolf et al.

Giaimo, G., Anderson, R., Wargelin, L., & Stopher, P. (2011). Will it work? Transportation Research Record: Journal of the Transportation Research Board, 2176, 26–34. doi:10.3141/ 2176-03 Morris, J., & Adler, T. (2003). Mixed mode surveys. In P. Stopher & P. Jones (Eds.), Transport survey quality and innovation. Oxford: Pergamon. Oliveira, M. G. S., Vovsha, P., Wolf, J., Birotker, Y., Givon, D., & Paasche, J. (2011). Global positioning system-assisted prompted recall household travel survey to support development of advanced travel model in Jerusalem, Israel. Transportation Research Record: Journal of the Transportation Research Board, 2246, 16–23. doi:10.3141/2246-03 Stopher, P. R. (2009a). Collecting and processing data from mobile technologies. In P. Bonnel, M. Lee-Gosselin, J. Zmud & J.-L. Madre (Eds.), Transport survey methods: Keeping up with a changing world (pp. 361–392). Bradford: Emerald. [Chapter 21]. Stopher, P. R. (2009b). The travel survey toolkit: Where to from here? In P. Bonnel, M. LeeGosselin, J. Zmud & J.-L. Madre (Eds.), Transport survey methods: Keeping up with a changing world (pp. 15–46). Bradford: Emerald. [Chapter 2]. U.S. Census Bureau. (2012). American FactFinder — Search. Retrieved from http://factfinder2. census.gov/faces/nav/jsf/pages/searchresults.xhtml?refresh¼t. Accessed on August 13, 2012. Wagner, D., Murakami, E., & Neumeister, D. (1997). Global positioning system for personal travel surveys. Washington, DC: FHWA, U.S. Department of Transportation. Wilhelm, J., Wolf, J., and Oliveira, M. G. S. (2012, January). Applications of GPS-based prompted recall methods in two household travel surveys. Presented at the 91st annual meeting of the Transportation Research Board, Washington, DC. Wolf, J. (2000). Using GPS data loggers to replace travel diaries in the collection of travel data. Dissertation, Georgia Institute of Technology, School of Civil and Environmental Engineering, Atlanta, GA. Wolf, J. (2006). Applications of new technologies in travel surveys. In P. Stopher & S. Stecher (Eds.), Travel survey methods: Quality and future directions (pp. 531–544). London: Elsevier Science [Chapter 29]. Wolf, J. (2009). Chapter 22: Mobile technologies: Synthesis of a workshop. In P. Bonnel, M. Lee-Gosselin, J. Zmud, & J.-L. Madre (Eds.), Transport survey methods: Keeping up with a changing world (pp. 393–401). Bingley, UK: Emerald. Zmud, J. (2003). Designing instruments to improve response. In P. Stopher & P. Jones (Eds.), Transport survey quality and innovation. Oxford: Pergamon.

Chapter 5

Conducting a GPS-only household travel survey Peter R. Stopher, Christine Prasad, Laurie Wargelin and Jason Minser

Abstract Purpose — This paper describes what the authors believe to be the first GPS-only full-scale household travel survey. Design/methodology — The survey commenced in early 2009 with the conduct of a pilot survey to help establish various parameters and procedures for the main survey. The main survey commenced in August 2009 and was completed in August 2010. It was designed as a household travel survey to be collected steadily over a 12 month period. The target sample size was originally set at over 3500 households, although this target was reduced downwards during the course of the survey. Each household member over the age of 12 was asked to carry a GPS device with them everywhere they went for a period of 3 days. After the 3-day collection period was completed, GPS devices were retrieved from households, the data were downloaded and processing of the data commenced. The study also involved a PR survey performed on the Internet. Findings — The paper concludes with lessons learnt from this GPS-only survey and suggestions for how future GPS-only surveys might be conducted. Originality/value of the paper — The paper describes the first GPS-only household travel survey and concludes that it is now feasible to conduct household travel surveys by GPS. Keywords: Household travel survey; Global Positioning System survey; prompted-recall survey; imputation of trips; mode; purpose

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

92

Peter R. Stopher et al.

5.1. Introduction and Overview For the past decade, GPS devices have been used increasingly as a means to validate household travel surveys (Casas & Arce, 1999; Draijer, Kalfs, & Perdok, 2000; Hawkins & Stopher, 2004; Kracht, 2006; Murakami & Wagner, 1999; Wolf, 2006; Wolf, Guensler, & Bachmann, 2001; Yalamanchili, Pendyala, Prabaharan, & Chakravarthy, 1999; inter alia). These devices provide measured versus reported travel activities, thereby eliminating a variety of problems including under-reporting and misreporting of travel. Self-reporting of travel information, whether retrieved by mail, Computer-Assisted Telephone Interviewing (CATI), or web-based formats, has been demonstrated to result in trip under-reporting. For instance, recent work by Wolf, Loechl, Thompson and Arce (2003), Forrest and Pearson (2005), Wolf (2006) and others has shown that diary information retrieved through CATI suggests trip under-reporting ranges from 20 to 30 percent, while Stopher and Greaves (2009) suggest it may be as high as 60 percent. In addition to the failure to report the number of trips correctly, it is known that respondents vary markedly in their ability to provide accurate information on other components of their travel. For instance, the tendency of respondents to round travel times to the nearest 5, 10 or 15 minutes is well known. Non-motorised travel (particularly walking) is also thought to be poorly recalled compared to motorised travel, although the extent of this discrepancy has not yet been established scientifically. Location information tends to be even more problematic, with people rarely able to provide address information for even commonly visited destinations such as work, school and the local grocery store to the degree of specificity required for geocoding and planning purposes (Stopher, 2004). The situation is even more problematic when trying to determine the route taken, with few people able to detail the route taken in terms of a sequence of street names. An additional perceived problem with diary-based approaches, and for that matter any type of survey of this nature, is respondent burden. This burden obviously increases as the level of detail required and the numbers of days of observation increase. While most Household Travel Surveys (HTSs) are 1 or 2 day surveys, evidence suggests that extending the survey period for a week or even longer results in greater statistical efficiency and may significantly lower sample size and cost requirements (Richardson & Meyburg, 2003). However, in reality the drop-off in reporting after even 1 day has tended to undermine the utility of this in practice. These issues aside, arguably the most pressing problem faced by all surveys is non-response. Starting with recruited households, while there is marked variability dependent on the exact strategies employed, as a rule response rates around 20–30 percent from a mail-back survey, 40–60 percent from a telephone survey and 60–75 percent for a face-to-face interview can be anticipated. However, non-response rates are not evenly distributed across the population, with certain groups (teenagers, larger and low-income households and those who travel more) under-represented in surveys (Stopher & Greaves, 2007). This leads to the potential for significant bias, which can only partially be accounted for in post-survey factoring of survey results.

Conducting a GPS-Only Household Travel Survey

93

Recent evidence pointing to the inaccuracies of diary-based and other selfreporting approaches, concerns about respondent burden, and rising non-response necessitate a fundamental change in the way we conduct HTSs. With the many recent developments in improving the capabilities and the user friendliness of small portable GPS devices, the time appeared ripe to test the potential for GPS to replace travel diaries. A multi-day GPS survey offers strong potential to tackle most, if not all, of the problems with self-reporting approaches. It also adds the capability to obtain multiple days of data from each person, something that is infeasible with self-report methods, especially diaries. This has two distinct advantages over a 1 day survey. First, it provides information on the day-to-day variability in travel, which is of increasing importance as we attempt to reduce both greenhouse gas emissions and the energy demands of day-to-day travel. Second, it enables acceptable levels of statistical accuracy in trip-making to be achieved with a substantially smaller sample size compared to a 1-day sample. GPS-based household travel surveys can be further enhanced by recruiting from address-based samples rather than traditional random-digit-dial (RDD) sampling frames. When all sources of undercoverage in RDD frames (i.e. households with no telephones, those in zero blocks, and cell-phone-only households) are considered, the percentage of U.S. households not covered by RDD frames may be as high as 30–45 percent. Address-based sampling has the potential to improve the representativeness of HTS samples by including households that cannot be captured by land-based phones. This approach also improves the ability to define specific geographic strata. This paper describes a GPS-only full-scale household travel survey that commenced in early 2009 with the conduct of a pilot survey to help establish various parameters and procedures for the main survey. The pilot survey has been documented elsewhere (Giaimo et al., 2009). The main survey commenced in August 2009 and was completed in August 2010. The target sample size was originally set at over 3500 households to be collected steadily over a 12 month period, although this target was reduced downwards during the course of the survey. Sampling used an addressbased sampling procedure, with households contacted initially by a combination of mail and telephone. While the overall response rate from households was similar to that of more conventional methods of surveying in the United States, some of the biases encountered in conventional surveys were not encountered in this survey and a representative sample of completed households was obtained. Each household member over the age of 12 was asked to carry a GPS device (GPS Personal Passive Activity Logger — PPAL) with them everywhere they went for a period of 3 days. The household received travel packets two days prior to scheduled travel dates containing a one-page PPAL instruction sheet, household and person information forms, a PPAL for each person aged 13+, one charger for every two devices and postage-paid return mailing materials. Household members under the age of 13 received a simplified ‘child diary’, which was to be completed on the first day of the travel period. PPALs were set to collect data on a second-by-second basis (since this has been found to provide a superior basis for imputing travel characteristics). After the 3 day collection period was completed, PPALs were retrieved from households, the data were downloaded and processing of the data commenced.

94

Peter R. Stopher et al.

Household and person information forms sent to households were for respondents to indicate if they had or had not left home on any particular travel day, or if they carried or forgot to take the PPAL with them on one or more days while they travelled. In addition, respondents were asked to provide workplace, school, and shopping addresses for household members. Together with the home address of the household these addresses were geocoded and used in the GPS processing. An Internet Prompted Recal (PR) survey was conducted on respondents, based on Google Mapss, and providing a playback of the GPS records for 1 day of their travel. Respondents were asked to fill in certain information about their travel. This information served two primary purposes to validate the results of the processing software that imputed trip ends, mode, occupancy, and purpose, and to provide a means for improving the software by identifying those situations where the software did not perform as well as expected. The achievements and challenges of the processing software are described elsewhere (Stopher, Zhang, & Prasad, 2011). In this paper, the results of the PR survey are documented in terms of response rates and usability of the results from the survey. The paper concludes with lessons learnt from this GPS-only survey and suggestions for how future GPS-only surveys might be conducted. The paper also concludes that it is now feasible to conduct household travel surveys by GPS, although there are some issues still to be explored.

5.2. Survey Methodology For the pilot and main survey, an address-based sampling approach was used with addresses within the Cincinnati region randomly selected from the most current U.S. Postal Delivery Sequencing File (DSF), sorted by census block groups. Two geographic groups were oversampled because they are important to travel modelling and are often underrepresented: (1) residential census blocks identified as having access to public transport and (2) block groups in residential areas near major universities within the region, and therefore, having a higher propensity for student residential housing. Once the sample was designed and selected, address matching with land-based phone numbers was conducted resulting in a 55 percent match between addresses and phone numbers over the data collection period from August 2009 to September 2010. All households were sent an advance letter describing the project and its importance, which was signed by the director of the Ohio-KentuckyIndiana Metropolitan Planning Organization. The 45 percent of households without a match to a land-based phone number were also sent an Internet address for online recruitment and a phone number to call for CATI recruitment. If the matched households did not complete the online recruitment, then they were telephoned. The recruitment interview consisted of collecting household and person demographic data and assigning travel days for GPS recording (three consecutive days) for all members of the household over 12 years old. Once recruited, PPALs and instructions, household and person forms, and simplified child diaries (for those under 12 years old) were shipped by courier — scheduled

Conducting a GPS-Only Household Travel Survey

95

Figure 5.1: The GPS-PPAL unit. to arrive 1–3 days before the assigned travel days. The forms were designed to collect the address information and the GPS usage status for each member for each day. Each person assigned a PPAL was asked whether s/he carried the device all day, whether s/he forgot the PPAL part of the day, whether the battery died during the day, whether the PPAL was forgotten for the entire day or whether s/he did not travel. A reminder phone call or email was placed the day before the first assigned travel day. The forms and child diaries could be returned with the PPALs, or the information could be entered online with a provided password. The PPAL, shown in Figure 5.1, is a personal unit that can be carried in a pocket or purse, or clipped on a belt or wristband. It records all modes of travel including car, public transport, bicycle and walk and can record inside many buildings. For the most part, the units recorded 3 days of travel. Respondents were provided with battery chargers and instructions for use. Respondents were encouraged to charge the units each night. The PPAL is described elsewhere (Stopher, FitzGerald, & Zhang, 2008). Once the units were returned, data were downloaded. Each data entry (GPS file, forms data, and household recruitment information) had an associated household and person ID. These data were compiled into a metadata file that was referenced to the GPS data.

5.3. The PR Survey For verification purposes and to provide data to improve the software for mode and purpose identification, a web-based PR survey was implemented. The PR survey

96

Peter R. Stopher et al.

displayed the respondent’s travel on a map in a common web interface (Google Mapss), and posed a series of questions regarding that travel, such as mode and trip purpose. The PR responses were then used to improve software to impute trip mode, purpose, and other missing data for the completed surveys (Stopher et al., 2011).

5.3.1.

Design Principles for the PR

The PPALs record all location data. The only errors that can occur in the location data are as follows: 1. Cold start problem: The device does not find position until after a trip has actually started. The data processing software corrects this for all except the first trip on the first day, which is corrected by the manual map editing that precedes the PR survey. 2. Lost signal: This is only a problem if it occurs near the end of a trip and results in a premature destination recording. This is normally corrected during the manual map editing process prior to the PR survey. Other errors in the GPS record arise if the person did not carry the device for the entire day, or if the battery ran out. In these cases, if it was indicated that s/he forgot the device for part of the day or that the battery ran out, then that day was excluded from sampling for the PR survey. Apart from these issues, the start and end times of travel on the GPS record must be correct, and, provided that the travel is also along the street or rail networks, a trip must have taken place. Thus, the only things that may need to be corrected with the GPS record by PR respondents are the following: 1. Processing missed identifying a brief stop (less than 120 seconds). Respondents can insert one or more stops, splitting one trip into two or more trips. 2. Processing has identified as a stop what was actually just a traffic stop or other delay of more than 120 seconds. Respondents can delete one or more stops, linking together two or more travel episodes. 3. Trips may be missing because the respondent forgot to take the device along, or because the trip was so short that position could not be acquired before the trip ended. Missing trips are particularly likely to occur at the beginning or end of a person’s day, when the GPS device may be forgotten. Based on these assumptions, confirmation of the start and end times of each trip was not requested in the main PR Survey, although this was included in the pilot. To allow for the deletion of a stop, two consecutive trips were displayed to the PR respondent at a time. Because the respondent can see and edit both trips, s/he has the option of deleting the middle stop of the travel, thereby joining together the first and second travel events into a single trip. To allow adding a stop, a question is included to ask whether the person travelled from the origin to the destination without stopping. A negative response causes an

Conducting a GPS-Only Household Travel Survey

97

Figure 5.2: Example of PR web survey format.

edit box to appear allowing respondents to insert the time they stopped and the time they started to travel again. If respondents recall reasonably accurately the time they stopped for each added stop, then the software can identify the additional stop location with reasonable accuracy. Following completion of all other travel details (companions, mode, and the activity at the stops), the respondent clicks on continue, which then displays the next trip pair. This continues in pairwise fashion to the end of the day. A closing question asks whether there is any travel or any other stop that the respondent remembers making on that day that was not recorded on the survey. If so, s/he is asked to record stops and approximate times. The PR survey is then complete. Figure 5.2 provides an example page from the PR survey.

5.4. Analysis of Results 5.4.1.

GPS Data Collection and Processing

Each person provided with a PPAL was asked to carry it with them everywhere they went for 3 days, beginning the day after the PPAL was received. Following this period, the PPALs were to be returned by the household. In some instances, households kept their PPALs for much longer than intended and also sometimes continued to use the PPAL while they travelled, so that some people provided more than 3 days of travel data.

98

Peter R. Stopher et al.

Once the GPS files and metadata were prepared, the processing commenced. This involved initial software runs to convert the entire GPS trace for each respondent into discrete trips by day. The GPS data were then processed and used to generate PR survey data. For each household, a selection was then made of one of the days of recording for the PR survey. Once the data from a household were completed, URLs were generated for each household member. While PR survey data were generated for all households, only those households that provided email addresses received URLs for the PR Survey. The PR data were then compiled and used for comparison with the results of further processing of the GPS data. The data were used to help identify shortcomings and errors in the software. However, it also became apparent that respondents often provided incorrect information in the PR survey, similar to errors often noted in diaries. In both the pilot and the main survey, it was found that a significant proportion of respondent errors were from respondents defining a trip as a round trip, and attempting to combine trips that were separated by an activity at a place.

5.4.2.

GPS Data Collection Results

For this survey, a household was defined as complete only if all persons provided with PPALs had at least one common day of data recorded, or a claimed no-travel day on that day. A total of 2059 households provided fully completed GPS data, with an additional 549 incomplete households with significant GPS data. Of the 2059, there were 17 one-person households that were GPS complete but where the household member did not travel. Hence, there are 2042 households whose travel is reported in this section. A summary of the responses is provided in Table 5.1. The completed households provided 3849 person records. Thus, an average of 1.88 persons per household carried PPALs on at least one common day. The average number of PPALs with data for incomplete households was 1.47. The average trip rate of 4.61 trips per person per day, or 8.62 trips per household per day, is higher than that usually measured in diary surveys. The weekday trip rate is higher still at 5.06 trips per person or 9.46 trips per household. The minimum number of trips recorded for any individual was 0 and the maximum was 43. The trip file includes 27 percent of persons who claimed a no-travel day on at least one of the days that they carried GPS devices, and 18.5 percent of households where all household members claimed not to have travelled on one day. It must be kept in mind that these figures include weekend days, where there are normally much higher rates of no-travel days than in weekday-only surveys. There were 2883 persons who claimed to use the GPS on all three of the targeted days. In addition, 779 persons claimed to have used the GPS on 2 of the 3 days and 308 claimed to have used the GPS all day on one of the three targeted days. This totals 3970 persons, the discrepancy with the number of persons (GPS complete plus GPS incomplete) shown in Table 5.1 being due to persons who claimed to have used the GPS device for only

Conducting a GPS-Only Household Travel Survey

99

Table 5.1: Disposition of the final sample. Statistic

Households Persons Travel days Trips Average daily trip rate Average weekday daily trip rate Average trip distance (all days) Average trip distance (weekdays) Average trip travel time (all days) Average trip travel time (weekdays) Average daily travel time (all days) Average daily travel time (weekdays)

GPS complete

GPS incomplete

Total

Number

Percent

Number

Percent

2,059 3,849 13,210 60,900 4.61 5.06

78.9 82.7 83.2 84.2 – –

549 807 2,670 11,336 4.25 4.64

21.1 17.3 16.8 15.8 – –

2,608 4,656 15,880 72,236 4.55 4.99

6.11 miles



6.29 miles



6.14 miles

6.21 miles



6.48 miles



6.25 miles

0:13:07



0:13:17



0:13:09

0:13:05



0:13:21



0:13:07

01:22:11.1



01:19:27.1



01:21:44.4

01:21:10.5



01:19:26.6



01:20:53.7

part of a day, persons who did not travel, and those who returned GPS data, but did not fill out the GPS use card. There were 206 respondents (3.6 percent) who did not take the GPS device with them on the first day, 93 (1.6 percent) who did not take it with them on the second day, and 95 (1.7 percent) who did not take it with them on the third day. Similarly, 232 respondents (4.1 percent) said that they took the device with them for only part of the first day, 197 (3.5 percent) on the second day, and 173 (3.0 percent) on the third day. A total of 417 persons (7.4 percent) said they did not travel away from home on the first day, 374 (6.6 percent) who did not travel on the second day, and 346 (6.1 percent who did not travel on the third day. A total of 1289 respondents did not fill out the GPS status card at all, and between 151 (on Day 1) and 188 (on Day 3) left at least 1 day blank on the card. A total of 601 households completed the PR survey, comprising 989 persons, or 1.65 persons per household. This was lower than the number of GPS persons per household (which was 1.88); however, most households that completed the PR survey did so with all members of the household that carried PPALs completing the survey. No information was collected on who filled out the PR survey, and only one email address was collected from each household. A total of

100

Peter R. Stopher et al.

920 households containing 1,895 persons aged 13 or over provided no email address. A number of the email addresses that were provided also proved to be incorrect, although statistics on this number are not available. Only households with email addresses were asked to do the PR survey, and this also only if there was at least one complete day of GPS data for each person in the household. Assuming that all 2608 households, in Table 5.1 were eligible to receive the PR survey, the response rate of 601 households out of 1668 households that provided email addresses indicates a response rate of about 35.6 percent of households. However, given that there were invalid email addresses, the response rate is actually higher than this. In the PR survey, a more detailed list of modes was requested, including identification of driver and passenger. In addition, the total number of people on the trip was also ascertained. More detailed purposes were also collected. However, considerable care must be taken in interpreting the PR survey results, because there are some fundamental differences between the PR data and the full GPS data, which must be borne in mind when comparing the two. In addition, an in-depth analysis of the PR data shows that these data are often not even close to the ‘ground truth’ that is desired. It must also be kept in mind that there is much greater trip reporting from GPS than is customary from diary surveys. An important difference between the GPS and PR data is that the full GPS data include 4064 no-travel days, whereas the PR data include no no-travel days. Of these no-travel days, 2491 are on weekdays. Further, the distribution of weekdays and weekend days is radically different, with only 13 out of 4831 trips (0.3 percent) in the PR data being on a Saturday and none on Sunday, compared to 1921 (3.0 percent) in the full GPS data being on Saturday and 1526 (2.4 percent) being on a Sunday. Hence, most PR trips are weekday trips, unlike the GPS data. There are also clear problems in the completion of the PR data. An in-depth analysis of the PR data revealed that about 6 percent of the responses of the mode of travel used appear to be highly questionable, and about 15 percent of the trip purposes identified also appear to be highly questionable. For mode, the most common issue was a trip claimed to be by walking at a speed beyond the capability of a human being, and trips claimed to be by car that were at a slow walking pace. For purpose, the major issues relate to respondents combining two or more trips into one round trip or tour, and providing an incorrect purpose for the combination. As a general conclusion, the PR data are subject to almost all of the common problems found in self-report diaries, even though respondents have a map showing where the GPS says that they travelled and from which they just need to fill in the details of their travel. Experience from the pilot, when respondents were allowed to change the trip times, showed that respondents often have a completely incorrect recollection of when the travel took place. While some proportion of the PR data probably represents ‘ground truth’, a significant amount of the data is actually incorrect and unreliable. While an in-depth analysis can reveal some of the probable problems, it is not possible to determine which PR data are correct and which are incorrect at an overall level.

Conducting a GPS-Only Household Travel Survey 5.4.3.

101

Trip Characteristics

Some more detailed statistics are described in this section, both from the GPS and the PR survey. Table 5.2 provides a breakdown of trips by mode from the complete households and also provides a household trip rate by mode from the GPS data after final processing. As expected, Table 5.2 shows that the majority of trips recorded were by car. Slightly less than 5 percent of trips could not be identified to a specific mode, usually because the trip was inserted in map editing and there was no trace for the trip. Table 5.3 shows a similar report for the PR survey. Trips by motor vehicle are 2.4 percentage points lower in the GPS data than the PR data, total bus trips are 0.6 percentage points lower, bicycle trips are 0.5 percentage points higher and walk trips are 0.1 percentage points higher. The overall trip rate by households in the PR survey is lower than for the GPS survey, partly because of fewer persons per household completing the PR survey, and possibly also because those who did complete it may have been those with fewer trips in the sampled day. Table 5.4 shows the distribution of origin and destination activities and rates of these per household per day from the GPS survey. Because the G-TO-MAP software is provided principally with home, work, school and some shopping locations, there will always be a large proportion of ‘other’ activities, these being those that cannot be classified to one of the other four purposes. It should be noted that there are no missing activities, because the software improvements removed this category, primarily through the use of the activity duration for potential work and school trips. Table 5.5 shows similar information for the origin activities as Table 5.4, but from the PR survey, with a somewhat richer set of options for the activities. The activities in Table 5.5 are grouped so as to make comparison to Table 5.4 easier. A comparison of Tables 5.4 and 5.5 indicates that the proportion of trips originating and destinating at home and at work is lower in the GPS software, while school trips are very close. Shopping is too high in the GPS data, which is surprising

Table 5.2: GPS trips and daily household trip rate by mode. Mode of travel

Number of trips

Percent of trips

Daily household trip rate

Motor vehicle Bus Walk Bicycle School bus (GPS and PR) Unknown

53,734 537 3,125 585 247 2,672

88.2 0.9 5.1 1.0 0.4 4.4

7.60 0.08 0.44 0.09 0.03 0.38

Total

60,900

100.0

8.62

102

Peter R. Stopher et al.

Table 5.3: Breakdown of PR trips by mode and daily household trip rate by mode. Mode of travel

Number of trips

Percent of trips

Daily household trip rate

Driver of auto/van/truck Passenger of auto/van/truck Driver of carpool Passenger of carpool Passenger of vanpool Motorcycle/Moped Total motor vehicle Bus School bus Total bus Taxi/Paid limo Walk Bicycle Other Unknown

3816 452 21 44 9 13 4376 62 30 92 2 235 22 83 15

79.4 9.4 0.4 0.9 0.2 0.3 90.6 1.3 0.6 1.9 0.0 5.0 0.5 1.7 0.3

6.39 0.76 0.04 0.08 0.02 0.02 7.31 0.10 0.05 0.15 0.0 0.40 0.04 0.14 0.03

Total

4804

100.0

8.05

Table 5.4: Breakdown of trips by origin and destination activity and by household rate from the GPS survey. Purpose

Origin activity

Destination activity

Number Percent Daily rate Number Percent Daily rate At home Paid work School Pick-up/Drop-off Catch bus/train/plane Shop Other Missing

15,419 7,308 1,871 2,243 1,552 14,452 18,055 0

25.3 12.0 3.1 3.7 2.5 23.7 29.6 0

2.18 1.03 0.27 0.32 0.22 2.04 2.55 0

15,155 7,406 1,905 2,224 1,548 14,457 18,205 0

24.9 12.2 3.1 3.7 2.5 23.7 29.9 0

2.15 1.05 0.27 0.32 0.22 2.04 2.58 0

Total

60,900

100.0

8.62

60,900

100.0

8.62

given that only two shopping locations were requested from each household and these were grocery shopping locations. As noted below, one of the problems that arise is where the address given is actually a relatively large site, and the address is coded at a point that may be close to or on the road, or may be in the centre of the site. This is

Conducting a GPS-Only Household Travel Survey

103

Table 5.5: Breakdown of trips by origin and destination activity and by household rate from the PR survey. Purpose

Origin activity

Destination activity

Number

Percent

Daily rate

Number

Percent

Daily rate

At home Paid work School Volunteer work Pick-up/Drop-off Social/Recreational/ Church Catch bus/train/plane Transfer from a bus/ train/plane to another Shop Personal business Eat meal Go for a drive Work related School related Don’t know/refused Missing

1376 842 136 56 227 296

28.7 17.5 2.8 1.2 4.7 6.1

2.31 1.41 0.23 0.09 0.38 0.50

1,360 858 128 56 222 312

28.3 17.8 2.7 1.2 4.6 6.5

2.28 1.44 0.21 0.10 0.37 0.52

23 35

0.5 0.7

0.04 0.06

25 33

0.5 0.7

0.04 0.06

622 322 222 25 162 57 388 0

13.0 6.7 4.6 0.5 3.4 1.2 8.1 0.3

1.04 0.54 0.37 0.04 0.27 0.10 0.65 0.03

621 327 225 25 155 58 380 0

12.9 6.8 4.7 0.5 3.2 1.2 7.9 0.3

1.04 0.55 0.38 0.04 0.26 0.10 0.64 0.03

Total

4804

100.0

8.05

4,803

100.0

8.05

particularly a problem for shopping and work trips. It must also be recalled, however, that there are very important differences between the PR and the full GPS data, so that strict comparability would be highly unlikely in these statistics. Some fraction of the mismatching reported here is a result of errors in the PR survey responses.

5.4.4.

In-Depth Comparison of GPS and PR Data

5.4.4.1. Overall analysis As noted earlier, people make various errors in completing the PR survey similar to those made in self-report diaries. First is linking individual trips together into tours and also omitting reporting on trips that are claimed not to have been made. There are 5362 trips recorded by the GPS devices for the PR households, but PR respondents reported on only 4827 trips. Further analysis shows that there were 24 trips reported by PR respondents that did not correspond to a GPS trip, while there were 554 GPS recorded trips that did not correspond to a PR reported trip. In most cases, these 554 arose from situations where

104

Peter R. Stopher et al.

respondents grouped trips together from the GPS and called them a single trip. A few of these cases may be real situations where the GPS software identified a trip end that was actually not a trip end. However, most of these cases are where trips have been joined together into tours. The ‘trip under-reporting’ implied by this 554 trips is 10.3 percent. It must immediately call into question the reliability of the PR data. A second part of this analysis was to look at trips that were reported as lasting longer than 1 hour. In the GPS data, there were 92 trips that lasted more than an hour. The range of trip durations was from 1 hour and 19 seconds to 5 hours, 46 minutes 22 seconds. The mean was 1 hour, 39 minutes and 7 seconds, and the standard deviation was 50 minutes and 26 seconds, suggesting that most long trips would be less than 3 hours in duration. For the corresponding PR data, there were 199 trips that lasted over an hour, with a range from 1 hour and 19 seconds to 21 hours, 31 minutes and 2 seconds. The mean travel time was 3 hours, 32 minutes and 49 seconds, with a standard deviation of 3 hours, 24 minutes and 57 seconds, suggesting that most long trips would be less than 10 hours in duration. This 7-hour difference is further evidence of the joining together of trips that have substantial activity durations between trips, and calling them a single trip. Many of the 54 cases where the PR respondent stated that a trip was missing turn out to be cases where the PR respondent split a GPS trip into two, with one component of the trip being only a few seconds in length. This is shown rather clearly by the fact that there are 36 cases in the PR data of a trip of less than 30 seconds in duration, whereas there are only 11 trips of less than 30 seconds duration in the GPS data. Looking at the former, the range is from 0 seconds to 29.98 seconds with a mean of 10.8 seconds. For the latter, the range is from 10.97 seconds to 29.03 seconds with a mean of 22.1 seconds. In 10 of the cases where the PR data has a less than 30 second travel time, the GPS data are missing, indicating that these are trips that have been split. Only five cases of the GPS data of less than 30 seconds correspond to missing trips in the PR data. Notwithstanding these issues, the overall statistics of mode and activity at origin and destination have been compared between the PR and GPS data. The comparison of mode is shown in Table 5.6. From these data, it appears that the GPS processing has produced very close results. Total motor vehicle trips are overestimated by 80 trips (1.8 percent), and bus trips are underestimated by 22 trips, with the majority of these being school bus trips. Walk is almost exactly correct with 235 versus 236, while bicycle is still overestimated by the software by 13 trips. Overall, however, the results are remarkably close. Table 5.7 shows the distribution of trips by mode for the PR sample, as recorded by the GPS devices and processed, together with the percentage splits in the PR data. Table 5.7 also shows that, when comparing the results of the processing of the GPS trips against the percentage mode shares from the PR survey, the results are remarkably close, with motor vehicle underestimated by 1.6 percent, bus underestimated by 0.2 percent, school bus underestimated by 0.2 percent, walk is exactly correct and bicycle is overestimated by 0.3 percent. There are 3.7 percent of trips that have an unknown mode, which are trips added by the GPS map editing, which had no traces and therefore could not be processed.

Conducting a GPS-Only Household Travel Survey

105

Table 5.6: Comparison of GPS and PR survey on mode. Mode of Travel

GPS number of trips

PR number of trips

Driver of auto/van/truck Passenger of auto/van/truck Driver of carpool Passenger of carpool Passenger of vanpool Motorcycle/Moped Taxi/Paid limo Total motor vehicle Bus School bus Total bus Walk Bicycle Other Unknown

– – – – – – – 4437 54 16 70 236 35 – 26

3816 452 21 44 9 13 2 4357 62 30 92 235 22 83 15

Total

4804

4804

Table 5.7: Breakdown of GPS trips by mode for the PR households, compared to PR percentages. Mode of travel

Number of trips

Percent of trips

Percent from PR responses

Total motor vehicle Bus School bus Walk Bicycle Other Unknown

4773

89.0

90.6

59 20 267 42 0 201

1.1 0.4 5.0 0.8 0 3.7

1.3 0.6 5.0 0.5 1.7 0.3

Total

5362

100.0

100.0

Table 5.8 shows the comparison of the GPS processing of purpose to the results of the PR survey on origin and destination activities. Table 5.8 shows only those trips that more or less correspond between the GPS and PR survey, although it must be kept in mind here that, if two or more trips were combined by the PR respondent, the convention was used of matching the first GPS trip to the combined PR trip, leading to some anomalies in comparing origin and destination activities.

106

Peter R. Stopher et al.

Table 5.8: Comparison of GPS and PR survey activities at origin and destination. Purpose

Origin activity

Destination activity

GPS number

PR number

GPS number

PR number

At home Paid work School Volunteer work Pick-up/Drop-off Social/Recreational/Church Catch bus/train/plane Transfer from a bus/train/plane to another Shop Personal business Eat meal Go for a drive Work related School related Don’t know/refused Other

1374 659 158 – 155 – 90 –

1376 842 136 56 227 296 23 35

1339 653 163 – 169 – 96 –

1360 858 128 56 222 312 25 33

1127 – – – – – – 1241

622 322 222 25 162 57 388 0

1138 – – – – – – 1245

621 327 225 25 155 58 380 0

Total

4804

4804

4803

4803

The number of trips with an origin or destination activity at home is very close in the PR and GPS surveys. Pick-up and drop-off are identified lower in the GPS software than reported in the PR survey. However, some of these could also be confused with catching a bus, train or plane in the GPS data, which could partially account for the too high frequency of this activity. Similarly, the GPS software cannot distinguish between catching a bus, train or plane and transferring. In total, the software identifies 245 origins and 265 destinations that are either pick-up/dropoff, or catch or transfer between buses, trains and planes. In comparison, the PR survey shows 285 origins and 280 destinations for these three activities, suggesting that overall the software is doing a reasonable job on these activities. The shopping trips are the principal activity that is seriously overestimated. However, because of the fact that many supermarkets are within major shopping centres, and these centres may offer opportunities for both personal business and some social and recreational activities (such as a gym, movies, eat meal, etc.), the overestimate is probably not unreasonable. A more detailed examination of the work trips shows that about 180 of the work trip origins according to the PR data were categorised as ‘other’ by the GPS software

Conducting a GPS-Only Household Travel Survey

107

and 98 were categorised as ‘shop’ trips, with figures of 192 and 109, respectively, for destinations. Adding these numbers into the work plus volunteer work trips will actually produce larger totals of work trips than the PR survey measured, but is indicative of some of the reasons for difficulty in determining correctly the work activity. Again, it seems useful to compare the percentages of trips from the GPS processing with the overall percentages from the PR data, using the percentages shown in Table 5.5 and comparing these to the percentages for the full 5362 GPS recorded trips. The results are shown in Table 5.9. A review of the numbers in Table 5.9 shows that trips with an origin or destination at home are correct to within about 0.7 percent. Work trips are underestimated by approximately 3.5 percent, while school trips are overestimated by about 0.4 percent. Pick-up and drop-off are underestimated by about 1.5 percent, and the combination of catching and transferring between bus, train or plane is overestimated by 0.7 percent. Shopping is the most overestimated purpose, with an overestimate of around 10 percent, but this discrepancy was explained previously. Other purposes are slightly overestimated by about 2.1 percent, while the GPS results have no missing or refused/don’t know results, which account for 8.4 percent of the PR total. Again, this comparison suggests that the processing results are relatively quite accurate and should be suitable to support most modelling applications. Following the completion of the various software improvements, a subsample of 41 households who made 429 trips was used to carry out an in-depth analysis of the

Table 5.9: Breakdown of GPS trips by activity at origin and destination, compared to PR percentages. Purpose

Origin activity Number Percent

At home Work School Pick-up/Drop-off Catch bus/train/plane Transfer from a bus/train/ plane to another Shop Other Don’t know/refused Missing Total

Destination activity

PR Number Percent PR percent percent

1501 740 173 172 103 0

28.0 13.8 3.2 3.2 1.9 0

28.7 17.5 2.8 4.7 0.5 0.7

1484 742 176 171 103 0

27.7 13.8 3.3 3.2 1.9 0

28.3 17.8 2.7 4.6 0.5 0.7

1275 1398 0 0

23.7 26.0 0 0

13.0 23.7 8.1 0.3

1279 1407 0 0

23.9 26.2 0 0

12.9 24.3 7.9 0.3

100.0

100.0

108

Peter R. Stopher et al.

results of the software processing in comparison to the prompted-recall results. For this comparison, the focus was on mode and purpose. For the 429 trips analysed, 362 matched exactly on mode of travel after software processing. After identifying the reasons for mismatch, it was found that 51 of the 67 mismatched modes were respondent issues in the PR survey. Thus, software errors appear to concern just 16 trips, suggesting 95.8 percent accuracy in this test. A similar analysis was performed with trip purpose. Of the 476 trips in this analysis, 363 matched on origin and 346 on destination. In this case, it was found that 77 of the origin mismatches and 90 of the destination mismatches were due to respondent problems in the PR survey. Hence, the accuracy of the origin and destination activities is 91.0 and 89.6 percent respectively.

5.4.5.

Representativeness of the Completed Sample

The overall completed-household sampling goal was reduced as the study proceeded, due mainly to loss of GPS units resulting in an insufficient number of GPS units for deployment to the full sample of households recruited as the study progressed. Meanwhile, three-quarters of the way through the study, as would be expected, low completion rates were identified for certain hard-to-reach household categories (those with low incomes, those with zero vehicles and those with 4+ person members). Starting in May 2010, distribution of GPS units was concentrated on recruited households meeting the undersampled criteria. Upon completion of the study, 2059 households fully met the criteria for a completed household. Overall, units were deployed to 4238 of the 5564 households recruited (76 percent). Households that did not receive GPS units also did not provide any other data, because the other forms were provided only to GPS households. The reasons for non-deployment were generally that there were no GPS devices available within the required time to send to these households. This occurred in part because of an exceptionally high loss rate of GPS devices. Therefore, although 5564 households were recruited, only 4238 were actually provided with a GPS device and forms to complete. Thus, the completion rate for those fully deployed was 48.6 percent. According to AAPOR (private communication, 2012), there is no agreed upon way to handle known and unknown eligibility households within an address-based sample or mixed frame model. As a result, calculation of a standard response rate is considered not to be possible at this time. Within the framework of the reduced sample size, a very representative sample was recruited and completed for all geographic and household demographic sampling variables. Table 5.10 shows counts and percentages for household completions. These percentages are compared with sampling plan (census-based) targets, which included adjusted geographically based oversampling for transit propensity households and households around major universities. The completed sample region-wide was representative by county, household size, number of vehicles and number of workers, with the exception that zero-vehicle

425 162 954 167 83 82 149 37 2059

320 248 1491 2059

669 696 278 416 2059

91 676 809 483 2059

573 704

Study area households Butler Clerivont Hamilton Warren Boone Campbell Kenton Dearborn Total

Sample transit type Transit University Other Total

Household size 1 Person 2 Persons 3 Persons 4+ Persons Total

Number of vehicles 0 Vehicles 1 Vehicle 2 Vehicles 3+ Vehicles Total

Number of workers 0 Workers 1 Worker

Frequency

27.8 34.2

4.4 32.8 39.3 23.5 100.0

32.5 33.8 13.5 20.2 100.0

15.5 12.0 72.4 100.0

20.6 7.9 46.3 8.1 4.0 4.0 7.2 1.8 100.0

Percent

Complete

192037 300001

77555 258690 311104 154208 801557

218622 256794 132757 193384 801557

NA NA NA NA

123082 66013 346790 55966 31258 34742 59444 16832 734127

Households

23.96 37.43

9.68 32.27 38.81 19.24 100.00

27.27 32.04 16.56 24.13 100.00

10.39 5.17 84.44 100.00

16.77 8.99% 47.24 7.62 4.26 4.73 8.10 2.29 100.00

Percent of region

Sampling plan: Distribution of households by census PUMS 2000

Table 5.10: Sociodemographics of the sample compared to census target.

NA NA

NA NA NA NA

NA NA NA NA

11.24 22.52 66.24 100.00

NA NA NA NA NA NA NA NA

Sampling target with oversample (%)

Collapsed oversample categories

3.87  3.24

 5.26 0.56 0.48 4.22

5.22 1.76  3.06  3.93

4.30  10.47 6.17

3.88  1.12 0.91 0.49  0.23 0.75  0.86  0.50

% Difference completes/ target (%)

Conducting a GPS-Only Household Travel Survey 109

643 139 2059

344 450 395 712 158 2059

1025 83 377 574 2059

61 27 3 0 310 336 27 3 162 239 388 20

Household income Up to $25,000 Over o25,000 to $50,000 Over 850,000 to $75,000 More than $75,000 Don’t know/Refused Total

Household type (Lifecycle) Adult HH Adult student HH Retiree HH HH with children Total

Autos vs. Workers 0 Autos, 0 Workers 0 Autos, 1 Workers 0 Autos, 2 Workers 0 Autos, 3+ Workers 1 Auto, 0 Workers 1 Auto, 1 Worker 1 Auto, 2 Workers 1 Auto, 3+ Workers 2 Autos, 0 Workers 2 Autos, 1 Worker 2 Autos, 2 Workers 2 Autos, 3+ Workers

Frequency

3.0 1.3 0.1 0.0 15.1 16.3 1.3 0.1 7.9 11.6 18.8 1.0

49.8 4.0 18.3 27.9 100.0

16.7 21.9 19.2 34.6 7.7 100.0

31.2 6.8 100.0

Percent

Complete

2 Workers 3+ Workers Total

Table 5.10: (Continued )

48513 22171 5628 1243 87473 142177 25938 3102 45332 101900 153695 10177

6.05 2.77 0.70 0.16 10.91 17.74 3.24 0.39 5.66 12.71 19.17 1.27

45.98 2.86 14.54 36.62 100.00

100.00

801557 368555 293541 116567 22894 801557

20.56 25.14 20.17 34.13

31.33 7.29 100.01

Percent of region

164803 201541 161650 273563

251120 58399 801557

Households

Sampling plan: Distribution of households by census PUMS 2000

1 Auto, 0 Workers 1 Auto, 1 Worker 1 Auto, 2 Workers 2 2 2 2

12.12 20.06 3.58 5.79 12.61 16.91 1.09

0 Workers 1 Worker 2 Workers 3+ Workers

0 Autos, 1+ Workers

4.06

Autos, Autos, Autos, Autos,

0 Autos, 0 Workers

Collapsed oversample categories

6.85

43.64 7.42 15.30 33.64 100.00

25.52 27.48 21.70 25.30 0.00 100.00

NA NA

Sampling target with oversample (%)

2.08  1.00 1.94  0.12

2.93  3.74  2.12

 2.60

 3.89

6.15  3.39 3.01  5.76

 8.81  5.63  2.51 9.28

 0.10  0.54

% Difference completes/ target (%)

110 Peter R. Stopher et al.

40 102 225 116 2059

68 15 3 5 509 104 36 27 74 421 123 191 18 156 116 193 2059

327 342 206 187 303 21 79 126 52 19 96 214 87 2059

3 + Autos, 0 Workers 3+ Autos, 1 Worker 3+ Autos, 2 Workers 3+ Autos, 3+ Workers Total

Autos vs. Household size 0 Autos, 1 HH Member 0 Autos, 2 HH Members 0 Autos, 3 HH Members 0 Autos, 4+ HH Members 1 Auto, 1 HH Member 1 Auto, 2 HH Members 1 Auto, 3 HH Members 1 Auto, 4+ HH Members 2 Autos, 1 HH Members 2 Autos, 2 HH Members 2 Autos, 3 HH Members 2 Autos, 4+ HH Members 3+ Autos, 1 HH Members 3+ Autos, 2 HH Members 3+ Autos, 3 HH Members 3+ Autos, 4+ HH Members Total

Household size vs. Workers 1 HH Member, 0 Workers 1 HH Member, 1 Workers 2 HH Member, 0 Workers 2 HH Member, 1 Workers 2 HH Member, 2 Workers 3 HH Member, 0 Workers 3 HH Member, 1 Workers 3 HH Member, 2 Workers 3 HH Member, 3+ Workers 4+ HH Member, 0 Workers 4+ HH Member, 1 Workers 4+ HH Member, 2 Workers 4+ HH Member, 3+ Workers Total 15.88 16.61 10.00 9.08 14.72 1.02 3.84 6.12 2.53 0.92 4.66 10.39 4.23 100.00

3.30 0.73 0.15 0.24 24.72 5.05 1.75 1.31 3.59 20.45 5.97 9.28 0.87 7.58 5.63 9.37 100.00

1.9 5.0 10.9 5.6 100.0

99428 119194 68965 82080 105749 12849 42406 58714 18788 10795 56321 86657 39611 801557

45636 15153 7781 8985 144328 64271 25986 24105 23309 140850 55814 91131 5349 36520 43176 69163 801557

10719 33753 65859 43877 801557

12.40 14.87 8.60 10.24 13.19 1.60 5.29 7.32 2.34 1.35 7.03 10.81 4.94 100.00

5.69 1.89 0.97 1.12 18.01 8.02 3.24 3.01 2.91 17.57 6.96 11.37 0.67 4.56 5.39 8.63 100.00

1.34 4.21 8.22 5.47 100.00

0 Autos, 2+ HH Members 1 Auto, 1 HH Member 1 Auto, 2 HH Members 1 Auto, 3+ HH Members 2 Autos, 1-2 HH Members 2 Autos, 3 HH Members 2 Autos, 4+ HH Members 3+ Autos, 1-3 HH Members 3+ Autos, 4+ HH Members

4.33 24.76 8.55 6.36 21.03 6.30 9.94 4.67 7.45 100.00 NA NA NA NA NA NA NA NA NA NA NA NA NA

0 Autos, 1 HH Member

3 + Autos, 0 Workers 3+ Autos, 1 Worker 3+ Autos, 2 Workers 3+ Autos, 3+ Workers

6.61

1.39 4.00 7.09 4.45 100.00

3.48 1.74 1.40  1.16 1.52  0.58  1.45  1.21 0.18  0.42  2.36  0.42  0.72

1.92

9.42

 0.33  0.66

3.01

 0.04  3.49  3.30

 3.22

 3.30

0.55 0.95 3.84 1.18

Conducting a GPS-Only Household Travel Survey 111

112

Peter R. Stopher et al.

households were under-represented in completes by 5.3 percent, despite $25 incentives for completion. Likewise, 4+ person households were only under-represented by 4 percent. As in diary-based household travel surveys, households with incomes under $25,000 were under-represented by 8.8 percent and households with incomes over $75,000 were over-represented by 9.3 percent. Only 7.7 percent of completed households refused or did not report income. Life cycle (adult households, adult student households, retirees and households with children) was representative of census-based percentages. Additionally, the completed sample was representative by the regional breakdown of categories of interest to travel analysis zones including number of autos by number of workers, number of autos by household size, and household size by the number of workers. Thus, a representative sample of completed households can be obtained using GPS-only and address-based sampling methodologies.

5.5. Conclusions and Recommendations The primary conclusions to be drawn from this paper are that it is feasible to undertake a GPS-only household travel survey, achieve a high standard of representativeness for the sample, and impute such data as mode and purpose at a sufficiently accurate level to support modelling work. Indeed, in most self-report surveys, it has been demonstrated over recent years that there is under-reporting of travel by generally at least 20 percent, meaning that such data cannot be representative of the region from which they are collected and cannot be readily adjusted to correct respondent under-reporting. The high level of accuracy reported in this paper for imputing mode and purpose with 96 percent on mode and around 90 percent on activity is far superior to self-report surveys and the richness of the time and distance information from this survey surpasses what can be achieved from any other form of survey. There are improvements that could be made, however. For future GPS-only HTSs the authors recommend including the workplace address question for every person in the household in the recruitment, whether the recruitment is conducted by web or CATI. If workplace address information obtained this way is found to be insufficient for each worker in the household, the household should be considered a refusal at an early stage and replaced with an appropriate household from the same data cell. The authors also recommend that a longer period of measurement be used for the GPS component of future surveys. A full week (or longer) of GPS data will enhance the ability to identify work trips, as well as provide much richer data on the variability of travel from day-to-day. In addition, a larger sample of weekend data can be obtained, which will have significant future use in a number of policy areas. A better method of obtaining ‘ground truth’ is needed for further improvements in software processing. However, detailed land use data would also be helpful, where land uses are coded to such categories as retail, education, office, medical etc. so that the likely purpose of a person visiting a parcel can be deduced.

Conducting a GPS-Only Household Travel Survey

113

References Casas, J., & Arce, C. H. (1999). Trip reporting in household travel diaries: A comparison to GPS-collected data. Transportation Research Board, 78th Annual Meeting, CD-ROM, Washington, DC. Draijer, G., Kalfs, N., & Perdok, J. (2000). GPS as a data collection method for travel research: The use of GPS for the data collection for all modes of travel. Transportation Research Board, 79th Annual Meeting, Paper No. 00-1176, 15 pp., Washington, DC. Forrest, T., & Pearson, D. (2005). Comparison of trip determination methods in household travel surveys enhanced by Global Positioning System. Transportation Research Record 1917, 63–71. doi: 10.3141/1917-08 Giaimo, G., Andersen, R., Rohne, A., Wargelin, L., Stopher, P., Tierney, K., & O’Connor, S. (2009). The Greater Cincinnati Area large-scale (100%) GPS-based household travel survey. Paper presented to the Transportation Planning Applications Conference, Houston, TX. Hawkins, R., & Stopher, P. (2004). Collecting data with GPS: Those who reject, and those who receive. Working Paper ITS-WP-04-21. Institute of Transport Studies, University of Sydney, Sydney, 15 pp. Kracht, M. (2006). Using combined GPS and GSM tracking information for interactive electronic questionnaires. In P. Stopher & C. Stecher (Eds.), Travel survey methods: Quality and future directions (pp. 545–560). Oxford, UK: Elsevier. Murakami, E., & Wagner, D. P. (1999). Can using global position system (GPS) improve trip reporting? Transportation Research Part C, 7, 149–165. Richardson, A. J., & Meyburg, A. H. (2003). Definitions of unit non-response in travel surveys. In P. Stopher & P. Jones (Eds.), Transport survey quality and innovation (Chapter 36, pp. 587–604). Oxford, UK: Pergamon Press. Stopher, P., FitzGerald, C., & Zhang, J. (2008). In search of a GPS device for measuring personal travel. Transportation Research Part C, 16, 350–369 (special issue on emerging commercial technologies). Stopher, P., Zhang, J., & Prasad, C. (2011). Evaluating and improving software for identifying trips, occupancy, mode and purpose from GPS traces. Paper presented to the 9th International Conference on Transport Survey Methods, Chile. Stopher, P. R. (2004). GPS, location, and household travel. In D. Hensher, K. Button, K. Haynes & P. Stopher (Eds.), Handbook on transport geography and spatial systems (pp. 433–449). Oxford: Elsevier (Handbook No. 5). Stopher, P. R., & Greaves, S. P. (2007). Household travel surveys: Where are we going? Transportation Research Part A, 41(5), 367–381. doi: 10.1016/j.tra.2006.09.005. Stopher, P. R., & Greaves, S. P. (2009, September). Missing and inaccurate information from travel surveys — Pilot results. Paper presented to the 32nd Australasian Transport Research Forum, Auckland. Wolf, J. (2006). Applications of new technologies in travel surveys. In P. Stopher & C. Stecher (Eds.), Travel survey methods: Quality and future directions (pp. 531–544). Oxford, UK: Elsevier. Wolf, J., Guensler, R., & Bachmann, W. (2001). Elimination of the travel diary: An experiment to derive trip purpose from GPS travel data. Transportation Research Board 80th Annual Meeting, Paper No. 01-3255, 22 pp., Washington, DC. Wolf, J., Loechl, M., Thompson, M., & Arce, C. (2003). Trip rate analysis in GPS-enhanced personal travel surveys. In P. R. Stopher & P. M. Jones (Eds.), Transport survey quality and innovation (pp. 483–498). Oxford, UK: Pergamon Press. Yalamanchili, L., Pendyala, R.M., Prabaharan, N., & Chakravarthy, P. (1999). Analysis of global positioning system-based data collection methods for capturing multistop tripchaining behaviour. Transportation Research Record 1660, 58–65. doi: 10.3141/1660-08

Chapter 6

The Role of Web Interviews as Part of a National Travel Survey Linda Christensen

Abstract Purpose — The paper is analysing the effect of adding a web survey to a traditional telephone-based national travel survey by asking the respondents to check in on the web and answer the questions there (Computer Assisted Web Interview, CAWI). If they are not participating by web they are as usual called by telephone (Computer Assisted Telephone Interview, CATI). Design/methodology/approach — Multivariate regression analyses are used to analyse the difference in response rates by the two media and to analyse if respondents’ answering by the two media have different travel patterns. Findings — The analyses show that web interviews are saving money, even though a more intensive post-processing is necessary. The analyses seem to show that the CAWI is resulting in a more careful answering which results in more trips reported. A CAWI is increasing the participation of children in the survey and of highly educated. And it is offering a higher flexibility to answer after a couple of days off. The CATI is on the other hand more useful for the elderly. In addition, the CATI survey proved to be more useful for busy people and people not willing to participate in a survey at all. Young people and people with low resources who are difficult to reach by telephone are neither met on the web. Most of the differences in the response shares can be compensated by a weighting procedure. However, not all seems to be possible to compensate for. An effort to increase the number participating in the CAWI survey might increase the quality of the survey in general.

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

116

Linda Christensen

Originality/value of paper — In many countries authorities are considering how to reduce the cost of their national travel surveys. The value of the paper is to show that a combination of a CAWI and a CATI could be a good solution. Furthermore, it shows that the mixed mode could improve a CATI and therefore be the reason in itself to change methodology. Keywords: Web survey; national travel survey; CAWI; CATI; response rates; survey costs

6.1. Background Over the decades national travel surveys (NTS) have been the most important source for construction of national and regional transport models, and for policy-making on the basis of travellers’ behaviour. While in the past, traffic counts and simple zonebased origin and destination information was sufficient, today most transport forecasting models include multiple modes and trip purposes and new activity-based models are emerging. These models cannot be based on simple data collection methods, and interview-based data collection methods are becoming more and more important. Nevertheless, because data collection by means of interviews is very expensive, national and local authorities often look for possibilities to collect these data in a cheaper way. A web-based survey is an obvious solution for saving money. In Denmark the NTS (Transportvaneundersøgelsen, TU, 2011) was held on an annual basis in 1992–2003 by Computer Assisted Telephone Interviews (CATIs). The survey was not held in 2004 partly due to poor data quality during 2000–2001 (Christensen, 2006) and partly because of lack of money for collecting enough interviews. For this reason, in 2005 experiments with web-based data collection were carried out. A new Computer Assisted Web Interview (CAWI) was developed which could be used either by the CATI interviewers or by the survey respondents without the aid of the CATI interviewers. As the web survey proved to be successful, a tender was launched to further reduce the price of the CATI interviews, and the government decided to hold the survey again as a continuous survey from 2006. The purpose of the Danish NTS is to collect travel data for national and local transport policies, infrastructure investments, national transport modelling and research. The survey is based on a stratified random sample of individuals from the Danish Civil Registration System. The survey, consisting of a 1 day travel diary, is conducted all year round and elicits socio-economic background information, family characteristics and car ownership. More than 95% of all the trip origins and destinations are identified by addresses, which can be transformed into coordinates. Ninety-nine percent of the origins and destinations can be associated with a traffic zone. The respondents are contacted by a letter requesting them to complete the questionnaire online (TU questionnaire, 2012). Upon failure to complete the survey within 2 days after the assigned travelling day, the access to the web interview is denied, and they are called by telephone for a CATI. In attempt to conduct the

The Role of Web Interviews as Part of a National Travel Survey

117

CATI, each respondent can be called up to five times on specific days assigned in advance, spaced with intervals of 2–3 days. In case that the five attempts fail to yield a contact, no further attempts are made. Notably, 12% of the respondents cannot be contacted by phone due to lack of a publicly available phone number. From 2012 the share has been reduced after a new tender.

6.2. Introduction In Europe, NTS are based on representative population samples, drawn from either the register data or the census data, because it is very important that the resulting travel behaviour can be upscaled to the whole population and provide an accurate representation of the annual kilometrage by mode, purpose, weekday and geographical areas. Such an accurate representation is fundamental for the evaluation of transport policies and infrastructure changes, and is highly dependent on the NTS data collection method (Bayart & Bonnel, 2012). In fact, Christensen (2006) shows that even small changes in the day and hour of the CATI interviews influence the daily person-kilometrage. In addition, any discontinuity in the survey method is translated to unexpected changes in the overall kilometrage. In light of these findings, the introduction of a web-based survey should be made as smoothly as possible (Bonnel & Madre, 2006), in particular for tracking trends over time (Bronner & Kuijlen, 2007). This is possibly the reason that only a few countries changed their household survey methods so far. Denmark, inspired by the web workshop at ISCTSC in 2004, was the first country to include a CAWI in the NTS followed by the Danish time use and consumption survey from 2008 to 2009 (Bonke & Fallesen, 2009). In Holland several household surveys were changed to a mixed mode from 2008 (Buelens & van den Brakel, 2011; Kraan, van den Brakel, Buelens, & Huys, 2010), and from 2010 the NTS was changed to a mixed mode survey similar to the Danish (Bouhuijs, 2011) although including a face-to-face survey (CAPI) with non-telephone holders. The German MOD2008 includes an offer to use CAWI for the background information. In France where the NTS and regional travel surveys are conducted as face-to-face interviews, experiments have been conducted for the Lyon Household Survey (Bayart & Bonnel, 2012) followed up from autumn 2012 by a test with the travel survey for the whole Rhoˆne Alpes region. For both a CAWI is conducted with non-respondents and refusers who do not want an interviewer indoor for a CAPI interview. Norway and Sweden consider a mixed mode survey too for their next NTS. Last, Eurostat is discussing the possibility of a mixed mode European Work Force Survey (Kloek & van der Valk, 2011). In the United States many surveys made by central authorities are changed to CAWI often based on existing survey panels. In Australia no attempt has been made till now to include a web survey in the NTS in any of the metropolitan areas (Stopher, Zhang, Armoogum, & Madre, 2012). Research from the last 10 years indicates that three questions are relevant when considering a transition to a CAWI or a mixed mode survey: (1) Differences in the

118

Linda Christensen

response rates from different groups which might change the representativeness of the results, (2) difference in the quality of the answers and (3) the overall costs. Most research comparing CAWI with other modes is based on panels for the CAWI making comparison of the response rates difficult. Manfreda, Bosnjak, Berzelak, Haas, and Vehovar (2008) have registered 40 panel-based surveys out of 45 CAWI in a meta-analysis comparing response rates with other modes except CAPI. The response rate for the CAWI varies between 11% and 82% with a 11% lower mean than for other surveys. The lower response rate makes concerns about non-response bias on the CAWI even larger than for other modes. Surveys based on panels can be created as representative samples of the population according to demographic variables; however, they still represent people who have access to a computer and skills to use the Internet. Respondents who have accepted to be a part of a survey panel are likely to be more knowledgeable, engaged and viewpoint-oriented than those answering by CATI (Chang & Krosnick, 2008) or CAPI (Duffy, Smith, Terhanian, & Bremer, 2005). Respondents who have participated in a panel for more than half a year have a higher response rate than those with less experience (Shin, Johnson1, & Rao, 2011). Other sampling strategies can be advertisements on Internet pages on web pages relating to the subject of interest or emails to, e.g., members in relevant organisations (both used by Lee & Pino, 2012). They are by definition not even socio-economic representative as they only represent people with special interests. Lee and Pino (2012) compare the CAWI with a similar CATI survey about attitudes of motorcyclists. The middle age group is over-represented in the CAWI answers and the 18–25 year age group is under-represented. Fleming and Bowden (2009) analyse preferences between respondents contacted through web pages for a tourist attraction compared to people answering to a mail-back questionnaire handled out at the selected destination. They find the socio-demographic make-up of respondents to the two survey modes are not statistically different and both modes yield similar consumer surplus estimates. For Danish representative sampled CAWI surveys, the response rate vary between 17% (Bech & Kristensen, 2009) for a sample of 50–75 year age group, 20% (Christensen, 2011) for a long duration travel survey sampled between 16 and 84 year age group, and 29% (Mabit & Fosgerau, 2011) for a survey between new car buyers. Bech and Kristensen (2009) show that CAWI respondents have a higher income and longer education than mail respondents. The response rate is highest for the 55–65 year age group and lowest for the 70–75 year age group. By considering a CAWI it is important to be aware that access to computers and willingness to answer to a survey on the web is dependent on between others age and educational level and on the available time for completing the survey. In Denmark, 93% of the population has access to a computer (Bonke & Fallesen, 2009) and around 80% to the Internet (DST, 2012), but not everybody has skills to use the Internet and answer to a survey. The most important problem exists if the survey variables under consideration are correlated with the propensity to respond (Bonke & Fallesen, 2009; Groves, 2006).

The Role of Web Interviews as Part of a National Travel Survey

119

Bonke and Fallesen (2009) argue that for a survey with a register-based sample it is possible to correct for these biases through a weighting procedure based on information from administrative registers. However, this does not completely remove biases from non-respondents having another travel pattern than the respondents even when taking into account socio-economic differences and other circumstances used in the up-writing (Groves, 2006). Duffy et al. (2005) argue that weighting is only possible if we can check for variables that are responsible for the differences in samples. Bonnel (2003) shows that travel behaviour is correlated with non-response or with the number of attempts needed to reach the respondent (see also Christensen, 2006). A possible correlation might be that busy people spending much time outside home and eventually having long travel times are difficult to get contact for a CATI. Dixon (2002) finds that non-respondents to the Consumer Expenditures Quarterly Interview Survey have higher relative expenditure estimates for transportation. A CAWI offers a possibility to answer at a moment appropriate to the respondent (e.g. Bonnel & Madre, 2006). However, according to Groves (2006), some research indicates that busy people seems not to have a lower response rate than others. According to this the relation between busyness and response rate is unclear and should be analysed. In the analyses we will look for three reasons to have a low response rate by the CAWI:  Little time to conduct a web survey because of long working time, long travel time, many activities related to the family or a high activity level in general  Few resources because of no access to a computer, no knowledge of how to use one and in general few socio-economic resources  Not willing to participate in surveys in general Interviewer administered surveys as a CATI are dependent on the interviewer effect which eventually might be overcome by a CAWI (Bonke & Fallesen 2009; Lee & Pino, 2012). An interviewer is on the one hand able to increase the response rate by introducing the survey in a good manner, and on the other to reduce it by broken language, lack of politeness etc. Interviewers are furthermore able to explain difficulties in the questionnaire which are not easy to communicate in the written form. However, they are always to some extent interested in getting the interview finished within an appropriate time but some are always more eager to get finished than others (Christensen, 2006). Both refusals to participate and item non-response, eventually resulting in unreported trips, are very dependent on the actual interviewer (Christensen, 2006). In general, the presence or absence of an interviewer will influence the answers to certain questions (Bronner & Kuijlen, 2007). A special interviewer effect is a soft refusal in which the respondent is reporting no trips at all. This happens when the respondent is not willing to participate but does not say it directly. He will for instance agree on an appointment for a later call. After numerous calls and appointments he is accepting to participate to avoid more calls. The soft refusal strategy could be to report no trips to stop the interview. The

120

Linda Christensen

effect is a higher share of respondents with no trips (Christensen, 2006; Madre, Axhausen, & Bro¨g, 2007). Soft refusal to please the interviewer might not be the case in a web survey because then the respondent would probably never have started. The choice is in fact between a higher response rate and a more correct outcome. Kojetin (1994) examined non-response in two population surveys and found that characteristics of refusers were more similar to respondents than those of non-contacts. This indicates that a lower response rate might be acceptable (Groves, 2006). Since the available deliberation time is greater with CAWI, respondents have more time to retrieve correct answers from memory (Bronner & Kuijlen, 2007). Bonke and Fallesen (2009) show that responses to the CAWI in a mixed mode include more details compared to the CATI part. This holds for the questionnaire, where more questions were answered, for a diary with more registered activities and sequences, and for a booklet with a larger number of goods. However, Roster, Rogers, Albaum, and Klein (2004) report a higher item non-response rate for the web than for telephone. Telephone answers manifest more random measurement error (Chang & Krosnick, 2008). Research shows that answers from CAWI are slightly more accurate and reliable (Roster et al., 2004) and consistent (Braunsberger, Wybenga, & Gates, 2007) compared to interviewer administered modes. Respondents are more willing to answer frankly to sensitive questions (Lee & Pino, 2012) and their judgements about questions are more balanced regarding social problems (Bronner & Kuijlen, 2007). Furthermore, CAWI respondents answers less social desirably and satisfying (Chang & Krosnick, 2008). The third question relevant for changing from an interviewer administered survey to a CAWI or a mixed mode survey is wheather it is possible to save money. Most papers show cost reductions after a change from a CATI to a CAWI survey, Roster et al. (2004), e.g., find 53% savings and Braunsberger et al. (2007) find 71%, both using a panel for sampling. However, the cost for establishing the panel is not considered in these figures. The fixed costs for developing the survey design are high and not always included in the comparison (Bonnel & Madre, 2006). The purpose of the paper is on this background to cast light on whether the mixed mode survey  brings the stated travel pattern closer to the real mobility or farther away  makes a difference in representativeness, for instance by getting contact to respondents not easily accessible in a CATI  saves money. This will be done by  analysing differences in response rate in the two media  comparing travel behaviour, e.g. no trips, number of trips and trip length for the two media  discussing to what extent there is a relation between travel activity and response rate on web and by CATI.

The Role of Web Interviews as Part of a National Travel Survey

121

6.3. Methodology The dataset covers the period from May 2006 to the end of 2009. It includes 89,068 persons sampled resulting in 53,500 final interviews. The response rate is 60%. The analyses are primarily based on disaggregated multivariate regression analyses by SAS. The multivariate methodology is advantageous in distinguishing different effects from each other so that an effect is not caused by other explanatory variables. Results of the analyses are presented below as the estimated response values. In the current study the results of three different analyses are provided. The first analysis is a disaggregate logistic regression focusing on the CATI and CAWI response rates. The dataset contains the sampled respondents with information about whether they answered by CAWI, CATI or not at all. The response probability is calculated regarding demographics (i.e. age, gender and geography), the travel date and whether the respondent has a publicly available telephone number. The analyses are made stepwise. In the first step the web share is calculated in a logistic regression. In the next step respondents who have not answered online are analysed in a new logistic regression uncovering the response rate for the first day of call. Those who have still not answered are analysed for the second day up to the fifth day of call. The final overall response rate is calculated by adding the estimated response rates of each day multiplied by the probability of no answer up to that date. This complicated analysis is used to uncover differences in the calling days and to account for the calling day effects. The second analysis is conducted on the basis of the final interview and is aimed at gaining knowledge about the underlying factors of the propensity to response by CAWI. The share answering by CAWI is analysed by means of a disaggregate logistic regression. It is possible to use this method since the respondents are fully aware that they will be contacted by phone in case they fail to complete the survey online, and hence they are free to choose between the survey methods. Some respondents may choose to be contacted by phone (Alsnih, 2006), e.g. due to the inconvenience of completing the survey on the date assigned for online completion, while others may choose to answer online in order to avoid the hassle of being contacted by phone. It would have been interesting to have based the first analysis on more socioeconomic and travel-related information which would have been possible if the data could be analysed together with register data of the sample. However, this has not been possible for the moment and must be future research. The third analysis consists of a disaggregate multivariate regression analysis in two parts focusing on uncovering the differences in travel behaviour of CAWI and CATI respondents. The first part comprises a logistic regression that reveals the difference in the share of respondents not travelling on the day of the survey. The second part of the analysis is conducted by means of different regression models for the travelling respondents. The number of trips per traveller over all and by different distances is analysed by a poison distribution. Trip length and duration are analysed by a log-linear distribution. The linear regressions are made in the SAS procedure Genmod, the rest by the logistic procedure.

122

Linda Christensen

6.4. Results 6.4.1.

The Response Rate Analysis

For the years 2006–2009 the response rate was 60%, of which 11.8% completed the web interview; 19.6% of the completed interviews are made by CAWI. As shown in Table 6.1, all the considered variables and two interaction terms are highly significant in the first of the stepwise regression analyses for CAWI. Notably, fewer variables are significant in the consecutive steps of the phone call attempts, possibly due to more and more left in the calling base not contactable eventually because of wrong telephone numbers. As a supplement to Table 6.1, Table 6.2 shows response rates for CAWI and during the five CATI calls estimated by the regression model. Among the explanatory variables, the public availability of the telephone number has the highest effect on the CATI response rate. Notably, respondents without a publicly available telephone number are requested in the advance letter to communicate their telephone number, but only 0.5% actually responds to the request. The online response rate of these respondents is 8–9% relatively to a 13% online response rate of respondents with known telephone number. Lack of willingness to communicate the phone number may serve as indication for a general unwillingness to participate in the NTS. Another is a low availability of Internet access for people without telephone (Duffy et al., 2005). However, only a small share of people without publicly available telephone number are missing a telephone, the share of households with a cell phone increased from 94% to 98% during the period (DST, 2012). Age has the second highest effect on response rate and the interaction with gender is highly significant. In general women have a higher response rate than men, regardless of the survey method (i.e. CATI or CAWI) at an age up to 60. The CAWI response rate for women decreases after 60 years of age and online responses by women are rare over the age of 80. The CAWI response rate for men starts decreasing at an older age and the rate of decrease is slower. For children under 15 years of age, the response rate by CAWI is over 20%, and gender differences are small. For older young adults up to 25 years of age, the CAWI response rate decreases, but increases again for adults between 25 and 60 years of age. The CATI response rate is increasing by age between 25 and 80 years of age. Similarly to the CAWI response rate, the CATI response level is lowest for the age group of 20–25. In fact, the response rate on the first calling attempt is only half the response rate for other age groups. Hence, both methods have difficulties in capturing these young adults. The response rate is low also for adults between 25 and 35. They are responding less by web and at the first CATI attempt than other age groups. With respect to geographical location, the response rate for both media is much lower in central Copenhagen than in the rest of the country. For the three other big university cities with more than 100,000 inhabitants (Aarhus, Aalborg, and Odense) the CAWI response rate is in line with the average rates, but slightly lower for CATI. With respect to the response day, the effect of the weekday on the estimated response rate is insignificantly different for the 4 working days for CATI, but higher

Men Women

10–14 15–19 20–24

Age

0.0831

Gender

 0.2337

Copenhagen

3 big cities Rest of DK

Urban–rural

0.5317  0.1029  0.3608

 0.4846

0.0614*** 0.0672 0.0747***

0.0693***

0.0288**

0.0452  0.298  0.6647

 0.3089

 0.1102

 0.3482

 0.1717  0.3588  0.0977

0.0327*** 0.0342*** 0.0286

 0.2054  0.3635  0.0449

0.0368***

0.0479

0.0297**

 0.2771 0.0173

 0.1235

0.2574 0.1184

0.154 0.2351 0.1637

 0.3508  3.9656

Estimate

0.0586 0.0572*** 0.0634***

0.0542***

0.0236***

0.028***

0.0248*** 0.0254*** 0.0231***

0.0246

0.0296*** 0.0302

0.0204***

0.0342*** 0.0266***

0.0252*** 0.0209*** 0.0212***

0.0425*** 0.3193***

Sandard Error

CATI 1. Day 77,649 observations

0.1084

0.0385*** 0.038***

 0.2012  0.2168

Monday Tues-Thurs Friday Saturday Sunday

0.0259**

 0.0972

Response day

0.0416** 0.0336

0.0311*** 0.0268*** 0.0269***

0.0504*** 0.0368***

Sandard Error

0.1559 0.0429

Jan Feb+Mar Apr+Aug+ Okt+Nov May+Jun +Sep July Dec

Month

0.2541 0.1239 0.1816

 1.6718  0.5026

Estimate

CAWI 90,997 observations

2006 2007 2008 2009

Intercept None Known

Telephone

Value

Year

Value

Parameter

0.335  0.101  0.300

 0.081

 0.091

 0.326

 0.229  0.276  0.196

0.052

 0.166 0.001

0.011

0.129 0.078

0.053 0.200 0.140

 1.197  5.105

Estimate

0.0858*** 0.0841 0.089**

0.0799

0.0348**

0.0405***

0.0382*** 0.0387*** 0.0348***

0.0352

0.0431** 0.0452

0.0301

0.0527* 0.0404

0.0378 0.031*** 0.0314***

0.0645*** 0.243***

Sandard Error

CATI 2. Day 50,914 observations

0.3957 0.1594  0.1837

 0.0967

 0.0662

 0.3476

 0.0946  0.3712  0.1379

 0.0313

 0.2132  0.1908

 0.0262

0.1121 0.0222

0.1342 0.2344 0.1155

 1.6828  4.214

Estimate

0.0792*** 0.0745* 0.0801*

0.0323**

0.0466

0.0547***

0.0495 0.054*** 0.0471**

0.0487

0.0586** 0.0628**

0.0406

0.0703 0.0551

0.0508 0.0418*** 0.0429

0.0664*** 0.2049***

Sandard Error

CATI 3. Day 41,611 observations

0.2541 0.1532  0.3755

 0.1205

 0.0157

 0.3312

 0.2304  0.4856  0.2178

0.0253

 0.0572 0.1578 0.1076

 2.0071  5.7952

0.1024* 0.0934 0.1042**

0.0423**

0.0604

0.0708***

0.0675** 0.0734*** 0.0616**

0.0623

0.068 0.0545** 0.055

0.0793*** 0.5778***

Sandard Error

CATI 4. Day 37,029 observations Estimate

Table 6.1: Maximum-likelihood estimates for CAWI response rate and response rates at 1.–5. calling day.

0.0386 0.1083  0.2304

0.0767

 0.3048

0.0057  0.3605  0.2602

0.0355

 0.1511  0.3161

 0.0618

 0.8862  1.0746

 0.4643  1.6532  1.8119

 2.3683

Estimate

0.1966 0.1714 0.1847

0.1127

0.1351*

0.1201 0.1358** 0.1218*

0.1163

0.1292 0.1414*

0.0985

0.2532** 0.2007***

0.3091 0.1377*** 0.1466***

0.1409***

Sandard Error

CATI 5. Day 21,617 observations

Age* gender

Parameter

10–14 15–19 20-24 25–29 30–34 35–39 40–44 45–49 50–54 55–59 60–64 65–69 70–74 75–79 80–84

25–29 30–34 35–39 40–44 45–49 50–54 55–59 60–64 65–69 70–74 75–79 80–84

Value

Table 6.1: (Continued )

0.0471  0.0171  0.089 0.0347 0.0476 0.2103 0.1218 0.3551

0.097 0.0974 0.0949** 0.0957*** 0.1081*** 0.1328*** 0.181*** 0.2596***

0.0332 0.166 0.2525 0.3201 0.4195 0.4912 0.5018 0.1782 0.3515 0.3355 0.4128 0.2472 0.1282 0.115

0.063 0.0637** 0.0628** 0.065 0.0758*** 0.0943*** 0.138*** 0.196***

0.0788 0.2267 0.1981 0.0015  0.3715  0.8577  1.6022  2.1878

 0.4046  0.1109  0.1344

Estimate

0.0773 0.08 0.0791 0.078 0.0833 0.0882* 0.0959 0.1054

0.0815*** 0.0796*** 0.0878*** 0.0873** 0.0806 0.0779

0.0547 0.0569** 0.0556*** 0.0552*** 0.0582*** 0.0612*** 0.0648*** 0.0696*

0.062*** 0.057 0.0553*

Sandard Error

CATI 1. Day 77,649 observations

0.0911*** 0.101* 0.1155 0.115 0.1084 0.1018

0.0745** 0.0698** 0.0654

Sandard Error

 0.2863  0.2475  0.1029

Estimate

CAWI 90,997 observations

Men 0.3805 Men 0.2038 Men 0.0324 Men 0.0727 Men 0.0485 Men  0.0013 Men Men 0.0571 Men 0.0755 Men 0.2926 Men 0.5383 Men 0.7816 Men 0.8133 Men 1.2501 Men 1.2429

Value

 0.251  0.095  0.112  0.041 0.014 0.318 0.176 0.332

0.095 0.323  0.020 0.050  0.057  0.028

0.218 0.133 0.140 0.175 0.112  0.077  0.232  0.592

 0.092 0.112 0.086

Estimate

0.1134* 0.1194 0.1187 0.1185 0.1294 0.1424* 0.1629 0.1858

0.118 0.1138 0.1242 0.1228 0.116 0.112

0.0814** 0.0873 0.086 0.086* 0.0929 0.1025 0.1132* 0.1264***

0.0886 0.0838 0.0814

Sandard Error

CATI 2. Day 50,914 observations

0.1415 0.1096 0.0866 0.0556  0.1454  0.2371  0.5016  0.9867

 0.1334  0.0354 0.09

Estimate

0.0749 0.079 0.0791 0.0806 0.092 0.1031* 0.1199*** 0.1468***

0.0831 0.0793 0.0746

Sandard Error

CATI 3. Day 41,611 observations

 0.0988 0.0298  0.0909  0.1546  0.3605  0.4162  1.0306  1.4357

 0.2876 0.0363 0.0177

Estimate

0.0985 0.1007 0.1031 0.1059 0.1214** 0.1352** 0.1806*** 0.2176***

0.1075** 0.0973 0.0946

Sandard Error

CATI 4. Day 37,029 observations

 0.0772  0.0693 0.1068  0.2937  0.2403  0.3934  0.8398  1.5836

 0.4966  0.3898  0.2518

Estimate

0.178 0.1899 0.1802 0.2027 0.2119 0.2457 0.3066** 0.4286**

0.2098* 0.1995 0.1852

Sandard Error

CATI 5. Day 21,617 observations

10–14 15–19 20–24 25–29 30–34 35–39 40-44 45–49 50–54 55–59 60–64 65–69 70–74 75–79 80–84

None None None None None None None None None None None None None None None

0.5522 1.0509 1.0503 0.5516 0.6613 0.4516 0.5514 0.5207 0.5517 0.4642 1.0512 0.5532 0.4823 0.556

 0.3702  1.8214  1.9731  0.3792  1.1161 0.1262  0.5697  0.3549  0.586  0.1283  2.1748  0.3679 0.4465 0.4753

Reference value shown in bold face. * Significant at 5% level, ** significant at 1% level, *** significant at 0.01% level. Note: Parameters and interactions for which estimates are not listed in the table are not significant.

Age* telephone

126

Linda Christensen

Table 6.2: Estimated response rates at CAWI and CATI and over all calculated by the model. The web share of the estimated responses by each mode is shown too. Parameter Value

Telephone None Known Year 2006 2007 2008 2009 Month Jan Feb + Mar Apr + Aug + Oct + Nov May + Jun + Sep Jul Dec Response Mon day Tue-Thu Fri Sat Sun Urban– Copenhagen rural 3 big cities Rest of Denmark Age* 10–14 gender 10–14 15–19 15–19 20–24 20–24 25–29 25–29 30–34 30–34 35–39 35–39 40–44 40–44 45–49 45–49 50–54 50–54 55–59 55–59 60–64 60–64 65–69 65–69 70–74

Value

Men Women Men Women Men Women Men Women Men Women Men Women Men Women Men Women Men Women Men Women Men Women Men Women Men

CAWI (%)

CATI day 1 (%)

CATI day 2–5 (%)

Over all Web (%) share (%)

9 13 14 13 13 11 15 14 13

1 39 35 37 35 32 40 37 35

0 16 14 15 14 17 10 11 12

10 68 63 64 63 60 65 62 60

87 19 23 20 21 19 23 22 22

12 11 11 15 13 11 10 13 10 14 13 21 23 11 14 7 11 8 12 8 12 9 14 10 15 11 16 13 18 15 18 16 15 14 11 9

32 29 35 37 36 32 29 34 28 33 35 37 36 30 29 24 23 26 27 29 33 29 33 29 35 31 36 32 39 32 41 36 42 39 44 44

14 13 12 14 15 16 15 14 15 15 15 13 13 21 18 17 18 18 17 19 17 20 17 19 14 17 15 16 12 15 11 13 12 11 12 12

58 54 58 66 65 59 54 61 54 62 63 72 73 61 62 49 51 53 56 57 62 57 63 58 65 59 67 61 69 62 70 65 69 64 67 66

21 21 19 22 21 19 18 21 19 22 20 30 32 18 22 15 21 15 21 14 19 16 22 17 23 19 24 21 26 24 25 24 22 22 16 14

The Role of Web Interviews as Part of a National Travel Survey

127

Table 6.2: (Continued ) Parameter Value

70–74 75–79 75–79 80–84 80–84

Value

Women Men Women Men Women

CAWI (%)

CATI day 1 (%)

CATI day 2–5 (%)

7 7 3 4 2

46 42 46 40 39

12 11 11 10 10

Over all Web (%) share (%) 65 60 61 54 51

11 12 6 8 4

on Mondays for CAWI. It is significantly lower for Fridays and Saturdays for both media and on Saturday it is very low for CATI. On Sunday, the CAWI response rate is not significantly different than on working days, but the CATI response rate is little but significantly lower. Among the CAWI respondents, 50% answer on the day after the travelling day, a quarter respond on the travelling day, and another quarter respond 2 days after. However, the share depends on the weekday and a higher share is answering at the second day when this is Sunday, holiday and Monday. It seems as if the CAWI respondents try to compensate by answering the following day when they have little time to answer on Friday and Saturday. Holidays and seasons have a significant effect on the response rate. For both media, the response rate is high during winter, and is particularly low in July, which is the Danish summer holiday. Interestingly, in December, which is the month of Christmas preparations and many activities in the families, the response rate is significantly lower by CAWI but not by CATI. Year effects are found for both survey response rates. The response rate in CAWI decreases over the years. The same is the case for CATI except for 2006. However, the survey in 2006, started in May and hence it might explain why the response rate by CATI is lower than the following year.

6.4.2.

Response Media Choice

Table 6.3 shows the model estimation results for the choice of response media. All variables included in the model are accepted at a 0.05 significance level. All above variables except month can be included. In Table 6.3 included are the web shares estimated by the regression model. The estimated web shares are very close to the shares found in Table 6.2 with few exceptions. It is lower for school children because some of the explanation for the high web-response rate in this analysis can be explained by a high web share for students and school children. For the elderly over 70 years of age it is higher because some of the explanation for a low web share can be found for pensioners (pensioners are both elderly and people unable to work). The web share is a little higher for people without a publicly known telephone number which shows that the lower

Education

Job situation

Residence

Adults

Children

Income

Response day

Year

Intercept Telephone

Parameter

None Known 2006 2007 2008 2009 Monday Tues-Thurs Friday Saturday Sunday Denied Informed 0 1 2+ Single Couple House Flat Under education Pensioner Unemployed Working at home Employed 1.–10. grade High school Middle or long

Value

Value

Standard error

0.0752*** 0.1009*** 0.0352*** 0.0302 0.0303** 0.0400** 0.0324 0.0425*** 0.0437*** 0.0303*** 0.0408*** 0.0382*** 0.0320*** 0.0346** 0.1888 0.0848 0.0873*** 0.1071** 0.0723 0.0482*** 0.0377***

Estimate

 1.5578 3.4514 0.1758 0.0283 0.0867 0.1515 0.0425  0.1692  0.3346  0.4944 0.3525 0.1954  0.1797  0.0989  0.0512  0.0297  0.7413  0.4075 0.0799 0.2742 0.5661

84 19 21 19 20 19 22 21 18 16 20 15 21 21 19 17 18 20 19 21

Web share (%)

Age* Gender

Gender

Age

Parameter

10–14 15–19 20–24 25–29 30–34 35–39 40–44 45–49 50–54 55–59 60–64 65–69 70–74 75–79 80–84 Men Women 10–14 15–19 20–24 25–29 30–34 35–39 40–44 45–49 50–54 55–59 60–64

Value

Men Men Men Men Men Men Men Men Men Men Men

Value

0.1223 0.1308 0.3754 0.4200

0.2346 0.0460 0.0008 0.0235 0.0369  0.0325

 0.0446 0.0041  0.0748  0.1097  0.1523  0.5218  1.0532  1.1341  0.3288

0.2579  0.1901  0.2002  0.3800  0.3618  0.1024

Estimate

Table 6.3: Maximum-likelihood estimates for web response shares. Web shares estimated by the model are shown too.

0.1088 0.1097 0.1074** 0.1090**

0.1031 0.1139 0.1317 0.1314 0.1223 0.1141

0.0715 0.0756 0.0759 0.0817 0.1027 0.1230*** 0.1638*** 0.2250*** 0.0780***

0.1160* 0.1068 0.0975* 0.0890*** 0.0797*** 0.0739

Standard error

25 16 15 13 14 16 18 19 20 22 22

Web share (%)

No car Car owner Under education Pensioner Unemployed Working at home Employed Under education Pensioner Unemployed Working at home Employed

Employed

Working at home

Unemployed

Pensioner

Short or practical Under education

No car No car No car No car No car Car owner Car owner Car owner Car owner Car owner

1.–10. class High school Middle or long Short or practical 1.–10. class High school Middle or long Short or practical 1.–10. class High school Middle or long Short or practical 1.–10. class High school Middle or long Short or practical 1.–10. class High school Middle or long Short or practical

0.0878  0.4892  0.3701 0.1265

 0.3337

0.1617 0.1441 0.2424

0.2004 0.3985 0.6832

 0.2012  0.1839 0.0013

0.6470 0.0157 0.5995

0.0856 0.1417** 0.1113** 0.2567

0.0692***

0.2278 0.1748 0.1424

0.1193 0.1123** 0.0902***

0.1479 0.1309 0.1143

0.2139** 0.2006 0.2085**

22* 10 9 15 16 25 19 15 17 20

27 20 36 16 14 16 23 15 11 15 23 9 15 17 22 12 17 20 25 16 Urban-rural *Car

Urban-Rural

65–69 70–74 75–79 80–84 10–14 15–19 20–24 25–29 30–34 35–39 40–44 45–49 50–54 55–59 60-64 65–69 70–74 75–79 80–84 Central Cph 3 big cities Rest of DK Central Cph 3 big cities Rest of DK Central Cph 3 big cities Rest of DK

Reference values are indicated in bold. * Significant at 5% level, ** significant at 1% level, ***significant at 0.01% level. Note: Parameters and interactions for which estimates are not listed in the table are not significant.

Job situation *Car

Car

Job situation *Education

No car No car No car Car owner Car owner Car owner

Men Men Men Men Women Women Women Women Women Women Women Women Women Women Women Women Women Women Women

0.1228*** 0.1491*** 0.2029*** 0.2843*

0.3822 0.0955*** 0.1803 0.0912*

 0.2257 0.0600** 0.0808 0.0360*

0.6600 0.5817 0.8654 0.6271

17 19 15 17 22 20

25 19 15 12 27 20 19 17 17 21 23 22 23 21 21 20 15 10 9

130

Linda Christensen

CAWI response rate does not seem to be explained by the included socio-economic variables. Education is highly significant alone as well as the interaction between education and employment status. Higher education and students are related to a substantially higher web-response share, while no education or a short or technical education is related to a low online response shares. Employed respondents are associated with a higher web-response share than retired and unemployed respondents, although the web-response share of highly educated unemployed and retired respondents are similar to the employed. The interactions between car ownership and employment status and between car ownership and urbanisation class are significant too. Especially pensioners and unemployed without a car have a low web share (9–10%), whereas unemployed with a car have a web share close to the mean. Respondents without a car have a web share lower than the web share found in Table 6.2 for all 3 urbanisation classes. The commuting distance has no significant effect on the web-response rates except that respondents working at home have a significant lower web share. Surprisingly, the web-response share is not significantly related to income. The only group with a significant lower web share is those who did not inform about their income (15%). The result is possibly related to the correlation between income and education level and employment status and the fact that income is reported with many errors.

6.4.3.

Response Media’s Effect on the Resulting Travel Behaviour

Table 6.4 gives an overview of the model estimation results of travel behaviour. Both the direct results and estimated values are presented. According to the data values in Table 6.4 people responding by CAWI have 18% less days without trips than people responding by CATI. Those who are travelling have 8% more trips but the trips are 1.9% shorter in mean. The trips are close to equally time-consuming in mean. The overall effect of the difference in the number of respondents without trips and the higher number of trips by CAWI is that the overall kilometres by CAWI is 11.4% higher than the kilometres by CATI per respondent. In the right part of Table 6.4 the estimated results from the regression model are shown. The variables included in the regression modelling are on one hand a variable for the interview media and interactions between interview media and socio-economic variables and on the other hand all the socio-economic variables and their interactions included in the in-depth analysis above. The method is used to uncover if the differences in travel behaviour can be explained only by socioeconomic variables or if the difference is found in different behaviour of respondents by the two media. Appendix Table A1 presents the variables and interactions which are significant in the analyses of the four behavioural characteristics of travel.

The Role of Web Interviews as Part of a National Travel Survey

131

Table 6.4: Five indicators for travelling activity calculated directly from data and estimated in multivariate regression analyses. The difference is calculated as the percentage CAWI figures are bigger than CATI figures. Indicators for travelling activity

Share of respondents with no trips Number of trips per person travelling Number of trips W0 km,o2 km Number of trips X2 km,o6 km Number of trips X6 km,o10 km Number of trips X10 km,o20 km Number of trips X20 km Average trip length Average time use per trip Kilometres per person travelling

Data values

Estimated values

CAWI

CATI

Difference (%)

CAWI

CATI

Difference (%)

13.0%

16.0%

 18.8

15.4%

15.4%

+ 0.0

3.90

3.61

+8.0

3.82

3.62

+ 5.6

1.06

1.10

 3.3

1.18

1.03

+14.5

0.42

0.40

+6.4

0.53

0.49

+7.2

0.63

0.60

+5.0

16.1 22.2 48.4

16.5 22.1 45.0

 1.9 +0.7 +7.6

16.2 22.5 47.7

16.5 22.2 45.3

 2.1 +1.4 +5.4

The result for the share of respondents with no trips is that the interview media is not significant, neither alone nor in interaction with other variables. The observed difference in number of immobile respondents is therefore explained only by difference in socio-economic variables and not related to the response media. The estimated number of trips per traveller is 5.6% higher by CAWI than by CATI (see the right columns in Table 6.4). More but shorter trips will be the case if the respondents report more carefully all trips because it might be the short trips which are left out or aggregated to one longer trip if the behaviour is reported less carefully. To uncover if this is the case the effect of interview media is analysed for different distance bands and the estimated results are shown in the right side of Table 6.4. When the socio-economic effects are taken into account, the number of trips less than 2 km is 3.3% less by CAWI whereas the number of trips between 2 and 6 km is 14.5% higher by CAWI. For the longer distances the number is also higher by CAWI than CATI, but the difference is smaller (5–7%). The maximum likelihood estimators for the 5 distance bands are shown in Appendix Table A2. The estimated average trip distance is 2.1% shorter in CAWI responses than in CATI. The estimated average travel time per trip for those using the web is 1.4% higher. The mean kilometre per person travelling is estimated to be 5.4% higher by CAWI than by CATI.

132

Linda Christensen

Interaction between interview media and car ownership is significant in all four analyses of travel indicators and for the number of trips at all trip distances. Interaction between interview media and age, education, job situation and commuting distance respectively is significant for the number of trips and for some of the other travel indicators and trip distances (refer to Appendix Tables A1 and A2). The estimated result for the interaction between response media and car ownership is remarkable. For the car owners the picture is following the common results for the CAWI related to the CATI. For respondents without access to a car, the number of trips is less than for the car owners except at the shortest distances under 2 km. The CAWI respondents, however, have more trips at all distances than the CATI respondents, furthermore resulting in a longer mean distance. The interaction between response media and age is significant for all trip distances except the longest and for the mean distance. For most age groups the number of trips is a little higher by CAWI than by CATI for all distances. The CAWI respondents from 70 years of age and up are travelling significantly longer per trip than the CATI respondents, whereas CAWI respondents under 70 are travelling shorter. The interaction between response media and commuting distance is significant for the number of trips. For respondents with work or school less than 50 km from home, the number of trips is higher by CAWI respondents than by CATI. But for the long-distance commuters the CAWI respondents have less trips.

6.5. Discussion As mentioned in the introduction, the purpose is to uncover whether introducing the CAWI (1) makes a difference in representativeness, for instance by getting contact to respondents not easily accessible in a CATI, (2) brings the stated travel pattern closer to the real mobility or farther away or (3) saves money. The combination of a CAWI and a CATI survey might on the one hand improve the data when the quality of the CAWI data is higher (discussed above) and/or because the survey gets more representative due to contact to a broader group of Danes. On the other hand, it might bias it negatively if the response rate is increased for groups which are already over-represented in the CATI and/or have nonrepresentative travel pattern. A special problem for the analyses is that the two groups are not ruling out each other. If a person is not answering by CAWI it can be contacted by CATI. A low or high response rate at CAWI can therefore be substituted by a higher or lower response rate at CATI. If the overall response rate is unknown, it is only possible to guess about the effect of differences in response rates at CAWI and CATI. Most of the bias is not a real bias because it is possible to make an up-weighting by using the socio-economic variables in the administrative registers. The problem is if the biasing respondents have a different travel pattern than the rest in the same group or if they are not possible to identify by the relevant socio-economic variables.

The Role of Web Interviews as Part of a National Travel Survey 6.5.1.

133

Effects from the Different Response Rates

The response rate by CAWI is very low compared to the CATI, indicating either a general preference of Danes for phone communication or a feeling of higher commitment in responding to the survey when contacted by a person. The overall low response rate by CAWI shows that this method cannot be used as a standalone method and should be used in a mixed setting with CATI. Due to the low CAWI response rate, the effect of this media on the data representativeness is relatively low in the case that it is used as a complementary method to CATI. Nevertheless, the data analysis shows that there are three cases in which the CAWI is particularly useful. First, the CAWI is particularly useful because of flexibility to answer a day later if the respondent is not available earlier. The most important bias of all NTS with a 1 day diary is the lower response rates in certain months and weekdays because people travelling over 2 or more days are less possible to be contacted for an interview. However, the CAWI only compensates for brief travel activities like a weekend tour. The lack of knowledge about holidays cannot be compensated either by oversampling or by up-weighting because those contacted in the oversample are more like those at home than those with whom there is no contact. Second, the CAWI survey method is particularly suitable for highly educated respondents and university students. Higher education is related to better resources to own and use a computer which is possibly the reason for a higher web share for both employed and persons outside the labour market. The same is the case for students at higher education. Because these groups have a travel behaviour similar to other groups, the high CAWI response rate is not biasing the results. It might even be possible that some are answering by CAWI who would not have been possible to contact by CATI. Third, the CAWI is very relevant to schoolchildren and teenagers under 20 years of age who have a high response rate on CAWI. This is likely due to the wide exposure of children to technology these days, the high familiarity with the media and their relatively abundant time resources. The CATI method is superior in three cases. The CATI is more useful than the CAWI for contacting elderly respondents. For the elderly above 70, a lower webresponse rate is obviously due to lack of access to a computer and of skills for the use. The overall response rate is not that low except for the rather few sampled over 80 years. The trip length is higher for the elderly CAWI respondents than for the elderly CATI respondents. This indicates that the elderly answering by CAWI is still fully active, which possibly induces some bias for this age group, so the mixed mode is essential for gathering complete information for this age group. The CATI is more useful than the CAWI for contacting respondents who are not inclined to answer by CAWI. This might be because of busyness over a longer period illustrated by a lower CAWI response rate in the busy December, but a mean response rate by CATI. They apparently have a commitment to responding to the survey when contacted by telephone, opposite to a letter which is easier to neglect. The same is the case for persons with low willingness to participate in the survey in general indicated by denying answering about income. However, a lower response

134

Linda Christensen

rate by CAWI does not seem to bias the result because they do not have a significant different travel behaviour. The CATI is more useful than the CAWI for contacting respondents with few socio-economic resources. Indicators for these people who are known to have a low response rate in all surveys (Groves, 2006) are no or only a short education, unemployment or young pensioner, no car, and eventually living in the big cities. They travel shorter than other respondents. They have seldom access to the Internet and are not able to use the web even when they have the time. As shown by respondents without access to a car, respondents from this group who answer by CAWI often travel longer than respondents answering by CATI. Including the CAWI might therefore bias the survey a little extra because it is not easy to up-weight the travel pattern with shorter trips by a demographic weighting method taking care of the special characteristics of this group. Notably, there are two cases in which both methods have a particularly low response rates. Both methods yield very low response rates for young adults in their 20s. The CATI response rate is very low and they are less easily contacted by telephone because 18–19% have unknown telephone number which partly adds to the problem. Because of the very low response rate, oversampling has been decided from 2012. Five to 10 years old research (e.g. Groves, 2006) indicates a higher online response rate for young men under 35. The opposite is the case today, at least in Denmark. School children are more advanced in Internet use and have high response rates by CAWI. Furthermore women answer more than men by web. During holiday periods, especially in July, the response rates are particularly low with both methods due to increased travel abroad and to a summer cottage for a relatively long period of time. Travels with more than one or two nights stay outside home are as mentioned under-represented.

6.5.2.

Effect of Web Response on the Travel Behaviour

The results show that there is a difference in data between CAWI and CATI in the share of immobile respondents. However, the difference is explained solely by difference in socio-economic variables. The share of respondents with ‘no trips’ is an important indicator for the quality of the interview process and for the interviewer effect on a CATI as shown by Christensen (2006). The former problems with a massive interviewer effect seem to have been overcome by a very active control of interviewers. Another former problem was soft refusers who seemed to over-report ‘nos trip’ when they were inclined to answer because of many calls and insisting interviewers. It is obvious that people who have started the web questionnaire do not feel compelled to answer and therefore they will not refuse to report their first trip. It therefore seems as if the former over-representation because of soft refusers is overcome in the CAWI. However, it might still be possible that the no trip rate is too high at both media because the respondents get tired of answering before they start the diary. A lower trip rate is likely by both media when the

The Role of Web Interviews as Part of a National Travel Survey

135

respondents find out that the answers are rather complicated and imply reporting many addresses. The number of trips is a little higher by CAWI than CATI except for the shortest trips. We might explain this by some more complicated mechanisms. For the shortest trips the interviewer increases the number of trips because they can better explain when a chain of short trips has to be parted into several separate trips which are especially relevant for a shopping trip with visits to several shops eventually combined with non-shopping purposes. For longer trips, especially between 2 and 10 km, it seems as if the CAWI respondents are more careful with reporting all trips, whereas the interviewers give the respondents too little time resulting in an under-reporting of trips. Bonke and Fallesen (2009) find a more careful reporting of activities and spendings by CAWI respondents too for a time use and consumption survey. The extra trips by CAWI improve the survey results by bringing the travel distance closer to the real behaviour of the population. The lower number of short trips is less important because the travel activity is still included. CAWI respondents might also be a little more travel active than those answering by CATI indicated by a little higher number of trips at longer distances which should not easily be forgotten at a CATI. This can be both due to a general higher travel activity for CAWI respondents and because respondents who are travelling much choose to answer by CAWI because they are not so easily contacted by telephone. The 30–39 age group could be a good example of the last case. They have a relatively higher trip rate at CAWI than the rest and a low response rate at CATI especially at the first telephone call. They are therefore not easy to contact. In such case the CAWI brings contact to people who would not have been contacted if only a CATI is included. Bayart and Bonnel (2012) find a lower number of trips per respondent in a CAWI (3.00) than in a CAPI (4.04) for the Lyon area. They suggest that a reason is an under-reporting in the CAWI because people are busy and don’t use time on answering correctly. However, the CAWI respondents are non-respondents to the CAPI, having a higher income, longer education etc. related to the CAPI respondents. It is therefore more in accordance with the Danish survey that the CAWI group over-represents busy respondents with less but longer trips. The results from Bayart and Bonnel (2012) can easily be interpreted that way instead of as an under-reporting of CAWI respondents.1 Another group which should call special attention is those commuting long distances. They have a lower trip rate at CAWI which indicates that those who are able to take their time to answer by CAWI are those who travel less trips. By introducing the mixed mode it seems as if those commuting with most trips are

1. The CAWI respondents have more long trips typical by car starting in the morning and arriving home late. They seem to have less time for walking trips for lunch and shopping in the afternoon.

136

Linda Christensen

under-represented and therefore bias the result. This bias cannot be compensated for by a demographic up-weighting procedure. The difference in mean kilometres per trip between the CAWI and CATI is small and in most cases insignificant. However, for the elderly above 70 there is a significant difference. The elderly answering by CAWI are travelling longer trips but not more often. The difference in behaviour of the elderly answering by the two media might not bias the result because it should be expected that they would answer by CATI if the CAWI response possibility did not exist.

6.5.3.

Economy and Post-Processing

At first glance web interviews are saving much money. According to the result of the tender with the survey firm the marginal cost for a telephone interview in 2009 was 113 DKK (15 EUR) and for a web interview 7.50 DKK (1 EUR). However, the necessary post-processing of the CAWI is much more timeconsuming than for the CATI. It is assessed that the mean time per handled interview for post-processing takes 9 minutes and half of the time is used on the web interviews even though they are only taking up 20% of the interviews. The two most common problems are stages of a travel reported by the self-administering respondent as different trips, and the need for post-processing addresses to get them correctly geocoded. The interviewers are skilled in finding pre-coded addresses. The resulting marginal price per CAWI interview is therefore 31 DKK (4.15 EUR) compared to 119 DKK (15.90 EUR) per CATI interview.

6.6. Conclusion The CAWI respondents have a higher number of trips, especially from 2 to 10 km, very likely because the respondents are more careful when filling in the questionnaire. Including the CAWI in the survey is therefore improving the survey. For the shortest distances less than 2 km the CATI is advanced because the interviewers are able to explain the complicated definition of a trip. However, the lack of short trips only influences the number of travels not the resulting travel distances reported by the survey. By combining the two media a little more people seems to be included in the survey, especially some who are not so easy to contact, because of travelling or not being at home at normal calling time. The analysis shows that the CAWI survey is particularly useful in three cases. First, due to its flexibility in the response day, the CAWI interview enables to gather information about respondents who have been away for a couple of days or did not have time to complete the survey on a specified date. Second, the CAWI survey method is particularly suitable for highly educated respondents and university

The Role of Web Interviews as Part of a National Travel Survey

137

students. Third, the CAWI method is particularly useful for increasing the response rate of school children and teenagers. However, the results of the current study show that in some cases the CATI survey is preferred. Specifically, the CATI is preferred for collecting information regarding the elderly population, and people with a low socio-economic situation. In addition, the CATI survey proved to be more useful for busy people and people not willing to participate in a survey at all. Both methods have low response rates in the case of young adults in their 20s and in holiday periods. The first problem can be solved by oversampling the second can only be solved by a changed design of the survey. By including the CAWI, the survey is biased a little in relation to travellers with long commuting distances, without a car, and with low socio-economic status. These biases can only, with difficulties, be removed by an up-weighting procedure based on central register data. The current study shows that in Denmark the response rate by CAWI of a sample drawn from the central register and contacted by letter is relatively small in comparison with CATI. Hence, efforts should be made to explore the reasons for the low response rate and to increase the participation because of the proved advantage of the CAWI to increase the quality. In terms of costs a supplement with a CAWI is saving money in an NTS with a high number of interviews with respondents sampled by a representative sampling method. Per interview realised on the web, a 74% saving of the marginal cost is realised in the Danish case, and for the survey as a whole with 20% of the interviews as CAWI responses the saving is 15%. The fixed cost to run the survey is not included. However, in case of use of reminders the saving per marginal extra CAWI respondent is much less and in case of a response rate less than 10% it might be negligible, and the decision to choose this solution is dependent on other advantages of extra CAWI responses.

Acknowledgements The author wants to thank Joyce Dargay, Sigal Kaplan, and Carsten Jensen for advice, help and input.

References Alsnih, R. (2006). Characteristics of web based surveys and applications in travel research. In P. Stopher & C. Stecher (Eds.), Travel survey methods: Quality and future directions (pp. 569–592). UK and the Netherlands: Elsevier. Bayart, C., & Bonnel, P. (2012). Combining web and face-to-face in travel surveys: Comparability challenges. Transportation. doi: 10.1007/s11116-012-9393-x Bech, M., & Kristensen, M. B. (2009). Differential response rates in postal and web-based surveys among older respondents. Survey Research Methods, 3, 1–6.

138

Linda Christensen

Bonke, J., & Fallesen, P. (2009). The impact of incentives and interview methods on response quantity and quality in diary- and booklet-based surveys. Study Paper No. 25. University Press of Southern Denmark, Odense (ISBN 978-87-90199-21-0). Bonnel, P. (2003). Postal, telephone and face-to-face surveys: How comparable are they?. In P. R. Stopher & P. M. Jones (Eds.), Transport survey quality and innovation (pp. 215–237). London: Elsevier. Bonnel, P., & Madre, J.-L. (2006). New technology: Web based. In P. Stopher & C. Stecher (Eds.), Travel survey methods: Quality and future directions (pp. 593–603). UK and the Netherlands: Elsevier. Bouhuijs, I. (2011). The Dutch travel survey: From pen-and-paper to mixed mode. Retrieved from http://shanti.inrets.fr/index.php?id¼19&tx_airfilemanager_pi1[path]¼Presentations%2 FM6%20Presentations%20in%20Eindhoven%2FWorking%20Group%204 Braunsberger, K., Wybenga, H., & Gates, R. (2007). A comparison of reliability between telephone and web-based surveys. Journal of Business Research, 60(7), 758–764. doi:10.1016/ j.jbusres.2007.02.015 Bronner, F., & Kuijlen, T. (2007). The live or digital interviewer: A comparison between CASI, CAPI and CATI with respect to differences in response behaviour. International Journal of Market Research, 49(2), 167–190. Buelens, B., & van den Brakel, J. (2011). Inference in surveys with sequential mixed-mode data collection. Discussion Paper No. 201121, Statistics Netherlands, The Hague/Haarlem. Retrieved from http://www.cbs.nl/NR/rdonlyres/C70ED9D5-6199-4E27-B37D-792 A161D614D/0/2011x1021.pdf Chang, L., & Krosnick, J. A. (2008). National surveys via RDD telephone interviewing vs. the Internet: Comparing sample representativeness and response quality. Public Opinion Quarterly, 73, 641–678. doi:10.1093/poq/nfp075 Christensen, L. (2006). Possible explanations for an increasing share of no-trip respondents. In P. Stopher & C. Stecher (Eds.), Travel survey methods: Quality and future directions (pp. 303–316). UK and the Netherlands: Elsevier. Christensen, L. (2011). Experience with web interview as part of a NTS survey. Paper presented at a workshop of the Cost Shanti Action, TUD Action TU0804–Survey Harmonisation with New Technologies Improvement, 13–15 April, Vienna. Retrieved from http://shanti.inrets.fr/index.php?eID¼tx_nawsecuredl&u¼0&file¼fileadmin/Documents%20 Center/Presentations/M5%20Presentations%20in%20Vienna/Working%20Group%202/ Christensen.pdf&t¼1352940648&hash¼75e5e3b2b47b32d616a947be95004cf0 Dixon, J. (2002). Nonresponse bias in the consumer expenditure quarterly survey. American Association for Public Research 2002: Strengtheing our Community – Section on Survey Research Methods. Available at http://www.amstat.org/sections/srms/proceedings/y2002/ Files/JSM2002-000410.pdf DST, Statistics Denmark. (2012). Retrieved from http://www.statistikbanken.dk/statbank5a/ default.asp?w¼830 Duffy, B., Smith, K., Terhanian, G., & Bremer, J. (2005). Comparing data from online and face-to-face surveys. International Journal of Market Research, 47(6), 615–639. Fleming, C. M., & Bowden, M. (2009). Web-based surveys as an alternative to traditional mail methods. Journal of Environmental Management, 90(1), 284–292. doi:10.1016/j.jenvman. 2007.09.011 Groves, R. M. (2006). Nonresponse rates and nonresponse bias in household surveys. Public Opinion Quarterly, 70, 646–675. doi:10.1093/poq/nfl033 Kloek, W., & van der Valk, J. (2011). Reflections on web based data collection in a mixed mode design: The case of the EU Labour Force Survey. Workshop on Multimode data collection. Luxembourg, 22–23 September.

The Role of Web Interviews as Part of a National Travel Survey

139

Kojetin, B. A. (1994). Characteristics of nonrespondents to the current population survey (CPS) and consumer expenditure interview survey (CEIS). Proceedings of the Survey Research Methods Section of the American Statistical Association, Alexandria, VA. Available at http://www.amstat.org/sections/srms/Proceedings/ Kraan, T., van den Brakel, J., Buelens, B., & Huys, H. (2010). Social desirability bias, response order effect and selection effects in the new Dutch Safety monitor. Discussion Paper 10004. Statistics Netherlands, The Hague/Haarlem. Retrieved from http://www.cbs.nl/NR/ rdonlyres/7DB17DBF-3FEA-47D9-9B1C-1A1F50410F81/0/201004x10pub.pdf Lee, C., & Pino, J. (2012). Hang up the phone and get online: Measuring effectiveness of web-based surveys in transportation. Proceedings of the 91th Annual Meeting of the Transportation Research Board 22–26 January, Washington, DC. Mabit, S. L., & Fosgerau, M. (2011). Demand for alternative-fuel vehicles when registration taxes are high. Transportation Research D, 16, 225–231. doi:10.1016/j.trd.2010.11.001 Madre, J.-L., Axhausen, K. W., & Bro¨g, W. (2007). Immobility in travel diary surveys. Transportation, 34(1), 107–128. doi:10.1007/s11116-006-9105-5 Manfreda, K. L., Bosnjak, M., Berzelak, J., Haas, I., & Vehovar, V. (2008). Web surveys versus other survey modes: A meta-analysis comparing response rates. International Journal of Market Research, 50(1), 79–104. Roster, C. A., Rogers, R. D., Albaum, G., & Klein, D. (2004). A comparison of response characteristics from web and telephone surveys. International Journal of Market Research, 46(3), 359–374. Shin, E., Johnson1,2, T. P., and Rao, K. (2011). Survey mode effects on data quality: Comparison of web and mail modes in a US National Panel Survey. Social Science Computer Review, 1–17. doi: 10.1177/0894439311404508. Retrieved from http://ssc. sagepub.com Stopher, P., Zhang, Y., Armoogum, J., & Madre J.-L. (2012). National household travel surveys: The case for Australia. Australasian Transport Research Forum 2011 Proceedings 28–30 September, Adelaide, Australia. Available at http://www.atrf11.unisa.edu.au/Assets/ Papers/ATRF11_0221_final.pdf Transportvaneundersøgelsen, TU. (2011). Unpublished raw data of The Danish National Travel Survey. Retrieved from http://www.dtu.dk/centre/Modelcenter/TU.aspx TU questionnaire. (2012). A test version can be accessed at the address. Retrieved from http:// tu2012.dk/dev/starttest.php (the year in the address changes every new year)

Children

Income

Education

Gender

Intercept Age

Parameter

10–19 20–24 25–29 30–39 40–44 45–59 60–69 70–74 75–79 80–84 Men Women 1–10 grade High school Middle or long Short or practical Denied o12,000 kr 12–100,000 kr 100–250,000 kr W250,000 kr 0 1 2+

Value

Value

1 1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 0 1 1 0

Degree of freedom

1.6379  0.1363 0.061  0.1211  0.1127 0  0.1765  0.2074  0.2524  0.3077  0.3441  0.1582 0  0.1001 0.042 0.0091 0  0.1073  0.1767  0.09 0.0414 0  0.0923  0.0172 0

Estimate

0.0254*** 0.0224*** 0.0208** 0.0193*** 0.014*** 0 0.0138*** 0.0174*** 0.0252*** 0.0284*** 0.0359*** 0.0156*** 0 0.0156*** 0.0101*** 0.0084 0 0.0097*** 0.1001 0.0297** 0.0081*** 0 0.0079*** 0.0079* 0

Standard Error

Number of trips

Table A1: Maximum-likelihood estimates for some travel indicators.

3.8285  0.0036 0.0793  0.0308  0.096 0  0.0132 0.135  0.0952  0.3433  0.1308 0.1991 0 0.0406 0.0248 0.0603 0  0.0383 0.0179  0.2241  0.1174 0 0.1086  0.0077 0

Estimate

0.0522*** 0.0816 0.077 0.0723 0.0543 0 0.0503 0.0616* 0.102 0.148* 0.1844 0.0492*** 0 0.0414 0.0287 0.0224** 0 0.0256 0.061 0.0503*** 0.025*** 0 0.0256*** 0.0281 0

Standard Error

Mean trip distance

Appendix

3.6646 0.0673 0.0555  0.0699  0.086 0 0.0033 0.1137 0.0553  0.1033 0.0085 0.0908 0 0.015 0.042 0.076 0 0.0084 0.0806  0.0731  0.0511 0 0.1415 0.0292 0

Estimate

0.0418*** 0.0472 0.0477 0.0469 0.0364* 0 0.0336 0.0394** 0.0572 0.0716 0.0842 0.0378* 0 0.0221 0.018* 0.0152*** 0 0.0185 0.0373* 0.0291* 0.0172** 0 0.0191*** 0.0205 0

Standard Error

Mean travel time

4.9759  0.1224 0.0469 0.0164  0.0333 0 0.0132  0.0269  0.1542  0.2523  0.4338 0.114 0  0.0086  0.0321  0.0067 0  0.1266  0.0991  0.2179  0.1417 0

Estimate

0.0329*** 0.0496* 0.0353 0.0291 0.0206 0 0.0188 0.026 0.0539** 0.0703** 0.1135** 0.0127*** 0 0.0293 0.0198 0.0151 0 0.0189*** 0.051 0.0373*** 0.0173*** 0

Standard Error

Travel distance per traveller

Age*gender

Adults

Residence

Commuting distance

Travel year

Travel day

Urban-rural

Car

Job situation

Under education Pensioner Unemployed Employed No car Car owner Central Cph 3 big cities Rest of DK Monday Tues-Thurs Friday Saturday Sunday 2006 2007 2008 2009 No commute 0–9 km 10–19 km 20–49 km 50– km House Flat Single Couple 10–19 10–19 20–24 20–24 25–29 25–29

Men Women Men Women Men Women

1 1 1 0 1 0 1 1 0 1 0 1 1 1 1 1 1 0 1 1 1 1 0 1 0 1 0 1 0 1 0 1 0 0.0977 0  0.0778 0 0.2037 0

0.008  0.0644  0.0594 0  0.1107 0  0.0456  0.0452 0  0.0881 0 0.7106  0.1249  0.1989  0.0145  0.0694  0.0995 0 0.1136 0.0275  0.0239  0.0271 0

0.02*** 0 0.0262** 0 0.0256*** 0

0.0696 0.038 0.0296* 0 0.0106*** 0 0.012** 0.0081*** 0 0.0412* 0 0.0335*** 0.0412** 0.0466*** 0.0075 0.0064*** 0.0066*** 0 0.0255*** 0.0226 0.0243 0.0248 0

 0.0682 0  0.0651 0  0.1418 0  0.0816 0

 1.1034  1.6198  1.0526  0.515 0

 0.1421  0.2354  0.1793 0  0.2289 0  0.1108 0.0752 0 0.0269 0  0.0261  0.4193  0.4376

0.023** 0 0.0735 0 0.0894 0 0.0866 0

0.0524*** 0.0443*** 0.0443*** 0.0337*** 0

0.1261 0.0808** 0.069** 0 0.0496*** 0 0.0399** 0.0236** 0 0.0467 0 0.0509 0.0657*** 0.0747***

 0.0588 0  0.0341 0  0.1351 0  0.0235 0

0.0213 0  0.0098  0.2289  0.3503 0.0529 0.0217 0.0231 0  0.7947  1.0496  0.6747  0.316 0

 0.0677  0.127  0.0918 0 0.1264 0

0.0155** 0 0.049 0 0.061* 0 0.0617 0

0.0456 0 0.0493 0.0543*** 0.0672*** 0.0174** 0.0151 0.0153 0 0.0414*** 0.033*** 0.0363*** 0.0328*** 0

0.0542 0.0394** 0.0381* 0 0.0264*** 0

 0.2267  0.2448  0.2848 0  0.2919 0  0.1604 0.0465 0  0.0202 0  0.0257  0.5962  0.7264  0.0215 0  0.0577 0  0.9305  1.4399  0.8943  0.4661 0 0.043 0

0.0923* 0.0585*** 0.0522*** 0 0.038*** 0 0.0318*** 0.0175** 0 0.0341 0 0.0363 0.0552*** 0.0686*** 0.0179 0.0149 0.0155** 0 0.0373*** 0.0305*** 0.03*** 0.025*** 0 0.0179* 0

Education* Job situation

Urban-rural* Car

Parameter

Value

Men Women Men Women Men Women Men Women Men Women Men Women Men Women No car Car owner No car Car owner No car Car owner Under education Pensioner Unemployed Employed Under education Pensioner Unemployed

Value

30–39 30–39 40–44 40–44 45–59 45–59 60–69 60–69 70–74 70–74 75–79 75–79 80–84 80–84 Central Cph Central Cph 3 big cities 3 big cities Rest of DK Rest of DK 1–10 grade 1–10 grade 1–10 grade 1–10 grade High school High school High school

Table A1: (Continued )

1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 1 1 0 1 1 1

Degree of freedom

0.0492 0 0 0 0.1419 0 0.1345 0 0.143 0 0.2048 0 0.1167 0 0.0545 0 0.0497 0 0 0  0.2895 0.041 0.0465 0  0.4278  0.0761  0.0551

Estimate

0.0196* 0 0 0 0.0183*** 0 0.0208*** 0 0.0299*** 0 0.0361*** 0 0.0475* 0 0.0195** 0 0.0193* 0 0 0 0.0371*** 0.032 0.023* 0 0.0327*** 0.028** 0.0221*

Standard Error

Number of trips

0.0515 0.0091  0.0639 0  0.3679  0.0897  0.0027

0.0326 0 0 0  0.0498 0  0.1666 0  0.1074 0 0.2546 0  0.4564 0

Estimate

0.1425 0.1353 0.1553 0 0.141** 0.1103 0.0957

0.0638 0 0 0 0.0569 0 0.0659* 0 0.1135 0 0.1616 0 0.2599 0

Standard Error

Mean trip distance

0.0366 0 0 0  0.0378 0  0.0919 0  0.0832 0 0.211 0  0.1783 0

Estimate

0.0482 0 0 0 0.0434 0 0.0477 0 0.0699 0 0.085* 0 0.1168 0

Standard Error

Mean travel time

 0.1673  0.0155 0.011 0  0.2952  0.0863 0.0448

Estimate

0.0992 0.0883 0.0993 0 0.105** 0.0848 0.0715

Standard Error

Travel distance per traveller

Travel day* Commuting distance

High school Middle or long Middle or long Middle or long Middle or long Short or practical Short or practical Short or practical Short or practical Monday Monday Monday Monday Monday Tues-Thurs Tues-Thurs Tues-Thurs Tues-Thurs Tues-Thurs Friday Friday Friday Friday Friday Saturday Saturday Saturday Saturday Saturday Sunday Sunday Sunday Sunday

Employed Under education Pensioner Unemployed Employed Under education Pensioner Unemployed Employed No commute 0–9 km 10–19 km 20–49 km 50– km No commute 0–9 km 10–19 km 20–49 km 50– km No commute 0–9 km 10–19 km 20–49 km 50– km No commute 0–9 km 10–19 km 20–49 km 50– km No commute 0–9 km 10–19 km 20–49 km

0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1

0  0.3478 0.0626 0.066 0 0 0 0 0 0.043 0.0499 0.0244 0.0226 0 0 0 0 0 0  0.7414  0.6664  0.6825  0.7352 0  0.0529 0.0399 0.0092 0.0144 0  0.0808  0.0745 0.004  0.0639

0 0.0368*** 0.0253* 0.0191** 0 0 0 0 0 0.0435 0.0425 0.0461 0.0468 0 0 0 0 0 0 0.0367*** 0.0352*** 0.0398*** 0.041*** 0 0.0441 0.0428 0.0472 0.0478 0 0.0493 0.0482 0.0518 0.0534

0  0.0677 0.2078  0.0484 0 0 0 0 0  0.0097 0.0077 0.0231  0.1814 0 0 0 0 0 0 0.0177 0.1894 0.2408  0.0426 0 0.6681 1.0256 0.6071 0.0846 0 0.9698 1.3173 0.7858 0.3658

0 0.0852 0.0772** 0.0718 0 0 0 0 0 0.0774 0.082 0.0848 0.0695** 0 0 0 0 0 0 0.0852 0.0841* 0.0847** 0.0734 0 0.0864*** 0.0833*** 0.0957*** 0.092 0 0.0883*** 0.0872*** 0.0959*** 0.0919***  0.0129 0.0062 0.006  0.1528 0 0 0 0 0 0 0.0103 0.1146 0.0801  0.0541 0 0.3873 0.5493 0.2585  0.0791 0 0.7171 0.8538 0.4954 0.1863

0.0582 0.0571 0.0647 0.0608* 0 0 0 0 0 0 0.0632 0.0606 0.0693 0.0648 0 0.0645*** 0.0622*** 0.0741** 0.0736 0 0.0736*** 0.0724*** 0.0803*** 0.0812*

0  0.0538 0.0809 0.0388 0 0 0 0 0  0.0164 0.0219  0.0318  0.1279 0 0 0 0 0 0 0.0275 0.1919 0.164  0.0287 0 0.6317 0.9986 0.5043 0.1715 0 0.8443 1.1693 0.6808 0.3869

0 0.0672 0.0668 0.0554 0 0 0 0 0 0.0548 0.054 0.0568 0.048** 0 0 0 0 0 0 0.0579 0.0543** 0.0557** 0.0501 0 0.0702*** 0.0652*** 0.075*** 0.0718* 0 0.0795*** 0.0763*** 0.083*** 0.0814***

Income* Job situation

Job situation* Car

Job situation* Commuting distance

Parameter

Value

50– km No commute 0–9 km 10–19 km 20–49 km 50– km No commute 0–9 km 10–19 km 20–49 km 50– km No car Car owner No car Car owner No car Car owner No car Car owner Under education Pensioner Unemployed Employed Under education Pensioner Unemployed

Value

Sunday Under education Under education Under education Under education Under education Employed Employed Employed Employed Employed Under education Under education Pensioner Pensioner Unemployed Unemployed Employed Employed Denied Denied Denied Denied o12,000 kr o12,000 kr o12,000 kr

Table A1: (Continued )

0 1 1 1 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 1 1 1 0 1 1 1

Degree of freedom

0.099  0.0291 0.0024 0 0.1904 0.1043 0.1719

0 0.0847 0.1893 0.1026 0.395 0 0 0 0 0 0

Estimate

0.0553 0.0387 0.0275 0 0.1142 0.1186 0.2181

0 0.0681 0.0419*** 0.0463* 0.0463*** 0 0 0 0 0 0

Standard Error

Number of trips

0.1214 0  0.1638 0  0.4137 0 0 0

0

Estimate

0.0794 0 0.1478 0 0.1176** 0 0 0

0

Standard Error

Mean trip distance

0  0.024 0.0492 0.1904 0.0936 0 0 0 0 0 0  0.0365 0  0.1039 0  0.1157 0 0 0

Estimate

0 0.1331 0.0571 0.0654** 0.0645 0 0 0 0 0 0 0.0416 0 0.0659 0 0.0459* 0 0 0

Standard Error

Mean travel time

0  0.0446 0.1681 0.2396 0.2184 0 0 0 0 0 0 0.0904 0  0.3213 0  0.3619 0 0 0

Estimate

0 0.1909 0.069* 0.0791** 0.0719** 0 0 0 0 0 0 0.0635 0 0.1316* 0 0.0978** 0 0 0

Standard Error

Travel distance per traveller

Age* interviewtype

Car* interviewtype

o12,000 kr 12–100,000 kr 12–100,000 kr 12–100,000 kr 12–100,000 kr 100–250,000 kr 100–250,000 kr 100–250,000 kr 100–250,000 kr W250,000 kr W250,000 kr W250,000 kr W250,000 kr No car No car Car owner Car owner 10–19 10–19 20–24 20–24 25–29 25–29 30–39 30–39 40–44 40–44 45–59 45–59 60–69 60–69 70–74 70–74

Employed Under education Pensioner Unemployed Employed Under education Pensioner Unemployed Employed Under education Pensioner Unemployed Employed CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI

0 1 1 1 0 1 1 1 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0

0 0.2733 0.0122  0.0105 0 0.0442  0.0548  0.0723 0 0 0 0 0  0.2422 0  0.2702 0 0.081 0  0.0272 0 0.0706 0 0.1193 0 0 0 0.0836 0 0.036 0 0.0193 0

0 0.0622*** 0.0588 0.041 0 0.0569 0.0359 0.0247** 0 0 0 0 0 0.0438*** 0 0.0391*** 0 0.0418 0 0.038 0 0.0335* 0 0.0241*** 0 0 0 0.0221** 0 0.0287 0 0.051 0 0.1733 0  0.0778 0  0.0583 0 0.0059 0  0.1522 0  0.05 0 0 0 0.0324 0  0.0701 0 0.358 0

0.0989 0 0.0585 0 0.1456 0 0.1288 0 0.1306 0 0.079 0 0 0 0.0672 0 0.0873 0 0.1687* 0

0.1822 0 0.0134 0

0.0528** 0 0.0399 0

0.0775 0  0.0449 0

0.0685 0 0.0319 0

Commuting distance* interviewtype

Job situation* interviewtype

Education interviewtype

Parameter

Value

CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI

Value

75–79 75–79 80–84 80–84 1–10 grade 1–10 grade High school High school Middle or long Middle or long Short or practical Short or practical Under education Under education Pensioner Pensioner Unemployed Unemployed Employed Employed No commute No commute 0–9 km 0–9 km 10–19 km 10–19 km

Table A1: (Continued )

1 0 1 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0

Degree of freedom

0.0206 0 0.1398 0  0.0231 0  0.0594 0 0.0002 0 0 0 0.0787 0 0.1189 0 0.1232 0 0 0 0.1706 0 0.2618 0 0.287 0

Estimate

0.0687 0 0.0985 0 0.0267 0 0.0201** 0 0.0161 0 0 0 0.0315* 0 0.0398** 0 0.0413** 0 0 0 0.045** 0 0.0349*** 0 0.0373*** 0

Standard Error

Number of trips

 0.0435 0 0.3626 0 0.0073 0 0 0

0.447 0 0.6274 0

Estimate

0.1271 0 0.0964** 0 0.1081 0 0 0

0.2035* 0 0.3805 0

Standard Error

Mean trip distance

0.094 0  0.076 0  0.0757 0

Estimate

0.0495 0 0.0471 0 0.0547 0

Standard Error

Mean travel time

 0.0069 0 0.0246 0  0.0154 0 0 0 0.1924 0 0.094 0 0.0347 0

Estimate

0.0528 0 0.0899 0 0.0806 0 0 0 0.0671** 0 0.044* 0 0.0478 0

Standard Error

Travel distance per traveller

CAWI CATI CAWI CATI

1 0 0 0 0 0 0

0.2454 0 0 0 0 0 1

0.0382*** 0 0 0 0 0 0 0 0 32.0749

0 0 0.1069

Reference values are indicated in bold. * Significant at 5% level, ** significant at 1% level, *** significant at 0.01% level. Note: Parameters and interactions for which estimates are not listed in the table are not significant.

Scale

interviewtype

20–49 km 20–49 km 50– km 50– km CAWI CATI

 0.109 0 0 0 0 0 28.0883

0.0523* 0 0 0 0 0 0.0936

0.0167 0 0 0 0 0 63.4847

0.0424 0 0 0 0 0 0.211

Car

Job situation

Children

Income

Education

Gender

1 1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 0 1 1 0 1 1 1 0 1 0

0.0543*** 0.0389*** 0.0344 0.0344*** 0.0244*** 0 0.0247*** 0.0309*** 0.0423*** 0.0455*** 0.0542*** 0.031*** 0 0.0322*** 0.0192*** 0.0169*** 0 0.0203*** 0.2052 0.0565 0.0153*** 0 0.0148*** 0.0141 0 0.116** 0.0737 0.0573 0 0.0165*** 0

Standard Error

0–2 km

0.3855  0.3971 0.0094  0.4338  0.3093 0  0.4893  0.4599  0.4614  0.4393  0.4678  0.5735 0  0.164 0.2261 0.0777 0  0.1731  0.2113  0.0759 0.2885 0  0.1513 0.004 0 0.3984  0.1008  0.0221 0 0.2432 0

10–19 20–24 25–29 30–39 40–44 45–59 60–69 70–74 75–79 80–84 Men Women 1–10 grade High school Middle or long Short or practical Denied o12,000 kr 12–100,000 kr 100–250,000 kr W250,000 kr 0 1 2+ Under education Pensioner Unemployed Employed No car Car owner

Degree of Freedom

Intercept Age

Value Estimate

Value

Parameter

Trip length

 0.1268  0.1062 0.001 0.0198  0.0007 0  0.0506  0.0738  0.1469  0.1302  0.2742  0.1304 0  0.1104  0.033 0.0679 0  0.0688  0.0618 0.064  0.0197 0  0.1999  0.0963 0 0.016 0.2408 0.2364 0  0.1532 0

Estimate

0.0534* 0.042* 0.0404 0.0357 0.0264 0 0.0263 0.032* 0.0459** 0.0507* 0.0675*** 0.0298*** 0 0.03** 0.0198 0.0158*** 0 0.0181** 0.1805 0.0532 0.0153 0 0.0147*** 0.0148*** 0 0.1427 0.0663** 0.0516*** 0 0.02*** 0

Standard Error

2–6 km

Table A2: Maximum-likelihood estimates for number of trips at different length.

 1.0031  0.147  0.0607  0.0703  0.0709 0  0.118  0.1887  0.1889  0.6284  0.4906 0.0031 0  0.1149 0.009  0.0893 0  0.1159 0.0058  0.2506  0.0604 0  0.1201  0.0537 0  0.3102  0.0493  0.1005 0  0.519 0

Estimate

0.0876*** 0.068* 0.0658 0.059 0.0424 0 0.0416** 0.0523** 0.075* 0.0995*** 0.126*** 0.0457 0 0.0426** 0.0273 0.0225*** 0 0.0283*** 0.2596 0.0933** 0.024* 0 0.0238*** 0.0244* 0 0.2475 0.1062 0.0808 0 0.0381*** 0

Standard Error

6–10 km

 1.0071 0.0937 0.0672  0.0285  0.0287 0  0.0233  0.1587  0.2198  0.4215  0.4873 0.0845 0  0.0509  0.0576  0.0828 0  0.1027  0.2327  0.0735  0.0287 0 0.0723 0.0648 0  0.0241  0.1685  0.3641 0  0.4843 0

Estimate

0.0755*** 0.0592 0.0597 0.0546 0.039 0 0.0375 0.0478** 0.0723** 0.0881*** 0.1197*** 0.0408* 0 0.0356 0.0241* 0.0197*** 0 0.0218*** 0.0468*** 0.0346* 0.019 0 0.0212** 0.022** 0 0.1539 0.0563** 0.0528*** 0 0.0279*** 0

Standard Error

10–20 km

0.4827 0.0516 0.1867 0.0847  0.0388 0 0.0042  0.0558  0.2742  0.5049  0.5283 0.1485 0  0.0673  0.01 0.0113 0  0.0703  0.354  0.1173  0.1303 0 0.0752 0.0449 0  0.4179  0.38  0.2613 0  0.4413 0

Estimate

0.0443*** 0.0532 0.051** 0.0455 0.0337 0 0.0322 0.0416 0.069*** 0.0884*** 0.1209*** 0.0362*** 0 0.0241** 0.0186 0.0155 0 0.0213** 0.2511 0.0713 0.0197*** 0 0.0187*** 0.02* 0 0.1565** 0.0905*** 0.0645*** 0 0.0271*** 0

Standard Error

W=20 km

Age*gender

Commuting distance

Year

Travel day

Urban-rural

Central Cph 3 big cities Rest of DK Monday Tues-Thurs Friday Saturday Sunday 2006 2007 2008 2009 No commute 0–9 km 10–19 km 20–49 km 50– km 10–19 10–19 20–24 20–24 25–29 25–29 30–39 30–39 40–44 40–44 45–59 45–59 60–69 60–69 70–74 70–74 75–79 75–79 80–84 80–84

Men Women Men Women Men Women Men Women Men Women Men Women Men Women Men Women Men Women Men Women

1 1 0 1 0 1 1 1 1 1 1 0 1 1 1 1 0 1 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 0 1 0

0.1007  0.1079 0  0.1761 0 1.879  0.0021  0.0939  0.0819  0.231  0.2379 0 0.4931 0.3077  0.3161  0.1295 0 0.5375 0 0.0159 0 0.7526 0 0.2073 0 0 0 0.3612 0 0.3141 0 0.3128 0 0.423 0 0.3589 0

0.0214*** 0.0158*** 0 0.0985 0 0.0592*** 0.0902 0.1039 0.0135*** 0.0119*** 0.0121*** 0 0.0545*** 0.0501*** 0.0558*** 0.0558* 0 0.0366*** 0 0.0481 0 0.0476*** 0 0.0388*** 0 0 0 0.0369*** 0 0.0408*** 0 0.0547*** 0 0.062*** 0 0.0771*** 0

0.0008  0.0026 0  0.3749 0 0.1038  0.0096 0.1195 0.0124  0.0103  0.0664 0 0.3867 0.6625 0.0373  0.0227 0 0.0531 0 0.0234 0 0.0155 0  0.0107 0 0 0 0.1126 0 0.1045 0 0.2324 0 0.1615 0 0.1907 0

0.0212 0.0146 0 0.1037** 0 0.0916 0.0892 0.0926 0.0139 0.0119 0.0123*** 0 0.053*** 0.048*** 0.0525 0.0538 0 0.0379 0 0.05 0 0.0485 0 0.0371 0 0 0 0.0347** 0 0.0388** 0 0.0549*** 0 0.0648* 0 0.0876* 0 0.6119 0.7554 0.0332  0.2824 0  0.023 0  0.087 0 0.0105 0  0.0891 0 0 0 0.0605 0 0.1768 0 0.0657 0 0.5917 0 0.2076 0

 0.0092 0.1006 0 0.0226 0 0.1401 0.3487 0.0808

0.0341  0.1426 0.0285*** 0.0226***  0.0382 0.0204 0 0 0 0.1452 0.1168 0.1225 0 0 0 0.149 0.2055 0.1246 0.1279** 0.1568 0.119 0.1518 0.2073 0.1262 0.0915 0.0203*** 0.0583 0.0175** 0.0091 0.0179 0 0 0.087*** 0.5896 0.0738*** 0.0805*** 0.3866 0.0683*** 0.0883 1.1524 0.0689*** 0.0934**  0.0294 0.0762 0 0 0 0.0619  0.1929 0.0555** 0 0 0 0.081  0.1308 0.0716 0 0 0 0.0776  0.0338 0.0711 0 0 0 0.058  0.0147 0.0515 0 0 0 0 0 0 0 0 0 0.0535  0.071 0.0474 0 0 0 0.0606** 0.0117 0.055 0 0 0 0.0871  0.0422 0.0836 0 0 0 0.1174***  0.0134 0.1108 0 0 0 0.1578 0.2065 0.1459 0 0 0

 0.4009  0.1587 0 0.0045 0  0.0238  0.4335  0.6714  0.0419  0.0332  0.0516 0  0.7486  1.6217  0.4123 0.1055 0  0.2356 0  0.1519 0  0.0449 0 0.0345 0 0 0  0.0294 0  0.0506 0 0.0787 0 0.2326 0  0.1081 0

0.0301*** 0.02*** 0 0.0602 0 0.0642 0.0708*** 0.0859*** 0.0187* 0.0158* 0.016** 0 0.0436*** 0.0381*** 0.0372*** 0.0354** 0 0.0532*** 0 0.0641* 0 0.0611 0 0.0455 0 0 0 0.0418 0 0.0499 0 0.0819 0 0.1078* 0 0.1587 0

Travel day* Commuting distance

Education* Job situation

No car Car owner No car Car owner No car Car owner Under education Pensioner Unemployed Employed Under education Pensioner Unemployed Employed Under education Pensioner Unemployed Employed Under education Pensioner Unemployed Employed No commute 0–9 km 10–19 km 20–49 km 50– km No commute 0–9 km 10–19 km

1 0 1 0 0 0 1 1 1 0 1 1 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 0 0

0.0318** 0 0.031** 0 0 0 0.0604*** 0.059*** 0.0427** 0 0.0505*** 0.0513*** 0.039*** 0 0.0575*** 0.0459 0.0344 0 0 0 0 0 0.1014 0.1001 0.1097 0.1116 0 0 0 0

Standard Error

0–2 km

 0.1211 0 0.0819 0 0 0  0.422 0.2671 0.1505 0  1.1366  0.2339  0.2844 0  0.8087 0.0824  0.0348 0 0 0 0 0 0.0745 0.1211 0.2002  0.1474 0 0 0 0

Central Cph Central Cph 3 big cities 3 big cities Rest of DK Rest of DK 1–10 grade 1–10 grade 1–10 grade 1–10 grade High school High school High school High school Middle or long Middle or long Middle or long Middle or long Short or practical Short or practical Short or practical Short or practical Monday Monday Monday Monday Monday Tues-Thurs Tues-Thurs Tues-Thurs

Degree of Freedom

Urban-rural* Car

Value Estimate

Value

Parameter

Trip length

Table A2: (Continued )

0.1287 0 0.1146 0 0 0  0.2314  0.0655  0.0267 0  0.1573  0.0173 0.0319 0  0.2093 0.0353 0.0482 0 0 0 0 0 0.4077 0.3502 0.2328 0.3198 0 0 0 0

Estimate

0.035** 0 0.0347** 0 0 0 0.0726** 0.0585 0.0432 0 0.0644* 0.0494 0.0402 0 0.0697** 0.0438 0.0342 0 0 0 0 0 0.1066** 0.1052** 0.1147* 0.1157** 0 0 0 0

Standard Error

2–6 km

0.3247 0 0.1169 0 0 0  0.1624 0.1126 0.0078 0 0.0225  0.1635  0.0479 0  0.0106 0.0832 0.1884 0 0 0 0 0  0.1074  0.1099  0.3229 0.2933 0 0 0 0

Estimate

0.0616*** 0 0.0634 0 0 0 0.1284 0.0941 0.069 0 0.1159 0.0874 0.0667 0 0.128 0.077 0.0559** 0 0 0 0 0 0.1511 0.148 0.1672 0.1669 0 0 0 0

Standard Error

6–10 km

 0.0869  0.034 0.1288 0 0.1725 0.0258 0.0434 0  0.1809  0.0792 0.1705 0 0 0 0 0  0.1419  0.1094  0.1102 0.0478 0 0 0 0

Estimate

0.1147 0.0837 0.0616* 0 0.1075 0.0742 0.0641 0 0.132 0.0728 0.0531** 0 0 0 0 0 0.1286 0.1262 0.1281 0.1403 0 0 0 0

Standard Error

10–20 km

 0.0771  0.0222  0.0911  0.0682 0 0 0 0

Estimate

0.0722 0.072 0.0723 0.0686 0 0 0 0

Standard Error

W=20 km

Income* Job situation

Job situation* Commuting distance

Tues-Thurs Tues-Thurs Friday Friday Friday Friday Friday Saturday Saturday Saturday Saturday Saturday Sunday Sunday Sunday Sunday Sunday Under education Under education Under education Under education Under education Employed Employed Employed Employed Employed Denied Denied Denied Denied o12,000 kr o12,000 kr o12,000 kr o12,000 kr 12–100,000 kr 12–100,000 kr

20–49 km 50– km No commute 0–9 km 10–19 km 20–49 km 50– km No commute 0–9 km 10–19 km 20–49 km 50– km No commute 0–9 km 10–19 km 20–49 km 50– km No commute 0–9 km 10–19 km 20–49 km 50– km No commute 0-9 km 10–19 km 20–49 km 50– km Under education Pensioner Unemployed Employed Under education Pensioner Unemployed Employed Under education Pensioner

0 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1 0 0 0 0 0 0 1 1 1 0 1 1 1 0 1 1

0 0  1.9759  1.8764  1.8336 -2.1192 0  0.3176  0.1387 0.1422  0.0342 0  0.358  0.3848 0.2257  0.193 0 0.4745 0.4775 0.2583 1.1584 0 0 0 0 0 0 0.1167 0.1337 0.269 0 0.2549 0.1659  0.3676 0 0.2318  0.0989

0 0 0.0648*** 0.062*** 0.0785*** 0.0814*** 0 0.0945** 0.0924 0.1033 0.1036 0 0.108** 0.1063** 0.1145* 0.1182 0 0.1082*** 0.0744*** 0.0865** 0.0821*** 0 0 0 0 0 0 0.0896 0.0744 0.0535*** 0 0.2232 0.2375 0.4941 0 0.1043* 0.113

0 0  0.066  0.0801 0.0389  0.0362 0  0.1753  0.2249 0.1738 0.2374 0  0.4653  0.4881 0.0299  0.0705 0 0.0122 0.0124  0.324 0.1159 0 0 0 0 0 0 0.1378  0.0712  0.152 0 0.0782 0.1543 0.1521 0 0.2298 0.0834

0 0 0.0954 0.0935 0.103 0.106 0 0.0937 0.0916* 0.101 0.1019* 0 0.0974*** 0.0951*** 0.1031 0.107 0 0.1387 0.091 0.1045** 0.1028 0 0 0 0 0 0 0.1071 0.0677 0.0487** 0 0.2094 0.2086 0.3662 0 0.1185 0.0997

0 0  0.2465  0.1224  0.1378 0.3503 0  0.4843  0.7048 0.0766 0.191 0  0.429  0.5419  0.1922 0.3132 0  0.5675  0.1723 0.1052 0.1052 0 0 0 0 0 0 0.2257  0.1501  0.013 0 0.0589  0.127 0.8154 0 0.614 0.059

0 0 0.1559 0.1519 0.1688 0.1708* 0 0.1358** 0.1325*** 0.1449 0.1514 0 0.1595** 0.156** 0.1714 0.1747 0 0.2641* 0.1471 0.163 0.1682 0 0 0 0 0 0 0.1924 0.1117 0.0781 0 0.3228 0.3196 0.4628 0 0.212** 0.1792

0 0  0.1786  0.0947  0.2235  0.1393 0  0.3111  0.1502  0.7322 0.1927 0  0.3531  0.3795  0.895  0.0849 0  0.7359  0.269  0.0184 0.1033 0 0 0 0 0 0

0 0 0 0 0.1314 0.0116 0.1285 0.2822 0.131 0.0053 0.1462  0.0484 0 0 0.1269* 0.528 0.1235 1.1788 0.1296*** 0.0654 0.137  0.2204 0 0 0.1335** 0.804 0.131** 1.3428 0.1363*** 0.3291 0.147 0.0101 0 0 0.2375**  0.0568 0.1272* 0.0997 0.1305 0.2731 0.1433 0.1695 0 0 0 0 0 0 0 0 0 0 0 0 0.0104  0.1614  0.153 0 0.3838 0.1112 0.3042 0 0.1383  0.2663

0 0 0.0769 0.0749** 0.0767 0.0735 0 0.0814*** 0.0781*** 0.086 0.0837** 0 0.0943*** 0.0922*** 0.0975** 0.0975 0 0.1689 0.0827 0.0884** 0.0866 0 0 0 0 0 0 0.1472 0.1016 0.069* 0 0.2907 0.3084 0.6321 0 0.1621 0.1636

Age* interviewtype

Car* interviewtype

Parameter

Trip length

12–100,000 kr 12–100,000 kr 100–250,000 kr 100–250,000 kr 100–250,000 kr 100–250,000 kr W250,000 kr W250,000 kr W250,000 kr W250,000 kr No car No car Car owner Car owner 10–19 10–19 20–24 20–24 25–29 25–29 30–39 30–39 40–44 40–44 45–59 45–59 60–69 60–69 70–74 70–74

Value

Table A2: (Continued )

Unemployed Employed Under education Pensioner Unemployed Employed Under education Pensioner Unemployed Employed CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI

Value

1 0 1 1 1 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

Degree of Freedom

0.1575 0  0.4702  0.127  0.0752 0 0 0 0 0  1.1183 0  1.2579 0 0.3513 0  0.0386 0 0.2782 0 0.4269 0 0 0 0.3705 0 0.2093 0 0.2472 0

Estimate

0.0779* 0 0.093*** 0.0693 0.0484 0 0 0 0 0 0.0946*** 0 0.0896*** 0 0.0792*** 0 0.0721 0 0.0639*** 0 0.0484*** 0 0 0 0.0459*** 0 0.0599** 0 0.1004* 0

Standard Error

0–2 km

 0.1475 0 0.3157  0.0106  0.0767 0 0 0 0 0 0.1534 0 0.2337 0  0.0266 0  0.0904 0  0.0223 0  0.0155 0 0 0  0.0462 0  0.1027 0  0.3484 0

Estimate

0.0724* 0 0.1097** 0.0627 0.0433 0 0 0 0 0 0.0518** 0 0.0389*** 0 0.0576 0 0.0637 0 0.0591 0 0.0436 0 0 0 0.04 0 0.0455* 0 0.0809*** 0

Standard Error

2–6 km

0.0123 0 0.4464  0.1155  0.0453 0 0 0 0 0  0.3689 0  0.4018 0 0.1169 0 0.1369 0 0.108 0 0.0993 0 0 0 0.1163 0 0.1536 0 0.3965 0

Estimate

0.1244 0 0.1964* 0.1025 0.0694 0 0 0 0 0 0.1526* 0 0.139** 0 0.0753 0 0.1029 0 0.0999 0 0.0732 0 0 0 0.066 0 0.0794 0 0.1225** 0

Standard Error

6–10 km

 0.0481 0 0.0848 0 0.0192 0 0.0833 0 0.019 0  0.0281 0 0 0  0.0824 0  0.0483 0 0.0132 0

Estimate

0.081 0 0.0484 0 0.106 0 0.0979 0 0.0914 0 0.0633 0 0 0 0.0565 0 0.0738 0 0.1316 0

Standard Error

10–20 km

 0.2106 0 0.2753  0.0327  0.1604 0 0 0 0 0 0.1839 0 0.0126 0

Estimate

0.1019* 0 0.151 0.0943 0.0614** 0 0 0 0 0 0.0591** 0 0.0176 0

Standard Error

W=20 km

CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI CAWI CATI

1 0 1 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 0 0 0 0 0 0

0.0567 0 0.2726 0  0.1045 0  0.1222 0 0.0505 0 0 0 0.2357 0 0.2982 0 0.2651 0 0 0 0.6139 0 0.9293 0 1.1397 0 0.7955 0 0 0 0 0 1

0.1387 0.0396 0 0 0.1767 0.0368 0 0 0.0545  0.0323 0 0 0.0399**  0.0257 0 0 0.0328  0.0826 0 0 0 0 0 0 0.0558*** 0 0.0788** 0 0.0831** 0 0 0 0.0998*** 0 0.0809*** 0 0.0871*** 0 0.0901*** 0 0 0 0 0 0 0 0 1 0 0 0

0.1024 0 0.1662 0 0.0485 0 0.0365 0 0.029** 0 0 0

0.3486 0 0.3835 0 0.2155 0 0.3314 0 0 0 0 0 1

 0.0921 0  0.2252 0

Reference values are indicated in bold. * Significant at 5% level, ** significant at 1% level, *** significant at 0.01% level. Note: Parameters and interactions for which estimates are not listed in the table are not significant.

Scale

interviewtype

Commuting distance* interviewtype

Job situation interviewtype

Education interviewtype

75–79 75–79 80–84 80–84 1–10 grade 1–10 grade High school High school Middle or long Middle or long Short or practical Short or practical Under education Under education Pensioner Pensioner Unemployed Unemployed Employed Employed No commute No commute 0–9 km 0–9 km 10–19 km 10–19 km 20–49 km 20–49 km 50– km 50– km CAWI CATI 0.1389* 0 0.1317** 0 0.1428 0 0.1451* 0 0 0 0 0 0

0.2055 0 0.3701 0

0 0 1

 0.0903 0  0.0112 0 0.2286 0 0 0

0.1061 0 0.2197 0

0 0 0

0.0916 0 0.0763 0 0.0783** 0 0 0

0.1831 0 0.2573 0

0 0 1

 0.0061 0 0.1866 0 0.1156 0 0 0

0 0 0

0.0454 0 0.0686** 0 0.0544* 0 0 0

Chapter 7

Using Accelerometer Equipped GPS Devices in Place of Paper Travel Diaries to Reduce Respondent Burden in a National Travel Survey Abby Sneade

Abstract Purpose — The Department for Transport’s 2011 GPS National Travel Survey (NTS) pilot study investigated whether personal GPS devices and automated data processing could be used in place of the 7-day paper diary. Using GPS technology could reduce the relatively high burden that the diary places upon respondents, reduce costs and improve data quality. Design/methodology/approach — Data was collected from c.900 respondents. Practical changes were made to the existing methodology where necessary, including the collection of information to support data processing. Processing was undertaken using the University of Eindhoven’s Trace Annotator. Results from the GPS pilot were then compared to those from the main NTS diaries for the same period. Findings — There were no insurmountable problems using GPS devices to collect data; however, the processed GPS data did not resemble the diary outputs, making GPS unsuitable for the NTS. The GPS data produced fewer and longer trips than the diary data. The purpose of a quarter of the GPS trips was unclear, and a disproportionate share started and ended at home. Research limitations — Further work to manually inspect trips identified via validation as unfeasible and subsequently refine the processing algorithms would have been desirable had time permitted. GPS data processing may have been hindered by missing GPS data, particularly in the case of rail travel.

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

156

Abby Sneade

Originality/value — This research used an accelerometer-equipped GPS device to better predict the method of travel. It also combined addresses that respondents reported having visited during the travel week with GIS data to code the purpose of trips without using a post-processing prompted-recall survey. Keywords: GPS; accelerometer; processing; travel; diary; survey

7.1. Introduction In September 2010, the Department for Transport (DfT) began preparing a pilot survey to define how the National Travel Survey (NTS) might be run using GPS devices in place of the travel diary. The primary objective was to investigate whether the burden of completing the 7-day paper travel diary could be replaced with the relatively low-burden task of carrying (and charging) a personal GPS device for the same period. This also had potential for improving data quality, as diary data relies upon respondents accurately recording their trips in full, and some errors, rounding and omissions are inevitable. Furthermore, GPS data collection methods may be more cost-effective than the printing, distribution and coding of diaries and subsequent data entry and validation activities. The pilot was conducted by the current NTS contractor, the National Centre for Social Research (NatCen). Fieldwork used sub-samples of the main NTS sample for February and March, and took place between 11 February and 8 June 2011. A team at the Eindhoven University of Technology (TU/e) led by Professor Harry Timmermans comprising Joran Jessurun, Anastasia Moiseeva and Tao Feng were sub-contracted to undertake the processing of the GPS data, building on their GPS ‘Trace Annotator’ using accelerometer data to improve the prediction of mode of travel (Moiseeva, Jessurun, & Timmermans, 2010). The resulting processed trip data was then analysed by a DfT statistician. A summary report of this analysis and comparisons to results for the main NTS February and March responses was published on the DfT’s website (Department for Transport, 2012b) alongside the technical reports on the fieldwork (Rofique, Humphrey, & Killpack, 2011a) and data processing (Feng, Moiseeva, & Timmermans, 2011) in January 2012. This document contains summary results on the analysis of the processed GPS data compared to results from the main NTS diary data collected over the same period as presented at the ISCTSC workshop in Termas de Puyehue, Chile in November 2011.

7.2. Selecting a Device for Use in the Pilot In 2008–2009, DfT contracted a small-scale feasibility study which explored the scope for using personal GPS devices in the NTS (Anderson, Abeywardana, Wolf, & Lee, 2009). The project collected data using 60 Atmel BTT08 GPS Data Loggers;

Using Accelerometer Equipped GPS Devices in Place of Travel Diaries

157

however, there were a number of practical and technical problems experienced with this model:  A small number of the devices contained or developed faults and did not work.  Half of the devices lost their configuration settings such that data was logged per 100 metres rather than every 4 seconds and subsequently recorded fewer GPS trips.  When out of signal range, these devices use an audible voice message indicating that they are searching for a signal. When the devices were issued to respondents this function was muted by turning the volume down. However, many reset themselves and became audible again. Interviewers then needed to further instruct respondents on how to silence them, possibly deterring a small number of respondents from using the devices further, and the device had to be worn on a cord hung around the neck. Respondents did not like this and some said it was embarrassing. In addition to this we were also interested in using an accelerometer equipped GPS device to see whether this could be used to improve the prediction of mode of travel. In spring 2010, DfT contracted AECOM and Imperial College London to undertake a review of available GPS devices for use in a further pilot study. DfT required a GPS data logger with an accelerometer that could:  accurately and reliably record all personal travel undertaken by respondents during a 7-day travel week;  be simple to use and also look simple to use — so they would not intimidate respondents who may be less confident about using technological devices;  be small and lightweight, and continue to work if placed inside a pocket, bag or clothing; and  be silent and passive, requiring no interaction other than periodic charging. The review covered a range of technical details including cost per unit, functions, data recorded, settings, indications of the time it takes devices to find a signal on reentering a signal area (warms and cold starts), trace accuracy, memory capacity, length of battery charge and required frequency of charge, ease of use, data format and the potential security of data download. Five devices were shortlisted, three of which were trialled by DfT staff: the Sprint Telematics SP01, the Forsberg A0057 prototype and the MGEdata Mobitest GSL — the latter of which was selected for use in the pilot.

7.3. Survey Methodology This section outlines the fieldwork procedures undertaken for the GPS pilot. The pilot methodology remained unchanged from the current NTS methodology wherever possible. A detailed explanation of standard NTS methodology may be found in the National Travel Survey 2010 Technical Report (Rofique, Humphrey, Pickering, & Tipping, 2011b).

158 7.3.1.

Abby Sneade Sampling and Response Rates

As noted in the introduction, the sample for the GPS pilot was drawn using random probability techniques from the sample for the main February and March 2011 NTS. It was therefore designed to be representative and comparable with the main NTS sample, with the aim of collecting GPS data from 1000 respondents aged 12 or more, drawn from 902 GB households. The standard NTS methodology requires personal travel data to be recorded for all household members; the reasons for limiting GPS data collection to those aged 12 and over are discussed in more detail in Section 7.3.8. The fully productive response rate — that is the proportion of households in which all eligible household members carried and returned a GPS device — was 52 per cent. This is lower than the respective diary-based rate for the main NTS over the same period (59 per cent). However, the partially productive response rate for GPS pilot cases was higher than for the main NTS (11 per cent compared with 6 per cent). The main reason being that individuals were slightly less likely to agree to carry and return GPS devices than complete paper travel diaries. In total, 1074 respondents were interviewed, 84 per cent of whom agreed to carry and return a GPS device. This is lower than the 91 per cent of respondents interviewed in the February and March main NTS who completed and returned a travel diary. Sample sizes for each age group are quite small, making it difficult to draw robust conclusions, but contrary to expectations, it would appear that younger respondents were marginally less willing to carry and return GPS devices than their older counterparts. Three quarters of the 65 respondents aged 12–15 years and 76 per cent of the 135 respondents aged 16–24 agreed to carry and return GPS devices, compared to 85 per cent of those aged 25 and over. Similarly, 88 per cent of 12- to 15–year-olds and 84 per cent of 16- to 24-year-olds returned a fully completed travel diary in the main NTS for the same period compared to 91 per cent of those aged 25 and over.

7.3.2.

Incentives

The standard NTS incentive of a d5 (h5.7) (Oanda, 2012) voucher for each eligible person within the household is only paid when all members of the household fully complete the placement interview and travel diary. It is therefore normally possible for one person in a participating household to complete the travel diaries of other household members on their behalf, ensuring they are a fully productive household and securing the incentive. However, GPS data collection cannot be completed by proxy in this way. This may also contribute to the higher partially productive household rates in the GPS pilot. The incentive structure was altered for the GPS study, such that each respondent who returned a GPS device received a d5 (h5.7) voucher (regardless of whether they had used it). The devices retail at approximately h280 per unit, so incentivising safe return of the devices was a priority in managing

Using Accelerometer Equipped GPS Devices in Place of Travel Diaries

159

survey costs. Nor did we want households giving up half-way through the week on account of just one person not using the GPS device.

7.3.3.

The Fieldwork Process

As with the main NTS, interviewers were given advance letters including an unconditional incentive of a book of six first-class postage stamps to send to the sampled addresses in advance of their first call. A few days after the advance letters had been sent, interviewers made contact with respondents by personal visit to undertake or arrange the placement interview.

7.3.4.

The Questionnaires

New questions were introduced to both the placement interview and pick-up interview to assess the practicality and acceptability of using GPS devices, and to assist in the processing of GPS data. These asked for addresses of destinations visited during travel week (including children’s school addresses and adults’ work addresses) and questions about respondents’ attitudes on the collection of travel data using GPS devices, their experiences of using the devices, travel week mileage, working hours for any travel week work trips that would be excluded under standard NTS diary recording rules, the occupancy of their most recent car/van trip and use of taxis. Data was also collected on whether the respondent had forgotten to take their device with them on any particular days and any nights away from home to aid processing and weighting for missing data. Another challenge in processing GPS data is how to allocate purpose of trip for people who work at multiple sites or have no fixed place of work. We therefore collected hours of work for the travel week from these people. Although the processing undertaken by TU/e did not utilise this data, it is something that can be further investigated during the final stage of analysis. Those who do not work traditional shift-based hours (e.g. plumbers or music teachers) present a further difficulty. For these individuals we collected the start time of the first job of the day and the end time of the last job of the day, for each day of the travel week. Although this is a rather naive approximation of a filter, it may give us some insight to the type of trips they make during their working day. A selection of standard NTS questions was removed from the interview questionnaires in order to create space for the additional pilot questions.1

1. These included questions on local transport services, satisfaction with local transport services, distances to amenities, frequency of walks of 20 minutes or more, children as front/rear passengers, reasons for not driving, transport barriers to employment, difficulties travelling to work, road accidents, long-distance journeys, vehicle details, parking, time spent playing in the street (children), fuel gauge details and mileage details.

160 7.3.5.

Abby Sneade The Placement of Devices

At the placement interview, all respondents aged 12 and over were asked to carry a GPS device for seven consecutive days beginning the day after the interview. This differs from the main NTS where interviewers allocate travel weeks according to a strict schedule which ensures travel weeks are evenly distributed throughout the year. This restriction was removed in the pilot survey so that interviewers could optimise their use of the limited number of devices. Interviewers explained the purpose of the devices and how to use them and provided respondents with a leaflet including this information. For households where there were children aged 12–15, respondents were provided with a copy of a letter to give to the children’s school or teacher explaining the purpose of the GPS pilot (many schools do not allow electronic devices on the premises2). Respondents were also provided with post-it notes that they could use to remind them to charge the device and to take it with them when they went out. Guidance included keeping the device with items like keys and mobile phones, which people tend to carry with them at all times.3 In households composed of two or more individuals, it was important to ensure that each person carried the same device throughout the travel week and that the data from that device could be matched back to the correct individual. Interviewers recorded device serial numbers as allocated to individuals in the CAPI program and on separate device allocation cards. They also used coloured stickers so that household members could differentiate between each others’ device and white stickers on which the name of the person allocated the device could be written.

7.3.6.

Office Procedures

Once the interviewer had collected the devices and chargers from a household, they returned them to the NatCen Operations Department by courier, and the Mobitest data explorer software was used to check that there was data on the device and to download and save the data files. A file naming convention was used that incorporated the household serial number and person number to ensure data could easily be matched. Device data were then erased and the devices charged and reallocated to a new interviewer for reuse.

2. For example, handheld games, phones and IPods. 3. Respondents in a previous study reported that keeping devices with their mobile phone or house and car keys helped them to remember to take the device with them (Swann & Stopher, 2008).

Using Accelerometer Equipped GPS Devices in Place of Travel Diaries

161

Figure 7.1: The MGEdata Mobitest GSL device. Image courtesy of MGEdata web content.

7.3.7.

Feedback on Carrying Devices

Six per cent of respondents (51 people) reported experiencing problems using the devices. When asked what these problems were, responses were varied, but the most common response was problems or difficulties understanding the lights or buttons. Ideally the device used would not have had any buttons at all other than ‘on’ and ‘off’; however, it would not have been feasible to have devices custom-made to specification for such a short-term ‘one-off’ exercise. The MGE Mobitest GSL has six buttons (see Figure 7.1). The green button turns the device on, and the red turns it off, the remaining buttons were not programmed. Respondents were advised to leave the device on at all times but to turn it off if they took a flight. The information sheet supplied to respondents explained the sequence of lights, although respondents did not need to know about these in order to use the device correctly. Almost all respondents reported that the GPS devices were very easy to use: 84 per cent said it was very easy to use and 15 per cent said it was fairly easy to use. Just 2 per cent found them fairly or very difficult to use, which is a positive indication that replacing the paper travel diary with a GPS device such as the Mobitest would reduce respondent burden. The majority (83 per cent) of respondents said that they charged their devices on each day of the travel week. Seven per cent said they forgot on at least 1 day and 11 per cent were not sure or could not remember. A fifth (20 per cent) of respondents said there was at least 1 day when they made journeys but did not carry their devices with 71 per cent claiming that there were no days when they did not carry the device. Ten per cent did not know or could not remember if there were any days on which they did not carry the device.4

4. Please note that where values do not sum to 100 per cent, this is due to rounding.

162

Abby Sneade

Some interviewers felt that respondents were less keen about the devices due to the perception that they were being ‘spied on’, the practical burden of remembering to charge and carry the device, and the level of detail that their movements would be recorded in. However, the simplicity and ease of the process was reported to be a positive factor in placing a device compared to the diary. Four devices were lost by respondents during fieldwork. This represented just 0.4 per cent of the total number of achieved travel weeks (in line with work undertaken by other market researchers5).

7.3.8.

Children and GPS Devices

Some researchers have questioned the ethics of asking children to carry GPS devices, not least because it may make them vulnerable to bullying or there may be a temptation to ‘play’ with the devices and create ‘false’ trips. As mentioned previously, the devices are expensive and it could be considered unreasonable to ask a young child to be responsible for an item of this value. It was agreed that the GPS pilot would ask 12- to 15-year-old respondents to carry devices, but not those aged less than 12 years of age. Those aged less than 12 years of age were still interviewed. All respondents aged 12 and over living as part of households which included children aged 12–15 were asked whether or not they thought that GPS devices should be given to children in that age group. There were 158 respondents who fell into this category. Two thirds (67 per cent) said 12- to 15-year-olds should be asked to carry devices, 18 per cent said they should not, while 15 per cent said they did not know. When asked about the feasibility of placing devices with younger people (12- to 15-year-olds), interviewers felt that it was appropriate for this age group to carry devices, especially if they were part of an already willing family.

7.3.9.

Recall Issues

Interviewers reported that when asked about what they had done on days they forgot to carry the devices respondents struggled to remember. The previous DfT feasibility study supplied respondents with a record sheet for use during the travel to week to record their activities. It was agreed that in order to reduce respondent burden, a record sheet would not be used in this pilot. There were also concerns that if a record sheet was supplied, respondents who had forgotten to carry the device might prefer to report that they had not left the house that day than admit their forgetfulness.

5. Correspondence with Ipsos MORI in June 2010 indicated that the Postar travel survey using GPS experienced a similar loss rate (Ipsos MORI, 2010).

Using Accelerometer Equipped GPS Devices in Place of Travel Diaries

163

Address details for travel week destinations were also difficult for respondents to recall: mostly because of the volume of information and level of detail required. The main problem appeared to be remembering postcodes and uncertainty about providing friends’ and relatives’ details. Interviewers also raised concerns that some respondents did not tell them about every destination they had been to in order to reduce the lengthy follow-up questions each destination incurred. Interviewers felt it would have been appropriate to warn the respondent at the placement interview that a certain level of detail about particular aspects would be needed and suggested that a look-up style address database could have made the recording of addresses quicker and easier.

7.3.10. Incomplete Address Data The address data collected in the placement and pick-up interview was processed to get as complete an address as possible from the information given before it was sent to TU/e. All addresses were run through the automated address matching program Matchcode.6 After Matchcoding, addresses that were still incomplete were sent to GatePost for manual look-up.

7.3.11. Outputs Full CAPI data, GPS data and enhanced address data were delivered to TU/e for processing.

7.4. Data Processing The objective of the GPS processing was to clean the data, process it into trips and then infer mode and purpose of trip. The MGE data devices recorded date, time, position (latitude and longitude), height (metres), horizontal accuracy (HACC: metres), vertical accuracy (VACC: metres), the number of satellites used for the position calculation and change in acceleration relative to the device on three-axis (kHz; see Figure 7.2). When the horizontal accuracy was determined within a range of 10 metres, the distance travelled in metres since the last recorded point was also recorded. The devices had also been expected to record the speed at which they were travelling; however, a misunderstanding in the device order meant that this was not recorded.

6. Matchcode is a Capscan product (Capscan, 2012).

164

Abby Sneade

Figure 7.2: The axes on which the accelerometer measures change in acceleration. Note: Image created by Yassine Mrabet. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License. Source: Mrabet (2008). The device audio alerts were disabled so they did not cause participants any embarrassment. Considering the effect of urban canyons, enclosed spaces and erroneous data points, it was agreed that data utility would be optimal if it were recorded once every second. Collecting data any less frequently lacks precision — especially when erroneous points are removed — and risks missing events such as drop-offs (Stopher, 2010). The accelerometer trace was used to help identify mode — each mode has a different ‘signature’ — and to assist in identifying movement when a GPS signal was not available, for example when travelling underground, or in urban canyons, both of which are problems experienced in areas such as Central London. Figure 7.3 illustrates how different accelerometer traces from different modes of transport produce distinct traces. The desired outputs from the processed GPS data were to imitate those from the diary data. They would include a description of the journey start and end points, derived from personal address data or GIS land use data, purpose of journey, the length of a journey (distance and time) and the mode of travel used. The questions added to the interviews on destinations visited by respondents during the travel week were then linked to the addresses by geocoding and thus provide data for inferring the purpose of the journey. GIS data was provided to supplement this data or act as a substitute were it not supplied.

Using Accelerometer Equipped GPS Devices in Place of Travel Diaries

165

Walking 2.0 1.5 1.0 0.5 17:02

17:01

17:00

16:59

16:58

16:57

16:56

16:55

16:54

16:54

16:53

0.0 –0.5 –1.0 –1.5 –2.0 Cycling 2.0 1.5 1.0 0.5 8:59

8:58

8:57

8:56

8:55

8:54

8:53

8:52

8:52

8:51

8:50

0.0 –0.5 –1.0 –1.5 –2.0 Tube

2.0 1.5 1.0 0.5

17:20

17:19

17:18

17:17

17:16

17:15

17:14

17:13

17:12

17:11

0.0 –0.5 –1.0 –1.5 –2.0 Local bus (in slow moving traffic) 2.0 1.5 1.0 0.5 18:06

18:05

18:04

18:03

18:02

18:01

18:00

17:59

17:58

17:57

17:56

17:55

0.0 –0.5 –1.0 –1.5 –2.0 XACC

YACC

ZACC

Figure 7.3: Accelerometer mode ‘signatures’ (change in acceleration relative to the device, G). A range of GIS data was supplied to TU/e to enable the identification of roads, rail track stops and stations and bus stops for use in calculating proximity and overlap of GPS points with mode-specific elements of the transport network and information on land use for coding of purpose of trips. These are summarised in Table 7.1.

166

Abby Sneade

Table 7.1: Geographic Information System (GIS) data used in data processing. Data set Ordnance Survey MasterMap Integrated Transport Network layer ITN Paths Great Britain London Underground trackline and stations National Public Transport Access Node (NaPTAN)

The National Public Transport Data Repository (NPTDR)

Meridian-2

InterestMap GB

Use/Description Road network and road routing information

ITN add-on including footpaths LU only — sourced from Transport for London Uniquely identifies all points of access to public transport in Great Britain. Contains every station, coach terminus, airport, ferry terminal, bus stop etc. Used for bus stops and ferry embarkation points A snapshot of every Great Britain public transport journey for a selected week in October (2010) including buses, coaches, trains and ferries. Includes stops, schedules and routing. Used for bus stops An Ordnance Survey mid-scale digital representation of Great Britain. Used for railway track and stations (including light rail outside of London such as Newcastle metro, Sheffield Tram etc.) Derived from Ordnance Survey Points of Interest database: used to supplement station data from Meridian-2 and land use for identifying purpose of trip

The Points of Interest (POI) data and the personal address data collected for places visited during the travel week were used to code the purpose of trips to a destination. Table 7.2 outlines how destinations were coded. An ‘unknown’ purpose category was also added to these purpose codes for use where neither personal nor commercial GIS data could provide enough information to code the purpose of a trip. For example, if the respondent forgot they had visited a particular site and did not provide an address in the pick-up interview, it may not be possible to use GIS data to classify a trip to a large mixeduse complex such as a mall. Alternatively, it could be that there is no information on the address, or that there was no data (personal or POI) available for the location. 7.4.1.

The Trace Annotator

The Trace Annotator system is a Bayesian Belief Network (BBN) or Bayesian classifier system which replaces commonly ad hoc rules with a dynamic structure,

Using Accelerometer Equipped GPS Devices in Place of Travel Diaries

167

Table 7.2: Use of Points of Interest data and personal profile data to code purpose of journey. Purpose Home Work Education/ Education escort Food/Grocery shopping All other types of shopping Personal business — medical Personal business — other Eat/Drink

Entertainment/ Public social activities Sports

Holiday base

Visiting family/ friends

Description Home address Work address School, centre/association etc. Food, drink and multi-item retail etc. Clothing and accessories, household, office, leisure and garden, motoring etc. Hospital, dental surgeries, physical therapy, clinics etc. Consultancies, employment and career agencies, hire service, advertising etc. Cafe, bar, restaurant etc.

Bodies of water, botanical and zoological, historical and cultural, landscape features, recreational, tourism Gambling, outdoor pursuits, sport and entertainment support services, sports complex, venues, stage and screen Camping, caravanning, mobile homes, holiday parks and centres etc. Family or friend’s address

Data source PP PP POI: Education and health POI: Retail POI: Retail

POI: Education and health POI: Commercial services POI: Accommodation, eating and drinking POI: Attractions

POI: Sport and entertainment

PP and POI: Accommodation, eating and drinking PP

PP, personal profile data; POI, Point of Interest data.

leading to improved classification if consistent evidence is obtained over time from more samples (more traces). A BBN is a model for reasoning about uncertainty, which represents all factors, deemed potentially relevant for observing a particular outcome and thus can be used to predict the conditional probability of observing a particular outcome. The network is a graphical representation of probabilistic causal information through a directed acyclic graph and sets of probability tables behind them. The graph consists of nodes and arcs which represent discrete or continuous variables and causal/influential relationships between variables, respectively. The

168

Abby Sneade

network calculates expected probabilities of different transport modes, given the structure of the BBN and associated conditional probabilities as input. Technically, the network used is a naive Bayesian classifier. More details on the Trace Annotator can be found in the full technical report supporting this work (Feng et al., 2011) or in Moiseeva et al. (2010). For the purposes of this pilot, the Trace Annotator was updated to include new nodes for the accelerometer data, namely the mean and standard deviation of change in acceleration for each of the three axis and a new variable ‘STEPS’. STEPS was derived from the device-recorded variable ‘NOMOVE’: if the device does not sense movement in the accelerometer, NOMOVE is automatically increased by one unit per second. A new variable was created to equal the average value of the time duration STEPS. This was then used to determine between different types of motion, as the higher the STEPS value is, the more random the motion becomes. Therefore, walking, running and cycling were confirmed to have relatively high values and less turbulent modes such as bus and light rail have lower values. In spring 2011, NatCen and DfT staff carried the GPS devices on a range of specific trips to collect a sample of data for as many modes and types of trips possible. These included long-distance multi-mode leisure trips using National Rail, Sheffield Supertram, Tyne and Wear Metro; local car trips; running, cycling and motorcycle commutes and commutes using a combination of bus, long-distance coach, London Underground, Docklands Light Railway and local rail. A record sheet was completed for each of these trips which confirmed (with approximate timings) the connections, purpose and modes of transport used. This ‘training’ data (existing of 80,670 records) was then split into two subsets in a 65:35 ratio, and used to (1) calibrate and (2) validate the performance of three models: an accelerometeronly model, a GPS-only model and a third model using both accelerometer and GPS data. These models were then expanded to include all transport modes and ‘activity modes’ and applied to a filtered version of the training data, which excluded the cases with unrealistic values of latitude and longitude, date, time etc. (Rail had been excluded from the first training data model as GPS data for large sections of some trips was missing and made the distinction between rail and light rail difficult.) Overall, the model combining accelerometer and GPS traces outperformed use of either the accelerometer or the GPS device in isolation. As outlined in Table 7.3, the GPS and accelerometer model correctly predicted the mode for 91 per cent of calibration data and 91 per cent of validation data, compared to 89 per cent for the accelerometer-only model and 79 per cent for the GPS-only model. Table 7.4 further demonstrates that the GPS and accelerometer mode outperformed the accelerometer-only model and the GPS-only model in correctly predicting the mode for a higher proportion of data for all modes except rail. (This is likely to be because GPS data was not recorded for several long rail trips owing to poor signal availability.) The final model using both accelerometer data and GPS data correctly predicted mode for between 83 per cent of rail data to 100 per cent of running, cycling and motorcycle data.

Using Accelerometer Equipped GPS Devices in Place of Travel Diaries

169

Table 7.3: The proportion of data for which mode was correctly identified. Model

Calibration data (%)

Validation data (%)

79 89 91

79 89 91

GPS-only Accelerometer-only GPS and accelerometer

Based on calibration data (approximately 24,000 data points).

Table 7.4: Correctly identified hit ratios by activity type based on filtered data. Activity Walking Running Cycling Bus Motorcycle Car Train Light rail Waiting

Accelerometer-only (%)

GPS-only (%)

Combined accelerometer and GPS (%)

92 97 88 78 100 93 89 98 33

97 98 100 87 100 98 58 98 84

98 100 100 98 100 99 83 99 83

Based on calibration data (approximately 24,000 data points).

The developed Trace Annotator was subsequently applied to the NTS GPS pilot survey to  infer transport modes and activity-travel episodes and stages, using the BBN, combining GPS and accelerometer data, and  infer activity type/trip purpose by fusing GPS data with GIS land use data and personal data. The structure of the network illustrating the child nodes of the updated and revised Trace Annotator model is given in Figure 7.4. Before the data was ready for input to the Trace Annotator, it required some basic processing:  converting all the GIS data from OSGB1936 to the WGS84 coordinate system that the GPS data were recorded in;  estimating missing distance and speed for the GPS data (the former of which used the law of Haversine) (Haversine Formula, 2012);  geocoding of personal address data from the interviews, linking of the interview sourced data to GPS data and derivation of variables relating to bicycle, motorcycle and car ownership; and

170

Abby Sneade AVGACC C1: 0 ~ 0 C2: 0 ~ 0.08 C3: 0.08 ~ 0.19 C4: 0.19 ~ 0.25 C5: 0.25 ~ 0.5 C6: 0.5 ~ 0.7 C7: 0.7 ~ 50000

MAXACC C1: 0 ~ 0 C2: 0 ~ 0.4 C3: 0.4 ~ 0.7 C4: 0.7 ~ 1.5 C5: 1.5 ~ 5

CAROWN C1: Yes C2: No

ACCUMDIST C1: 0 ~ 0 C2: 0 ~ 30 C3: 11 ~ 90 C4: 16 ~ 150 C5: 26 ~ 240 C6: 30 ~ 470 C7: 50 ~ 760 C8: 140 ~ 2000

AVGSPEED

MAXSPEED

C1: 0 ~ 0 C2: 0 ~ 2.5 C3: 2.5 ~ 6 C4: 6 ~ 12 C5: 12 ~ 18 C6: 18 ~ 32 C7: 32 ~ 50 C8: 50 ~ 135 C9: 135 ~ 500

C1: 0 ~ 0 C2: 0 ~ 5 C3: 5 ~ 10 C4: 10 ~ 13.5 C5: 13.5 ~ 19 C6: 19 ~ 36 C7: 36 ~ 42 C8: 42 ~ 62 C9: 62 ~ 140 C10: 140 ~

RRDIST

MOTORCOWN C1: Yes C2: No

C1: 0 ~ 25 C2: 25 ~ 50 C3: 50 ~ 100 C4: 100 ~ 500

MODE Walking Running Bicycle Motorcycle Bus Car Train Light rail Waiting

BIKEOWN C1: Yes C2: No HACC C1: 0 ~ 3 C2: 3 ~ 4.5 C3: 4.5 ~ 5.5 C4: 5.5 ~ 9 C5: 9 ~ 11 C6: 11 ~ 15 C7: 15 ~ 18 C8: 18 ~ 23 C9: 23 ~ 50000

RMDIST C1: 0 ~ 50 C2: 50 ~ 500 RLRDIST C1: 0 ~ 50 C2: 50 ~ 500 AVGXACC C1: 0 ~ 80 C2: 80 ~ 100 C3: 100 ~ 120 C4: 120 ~ 140 C5: 140 ~ 160 C6: 160 ~ 200

VACC C1: 0 ~ 10 C2: 10 ~ 25 C3: 25 ~ 100 C4: 100 ~

AVGXACC C1: 0 ~ 80 C2: 80 ~ 100 C3: 100 ~ 120 C4: 120 ~ 140 C5: 140 ~ 160 C6: 160 ~ 200

SATS C1: 0 ~ 1 C2: 1 ~ 5 C3: 5 ~ 8 C3: 8 ~ 15 STEPS C1: 0 ~ 1 C2: 1 ~ 3 C3: 3 ~ 9 C4: 9 ~ 15 C5: 15 ~ 27 C6: 27 ~ 50 C7: 50 ~ 70 C8: 70 ~ 78 C9: 78 ~ 50000

STDEVXACC

STDEVYACC

STDEVZACC

C1: 0 ~ 2 C2: 2 ~ 4 C3: 4 ~ 8 C4: 8 ~ 25 C5: 25 ~ 50 C6: 50 ~

C1: 0 ~ 2 C2: 2 ~ 3.5 C3: 3.5 ~ 5.5 C4: 5.5 ~ 8 C5: 8 ~ 25 C6: 25 ~

C1: 0 ~ 3 C2: 3 ~ 5 C3: 5 ~ 8 C4: 8 ~ 20 C5: 20 ~ 50 C6: 50 ~

AVGXACC C1: 0 ~ 80 C2: 80 ~ 100 C3: 100 ~ 120 C4: 120 ~ 140 C5: 140 ~ 160 C6: 160 ~ 200

Figure 7.4: Updated and revised Trace Annotator model structure used to process NTS GPS pilot data.

Using Accelerometer Equipped GPS Devices in Place of Travel Diaries

171

 selecting GPS data for only the 7-day travel week (some traces contained additional points preceding or following the travel week where devices were already switched on before drop-off or left on after collection). The Trace Annotator produces probability matrixes for each child node. It also uses conditional probabilities. Ideally these conditional probabilities would be based on a small-scale ‘pilot’ study and an assessment of the success of the classifier using that data. In the present study, we started with conditional probabilities which were successfully used in several previous pilot projects in the Netherlands. These were adjusted marginally until the transport modes of the training data were sufficiently predicted. The Trace Annotator was originally designed so that corrections from a prompted recall survey would feed help the BBN to ‘learn’ and improve its performance. Section 7.4.3. explains why prompted recall was not included in the methodology of the GPS pilot survey.

7.4.2.

Data Quality Issues

There were a small number of errors found within the GPS data as recorded by the devices, which include the following:  Multiple consecutive records for the same time stamp (this happened on at least one occasion).  Instances where individual or short periods of data are missing: these could be caused by the device going into sleep mode (after 600 seconds of no movement), or problems such as cold starts or urban canyons. This happened in 12 per cent of the trips.  Instances where the date is wrong (e.g. 1980 or 2055): in total, 118,285 data points from a total of 103,848,420 epoch data (0.1 per cent), distributed among 842 of 899 trace files had incorrect dates.  Unfeasible latitude and longitude recorded for data pertaining to travel within Great Britain: there were 181,089 unrealistic longitudes (0.2 per cent) in 98 trace files. These were typically dealt with by using the first (assumed correct) data point where there were duplicates, by correcting the date to match adjacent data points or by filtering incorrect or unfeasible data points. 7.4.3.

Validation

In the original pilot design, it had been intended that inferred data would be validated using an expert assessment of the inferred purpose/mode for each trip/trip stage against mapped GPS trip traces. This was to be completed by a researcher rather than the respondent. To date, other (small scale) GPS travel survey studies have employed web-based prompted recall validation surveys based on processed trip data and the

172

Abby Sneade

Trace Annotator was originally developed to encompass such a validation system. Unfortunately, it is simply not practicable to use online methods in a general population survey of the GB public: some 9.2 million UK adults have never used the Internet and 27 per cent of UK households do not have an Internet connection, and the complexity and costs of additional face-to-face or telephone follow-up surveys are not feasible (Office for National Statistics, 2010). However, owing to time constraints enforced by the late running of the processing work, a framework was instead developed to assess the validity/feasibility of the processed traces. This consisted of rules applied to summary outcomes. In summary these rules checked 1. the speed and duration of trips with respect to the mode used; 2. the distance travelled with respect to the trip purpose; and 3. the start and end time of a trip in relation to the trip purpose. The values of the parameters were determined using the expert knowledge of the project team.

7.4.4.

Post-Processing of Traces

There were some additional requirements to create variables that allow the processed GPS data to be analysed on a comparable basis to results from the main NTS survey which used the paper travel diary to collect personal travel data for the same period, namely  ‘series of calls’ and ‘round trips’ needed to be identified as in the NTS coding interviewers manual and treated in the same fashion;  trips not usually recorded in an NTS diary needed to be filtered from the GPS data (off network travel and travel in the line of work); and  non-travel episodes, for example children playing in the street. Dummy variables flagging trip stages to which these applied were created using rule-based algorithms. It was also agreed that by means of testing the value of potential new NTS variables, ‘waiting activity’ would also be flagged. The results were then converted into CSV file format and returned to NatCen. NatCen undertook some basic quality checks on variables, corrected any coding errors and created codes for missing values before forwarding the data in SPSS format to the DfT for analysis.

7.5. Summary of Results There were no major problems with fieldwork that could not have been overcome had we more time or devices available in which to complete the project. Practical issues included an update for the MGE device management software not being

Using Accelerometer Equipped GPS Devices in Place of Travel Diaries

173

available in time for use and two variables (DIST and speed) were not recorded correctly because the devices’ settings were not correct; however, these were later estimated using position data during the data processing stage. There are a number of disparities between the GPS personal travel data and the data from the NTS diary:  On average, 7 per cent of GPS pilot respondents made journeys but forgot to carry the device with them for some or all of the trips made on each day of the travel week and 3 per cent of respondents reported having forgotten to charge their GPS device each day of the travel week (Figure 7.A.1).  There were far fewer trips and stages in the GPS data, but the time taken to complete these was much longer and the average journey length was also much longer (Table 7.A.1).  Although the data processing confirmed that the use of combined GPS and accelerometer data improves the mode prediction, there were fewer walking trips in the processed GPS data and more rail trips (Figure 7.A.2).  The distribution of GPS trips tailed off towards the end of the travel week (Figure 7.A.3).  The GPS trip distribution does not contain the traditional morning and afternoon rush hour peaks, but instead peaks once in the early afternoon (Figure 7.A.4). The GPS data contains a greater share of trips to or from home than the diary (Figures 7.A.5(a) and 7.A.5 (b)). It also estimated fewer journeys to/from work and there were a number of substantial differences in the proportion of trips coded purpose to/from shopping trips, visiting friends or family and holidays. One quarter (25 per cent) of trips were missing either a ‘to’ or ‘from’ purpose code.

7.6. Conclusions The differences between results for the GPS sourced trip data and the diary trip data lead us to conclude that when used in the context of the NTS methodology and the Trace Annotator processing system, GPS data collection and processing do not provide an acceptable alternative to the paper travel diary. It is apparent that trips derived using GPS processing are not quite the same as the trips recorded in the diary. GPS derived trips are possibly more akin to a series of diary trips which also include non-travel activity. It is also evident that the Trace Annotator only identified multi-stage trips where a waiting stage occurred between two public transport stages or boardings. The response to the 2011 public consultation on the future design of the NTS (Department for Transport, 2012a) stated that users were broadly supportive of the GPS approach, albeit with some reservations. However, because the pilot study did not identify trip start and end points, or infer trip mode and purpose to the level of accuracy required, a move to GPS was deemed too risky and the paper travel diary will continue to be used in 2013 when the contract is renewed.

174

Abby Sneade

Acknowledgements Much of the material for the Survey Methodology section of this paper draws upon the National Travel Survey 2011 GPS Pilot, a technical report on the pilot survey management and data collection by Josi Rofique, Alun Humphrey and Caroline Killpack of NatCen (Rofique et al., 2011a). Similarly, material pertaining to the Data Processing section of this paper summarises material from Processing of National Travel Survey GPS Pilot Data, a technical report prepared on behalf of the Department for Transport by Tao Feng, Anastasia Moiseeva and Professor Harry Timmermans at TU/e (Feng et al., 2011). The author would like to acknowledge the helpful advice provided by Roger Mackett (University College London), Nadine Rieser-Schu¨ssler (ETH, Zurich), Peter Stopher (University of Sydney), Ashley Cooper (University of Bristol), John Polak (Imperial College London), Andrew Jones (University of East Anglia) and Kyle Roskilly (Royal Veterinary College) that influenced the design of the NTS GPS pilot, and thank David Strnad (MGE Data), Wayne Bull (Forsberg) and Ian Cleaver (Sprint) for the loan of GPS devices for trial purposes. Thanks also to Lyndsey Melbourne and the NTS team at DfT, and Alun Humphrey and the NTS team at NatCen for their invaluable, ongoing support and advice.

Postscript The processed GPS data from the pilot study was not delivered until after the original deadline for ISCTSC 2011 conference papers (October 2011). This paper therefore focuses on the design and process of the study rather than the results. Some early findings were presented at the conference in November 2011, and a revised paper including a short overview of the results and conclusions was submitted for publication in January 2012. A more in-depth report National Travel Survey 2011 GPS Pilot: Summary Analysis (Department for Transport, 2012b) and detailed reports on the study’s fieldwork (Rofique et al., 2011a) and data processing (Feng et al., 2011) are also available.

References Anderson, T., Abeywardana, V., Wolf, J., & Lee, M. (2009, December). National travel survey GPS feasibility study final report. Retrieved from http://webarchive.nationalarchives. gov.uk/+/http://www.dft.gov.uk/pgr/statistics/datatablespublications/personal/methodology/ ntsreports/ntsgpsstudy.pdf Capscan. (2012). Matchcode products. Retrieved from http://www.capscan.com/matchcode.aspx. Accessed on 29 August, 2012. Department for Transport. (2012a). Public consultation on the future design of the National Travel Survey, Department for Transport. Retrieved from http://assets.dft.gov.uk/consultations/ dft-2011-16/nts-consultation-response.pdf

Using Accelerometer Equipped GPS Devices in Place of Travel Diaries

175

Department for Transport. (2012b). National Travel Survey 2011 GPS Pilot: Summary analysis. Retrieved from http://assets.dft.gov.uk/statistics/series/national-travel-survey/ nts-gps-analysis-report.pdf Feng, T., Moiseeva, A., & Timmermans, H. J. P. (2011). Processing of National Travel Survey GPS Pilot Data: A technical report prepared for the Department for Transport. Retrieved from http://assets.dft.gov.uk/statistics/series/national-travel-survey/nts-gps-processing-report.pdf Haversine Formula. (2012). From Wikipedia, the free encyclopedia. Retrieved from http:// en.wikipedia.org/wiki/Haversine_formula#The_law_of_haversines. Accessed 29 August, 2012. Ipsos MORI. (2010). Postar. Retrieved from http://www.ipsos-mori.com/researchspecialisms/ ipsosmediact/Measurement/postar.aspx. Accessed November 2012. Moiseeva, A., Jessurun, A. J., & Timmermans, H. J. P. (2010). Semiautomatic imputation of activity travel diaries: Use of Global Positioning System traces, prompted recall, and context-sensitive learning algorithms. Transportation Research Record, 2183, 60–68. Retrieved from http://dx.doi.org/10.3141/2183-07 Mrabet, Y. (2008). File: Human anatomy planes.svg. Retrieved from http://en.wikipedia.org/ wiki/File:Human_anatomy_planes.svg. Accessed on 10 September, 2012. Oanda. (2012). Historical exchange rates. Retrieved from http://www.oanda.com/currency/ historical-rates. Accessed on 29 August, 2012. Office for National Statistics. (2010). Internet access — Households and individuals. Retrieved from http://www.ons.gov.uk/ons/rel/rdit2/internet-access---households-and-individuals/ 2010/stb-internet-access---households-and-individuals--2010.pdf Rofique, J., Humphrey, A., & Killpack, C. (2011a). National Travel Survey 2011 GPS Pilot. Department for Transport. Retrieved from http://assets.dft.gov.uk/statistics/series/nationaltravel-survey/nts-gps-field-report.pdf Rofique, J., Humphrey, A., Pickering, K., & Tipping, S. (2011b). National Travel Survey 2010 technical report. Retrieved from http://assets.dft.gov.uk/statistics/series/national-travelsurvey/nts2010-technical.pdf Stopher, P. R. (2010, June). Phone conversation with Peter Stopher, Professor of Transport Planning, The University of Sydney. Swann, N., & Stopher, P. (2008). Evaluation of a GPS survey by means of focus groups. Paper presented at the 87th Annual Meeting of the Transportation Research Board, Washington, DC.

176

Abby Sneade

Appendix 7.A.1 8%

Day 7

3%

Day 6

3%

12% 6% 14% 6%

Day 5

3% 15% 6%

Day 4

3% 14% 7%

Day 3

3%

Day 2

3%

Day 1

15% 7% 13% 7% 2% 13%

No journey

No charge

No carry

Figure 7.A.1: Proportion of respondents who did not travel, charge device or made journeys and forgot to carry the device (by day). Note: Base ¼ all respondents (871) excludes 26 ‘missing’/ ‘don’t know’ responses.

Table 7.A.1: GPS pilot and NTS diary journeys (February to March 2011).

Sample size Total number of journeys Total journey time (minutes) Total journey distance (miles) Average journey distance (miles) Average journey time (minutes) Journeys per person per year Average journey time per person per year (hours) Average journey distance per person per year (distance)

GPS

Diary

897 11,090 561,114 265,433 24 51 645 544 15,429

1,726 30,904 652,160 220,205 6 21 934 328 6,652

Note: GPS data excludes journeys flagged ‘children playing in the street’ and trips made by those working in transportation jobs during working hours. Diary data for short walks collected on Day 7 are multiplied by a factor of 7, are based only on data from fully participating households, and exclude data for those aged under 12.

Using Accelerometer Equipped GPS Devices in Place of Travel Diaries 60% 63%

Car/van/taxi/minicab 28%

Walking

10% 7% 8%

Bus Train

2%

Cycling

1% 0%

Light rail

1%

Motorcycle

0% 1%

14%

4%

GPS (11,979)

Diary (34,010)

Figure 7.A.2: Stages by mode. Base shown in brackets.

Day 7

14% 7% 14%

Day 6

11% 14% 14%

Day 5 Day 4

14%

Day 3

14%

16%

17% 14%

Day 2

18% 15%

Day 1

18%

GPS (11,090)

Diary (30,904)

Figure 7.A.3: Journeys by day of travel week. Base shown in brackets.

177

Abby Sneade 10% 9% 8% 7% 6% 5% 4% 3% 2% 1% 0% 00.00-00.59 01.00-01.59 02.00-02.59 03.00-03.59 04.00-04.59 05.00-05.59 06.00-06.59 07.00-07.59 08.00-08.59 09.00-09.59 10.00-10.59 11.00-11.59 12.00-12.59 13.00-13.59 14.00-14.59 15.00-15.59 16.00-16.59 17.00-17.59 18.00-18.59 19.00-19.59 20.00-20.59 21.00-21.59 22.00-22.59 23.00-23.59

Proportion of trips

178

GPS (11,090)

Diary (30,547)

Figure 7.A.4: Journeys by start time. Base shown in brackets. (a)

43%

Home

55% 14%

Work

4% 7% 7%

Personal Other Shopping Other

7% 1%

Shopping Groceries Visit friends/family

3%

4% 3%

Education

3% 3%

Entertainment

Personal Medical Sports

6%

5%

Holiday 0%

Eating drinking

6%

3%

2% 3% 1% 1% 1% 2%

Unknown 0%

14%

GPS(11,090) Purpose from

Diary(30,841) Purpose from

Figure 7.A.5(a): Journey purpose by origin.

Using Accelerometer Equipped GPS Devices in Place of Travel Diaries (b)

43%

Home Work

5% 7% 7%

Personal Other Shopping Other

7% 1% 6%

Shopping Groceries

3% 6% 3%

Visit friends/family Holiday

5% 0% 4% 3%

Education

3% 3%

Entertainment Eating drinking Personal Medical Sports

14%

2% 3% 1% 1% 1% 2%

Unknown 0%

12%

GPS(11,090) Purpose to

Diary (30,853) Purpose to

Figure 7.A.5(b): Journey purpose by destination.

56%

179

Chapter 8

WORKSHOP SYNTHESIS: VALIDATING SHIFTS THE TOTAL DESIGN OF TRAVEL SURVEYS

IN

Anthony J. Richardson and T. Keith Lawton 8.1. Purpose and Introduction The purpose of Workshop A6 was described by the conference organisers as follows: The process of trying to achieve an optimum balance in survey design decisions to achieve the best total quality is known as the ‘total survey design’ approach. In this approach, major efforts are taken to better understand, and therefore, to control both sampling and non-sampling errors throughout the design, capture, processing, and analysis of survey data. New approaches available to design travel surveys promise the capability of collecting better quality data while accommodating increasing budget restrictions and expectations. But the implications of these shifts in total survey design have not been well researched or documented. This workshop focuses on important issues in understanding the implications of implementing changes in survey design, such as improvements in telephone instruments, using GPS devices, or developing online survey systems. What are the implications in terms of the validity and reliability of the resulting information and for its utility for transportation planning and policy-making? The workshop was attended by 16 people1 from 11 different countries. In addition, Liz Ampt presented a summary of a paper she was presenting in another

1. Participants: Elizabeth Ampt (Australia), Jimmy Armgoogum (France), Herna´n Carvajal Corte´s (France), Bastian Chlond (Germany), Kelly Clifton (USA), Bernhard Fell (Germany), Birgit Kohla (Austria), Matthias Kowald (Switzerland), Keith Lawton (USA), Mancoba Mlotsa (South Africa), Jose´ Moore (Chile), Viviana Mun˜oz (Chile), Kseniya Nafigina (Russian Federation), Mariela Nerome (Argentina), Leda Pereyra (Argentina), Tony Richardson (Australia) (Chair), Petr Senk (Czech Republic), Alan Thomas (Chile).

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

182

Anthony J. Richardson and T. Keith Lawton

workshop, because of its relevance to the topic of this workshop (Ampt, 2011). The workshop also had the excellent services of a student from Chile, Jose´ Moore, to keep things running smoothly.

8.2. Summary of Total Design Principles The workshop was designed to explore how the principles of Total Design could be adhered to when implementing new forms of travel survey, especially where new technologies are involved. Given the diversity of the workshop participants, however, it was felt that a good understanding of what was meant by Total Design Principles was an essential pre-requisite for fully participating in the workshop. The Workshop Chair therefore provided an introductory presentation on Total Design (or Tailored Design, as it is now called by its creator, Prof. Don Dillman). The Total Design method was developed by Don Dillman, and has been refined over the years and presented in the form of three books (Dillman, 1978, 2000, 2007), with the name changing from Total Design to Tailored Design in the 2000 publication (nonetheless, most people still refer to it as Total Design). The most recent major publication has expanded the scope to ‘mixed-mode’ surveys and added a couple of co-authors (Dillman, Smyth, & Christian, 2009). The development of the Total Design method over the past 35 years parallels the developments in survey methodology. The 1978 publication, to quote Dillman et al. (2009), ‘introduced and helped to legitimise both mail surveys, which were considered inferior to face-to-face interviews, and telephone surveys, only rarely used for surveying before that time’. The 2000 publication covered new electronic modes of surveying including the Internet and email surveys. The 2007 update also introduced the role of mixed-mode surveys, in line with the new title of Tailored Design, where different methods may be more appropriate for different circumstances. The 2009 publication chronicles the demise of the telephone interview and the rise of the web survey. However, Dillman concludes that ‘despite its expanded use, the web is not yet a satisfactory replacement for telephone or mail in many survey situations’. The main change in the 2009 publication is that it ‘places front and centre the design and implementation of mixed-mode surveys’. So, given the extensive pedigree of Total Design/Tailored Design, what exactly is meant by this concept? Dillman et al. (2009) suggest that the names imply two aspects of design. First, ‘Total’ implies that attention must be paid to all aspects of the design; there is no ‘silver bullet’ in the design process, but hundreds of small individual designs must receive attention, from the specific wording of the precontact letter through to the visual design of the final questionnaire. Second, ‘Tailored’ implies that no one method is applicable in all situations. Depending on the topic, the demographic composition of the population, the geographic location and the availability of secondary datasets (e.g. sampling frames), different methods may be most appropriate and in some cases a combination of methods (mixed-mode surveys) may be most appropriate.

Workshop Synthesis: Validating Shifts in the Total Design of Travel Surveys

183

More importantly, and underlying both Total Design and Tailored Design, is the idea of minimising Total Design Error (Groves, 1989), which is the summation of the following: 1. Coverage Error: When not all members of the population have a known, non-zero chance of inclusion in the sample, and where the excluded members of the population are systematically different to the included members with respect to a variable of interest. 2. Sampling Error: When the sample size is not large enough to be able to draw conclusions with the desired levels of precision, especially within sub-populations of interest. 3. Non-Response Error: When not all those sampled respond to the survey and where those who do respond are systematically different to those who don’t respond with respect to a variable of interest. 4. Measurement Error: When the answers obtained from respondents are inaccurate or imprecise. Such inaccuracy is largely due to the inherent characteristics of the measurement device, and to the attention paid to the design of the survey instrument. In evaluating any survey method, Total Design requires that attention be paid to all four of the above types of error. A new survey method should not be used just because it might be good at minimising one of these types of error, if it results in increased error in one or more of the other types, thus increasing the overall Total Design Error. In the context of ‘Validating Shifts in the Total Design of Travel Surveys’, this workshop concentrated on examining Total Design Error for three traditional survey methods and two emerging survey methods, namely:     

Face-to-face interviews CATI interviews Paper-based self-completion diaries Web surveys GPS surveys

Given the recent emphasis on mixed-mode surveys, some attention was also paid to the use of mixed-mode methods for travel surveys.

8.3. Presentation of Workshop Papers The workshop was supported by three papers presented in the workshop, plus seven poster papers related to the theme of the workshop. In addition, because one of the workshop paper authors was unable at the last minute to attend the conference, and her results were only summarised by the workshop chair, a relevant paper presented in another workshop was also summarised in this workshop.

184

Anthony J. Richardson and T. Keith Lawton

Abby Sneade from the UK Department of Transport presented a paper on ‘Using accelerometer equipped GPS devices in place of paper travel diaries to reduce respondent burden in a national travel survey’ (Sneade, 2011). For the reasons of reducing respondent burden and reducing survey costs, the UK Department for Transport (DfT) had previously concluded that GPS data collection had real promise for the Great Britain National Travel Survey (NTS) and offered a suitable option for delivering affordable and practical improvements in the quality and reliability of the NTS diary data. In 2011, therefore, DfT undertook a pilot study of how the NTS might be run using GPS technology in place of the travel diary. It was expected that the results would determine how well the GPS methodology worked; identify any unforeseen problems and whether this method provided similar results to traditional methods. It was also expected to reveal whether the algorithms used to predict mode and purpose were accurate and well functioning and whether the certainty of these predictions were within acceptable parameters. If the pilot proved successful, it was expected that GPS methodology could be used from the inception of the next NTS fieldwork contract in 2013. The results of the pilot study presented at the conference, however, did not confirm these expectations. While the GPS survey was expected to reduce respondent burden, it actually obtained a lower response rate (52%) than the traditional 7-day paper-based self-completion diary (59%). While 98% of those who participate found the GPS easy to use, there were still a sizeable number of people who distrusted the technology. In particular, 18% of household members with children aged 12–15 said that those children should not be asked to carry a GPS device. There was also concern at the trip results obtained from the GPS survey, in comparison with the previous method. Fewer trips and trip stages were recorded in the GPS data, but the time and distance of the recorded trips were longer. In addition, the GPS data had far fewer walking trips and more rail trips. The distribution of trips over the day was different, with the GPS data not clearly recognising the morning and evening peaks. In addition, the GPS data had more trips to/from home, and fewer trips to/from work. As a result of the pilot study, it was concluded that the GPS data did not produce similar results to the traditional diary survey, and that such a major discontinuity in data trends could not be tolerated in a major national survey. As a result, it was decided that the tender for the NTS beyond 2012 would continue to be on the basis of a diary survey. Reductions in cost and respondent burden would be achieved by removing some of the existing questions that had not been used in analysis, thus reducing questionnaire length and respondent burden. Kelly Clifton and Keith Lawton presented a paper titled ‘Capturing and representing multimodal trips in travel surveys: a review of practice’ (Clifton, Muhs, & Lawton, 2011). Their concern was to identify efforts to capture multimodal trip making by identifying the issues that arise for data collection and representation of these trips. The paper included a review of a multitude of US-based household travel surveys with the purpose of identifying how the stages of multimodal trips are collected in the travel survey and represented in the data structure. Based upon this

Workshop Synthesis: Validating Shifts in the Total Design of Travel Surveys

185

review and the authors’ experience with various travel surveys, the authors then made recommendations for approaches to practice, with an emphasis on various aspects of data collection and data representation. As their presentation proceeded, it became clear that the subtile of the paper should have been ‘a review of practice in the USA’, since many of the shortcomings identified pertained mainly to CATI surveys, and to current US practice. Most of the workshop participants were from outside the United States, and many commented that what was being suggested was already common practice in non-US locations. Nonetheless, the recommendations were a timely reminder of the procedures needed to ensure proper collection of multimodal travel behaviour. Linda Christensen offered a paper titled ‘The role of web interviews as part of a national travel survey’ based on her work with the Danish National Travel Survey (Christensen, 2011). Unfortunately Linda was unable to attend, and her presentation was made by the workshop chair, given its important findings with respect to mixedmode surveys. For the 2006 Danish National Travel Survey, a mixed-mode survey involving CATI and web surveys was proposed in order to reduce costs, increase response rate (especially from younger respondents) and obtain better quality data. The web survey was added to the traditional CATI NTS by asking the respondents to first check in on the web and answer the questions there. If they have not participated on the web after 2 days, they were called by telephone and administered the usual CATI survey. The analyses show that the overall response rate was only increased by 1% due to contact to people with no known telephone numbers. The young and people with low resources who are difficult to reach by telephone were also not participating on the web. It was concluded that the web interviews give a more correct picture of the number of trips than the CATI. But in fact the difference, 1.2%, was very little. The difference in kilometres was more important. The results also showed that only using the web would bias the results. More people with social resources would participate while elderly people would be less involved. It would also bias the results to more kilometres but not to more trips. It would also reduce participants who are busy in their daily life, e.g. families with children. It was concluded that web interviews saved money. On a per-interview basis, web interviews were 74% cheaper than CATI interviews. However, overall, there was only a 15% saving due to a low response rate on the web and the need for much more post-processing of the web interview data compared to the CATI interview data. In general, the addition of the web surveys gave some marginal improvements, but not to the extent expected. The finding about the effect on response rates is echoed by Dillman et al. (2009), when he reports on findings from Dillman et al. (2008) and others. In that study, respondents were given the option at the beginning between doing the survey by mail or on the web. Rather than increasing the response rate, he found that it decreased between 1 and 9 percentage points, a finding confirmed by other studies. He therefore advises against offering a choice of options, because it just creates another decision for the respondent to make thereby increasing respondent burden and reducing response rate. He found that response rates could however be increased by offering

186

Anthony J. Richardson and T. Keith Lawton

one option at a time. If the first option was not adopted, then respondents could be offered a second option (and a third etc.). Since this also serves as a reminder each time a new option is provided, the response rate increases. Indeed, even if the method was not varied on the second and third contacts, one would likely see an increase in response rate. While this might be acceptable for time-insensitive measurements, it has been shown (Richardson, 2003) that while repeated contacts (which allow the specified Travel Day to slip over time, to avoid recall problems) do increase response rate, they also bias the measurement of travel, with decreasing trip rates in later contacts. One therefore needs to consider this issue if using mixed-mode methods sequentially in travel surveys. An extra paper was presented in the workshop by Liz Ampt titled ‘Diagnostic testing: An innovative way to test survey design’ (Ampt, 2011). This paper was particularly concerned with the Measurement Error aspect of Total Design Error, and proposed a method by which respondents’ understanding of questionnaires and other survey materials could be tested. In this way, measurement error could be reduced because respondents better understood what they were being asked, while response rates could increase because of reduced respondent burden (due to reduced confusion for the respondent). Importantly, it was also highlighted how these methods could be used in mixed-mode surveys to ensure that the questions in each of the survey methods were indeed asking for, and receiving, the same information.

8.4. Workshop Format The objective of the workshop was to elicit information from participants about the various types of survey error associated with three traditional and two emerging survey methods. However, the workshop had 16 participants from 11 different countries, covering a wide range of professional backgrounds and levels of experience, and also a wide range of proficiencies in the conference language (English). From past experience with such workshops, it was known that an unstructured discussion format would be dominated by those with more experience and those who were more proficient in the conference language (and those who simply like to talk). Therefore, the workshop was structured using the Six Hats methods developed by Edward De Bono (1985). Six Hats Thinking recognises that most unstructured discussions are inefficient ways of gathering information, with much time spent reacting to what others have said, defending pre-existing positions, or trying to convince others to agree with us. Six Hats Thinking recognises that such discussions can usefully be split into five different directions, and assigns a different colour ‘hat’ to each of these modes of thinking, as follows:  White Hat — information gathering  Red Hat — emotions and intuition  Yellow Hat — logical positive aspects

Workshop Synthesis: Validating Shifts in the Total Design of Travel Surveys

187

 Black Hat — logical negative aspects  Green Hat — creative thinking In addition, a sixth hat, the Blue Hat, is used for someone (the chairperson) to think about the thinking while the others are thinking about the topic. Six Hats Thinking has been found to be much more thorough in exploring a topic, more efficient with time, and enables all participants to actually contribute to the outputs. For this workshop, only three of the hats were used; the workshop chair wore the Blue Hat to direct the thinking process, while each participant wore the Yellow Hat and then the Black Hat to think about the good and bad aspects of each of the five survey methods. They wore each hat for only 2 minutes, and initially worked on their own, writing down as many ideas as they could under each hat for each survey method. After this initial 20 minutes of individual work, the workshop chair gathered these ideas by going around the room and asking each person for just one idea that they had written down under that hat that had not already been mentioned. If they had no further ideas, they could just ‘pass’. The process continued for each hat until no new ideas were forthcoming. The process then started at a different place in the room (to ensure that everyone had a chance of being an early contributor) for the next hat and survey method. This collective aspect of the Six Thinking Hats typically takes more time, since participants often need to explain or elaborate on their written notes, and this discussion often triggers more discussions and further ideas from other participants. In the workshop, each hat/method took approximately 30 minutes to gather the ideas. Given the relatively limited time available in the workshop, only three of the survey methods (CATI, GPS, web) were fully debriefed in this way. The effectiveness of the Six Hats method can be seen by reference to Table 8.1, which shows the number of different ideas collected under each hat for the three survey methods that were fully de-briefed. To gauge the increase in efficiency in generating this information, a straw poll was asked of respondents as to how many individual ideas they had personally written down under each of the hat/method combinations. The number varied between 1 and 6 across all the combinations. Thus, even the most experienced of the participants could not generate more than six ideas in the 2 minutes allowed (given more time, they might have generated a few more, but probably not). The collective numbers of ideas from the 16 participants was four to six times greater than that of the most productive individual.

Table 8.1: Ideas generated for three survey methods. Survey method CATI GPS Web

Yellow Hat

Black Hat

24 29 30

23 36 27

188

Anthony J. Richardson and T. Keith Lawton

8.5. Workshop Outcomes While the lists assembled from this process are interesting in themselves, the objective of the workshop was not just to compile lists of good and bad points for each of the survey methods. Rather, the objective was to examine the different types of survey error, which contribute to Total Design Error, for traditional and emerging survey methods, so as to validate shifts in the total design of travel surveys. This is done below by tabulating the main Advantages (Yellow Hat) and Disadvantages (Black Hat) of each survey method, with respect to each of the four types of survey error that constitute Total Design Error (i.e. Coverage Error, Sampling Error, NonResponse Error and Measurement Error). These results are predominantly based on the outputs from the workshop participants.

8.5.1.

Face-to-Face Surveys

The main advantages of a face-to-face interview survey with respect to the four sources of Total Design Error are shown in Table 8.2, while the main disadvantages are shown in Table 8.3. In this context, a face-to-face interview refers to an interview of an entire household at their home.

8.5.2.

Self-Completion Surveys

The main advantages of a self-completion survey with respect to the four sources of Total Design Error are shown in Table 8.4, while the main disadvantages are shown in Table 8.5. In this context, a self-completion survey can be given to and collected from the respondent in various ways, e.g. personally delivered and collected, or

Table 8.2: Advantages of face-to-face interview survey. Type of error

Advantages

Coverage Error

 Generally good lists of household addresses available

Sampling Error

 Allows longer surveys, therefore more data per household

Non-Response Error

 Highest response rates (60–75%) from a random sample of households

Measurement Error

 The presence of an interviewer allows more accurate objective data to be collected  The interview is customisable to the needs of the respondent  The interviewer can explain terms, prompt, and note body language of the respondent

Workshop Synthesis: Validating Shifts in the Total Design of Travel Surveys

189

Table 8.3: Disadvantages of face-to-face interview survey. Type of error

Disadvantages

Coverage Error

 Could be difficult to survey households in remote areas

Sampling Error

 More expensive per households, therefore fewer households can be surveyed within fixed budget

Non-Response Error

 Some respondents may have a fear of a ‘stranger’ entering their home

Measurement Error

 Respondents may have poor representations of travel time when they self-report

Table 8.4: Advantages of self-completion survey. Type of error

Advantages

Coverage Error

 Generally good lists of household addresses available. Postal surveys can reach remote areas as cheaply as builtup urban areas

Sampling Error

 Less expensive per household than face-to-face interviews, therefore more households can be surveyed within fixed budget

Non-Response Error

 Good response rates (60%) can be obtained from a random sample of households when the self-completion questionnaire is personally delivered and collected  The fact that the respondent can choose the time to complete the survey gives them more time to think about the answer or to collect required information

Measurement Error

mailed out and mailed back. Typically the former method gets higher response rates and better quality data but at a higher unit cost, whereas the latter has a lower unit cost but with lower response rates and poorer data quality. 8.5.3.

CATI Surveys

The main advantages of a CATI survey with respect to the four sources of Total Design Error are shown in Table 8.6, while the main disadvantages are shown in Table 8.7. In this context, a CATI survey is assumed to use random-digit dialling as the sampling procedure, with a diary posted to the respondent upon recruitment. The travel data is then retrieved over the phone.

Table 8.5: Disadvantages of self-completion survey. Type of error

Disadvantages

Coverage Error

 Could be difficult to survey households in remote areas if questionnaires personally delivered and collected

Sampling Error

 No specific disadvantage

Non-Response Error

 Poor response rates (30%) are obtained from a random sample of households when the self-completion questionnaire is mailed to the respondent and then mailed back

Measurement Error

 The absence of an interviewer who could prompt may mean that there could be an under-reporting of travel data  Some approximations with time and distance reporting

Table 8.6: Advantages of CATI survey. Type of error

Advantages

Coverage Error

 Phone surveys can reach remote areas almost as cheaply as built-up urban areas

Sampling Error

 Cheaper to administer, therefore more surveys can be performed within fixed budget

Non-Response Error

 Multiple callbacks possible to reduce non-response

Measurement Error

 Prompting possible to elicit details

Table 8.7: Disadvantages of CATI survey. Type of error

Disadvantages

Coverage Error

 Fixed-line phone coverage is decreasing  Hard to contact all members of household  Hard to sample mobile phones

Sampling Error

 Phone sampling is not a random sample of the population

Non-Response Error

 Poor response rates (o30%) are obtained from a random sample of households (if it could be chosen) because of recruitment and non-response loss  Call screening and other strategies make it difficult to make contacts with sample  Limits on length of call limit amount of detail that can be collected  No visual aids  Approximations of travel time and distance

Measurement Error

Workshop Synthesis: Validating Shifts in the Total Design of Travel Surveys 8.5.4.

191

GPS Surveys

The main advantages of a GPS survey with respect to the four sources of Total Design Error are shown in Table 8.8, while the main disadvantages are shown in Table 8.9. In this context, a GPS survey is assumed to be a survey where GPS is the main means of data collection, using algorithms to process the travel traces to estimate mode and trip purpose. 8.5.5.

Web Surveys

The main advantages of a web survey with respect to the four sources of Total Design Error are shown in Table 8.10, while the main disadvantages are shown in Table 8.11. In this context, a web survey is assumed to be a survey where the

Table 8.8: Advantages of GPS survey. Type of error

Advantages

No language difficulties

 Depends on the means of contacting the potential sample. Random sampling is best  Multi-day surveys are easier to administer, therefore more data can be collected within fixed budget  Lower respondent burden, therefore potentially higher response rates  Potentially high-resolution time and space data

Measurement Error

 Actual data on route choice

Coverage Error Sampling Error Non-Response Error

Table 8.9: Disadvantages of GPS survey. Type of error

Disadvantages

Coverage Error

 Black spots and cold starts limit coverage area

Sampling Error

 Relatively expensive per household (at the moment), hence generally smaller samples

Non-Response Error

 Still some reluctance to carrying GPS units, especially for school children  Generally low take-ups of GPS units (25–50%)  People forget to charge or carry GPS with them

Measurement Error

 Limited data on mode, purpose and vehicle occupancy  Processing algorithms complex (but improving)  Difficulties in identifying trip start and end points

192

Anthony J. Richardson and T. Keith Lawton

Table 8.10: Advantages of web survey. Type of error

Advantages

Coverage Error

 Depends on the means of contacting the potential sample. Random sampling is best

Sampling Error

 Potentially cheaper to administer, therefore more data can be collected within fixed budget

Non-Response Error

 Potentially low respondent burden  Can appeal to some problem demographic groups  Respondent can answer in their own time

Measurement Error

 Can have context-dependent questions  Potential for using online mapping to improve route and destination answers  Can do online error-checking to improve quality of responses

Table 8.11: Disadvantages of web survey. Type of error

Disadvantages

Coverage Error

 Limited web availability for some demographics  Firewalls and security measures may block potential respondents  Web survey panels are significantly biased

Sampling Error

 No specific disadvantages

Non-Response Error

 Respondents can drop survey in mid-survey  Generally low response rates (30%) unless supplemented with another method  Too much online checking may induce respondents to discontinue  Login procedures may discourage respondents

Measurement Error

 Cannot simply transfer paper survey designs, e.g. replacing lists with dropdowns  Difficult to design geocoding procedures for untrained respondents  The absence of prompting may mean that there could be an under-reporting of travel data

Workshop Synthesis: Validating Shifts in the Total Design of Travel Surveys

193

respondents are recruited by some means, and then asked to complete the travel survey via a web interface.

8.6. Conclusions The design and incremental refinement of traditional survey methods (face-to-face interviews and self-completion diaries) over many decades has led to a gradual reduction in Total Design Error by paying deliberate attention to the four components of Coverage Error, Sampling Error, Non-Response Error and Measurement Error. Newer survey methods have been introduced largely as a means of reducing one or more of these sources of error. For example, GPS surveys offer the potential of much more accurate measurement of travel time and distance, while web surveys offer the potential of lower cost and the promise of reaching traditionally low response rate groups. However, these methods must now themselves go through this incremental refinement process, by paying particular attention to their weaknesses in the four areas of Total Design Error. Only when the Total Design Error has been reduced below that of traditional survey methods will they be worthy substitutes for the traditional methods.

References Ampt, E. (2011). Diagnostic testing: An innovative way to test survey design. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC). Termas de Puyehue, Chile, November 14–18. Christensen, L. (2011). The role of web interviews as part of a national travel survey. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC). Termas de Puyehue, Chile, November 14–18. Clifton, K., Muhs, C. D., & Lawton, T. K. (2011). Capturing and representing multimodal trips in travel surveys: A review of the practice. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC). Termas de Puyehue, Chile, November 14–18. De Bono, E. (1985). Six thinking hats. Boston, MA: Little, Brown and Company. Dillman, D. A. (1978). Mail and telephone surveys: The total design method. New York, NY: Wiley-Interscience. Dillman, D. A. (2000). Mail and Internet surveys: The tailored design method. New York, NY: Wiley. Dillman, D. A. (2007). Mail and Internet surveys: The tailored design method (2nd ed.). Hoboken, NJ: Wiley. Dillman, D. A., Smyth, J. D., & Christian, L. M. (2009). Internet, mail and mixed-mode surveys: The tailored design method. Hoboken, NJ: Wiley. Dillman, D. A., Smyth, J. D., Christian, L. M., & O’Neill, A. (2008). Will a mixed-mode (mail/ Internet) procedure work for random household surveys of the general public? Paper presented at the Annual Conference of the American Association for Public Opinion Research, New Orleans, LA.

194

Anthony J. Richardson and T. Keith Lawton

Groves, R. M. (1989). Survey errors and survey costs. New York, NY: Wiley. Richardson, A. J. (2003). Behavioural mechanisms of non-response in mailback travel surveys. Transportation Research Record, 1855, 191–199. Sneade, A. (2011). Using accelerometer equipped GPS devices in place of paper travel diaries to reduce respondent burden in a national travel survey. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC). Termas de Puyehue, Chile, November 14–18.

Chapter 9

WORKSHOP SYNTHESIS: MULTI-METHOD DATA COLLECTION TO SUPPORT INTEGRATED REGIONAL MODELS Eric J. Miller and Caitlin Cottrill 9.1. Introduction and Purpose Much has happened in the past decade to develop integrated regional models of land use and transport systems, and their environmental impacts, as decision-support tools for urban regions. Used as a complement to well-established network-based travel demand forecasting models, they allow decision-makers to ‘try on for size’ (in a sensitivity-testing sense) different scenarios for technology, markets and policy, and to compare different development paths for the region. Many submodels covering different decision-making agents, including individuals, households, developers, employers and regulators, are designed to interact in integrated modelling platforms using the best available agent-based and econometric methods. More than ever, behavioural mechanisms for all agents need to be founded in an understanding of the spatial and temporal patterns of the activities of transport users. What is the feasible best data-collection strategy to both specify and run this new generation of integrated regional models? Are the available survey toolsets adequate for the purpose, or is new survey methodological research required? This workshop was charged with determining the state of these questions, and making recommendations for survey research in the shorter and longer terms.1

1. Participants: Patrick Bonnel (France), Peter Bonsall (UK), Caitlin Cottrill (Singapore) (Rapporteur), Joao de Abreu e Silva (Portugal), Elizabeth Greene (USA), Oli Madsen (Denmark), Eric Miller (Canada) (Chair), Andres Monzon (Spain), Catherine Morency (Canada), Mohja Rhoads (USA), Susan Swain (USA), Marius The´riault (Canada), Jean Wolf (USA).

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

196

Eric J. Miller and Caitlin Cottrill

9.2. Summary of Contributed Papers Three papers were presented in this workshop. The first paper by Peter Bonsall (2011) ‘So what is all this data for?’ was conceived as part of an EU funded initiative (the SHANTI project) whose aim is to promote development of effective and harmonized procedures for surveys in the transport sector. One task of this project was to establish data requirements for policy formulation in Europe and to consider how these requirements might best be met by surveys. The exercise has resulted in the production of a valuable database of opinions on the usefulness of different types of data for different tasks in the transport planning, management and research sector. This database, which is still being expanded, throws interesting and potentially important light on the use that is made of certain data items, and on the apparent under use of others. The paper is of particular relevance in the context of efforts to make best use of limited survey budgets and to capitalize on the savings and enhancements that can be made using new technologies. The paper explores some of the issues involved in the consideration of the purposes for which data are used. It seeks to explain the importance of considering data needs at every stage in the development of data collection methodologies and of survey planning. But it also highlights the problems involved in the definition of the ‘purpose’ of data, the extent to which perspectives vary on the importance of different uses of data, and the fact that the relative importance of different end uses varies over time. The second paper ‘Integrated transportation and energy activity travel web-based survey’ by Camila Garcia et al. (2011) describes the design process of a transportation and energy activity-travel web-based survey. The survey, developed as part of the iTEAM Project of the MIT-Portugal Program, aims to collect necessary information to investigate the relationships between individual travel behaviour and activity engagement, as well as the impacts that activity patterns have on energy consumption. The data collected from the survey will be used to develop an integrated activity-transportation and energy consumption model that can be used to predict the energy consumption in different policy scenarios and evaluate the effectiveness of candidate ‘green policies’. The third paper ‘The application of Internet based surveys in the Lisbon Metropolitan Area’ by de Abreu e Silva and Martinez (2011) describes three web-based surveys developed and conducted recently in the Lisbon Metropolitan Area for the development of two research projects in the MIT-Portugal Program (SCUSSE and SOTUR). The first survey was a stated preference survey regarding the mode choice assessment for the innovative intermediate transport modes devised in the SCUSSE Project. The other two surveys consisted on data collection regarding residential choice (past and current), residential satisfaction and stated neighbourhood preferences, connected with regular mobility patterns and accessibility levels. The first survey needed to be programmed whereas the other two used commercial web software platforms.

Multi-Method Data Collection to Support Integrated Regional Models

197

The paper compares the application of different surveying techniques (web based and CAPI) and how they can be used to reduce costs and at the same time increase the overall response quality while controlling for sampling biases. It also compares the accuracy levels in the answers of both techniques and provides a detailed assessment of the evolution of response rates in the web-based applications. These are related with the implemented dissemination strategies in order to analyse its impact on the curve of the response rate. The paper also analyses the variance in time required to complete the surveys, the number of responses with the same IP address and the attrition rates for each component of the surveys.

9.3. Integrated Regional Models Integrated regional models (IRMs) simulate the spatial evolution of a given study region’s system state over time as a function of various socio-economic, demographic and political processes. The region’s system state is highly multi-dimensional and usually includes the spatial distribution of the region’s resident population; the spatial distribution of the region’s employment and other out-of-home activities; the travel that occurs from point to point within the region over the course of a representative time period (usually a single ‘typical’ weekday); and, ideally, the flows of goods and services from point to point within the region over the course of a representative time period (Miller, 2004, 2009). Integrated regional models have been developed and used since the advent of mainframe digital computers and are seeing increasing use in urban regions worldwide (Wegener, 2004). As illustrated in Figure 9.1, IRMs are actually complex model systems that model the behaviour of many agents (persons, households, firms etc.) in considerable spatial detail over long (20-plus year) forecast horizons. They consist of the complex interaction of many individual models, including models of:     

population demographics; key spatial markets (housing, labour, commercial real-estate etc.); firmographics (employment, regional economy); built form; activity and travel (the demand for transportation, including auto ownership decision-making);  the multi-modal transportation network (the supply and performance) of the transportation system);  many outcomes of policy interest (transportation, environmental, economic, social etc.).2

2. Note that, although not shown in Figure 9.1 due to space limitations, these ‘outcomes’ also ‘feedback’ and influence agents’ behaviour in a variety of ways.

198

Eric J. Miller and Caitlin Cottrill

Education Built Form

Housing Market (demand/supply/ prices) Labour Market (demand/supply wages)

Demographics

Regional Economics / Firmographics

Auto Ownership

Space

Activity/Travel

Agents

Networks (Road, transit, ….)

Emissions, dispersion, exposure

Efficiency, equity, etc. Energy, GHGs Health

Time

Figure 9.1: Integrated regional model framework.

Given the comprehensiveness and complexity of IRMs, their data requirements for development and application are obviously very large, multi-faceted and challenging. Indeed, Lee (1973) famously argued that IRMs are doomed to failure due to their ‘data hungriness’, among other ‘sins’. While this may have been the case for the first-generation models that provided the basis for Lee’s comments, advances in data availability and collection methods (as well as computational capabilities, among other advances) have led to successful implementation of modern third- and fourth-generation IRMs in a growing number of urban regions worldwide (Miller, 2009). Nevertheless, the IRM data challenge remains with us.

9.4. Scoping the Data Collection Challenge for IRMs Currently, IRMs are developed using data obtained from a wide variety of sources. The travel model component is developed from traditional travel (or possibly activity) surveys and transportation network models. The ‘land use’ components of the model system are generally developed quite independently of the travel model components, using data from a variety of sources. The process tends to be ad hoc and opportunistic in nature. Census data, special-purpose surveys and regional ‘macroeconomic’ data concerning employment and population forecasts are often utilized in a variety of ways. Data for model system validation are typically limited or missing altogether (Miller, 2004).

Multi-Method Data Collection to Support Integrated Regional Models

199

Given their hyper-dimensionality, it is inevitable that IRM model development will continue to draw upon a variety of sources. The question exists, however, concerning the extent that a more coordinated, systematic approach to IRM-related data assembly, collection and use can be developed that might be generally applicable and that might lead to significantly improved IRMs. Key themes/issues relating to this possibility that were identified in the workshop include the following:  Improved, more extensive data concerning both ‘outcomes’ (building stock, prices, employment etc.) and decision processes (market processes, individual travel and location choices etc.) are required to develop improved models.  These data can be obtained in three fundamental ways:  Accessing existing datasets, typically collected for other purposes. Census data, assessment data and regional economic forecast data are a few typical examples of such datasets, but the workshop participants believe that much more can be done to exploit such datasets.  Non-intrusive (‘passive’) data collection methods that collect information concerning agents’ behaviour and outcomes with little to no explicit effort on the part of the agent (except typically consent for the data collection to occur). IRM models to date typically have not exploited such datasets, but considerable opportunities to do so exist.  ‘Active’ survey methods in which agents are directly queried concerning their attributes, actions, attitudes, preferences, decision process etc. As in travel behaviour research and modelling, these have been the traditional ‘work horse’ for residential location modelling, among other IRM-specific applications. The workshop participants believe that considerable scope exists within both traditional and new/emerging survey methodologies to significantly improve the database for IRM model development, particularly with respect to providing the basis for improved process models. The central IRM challenge with respect to active surveys is the extent to which multiple, inter-related decision process can be tracked within a single survey (or coordinated package of surveys).  Longitudinal surveys of all types (retrospective, panel, repeated cross-sections, prospective) are all relevant tools for IRM database development, particularly given the long-term, dynamic nature of these model systems, which (ideally) required process-based decision models.  The usual survey methodological technical challenges of response burden (and resulting response rates), sample representativeness, cost and data management (among others) all exist within the IRM domain and tend to be particularly challenging given the ‘hyper-space’ of attributes and decision process included within IRMs. Recognition of these issues filtered throughout the workshop’s discussions, in particular in terms of continuously reminding ourselves of the need to be ‘practical’, especially with respect to response burden. These issues are addressed as required in the subsequent discussion within this summary, but these issues were not the focus of a detailed, focussed discussion of them per se within the workshop.

200

Eric J. Miller and Caitlin Cottrill

 The level of both spatial and temporal (dis)aggregation in both IRM datasets and models is a major design problem, with no ‘one size fits all’ solution. While IRMs are following the general trend in transportation modelling towards increasingly detailed agent-based microsimulation as the emerging standard, many technical, practical and theoretical challenges still persist in this regard to the point that it is still an open question what the ‘best’ level of spatial and temporal precision is for operational, cost-effective, policy-sensitive models. This issue is not explored in greater detail within this summary paper, but there is a general implicit assumption that we are interested in ‘pushing the envelope’ with respect to data disaggregation (so as to support behavioural agent-based modelling as best as possible), while maintaining feasibility and cost-effectiveness in our data collection efforts, as well as keeping an eye firmly on matching our data collection efforts to the task at hand. In particular, the option for more ‘system-level’ modelling of urban land markets (among others), which may eliminate the need for highly disaggregate, individual level market participation data, should be carefully considered in model and data collection design.  The opportunity to exploit ICT technologies (Internet, social networking etc.) in the development of IRM-relevant databases was discussed at length with the workshop. Such technologies have not played a major role in IRM development to date, but, like all elements of travel behaviour research and model development, their potential contribution is huge. Subsequent sections of this summary deal with the issues that were the primary focus of the workshop’s deliberations. These are building longitudinal databases, exploiting available databases, non-intrusive data collection methods, developing an integrated IRM survey design strategy, and ICT applications within the IRM data collection tool-kit.

9.5. Building Longitudinal Databases Longitudinal data are essential both for the development of models and the testing of the model system over historical time periods. All of the data collection methods discussed in the subsequent sections of this report can and should be used to construct longitudinal datasets. In terms of surveys, three classic approaches to longitudinal data collection exist:  repeated cross-section surveys;  retrospective surveys; and  panel surveys. Repeated cross-section surveys are the most common and easiest to undertake. They can provide excellent ‘snapshot’ data concerning system states at various points in time. They are not, however, generally suitable for constructing ‘biographies’ of individual agents that provide the basis for building dynamic, process-based models (Figure 9.2). For this purpose, respondents must either be queried about their past

Multi-Method Data Collection to Support Integrated Regional Models

201

Demographics Lags/leads in decisions? Housing Interconnection (triggering) between events?

Employment

Time

Figure 9.2: Observing inter-related biographies.

behaviour (retrospective surveys) or followed forward through time (panel surveys). Both approaches are used in IRM applications and both have well-known strengths and weaknesses. Retrospective surveys are relatively inexpensive and efficient to undertake and have been shown to yield useful information about past ‘big’ decisions, such as housing (re)location decisions, labour market participation etc. As such, they are attractive options for building dynamic models of such processes. Important concerns with such models, however, include the ability of respondents to accurately recall details concerning past actions and to retroactively ‘justify’ their past decisions. Panel surveys are generally more expensive to undertake. Maintenance of panel respondents throughout the length of the panel survey period can also be a challenge. And researchers and modellers often are not willing to wait for panel results for long-term processes such as housing markets, which may take years or even decades to play out. On the other hand, panels provide the opportunity for a rich and highly controlled observation environment and can work well for shorter-term processes. In both retrospective and panel cases a major challenge is to collect the full set of contextual data required for model building. These might include transport network data and performance measures, housing market supply, vacancies and price data, labour market unemployment and wage data etc. Panel approaches may provide an advantage in this regard in that such data, in principle, can be gathered in ‘real time’ during the course of the panel. Unfortunately, for a variety of reasons, this often has not happened, resulting in researchers needing to retrospectively construct these ancillary datasets, as in the retrospective survey case, which can prove to be a very onerous process and in many instances may be effectively impossible to do adequately. Constructing IRM-relevant biographies is particularly challenging since, as is illustrated in Figure 9.2, multiple, co-temporal biographies are ideally required so that the interactions between events of different types (household structure changes, labour market changes, housing market changes) can be studied and correlated.

202

Eric J. Miller and Caitlin Cottrill

9.6. Exploiting Existing Datasets Regarding both the acknowledgement of increasingly limited funding resources to obtain survey data, as well as the increasing ubiquity of data from a number of sources, much discussion was held about the potential to use existing datasets as inputs to transportation models. It was noted that large amounts of data are currently being collected and stored in a disaggregate manner on topics ranging from natural events (such as weather or natural disasters) to incidents (such as traffic crashes or terrorism) to individual indicators (such as social media and health information). Workshop participants noted that, while much of these data are not currently being archived in useful ways, the potential exists to form alliances and agreements between agencies and organizations that would allow for access to such datasets. By allowing access, there would be potential to greatly improve the quality of these datasets, as well as their potential use in combination with other archived data. Some key needs identified as part of this discussion were:  adequate staff and resources to access, collect and maintain archives of collected data;  accurate and consistent metadata, including collection methods, definitions of key terms, time frames for data collection etc.; and  inter-agency and organization alliances to allow for the compilation of useful datasets.

9.7. Non-Intrusive (Passive) Data Collection Also discussed was the data collection potential of more person-based ubiquitous technologies, including GPS-enabled devices, transit smartcards, and other individualized devices. Such methods, classified as ‘passive’ by workshop participants, were discussed both for the potential to utilize collected data on its own, but also for the potential to combine such passive data collection methods with ongoing ‘active’ methods of survey data collection. Among the passive data collection methods discussed were the following:  Transactions: Data from the use of smartcard and credit cards  Passive location tracking: GPS, WiFi, GSM, and accelerometer data from such technologies as cell phones and GPS navigation devices  Resource use: Information collected from smart meters, compiled data on energy, water and waste collected by utility groups and others  Communication logs: Internet, mobile and landline phone use  Vehicle computer logs: Passenger vehicles and freight  Emerging technologies: Methods including peer-to-peer vehicle communications and RFID tags. While such methods and datasets may provide large amounts of data that would be useful to the transportation modeller, a number of issues will need to be addressed

Multi-Method Data Collection to Support Integrated Regional Models

203

to ensure the availability and validity of the datasets. Such issues may include the need to obtain consent from users to allow for data collection; issues associated with the ability to protect the privacy of persons on whom passive data are collected; and access to individualized data. If these issues are adequately addressed, there may still be questions related to the ability to combine data streams to generate individualized datasets, the commercial value of such datasets, and the ability to tie data streams to contextual information.

9.8. An Integrated IRM Survey Design Strategy A number of suggestions were discussed for methods by which the amount of data desired for transportation models and other uses could be gathered. A strategy was developed and recommended as shown in Figure 9.3. Under this recommendation, a ‘core’ survey (such as the census or household travel survey) would be undertaken using a large sample to gather basic information for purposes of characterizing the population. Such a sample would then be used as a foundation around which other ‘satellite’ surveys could be undertaken. These satellite surveys would be intended as special-purpose instruments designed to address specific elements of the larger picture, and would use smaller sample sizes. While most participants would only be requested to take part in one to two satellite surveys, a select group (whether of volunteers or of identified ‘self-exposers’) would be asked to participate in all, or at least most, of the satellites (the ‘full Monty’). While such a group would likely not be representative of the population as a whole, their participation would establish a rich dataset of attractive personal data. Observations of the datasets of such persons could be used for hypothesis generation

Figure 9.3: An integrated survey design strategy.

204

Eric J. Miller and Caitlin Cottrill

and process testing. A variety of response methods should be utilized for such surveys, including face to face, phone, web based, and paper based.

9.9. ICT Applications Finally, advances in data-gathering technologies, such as Smartphones and tablet computers, have the potential to significantly impact the ways in which travel surveys are conducted in the near and distant future. While GPS surveys have been increasingly used to conduct transport surveys, discussion often centred on the potential to expand upon these base methods given future technological advances and dispersion.

References Bonsall, P. (2011). So what is all this data for? Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC). Termas de Puyehue, Chile, November 14–18. de Abreu e Silva, J., & Martinez, L. (2011). The application of Internet based surveys in the Lisbon Metropolitan Area. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC). Termas de Puyehue, Chile, November 14–18. Garcia, C., de Abreu e Silva, J., Abou-Zeid, M., Ben-Akiva, M., Choudhury, C., Pereira, F., & Silva, M. (2011). Integrated transportation and energy activity-travel web-based survey. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC). Termas de Puyehue, Chile, November 14–18. Lee, D. B. (1973). Requiem for large scale models. Journal of the American Institute of Planners, 39, 163–178. doi: 10.1080/01944367308977851. Miller, E. J. (2004). Integrated land-use/transport model requirements. In D. A. Hensher, K. J. Button, K. E. Haynes & P. R. Stopher (Eds.), Handbook of transport geography and spatial systems, Handbooks in transport (Vol. 5, Chapter 10, pp. 147–166). Amsterdam: Elsevier Science. Miller, E. J. (2009). Integrated urban models: Theoretical prospects (invited resource paper, Chapter 14). In R. Kitamura, T. Yoshii & T. Yamamoto (Eds.), The expanding sphere of travel behaviour research: Selected papers from the 11th International Conference on Travel Behaviour Research (pp. 351–384). Bingley, UK: Emerald. Wegener, M. (2004). Overview of land use transport models. In D. A. Hensher, K. J. Button, K. E. Haynes & P. R. Stopher (Eds.), Handbook of transport geography and spatial systems, Handbooks in transport (Vol. 5, Chapter 10, pp. 127–146). Amsterdam: Elsevier Science.

THEME 2 IMPROVING RESPONDENT INTERFACES

Chapter 10

Web-Based Travel Survey: A Demo Pierre-Le´o Bourbonnais and Catherine Morency

Abstract Purpose — This paper presents the process of creating a web-based travel survey tool. It aims to define advantages and disadvantages of using the web as a survey tool as well as explain the methodology involved while conducting online travel surveys and technologies that were used in the described tool. Methodology/approach – This paper presents a web-based origin-destination travel survey tool that was developed to assess the potential of this medium to complement usual large-scale phone surveys conducted regularly in the Quebec province. The first tool (that was updated twice to answer to new needs — people-based regional survey and household-based regional survey) developed for a generator-based survey is presented and discussed. The paper namely describes the technology used as well as the particular functions, both for respondents and administrators that were developed. Particularities of the tools are introduced. Findings — The experimentations conducted using the web-based survey tool reveal that the key components of the tool that influences the response rates and quality of responses are ease of use of the multiple elements on questionnaire such as maps and form fields and overall design quality of the user interface. While presentation of actual results after conducting surveys using the tool are not the main goal of this paper, some preliminary results such as response rates reveal that between 10% and 20% of the entire community of trip generators like universities responded to the person-based version and around 10% of the sampled households from the general population of a specific region did complete the household-based version.

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

208

Pierre-Le´o Bourbonnais and Catherine Morency

Research limitations/implications — Web-based survey tools in the transportation domain are still new and are in need of a much larger research base to be able to generalize results and findings further. Practical implications — The tool presented is using up-to-date technology and refined questionnaire design. In this optics, it tries to push the development of web-based travel surveys further in order to increase response rates and quality of responses. Originality/value — This paper is, on one hand, one of the first to present a tool that was used for both a person-based and a household-based survey and on the other hand for trip generator communities as well as households sampled for regional surveys. It also presents in great detail the interface used in the questionnaire and the administration toolkit accompanying the web application. Keywords: Web; survey; tool; travel; behavior; interface

10.1. Introduction There is an increasing body of research on the importance to combine multiple survey instruments to increase the capacity of survey processes to reach a more complete and representative sample of individuals and households (Bonnel, Morency, & Bayart, 2009b). Typically, travel surveys have been conducted using face-to-face interviews, phone interviews or self-completion surveys. In the recent years, with the increasing number of cellular phones, the mistrust towards phone surveys and wide spreading of display devices on phone, it has become harder to reach respondents using this medium and to convince them to participate in surveys. At the same time, access to Internet has become widespread in many industrialized countries (75% of the Quebec residents use the Internet at least once a week in 2009 (CEFRIO, 2009). As a consequence, authorities have started to envision the web as a potential instrument to complement and even replace classical travel survey instruments. The tool presented in this paper was developed as part of a sustainability plan process of a main trip generator of the Greater Montreal Area. As part of their initiative to increase their sustainability, this institution has decided to conduct a survey to understand the current travel behaviours of its user and formulate various transportation strategies to move towards more sustainable behaviours. The survey hence needed to be representative of the typical behaviours of the community, which involved a rigorous sampling plan and user-friendly interface.

10.2. Background As researchers have to cope with decreases in response rates (Bonnel, Lee-Gosselin, Madre, & Zmud, 2009a) and high cost of conducting surveys face-to-face or by telephone interview, the use of new technologies like the web as a mitigation solution

Web-Based Travel Survey: A Demo

209

is promising. The biggest problem lie in quality and ease of use of web survey interfaces and in emergence of new bias in results obtained from online autoadministered questionnaires. Table 10.1 presents pros and cons of using web mode when conducting travel surveys. In developed countries, Internet access is high and still raising but the technological capabilities of Internet users and the amount of time they spend on their computer browsing the web are very different from one population group to another (CEFRIO, 2009). However, in a relatively short future, Internet access, usage and literacy should eventually reach levels comparable to telephone in households. On a technical side, browsers and operating systems are following web standards a little more every year, so differences in technical ability of equipment should diminish. Some bias induced by web surveys could also be mitigated with new interfaces and mixed-mode integration methods, but more researches are to be undertaken on the subject. The biggest challenges (those that will get harder to solve) are the decrease of response rates, the problem of increase in multitasking when respondents are in front of their computer (quality of data could be compromised) and the poor quality of email directories (difficult to recruit representative samples) as shown in Table 10.1.

10.3. Objectives There is still much work to do to understand the usability and relevance of web interface to gather complex spato-temporal data on daily travels. The main objective of this paper is to demonstrate the usability of the web to gather precise daily travel behaviours of a large trip generator community located in the Greater Montreal Area. Actually, this paper describes and analyses what could be seen as a pilot web survey for the conduct of large-scale travel surveys in metropolitan areas and for large trip generators like universities. It presents the development and implementation process of a web-based travel survey tool as well as various statistics on the way respondents interact with the tool. It also proposes a formal description of the tool in terms of data fields (automatic filling-in, choice set, database/map-assisted coding or free coding) and interaction level with the respondent as well as a comparison with the current CATI tool used in the Greater Montreal Area (see Morency, 2008 for the formal definition of the CATI tool used in Montreal). The underlying objective is to assess the potentiality of such tool to complement the current travel survey approach in the Greater Montreal Area, namely to reach population segments less inclined to answer to phone-based survey but that would be willing to provide information using the web due to its flexibility.

10.4. Project’s Milestones The preliminary version of the online survey tool created was used for the first mobility survey of E´cole Polytechnique de Montre´al in 2010. Since then, it drew the attention of one of MOBILITE Chair partner, the Quebec Ministry of Transportation

210

Pierre-Le´o Bourbonnais and Catherine Morency

Table 10.1: Advantages and disadvantages of using the web in travel surveys. Advantages Description

Disadvantages Trend

m/mm: should become even better with time -: should stay the same Low marginal costa

-

Less restrictive (the respondent choose when to respond)a Rapid collection of data

-

Real time validation of dataa,b This is also true of most computer-based survey tool Adaptability and flexibility (the questionnaire can be adapted to the respondent by analyzing previous answers) This is also true of most computer-based survey tool Allows the study of respondents behavior during the interview Ability to change quickly while conducting the survey (to reduce ambiguity, add new questions or promote/mitigate or encourage certain behaviours) Makes administration of survey less arduous

Allows the dissemination of real-time results and presentation of preliminary statistical data to respondents at the end of their interview

Description

Trend

m/mm: should mitigate with time k: could become even worse with time Uneven Internet access and connection speed Uneven usability of respondentsd Response rates are low and declining (high solicitation for a wide variety of surveys)a The concentration of respondents is decreasing (because of multitasking)

mm

mm

The questionnaire must be adapted to a web interface, which makes it more difficult to compare with other survey modesc

-

mm

Email directories are of poor quality and/or difficult to obtain Recruitment must often be performed using another mode

?

m

mm

m

mm

Technical differences from one respondent to another (browser, operating system, equipment, etc.).a Several biases can be induced (coverage bias, sampling error, measurement error, non-response bias)d

m k k

?

m

m

Web-Based Travel Survey: A Demo

211

Table 10.1: (Continued ) Advantages

Disadvantages

Description

Trend

Allows a large number of simultaneous interviews (provided the equipment and connection are able to cope with higher traffic)

m

Opportunity to ask random questions This is also true of most computer-based survey tool Increases the level of interactivity and visualizationa,e Capability to survey hard-toreach groupsf

-

Description Different understanding of some questions by respondents can cause poor quality or biased with some groups of different education, age or social profiles

Trend -

mm mm

a

Armoogum, Axhausen, and Madre (2009). Timmermans and Hato (2009). c Braunsberger et al. (2007). d Alsnih (2007). e Bonnel, Lee-Gosselin, Madre, and Zmud (2009a). f Riandey and Quaglia (2009). b

that was interested in using it as a prototype for the 2011 Trois-Rivie`res regional survey. As the tool was created for a person-based survey, it could not be modified on such short notice to be compatible with regional household survey, but it was decided to test it anyway with some soft refusals, with a part of the complete sample and with people listed in a cell phone directory that was purchased by the Ministry with the intention of reaching some younger households that could not have been contacted with conventional means. The only section that was added to the questionnaire was a small form asking the respondent to enter basic information about his or her household members, such as age group and gender. In that manner, some basic comparisons could be performed with the parallel CATI survey results. Since both E´cole Polytechnique and Ministry of Transportation were pleased by the tool, especially because of its ease of use and its design qualities, it was decided to pursue the project by using the survey tool for the University of Montreal first mobility survey and for the Ministry of Transportation Quebec City regional survey, both scheduled for late 2011. To be ready for the latter, the household survey module was to be integrated. In fact, the tool had to replicate almost entirely the CATI questionnaire. In that sense, it had to allow one person to enter trip information for

212

Pierre-Le´o Bourbonnais and Catherine Morency

each of his or her household members. In addition to that, with an online questionnaire, it would also be possible for each individual members of the household to respond to the survey on their own, based on their availability or motivation. For the University of Montreal survey, the questionnaire was to stay person-based but some parts were added to allow the selection of a precise building inside the campus, thus allowing the analysis to include internal trips too. For these two surveys, the interface was improved to allow an even easier flow when responding to the questionnaire. Both surveys are to start in October 2011.

10.5. Methodology The web-based travel survey tool was developed using up-to-date technology and namely benefited from the online maps’ spatial location potentialities. In the case of E´cole Polytechnique 2010 travel survey, overall survey design process has involved the following tasks (inspired by Richardson, Ampt, & Meyburg, 1995), both technical and procedural:  Preliminary planning: The first step that was required was to obtain support from the administrative sector of the metropolitan region or of the trip generator, namely to ensure access to datasets describing the universe and facilitate the weighting process.  Survey method: It was decided to develop the tool on the web since a majority of the trip generator users (students and workers) were already connected to the web and had a login and password that linked them to the administrative server.  Questionnaire design and survey tool: The questionnaire used for the large-scale travel survey was used as a reference and the data usually collected at the individual level were selected as required data.  Sampling plan (and respondent convocation using sampling lots): Official administrative datasets were used to assess the survey universe and develop sampling lot to invite people to participate in multiple successive waves covering multiple days. No sample was selected a priori and the universe, some 8000 people, was invited to participate in the survey. The goal was to collect data for an average week day, so people responding from Saturday to Monday had to select their Fridays trips.  Pilot survey: Prior to the official launching of the survey, panels of individuals of various types were asked to fill in the survey and provide comments on their experience.  Logo and promotion strategy: Official convocation from the president, announcement in the weekly activity calendar, posters.  Conduct of the interview: Supervised sessions of self-completion interviews, email for the interaction with respondents facing issues.  Data gathering post-processing (validation, weighting, analysis and result diffusion). The strategy adopted was to aim for similar travel data as those collected during the regional large-scale travel survey but only for a single person. Hence, the

Web-Based Travel Survey: A Demo

213

questionnaire was not directly transposed to the web but data requirements were introduced in the interface to ensure comparability of outputted data.

10.6. Technology The application built for the online survey tool was first created using a PHP web development framework called Symfony (Sensio Labs, 2011). A new version of the application that uses Ruby on Rails framework (Rubyonrails, 2012) is currently in development and should provide better performance as well as a faster development process. For the interactive part of the questionnaire, jQuery (The jQuery Project, 2010), also free, was used as it is the most compatible and flexible Javascript library as of now. The application is very flexible and allows the administrator to add or remove questions easily. Some of these can use geocoding maps, choices of images, multiselect dropdown menus and sliders, in addition to standard elements used in conventional web forms. The use of HTML5 (the latest version of HTML, the markup language of the web) will help to make the tool even easier to adapt to almost any kind of situation in the future. As it is easy with online surveys to modify questions or entire sections in real time to adapt to each individual respondent (according to personal profile or household data), there is virtually no limit to the level of customization that is possible to achieve in this context.

10.7. Questionnaire The actual version of the questionnaire contains four major sections: home/ household, personal profile, personal trips and opinions.

10.7.1. Home/Household The respondent is first asked to locate his or her home by entering the address (Figure 10.1). The online map provider then tries to geolocate the household (retrieve latitude and longitude coordinates). At this stage, coordinates have to be as precise as possible to allow further validations. Questions about the number of persons in the household (household size) and the number of available cars are also included. If the survey is a household-based one or if desired by the survey promoters, a small section asking for basic information about each person is added. 10.7.2. Profile In the next section, respondents enter their profile information like age or age group (if exact age is considered to be too sensitive), gender, main occupation and transportation type information like driver’s license or transit passes.

214

Pierre-Le´o Bourbonnais and Catherine Morency

Figure 10.1: Entering home location.

10.7.3. Trips The trips section is separated into three parts: interview day visited places, schedule and modes. 10.7.3.1. Visited places Trips of respondents are to be detailed for the last working day. To simplify the concept of trip and to make sure the respondent understands very well what information is needed by the analyst to recreate his or her trips for one particular day, the use of the term ‘‘places where you went’’ is used as a replacement for the usual ‘‘trips you made’’. That way, the respondent needs only to list all the places he or she visited during the aforementioned day in a chronological order (Figure 10.2). Also, to make sure that returning home trips are not forgotten, the respondent is asked if he or she got back home after each added visited place. When adding a new visited place, the interface incorporates a map with shortcuts like ‘‘Home’’, ‘‘Find an address’’, ‘‘Find an intersection’’ or ‘‘Find a place’’ (Figure 10.3). Each time a new place is added, it can be used again as a shortcut when entering the next visited places. For household-based surveys, the visited places of other members of the household could also be added to the list of shortcuts as a great time-saving option for families.

Web-Based Travel Survey: A Demo

215

Figure 10.2: Declaring list of visited places. 10.7.3.2. Schedule As soon as the respondent is done with entering the places visited during the interview day, the arrival and departure time of each of the places he or she went are to be set. Instead of asking, later on, to provide departure and arrival time of the individual trips, it has been discovered, according to our experience, that people tend to be more accurate when asked about the time they arrive at a place and the time they quit that place while conducting tests in the first prototype. It also helps reduce bias due to mode choice (people taking public transit usually overestimate their journey length and people using their cars underestimate it. A timeline (Figure 10.4) allows the respondent to visualize schedule with the places visited. 10.7.3.3. Modes The last step in the trips section is to choose which modes were used for each trips. The trips are automatically generated by the application by attaching visited places by pairs and by assigning schedules to trips instead of places. Respondents can choose between single mode or multi-mode and have to choose the household members that were present in the car when choosing car driver or car passenger modes (Figure 10.5). When selecting a public transit mode, they are asked to select specific routes taken. A separate module using GTFS (General Transit Feed Specification also known as Google Transit Feed Specification) data for local agencies

216

Pierre-Le´o Bourbonnais and Catherine Morency

Figure 10.3: Adding a new visited place.

Figure 10.4: Timeline appearing when entering arrival and departure time for each place. can be included at this stage to verify validity of chosen routes. For instance, if more than one route is taken, the GTFS module verifies that transfer is indeed possible (if travelling by foot between each line is feasible) and that services’ schedules are compatible for the interview day.

10.7.4. Opinions The last section can be used by the trip generator or the regional administrator to ask custom questions and to gather general comments. In the current version, a slider allows the respondent to grade the degree of clarity of the survey and its overall length. Also, more difficult questions can be asked at the end, like personal/ household incomes and questions regarding environmental and/or sensitive issues.

Web-Based Travel Survey: A Demo

217

Figure 10.5: Choosing transportation mode(s) and bus line(s). When all sections have been answered, the respondent can see basic statistics on the sample of people who already finished their interviews. It is also possible to show them animated maps of last survey results as a way to advertise next scheduled survey and increase motivation to stay informed on the subject of transportation. The respondent can also go back to any section and change his or her information and trips.

10.8. Survey Administration Survey stakeholders and administrators can benefit from the application since it includes configuration files to define parameters like default and allowed languages, default map centre location, default address components (country, region, city etc.) and many others. During the survey period, they can follow results in real time using tools like a map of all respondent home locations (Figure 10.6) and several graphics and statistics on demographics, trips, modes, etc. For a trip generator-based interview, it is also possible for an administrator to see the number of interviews started and/or finished during each hour of the day and compare these to the time the recruiting emails were sent to their community in order to optimize response rate and server performance. During the interview, all mouse clicks and values entered in most non-sensitive fields are saved in the database with a recording of the time of each action. It is therefore possible, by instance, to find the most error prone questions and to unravel failed geocoding requests with the intention of increasing success rates in the

218

Pierre-Le´o Bourbonnais and Catherine Morency

Up-to-date (realtime) spatial dispersion of the sample (home location)

One click provides linked activity locations

Figure 10.6: Real-time administration visualization tool: households map.

Figure 10.7: Gantt-like chart of each section duration for one respondent. subsequent versions of the application. It could also help us find the most difficult parts and the longest sections, in order to optimize them in the future. As an example of the many respondent behaviour visualization tools, a Gantt-like chart of each section duration is created for every respondent (Figure 10.7).

10.9. Diffusion and Visualization Using online a map pathfinder service like Google Directions and open source routing tools like OSRM (OSRM, 2012), most trips collected during the survey can be

Web-Based Travel Survey: A Demo

219

simulated on the road network and then be traced on an animated map that recreate the generation of trips for a common weekday (Animated Map of Polytechnique Mobility Survey, 2010). The result is a great demonstration opportunity for the stakeholders to both advertise and show what can be done with the data. In the future, it could also serve as a virtual laboratory for simulations and modelling, though real-time navigation inside the simulations would require great processing power that is not currently available in small organizations or on a single desktop machine.

10.10. Specificities of the Tool The paper aims to contribute to the body of knowledge and experience on the design and implementation of web-based travel survey. The particularities of the design rely on the following choices:  Moving from trips to activity locations: To limit the underreporting of trips linked to non-constrained activities or of shorter distance, it was decided to first ask respondents to declare the set of activity locations visited the previous day. After all locations were geocoded, respondents were asked to identify the temporal and modal attributes of trips made to move between locations.  Using freely accessible geocoding functions from online maps: It was decided to use Google Maps for the geocoding of spatial location instead of relying on databases since it has become a common tool for the identification of places, addresses and itineraries, namely for population segments which have a higher access to the Internet than the typical resident. Future versions may include other online map providers such as Open Street Map, Open Layers, Microsoft Bing or Yahoo Maps.  Providing main statistics to the respondent in real-time (at the end of the interview): At the end of the interview, respondent visualizes summary statistics of the entire survey.  Allowing for multiple visits to the questionnaire by the respondent: A respondent is allowed to go back to its questionnaire for the whole duration of the survey. As soon as he logs in the web interface, he can visualize his previous answers as well as statistics on the evolution of the surveys (number of respondents for instance).  Providing multiple statistics to the administrator, in real time: Mapping of the home location and activity location for each respondent, response rate by population segment, average interview duration, daily distribution of questionnaire filling (at what time did respondent chose to fill up the questionnaire), basic statistics on the attributes of the respondent.

10.11. Preliminary Findings (Response Rates) While actual results are beyond the reach of this paper, we can nonetheless expose some preliminary information such as response rates for the surveys conducted in 2010 and 2011 as part of our researches (Tables 10.2 and 10.3).

220

Pierre-Le´o Bourbonnais and Catherine Morency

Table 10.2: Web response rates for 2010 and 2011 surveys. Survey Polytechnique (trip generator) Fall 2010 Respondents contacted by email Polytechnique (trip generator) Fall 2011 Respondents contacted by email University of Montreal (trip generator) Fall 2011 Respondents contacted by email Trois-Rivie`res (regional survey) Spring 2011 Respondents were contacted on their cell phone Quebec City (regional survey) Fall 2011

Sampling unit

Persons contacted

Started interviews

Completed interviews

Person-based

8618

1932 (22.4%) 1655 (19.2%)

Person-based

8576

1930 (22.5%) 1679 (19.6%)

Person-based

58483

7951 (13.6%) 6610 (11.3%)

Person-based

333 out of the 1644 persons contacted accepted to participate

109 (32.7% of volunteers) (6.6% of persons contacted)

96 (28.8% of volunteers) (5.8% of persons contacted)

Household-based See Table 10.3

For both E´cole Polytechnique surveys, response rates were the same with almost 20% of interviews completed. University of Montreal survey, with a more diverse population, got a 11% completed response rate, which is not bad considering the low costs involved. On the other hand, Trois-Rivie`res regional survey, which used a cell phone directory in parallel to the conventional CATI survey to contact possible web respondents, did suffer from a lower response rate at 6%. A better analysis of recruiting call durations and representativity of the samples are to be conducted before evaluating the cost-effectiveness of the process. The Quebec regional survey

Web-Based Travel Survey: A Demo

221

Table 10.3: Web response rates for Quebec City Fall 2011 household-based regional survey. Sample

Households contacted

Started household interviews (At least one household member did start the interview)

Completed household interviews (every household member of 5 years old and older has a completed interview)

Respondents contacted by regular mail

749 reachable addresses (1000 letters were sent)

139

83

(18,6% of households reached by mail)

Households that refused to respond to the CATI conventional telephonic survey

63 of 0 refusals accepted to respond on the web

(59,7% of started household interviews) (11,1% of households reached) 0

got a response rate of 11%, after removing the 251 letters that were returned by the postal service. This result is the most representative of the ones we can get from the general population since the 1000 households were taken from the large CATI conventional survey sample based on the latest Canadian census. Finally, of the 63 households who refused to participate in the Quebec regional conventional survey but accepted to participate on the web, none even started their interview. While the sample size is low, it clearly shows that conventional surveys refusals are not easily recovered by proposing them a web version instead. However, a greater number of refusals from a larger regional survey would provide a better sample to further analyse this subject.

10.12. Challenges and Future Development The next big step in fulfilling the project is to create a mobile version of the application for cell phones and small tablets. An increasing part of the population uses these devices and allowing them to fill the questionnaire on a small screen could contribute to increase in response rates even further, especially with young people and busy individuals. This is indeed a great challenge for such a complex survey, since reducing the size of elements on screen could be difficult to say the least if the flexibility and ease of use is to be maintained.

222

Pierre-Le´o Bourbonnais and Catherine Morency

Another challenge is to keep simplicity and clarity when using the householdsurvey version of the tool. It is in fact difficult to know if only one person will fill the questionnaire for his or her household members or if each person is going to respond separately. And if this is the case and each person does it on his or her own, how can we manage to get valid and precise information for trips of the exact same day if some household members access the survey quite a long time after the first interview of the household has been finished? On the other hand, if one person answers for all his or her family, questions are hard to formulate accordingly and we may get incomplete trips information, as in the actual CATI survey. In another vein, technology is changing so fast in the web industry that it is difficult to keep a design and a framework working for a long time. Most survey administrators think they can reduce their costs by using web surveys. The main challenge though is to keep the questionnaire up to date and ‘‘fashionable’’. One must effectively make sure it allocates the right amount of resources updating the application and redesigning the interface on a regular basis and the cost of doing that properly is nowhere decreasing. As for demographics, elderly people and very young persons (infants and young teenagers) are difficult to reach, and for very different reasons. Older individuals tend to have limited access to Internet and usually lack the technical knowledge to answer to this kind of online questionnaire. Other media have to be used if they are to be reached them and a representative sample for the survey obtained. With children, there is a need to make sure their parents can answer for them during regional and metropolitan area surveys as the trip in which they are involved are usually useful to analysis since they represent an increasing part of rush hour travelling. For children and adolescents less than 18 years old (or 21 in some regions), laws usually prohibit survey instigators to include them in their sample and/or to collect sensitive information about their behaviours. Also, the motivation of youngs to get involved is an issue and though their parents could answer for them, the exact whereabouts of their offspring are not always known. Despite these challenges, web surveys are a great way of collecting travel behaviours and they are becoming an integral part of the survey manager’s toolbox. It helps reduce costs, on one hand, and reach more population groups on the other. However, it is essential that enough effort is made to refine the questionnaire and reduce the burden to the respondent by clarifying questions and simplify the process. One cannot just copy a conventional CATI questionnaire on the web and get satisfying and valuable data from it. The web is a different medium and has to be treated as such: it is flexible, evolving fast but it cannot forgive going around in circles.

Acknowledgements The authors wish to acknowledge the contributions of Hubert Verreault, Louiselle Sioui and Julien Faucher from E´cole Polytechnique in developing, documenting and testing the interface. They also wish to acknowledge Mathieu Decoste and Ste´phane Be´ranger, sustainability counsellors at E´cole Polytechnique and University of

Web-Based Travel Survey: A Demo

223

Montreal for supporting the team in the planning and conduct of the survey as well as Pierre Tremblay, from MTQ for interest and support at the Quebec level.

References Alsnih, R. (2007). Characteristics of web based surveys and applications in travel research. In P. R. Stopher & C. C. Stecher (Eds.), Travel survey methods: Quality and future directions (pp. 569–592). Oxford: Elsevier. Animated Map of Polytechnique Mobility Survey. (2010). Retrieved from http://www.youtube.com/v/Fk_ObOxIwKk Armoogum, J., Axhausen, K. W., & Madre, J.-L. (2009). Lessons from an overview of national transport surveys, from Working Group 3 of COST 355: ‘‘Changing behavior toward a more sustainable transport system.’’ In P. Bonnel, M. Lee-Gosselin, J. Zmud & J.-L. Madre (Eds.), Transport survey methods: Keeping up with a changing world (pp. 621–634). Bingley, UK: Emerald. Bonnel, P., Lee-Gosselin, M., Madre, J.-L., & Zmud, J. (2009a). Keeping up with a changing world: Challenges in the design of transport survey methods. In P. Bonnel, M. LeeGosselin, J. Zmud & J.-L. Madre (Eds.), Transport survey methods: Keeping up with a changing world (pp. 3–13). Bingley, UK: Emerald. Bonnel, P., Morency, C., & Bayart, C. (2009b). Survey mode integration and data fusion: Methods and challenges. In P. Bonnel, M. Lee-Gosselin, J. Zmud & J.-L. Madre (Eds.), Transport survey methods: Keeping up with a changing world (pp. 587–612). Bingley, UK: Emerald. Braunsberger, K., Wybenga, H., & Gates, R. (2007). A comparison of reliability between telephone and web-based surveys. Journal of Business Research, 60, 758–764. doi: 10.1016/ j.jbusres.2007.02.015 CEFRIO. (2009). Retrieved from http://blogue.cefrio.qc.ca/2009/12/resultats-de-decembrenetendances-2009/. Accessed on November 12, 2010. Morency, C. (2008). Enhancing the travel survey process and data using the CATI system. Transportation Planning and Technology, 31(2), 229–248. doi: 10.1080/03081060801948241 OSRM. (2012). Open Source Routing Machine. Retrieved from http://project-osrm.org/. Accessed on August 3, 2012. Riandey, B., & Quaglia, M. (2009). Surveying hard-to-reach groups. In P. Bonnel, M. LeeGosselin, J. Zmud & J.-L. Madre (Eds.), Transport survey methods: Keeping up with a changing world (pp. 127–144). Bingley, UK: Emerald. Richardson, A., Ampt, E., & Meyburg, A. (1995). Survey methods for transport planning. Melbourne, Australia: Eucalyptus Press. Rubyonrails. (2012). Ruby on rails. Retrieved from http://www.rubyonrails.org. Accessed on August 3, 2012. Sensio Labs. (2011). Symfony. Retrieved from http://www.symfony.com. Accessed on September 25, 2011. The jQuery Project. (2010). jQuery: The Write Less, Do More, Javascript Library. Retrieved from http://www.jquery.com. Accessed on September 25, 2011. Timmermans, H. J. P., & Hato, E. (2009). Electronic instrument design and user interfaces for activity-based modeling. In P. Bonnel, M. Lee-Gosselin, J. Zmud & J.-L. Madre (Eds.), Transport survey methods: Keeping up with a changing world (pp. 437–461). Bingley, UK: Emerald.

Chapter 11

Web versus Pencil-and-Paper Surveys of Weekly Mobility: Conviviality, Technical and Privacy Issues Marius The´riault, Martin Lee-Gosselin, Louis Alexandre, Franc- ois The´berge and Louis Dieumegarde

Abstract Purpose — In the context of evaluating transportation and carbon emission policies, improve weekly activity and mobility scheduling survey methodology in order to enhance data quality while reducing costs and decreasing respondent burden for designing continuous self-administered surveys that are predominantly passive (or computer-assisted). Approach — Evaluate a set of functionalities deployed in a web travel survey interface (2009) and compare with a pencil-and-paper survey (2002–2003) deployed in Quebec City that sought similar data about weekly mobility. The first used a pencil-and-paper approach complemented by interviews and telecommunications. The second used applets developed in Java, and Google Maps in order to assist geocoding of activity places and the reporting of actual trips into a relational database, while using email to recruit and support respondents. Implications — Both of these surveys had to address specific technical and privacy challenges during deployment, making their comparison relevant for discussing some of the impacts of information technologies on spatiotemporal data quality, conviviality of survey procedure, respondents’ motivation and privacy protection.

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

226

Marius The´riault et al.

Limitations — While neither of these surveys employed movement-aware mobile devices, such as GPS loggers, some of the lessons learnt are relevant to the design issues raised by the increasing deployment of such devices in travel surveys, and by the growing need to manage complex surveys over extended observation periods. Keywords: Weekly mobility surveys; web surveys; survey interface; GUI; technical issues; privacy

11.1. Introduction A major target of sustainable development is the lowering of greenhouse gas (GHG) emissions from urban transportation. Understanding the role of changes in mobility behaviour is of fundamental importance, and this in turn rests on improvement of survey data. Such data are complementary to those traditionally gathered to guide the design of transportation infrastructure, seeking viable urban and regional development while avoiding the detrimental impacts of traffic congestion. While the traditional preoccupation of transportation planning with peak-hour commuting can, most of the time, be served by travel diary surveys collected on one or two working days, the comprehensive understanding of mobility behaviour requires information for longer periods of time, generally for a minimum of 7 days, in order to capture diversity of activity planning strategies and travel decisions. However, surveys collecting data for more than a couple of days are expensive, error-prone, and maintaining respondents’ motivation can be problematic. To address this dilemma, transport survey methodologists are increasingly turning to information technologies and geomatics in order to enhance data quality, to decrease respondent burden, to lower costs and, eventually, to design continuous self-administered surveys that are predominantly passive (Axhausen & Ga¨rling, 1992; Doherty & Miller, 2000; Ettema & Timmermans, 1997; Greaves, 2004; Lee-Gosselin & Harvey, 2006; Roorda, Doherty, & Miller, 2005; Stopher, 2004). The main objective of this paper is to evaluate a new package of computerised functionalities that were developed for a detailed web-based survey of daily activities and travel in the light of experience with a pencil-and-paper instrument whose content it was intended to replicate. The opportunity to do this was a side benefit of a study to compare estimates of GHG emissions for two groups of car users in Quebec City, Canada — those in ‘conventional’ car-owning households versus those who were members of a car-sharing association. A suitable data source was already available for car-owning households from an in-depth survey that observed one week of activities and personal mobility in 2002–2003. However, its sample included only one car-sharing member, and so an additional survey targeting car-sharers was required, and this was completed in 2009. The pencil-and-paper approach used by the 2002–2003 survey was complemented by interviews and telecommunications (phone, fax and some email). The 2009 web-based survey took advantage of recent developments in information technologies, notably involving applets developed in

Comparisons of Web with Pencil-and-Paper Surveys of Weekly Mobility

227

Java, the Internet, and Google Maps in order to assist recording of activity locations (geocoding) via a Graphic User Interface (GUI), and the reporting of actual trips directly into a relational database, while email became standard practice for respondent recruitment and support.

11.2. The Pencil-and-Paper Survey The pencil-and-paper survey (known as OPFAST1) was carried out in Quebec City from 2001 to 2005, during the PROCESSUS research programme, which was funded by the Social Science and Humanities Research Council of Canada, the GEOIDE Network and the Quebec Ministry of Transportation.2 This was a three-wave panel survey. The first wave sample comprised 250 households (400 adults; 12,840 trips) chosen mostly at random from the phone directory. The first-wave instrument package focussed on activity planning and travel during a 7-day observation period. In-home interviews were held just before and just after the 7-day period to enable personalised instruction, in-depth validation and interpretation of the week’s data. Analysing activity planning and choice processes requires simultaneous observation of decisions at the individual and household levels in order to handle interactions among persons, considering other activity opportunities and spatio-temporal constraints of all household members (Kim & Kwan, 2003; Kwan, 1998; McCray, Lee-Gosselin, & Kwan, 2005). Communication with the respondents (all adults over the age of 15 living as members of the same household) was organised on a daily basis using phone calls and fax in order to transmit information and control quality. For the purposes of this paper, we retain only the first wave, although second (200 households) and third (171 households) waves were also conducted, an overall retention rate of 68%. OPFAST records both anticipated and executed activities, both at home and out of home. It uses a design intentionally aligned to the computer-aided instrument known as CHASE, developed by Doherty and Miller (2000). Unlike CHASE, OPFAST involved multiple pencil-and-paper instruments: 1. At startup — to gather basic information on household structure, residential mobility, vehicle ownership etc. during an in-home interview of the household that also introduced the self-administered instruments to be used in the coming week, and provided training; 2. During a 7-day period — to observe prospective activity planning, and executed activities and travel, using person-level paper instruments that respondents were instructed to complete and send to the research team using a provided fax machine (Figure 11.1);

1. OPFAST ¼ Observed and Perceived Flexibility of Activities in Space and Time. 2. PROCESSUS carried out a similar survey in Toronto using CAPI procedures during the same time period (CHASE), but this approach is not examined in this paper.

Figure 11.1: Self-administered activity/travel planning and logging instruments used in the OPFAST pencil-and-paper survey (adapted from Lee-Gosselin and Doherty, 2005). NB: The activity planning diary on the left is a ‘zoom’ showing 3 days, but the sheet displays all seven; the activity-travel log on the right shows the first seven activities on a particular day, but continuation sheets allowed as many entries as needed.

228 Marius The´riault et al.

Comparisons of Web with Pencil-and-Paper Surveys of Weekly Mobility

229

3. Soon after the 7-day period — during an in-depth home interview to validate the transmitted data and carry out a retrospective exploration of activity and travel patterns using sorting exercises and other approaches. For this paper we consider only our experience with the activity/travel log included in the second point above. In addition to recording activities and trips actually undertaken, this complex survey also solicited information on activity planning, perceptions of spatio-temporal flexibility and the overall decision process, event histories and many other elements. The activity/travel log was thus embedded in an in-depth multi-instrument package involving an unusually high level of face-toface and telephone contact with respondents. Moreover, through the three panel waves, anticipated observer and learning effects could not be avoided and were rather incorporated into a ‘reflexive’ design in which respondents and interviewers engaged in joint discovery. Effects on observed retention rates are attributable to the whole package and not just the activity/travel log, but we can nevertheless draw useful lessons from this application of a pencil-and-paper instrument (PAPI) activity/ travel log. The location of logged activities was recorded asking either for address or the usual name for schools, commercial outlets, offices etc. Data was later structured into a relational database, using geocoding post-processing tools in order to locate activity places in a geographic information system. Figure 11.1 shows an example of the self-administered paper instrument package, which was made up of two main components. On the left is an extract from a singlesheet prospective activity planning diary: activity planning started ahead of Day 1, initially for the whole 7-day observation period, but updates covered only the remaining day(s) of the observation period on each subsequent day up to and including Day 6. On the right is the log of executed activity and travel. Households were requested to fax the current state of the planning diary, and the most recently completed activity/travel log, once per day. Support was provided by telephone based on material received, and a human-based check of suitability and completeness. Further detail on OPFAST can be found in Lee-Gosselin (2005).

11.3. The Internet-Based Survey The 2009 Internet-based survey took a very different approach. It was needed to obtain actual mobility data among users of a car-sharing service (operated by Communauto) in Quebec City in order to estimate their weekly GHG emissions and to compare their results with those of a subset of the OPFAST respondents (providing a control group). The purpose of such a study is to test whether or not car-sharing users produce similar GHG emissions to car owners with similar travel needs, while simultaneously considering effects of urban form (Cervero & Kockelman 1997). A difference is expected based on the hypothesis that car-sharing users, having little or no investment in car ownership and a pay-per-use billing structure that

230

Marius The´riault et al.

aggregates fixed and variable costs, may favour a more diversified range of transportation modes to access activity locations. In contrast, car owners are thought likely to overuse their vehicle because the perceived marginal cost per kilometre is low and they are trying to rationalise their high fixed costs. The overall hypothesised outcome would be that car-sharing users would travel fewer kilometres by car and produce lower GHG emissions. Weekly mobility data is of paramount importance for this type of study because main uses of shared cars is for leisure and shopping, with demand peaking during Friday and the weekend, justifying higher hourly rent charges (Martin, 2007). The Communauto survey was deployed to car-sharing users recruited among 371 voluntary respondents. This project was carried out with funding from the Canadian Network of Centres of Excellence in Geomatics (GEOIDE) and FQRSC (see acknowledgements). In order to explore the mobility behaviour of car-sharing users, we developed a web-based mobility survey in cooperation with Communauto as a follow-up to its 2008 customer satisfaction survey. The computer-operated instrument was based on the 7-day OPFAST log book in order to stay comparable. For ethical reasons, it was operated using a secure website that prevented unauthorised persons from accessing the data entered by respondents, a choice that can have many consequences for efficiency and user-friendliness. Based on addresses provided by Communauto, potential respondents (those who explicitly agreed to be contacted at the end of the 2008 customer survey) were contacted by email with an invitation to participate. This procedure was successful in recruiting 147 adults for a three-part online survey: Part 1: Identification of respondent, his/her socio-economic status and household structure; Part 2: Geocoding of anchor and activity locations using street addresses, business names or clicking on a map (aerial photograph in Google Maps; Figure 11.2); Part 3: Retrospective reporting of daily trips using an on-screen log (Figure 11.3). The last operation had to be repeated for each of the 7 days, usually requiring multiple visits to the website and incremental updating of a relational database handling both data and information about the last completed stage in the survey procedure. Moreover, to avoid concerns about privacy, each respondent was assigned a private workspace where his/her information was stored, which also eased reporting of trips to anchor locations and frequently visited places. Finally, the user interface kept track of the state of completion at the end of each work session in order to return there automatically at logon of the next session. The survey was implemented using a combination of applets (in Java) to handle menus in the questionnaires to interact with Google Maps and to populate the database located in a secure environment. Using this approach, it was possible to design a system minimising risks of errors (generally using dropdown menus) and providing means for validating data before it is posted in the database and to provide several automated help functions directly into the interface. These help features were generally activated when the cursor was positioned over question marks. Efforts were

Figure 11.2: Computer-assisted location of activity places during the web survey among car-sharing users.

Comparisons of Web with Pencil-and-Paper Surveys of Weekly Mobility 231

Figure 11.3: Computer-assisted log book of daily travel (seven reports needed for weekly period).

232 Marius The´riault et al.

Comparisons of Web with Pencil-and-Paper Surveys of Weekly Mobility

233

made to design a user-friendly and flexible environment. For example the interface allows for ad hoc addition of new locations when needed in the log (Figure 11.3), simply reactivating the dialog in Figure 11.2 (including the Google Maps device) in order to add a new location in the list, which is named by the respondent, saved for the remainder of the survey and used to populate the dropdown menus. With such an approach providing three methods to locate activity places (point on Google Maps, select from an activity places index, select by street address or postal code), no trip can be reported without location (in both space and time), easing data validation. It is even possible to move an activity place on the map (drag and drop) when a mistake is discovered and this simple action, even if it is done several days after the actual trips are reported, immediately updates the location in the relational database. In order to study the user-friendliness of the web tools, and differences in respondents’ preferences, information on choices made by respondents (including which tool was used) was recorded in the relational database. Activity place identification menus (entered in Part 2; Figure 11.2) were also available during Part 3 (Figure 11.3), and a means of adding new locations was provided, keeping a trace of any that were defined during trip reporting, thus allowing to distinguish between locations that were preset before reporting trips (mostly anchor places) and those added when needed. Finally, user support was provided to respondents through emails. A total of 93 emails asking for help was received from 62 respondents, and answered during the survey, generally during the same day. The answering process was monitored with daily verification of the database status, and gentle recall emails were sent to respondents who had begun answering the survey, but were stalled somewhere in the weekly process. Finally, this complex procedure was able to secure 53 households (57 respondents; 1,423 trips) who completed the 7-day survey — substantially thanks to the recall procedure that was responsible for motivating 23 persons to finalise their trip report.

11.4. Results and Comparison These two types of surveys have strengths and weaknesses. While collecting similar data, they imply very different communication, technical and privacy issues as well as strategies to sustain motivation of respondents during the week. Both surveys had to go through a university ethics committee. While the objectives (obtain free consent and ensure confidentiality of data) are similar, the actions needed are very different (e.g. keeping both faxed paper forms and computer media under lock and key, versus implementing computer security with user IDs and passwords to allow access to a database over the Internet). Geocoding procedures are also very different, leading to strong differences in spatial reference quality and workload needed to validate coordinates of anchor and activity places. The GHG emission estimation study (Alexandre, 2010) used both data sources in order to compare the emissions of car-sharing users with those of a control group

234

Marius The´riault et al.

drawn from the majority of those living in the same neighbourhoods who were car users (OPFAST respondents were retained for inclusion in the control group if they were living less than 1.5 km away from the nearest car-sharing survey respondent). The operations needed to validate the data are obviously very different for both surveys despite the fact that both surveys provide the same basic information on travel, and are operationalised using a relational database. Moreover, for the GHG estimation study, there was a requirement for activity location accuracy that was considerably more stringent than had been established in the original OPFAST design. This is because it was essential in the GHG study to model actual trips using a GIS (TransCAD), estimating travel duration based on network distance and impedance. The main problem with nevertheless using the pencil-and-paper survey as input to a travel duration model was linked to the fact that geocoding of the first wave was delayed a long time after data validation, leading to the survival in the database of a number of classes of incomplete or dubious descriptions of spatial location, such as giving an intersection of parallel streets or useless information like ‘my brother’s house’. During OPFAST, validation of travel data had been carried out manually. Therefore, verifications are mostly on completeness and it is rather difficult to assess plausibility. Data completeness criteria had to be set in order to select OPFAST respondents providing sufficient accuracy for joining the control group. We also discovered that about 20% of declared trip durations were dubious (implying travelling at more than 120 Km/h or less than 5 km/h with car, or walking at more than 8 km/h, or travelling more than 1 hour in car for a total distance of less than 10 km etc.). Even made by trained personnel, careful manual validation had been unable to detect complex spatio-temporal weaknesses in data (inconsistency based on speed, trip duration, travelled distance, transportation mode and accuracy of geographical coordinates for geocoded activity places) that became obvious using computer cross-validation and GIST simulation. Because geocoding was delayed in time and data validation was done on a day-by-day basis, several activity places (with different names) were sharing identical coordinates, which led to lack of data uniqueness, a basic feature needed for efficient operation of a relational database. Therefore, for this task, we were able to summarise 4385 locations used in the OPFAST survey into a mere 2841 different coordinate sets (places), avoiding duplication. Moreover, subsequent tests were able to identify unique locations with several set of coordinates, locations without coordinates (geolocation failure) and locations with wrong coordinates (seen only when displayed on a map). Considering the load of work needed to verify all trip data reported by the 250 households and correct the problem using computer-assisted procedure, it was decided to limit efforts and to amend data only for the 82 households retained in the control group, avoiding those with more than 5% of activity places with fuzzy location within a radius of 40 km from the city centre. Finally, from the 250 households from the OPFAST survey (first panel), 139 had their home located at less than 1.5 km from a respondent of the car-sharing survey. Among those households, there are only 82 households (122 respondents) yielding sufficient accuracy for inclusion in the GHG emission estimation control group.

Comparisons of Web with Pencil-and-Paper Surveys of Weekly Mobility

235

The Internet-based survey generally provides completeness in the answers and some first level validation of data consistency. However, using a computer to communicate with respondents has its pitfalls. The first one is related to response rate and abandonments linked to tiredness and misunderstanding of the survey interface. Considering the risks associated with a newly developed interface, we decided to deploy the survey progressively in order to avoid massive loss of our 360 potential respondents. Deployment was done in four steps: (1) among 19 households asked (through email) to answer a 7-day survey, 6 households accepted; (2) 20 households were asked to fill a 4-day travel log (7 accepted) and 20 households were invited to fill 7 days (7 accepted); (3) 40 households were invited to fill a 7-day questionnaire (10 accepted); and (4) finally, 263 households were invited for a 7-day period and 107 accepted. These four steps took place between March 30 and May 15, leaving some time to fix some bugs in the interface before its full deployment. All in all, 147 adults living in 137 households registered in the survey. However, it was very difficult to recruit more than one adult in each household: 68 households had 2 adults; 5 households had 3 adults or more. By contrast, there was more than one respondent in about half of the 82 households in the OPFAST control group, a rate slightly lower than in the full useable OPFAST sample. This was expected given the more stringent recruiting criteria of OPFAST.3 Thus, we had to accept this reality as a limitation for the GHG emission estimation study. Despite an active (but gentle) follow-up procedure by email, the retention rate of the web survey was far lower than the inter-wave retention rate for the pencil-andpaper survey (80% between the first and second waves4). In the web survey, 39% of the 147 enrolled individuals completed a 7-day travel log. Table 11.1 presents results at the end of each part of the questionnaire, making distinction between gender and age groups of respondents. The retention rate is better for women and in the 34–49 years old age group. The lower retention rate is among the 50 + years old, probably related to a lower familiarity with the computer interface and Google Maps. Such an explanation is compatible with outcome of a survey done by Adler, Rimmer and Carpenter (2002) where respondents using Internet were younger and wealthier than those choosing traditional survey instruments. The survey interface was built with typical tools of the web and provided nothing equivalent to OPFAST’s in-home training just before the 7-day observation period, instead it relied on the online help functions and on emails to recover from eventual problems. Finally, our results for recruiting and retaining respondents with auto-administrated survey instruments are

3. As noted in Section 11.2, all adults over the age of 15 living as members of the same household had to agree to participate for a household to be eligible. 4. The inter-wave rate is the only comparison available with OPFAST, because completion of the 7-day logs was a condition for inclusion in the Wave 1 sample. Approximately 30 additional households participated in a variety of early field test versions of the OPFAST package but were not retained for the study because of subsequent changes made in the instruments, or in some cases because of unsuccessful implementation.

236

Marius The´riault et al.

Table 11.1: Attrition and retention rate of respondents during the web survey. Age group

Enrolled

Part 1 (General)

Part 2 (Figure 11.2)

Part 3 (Figure 11.3)

Retention rate

83 35 26 22 64 20 29 15

71 31 22 18 51 16 23 12

49 20 17 12 29 9 15 5

37 15 14 8 20 5 13 2

0.44 0.43 0.54 0.36 0.31 0.25 0.45 0.13

147

122

78

57

0.39

Women 21–34 34–49 50+ Men 21–34 34–49 50+ Total

Table 11.2: Number of respondents according to the number of daily travel log completed (web). Days completed

1

2

3

4

5

6

7

Number of households Number of respondents

72 78

62 67

60 65

58 62

55 59

53 57

50 54

Table 11.3: Number of trips reported according to daily logs completed (6- and 7-day web respondents). Survey days

1st

2nd

3rd

4th

5th

6th

7th

Number of respondents Number of trips Average trips per person

57 216 3.8

57 221 3.9

57 196 3.4

57 198 3.5

57 194 3.4

57 203 3.6

54 195 3.6

in line of those of Sharp and Murakami (2005) who report low response rates, particularly when trying to convince every member of each household to take part. Attrition was occurring mostly after the general questionnaire (Part 1) and coding of anchor and activity locations (Part 2). Respondents who completed more than one daily log of travel were generally willing to continue during the entire week (Table 11.2). Moreover, our persistence results (Table 11.3) are very similar to those of Kreitz and Doherty (2002) who did not observe a fatigue effect (variation on the daily number of trips reported over time) during their CASI survey involving 40 respondents, while fatigue was observed during two PAPI surveys. Similarly, Golob

Comparisons of Web with Pencil-and-Paper Surveys of Weekly Mobility

237

Figure 11.4: Number of places defined by respondents by the number of days of travel reported (web). and Meurs (1986) observed a fatigue effect in their 7-day survey (lower number of trips reported during the last days). Figure 11.4 shows the diversity of activity places identified by respondents during the survey. Distribution of results clearly indicate that most of the persistent respondents had to define new activity locations while they were reporting trips (Table 11.4), meaning that this option is an important feature of that type of survey. Moreover, offering the list of previously defined locations (with a name freely chosen by respondents) greatly helped avoid duplications. Finally, we observe that individuals who completed few (if any) daily logs of travel are locating up to 10 activity places. That is an indication that Part 2 of the interface operates properly (Figure 11.4). Table 11.5 again compares those who completed seven daily logs of travel to those who did not, in this case by gender and then by educational level. The most striking result is that the proportion of locations that were geocoded by clicking on a map was much higher for men than for women, and increased with education. However, recall from Table 11.1 that men dropped out of the survey more often than women, especially in the 21–34 age group; it is interesting to note that the men who persisted with the survey seemed as comfortable with graphic computer interfaces as those dropped out. A similar pattern is seen for respondents with any university education. A fundamental concern with travel log surveys is the possibility of overlooked or intentionally unrecorded trips. Figure 11.5 shows frequency distributions for the number of respondents reporting different numbers of trips during the week using the two survey methods, and Table 11.6 compares the trip rates per day of the week.

Total Proportion (%)

Home Workplace School/Day care centre Grocery Other location Location added during travel report (Part 3) Transit

Activity type

273 39.6

9

30 689 100

53 35 5 61 21 89

53 73 13 166 59 295

79 11.5

6 6 29 6 32

Nb. Address/ZIP Names/ Places code Keywords

337 48.9

21

32 2 76 32 174

Click on map

53 households completing the car-sharing survey

300 100

1

61 39 8 114 47 30

124 41.3

61 26 1 18 8 10

40 13.3

4 2 15 9 10

Nb. Address/ZIP Names/ Places code Keywords

136 45.4

1

9 5 81 30 10

Click on map

62 households abandoning the car-sharing survey

Table 11.4: Tools used by households for locating places by activity and web survey completion.

238 Marius The´riault et al.

Gender Women Men Total Education level High school and less College (CEGEP) University first cycle University second and third cycles Total

Household with only one respondent

44% 26% 37% 68% 52% 39% 27% 37%

33 16 49 4 8 20 17 49

Nb. Address/ZIP persons code

12%

16% 17% 9% 12%

11% 8% 11%

Names/ Keywords

51%

16% 31% 52% 61%

44% 66% 52%

56

2 15 19 20

33 23 56

29%

15% 38% 24% 29%

36% 19% 30%

19%

28% 13% 12% 25%

24% 12% 19%

Names/ Keywords

52%

57% 49% 64% 46%

40% 69% 51%

Click on map

Respondents abandoning the car-sharing survey — less than 6 days of travel (197 activity locations)

Click on Nb. Address/ZIP map persons code

Respondents completing the car-sharing survey (564 activity locations)

Table 11.5: Tools used by respondents for locating places by gender, education and completion (web).

Comparisons of Web with Pencil-and-Paper Surveys of Weekly Mobility 239

240

Marius The´riault et al.

Figure 11.5: Number of trips reported during a week: OPFAST (PAPI) survey vs. car-sharing (CASI) survey.

Table 11.6: Number of trips reported per weekday. Weekday OPFAST survey respondents Number of trips Mean trips per day Car-sharing survey respondents Number of trips Mean trips per day

Monday Tuesday Wednesday Thursday Friday Saturday Sunday Total 122

122

122

122

122

122

122

122

498

484

551

559

573

468

385

3518

4.1

4.0

4.5

4.6

4.7

3.8

3.2

4.1

57

57

57

56

56

57

56

57

196

202

222

211

239

219

193

1482

3.4

3.5

3.9

3.8

4.3

3.8

3.4

3.7

Comparisons of Web with Pencil-and-Paper Surveys of Weekly Mobility

241

It is clear that the web survey observed fewer trips overall, and produced a consistently lower daily trip rates, except at weekends when the rates were close. The weekday differences in the means were of the order of 10% higher for OPFAST. In part, this may be explained by the more explicitly ‘activity-based’ structure of the OPFAST logs, which queries the nature and timing of activities first, and then proceeds to query the characteristics of intervening trips, an approach reported by Stopher (1992) to increase the capture of trips. Conversely, part of the explanation could also be that car-sharing users, having higher consciousness of price of car travel, are more stringent in the planning of their travel needs, leading to lower trips (see Section 11.3). Both factors are likely present in our results and that remains an open research question that could be definitively answered only using the same survey instrument with both groups.

11.5. Ethical Issues Both of the survey methodologies evaluated in this paper collect a significant amount of microdata on the timing and location of most significant activities in a week of the life of each respondent. There are generally two primary concerns about such data: the possibility of violating norms of privacy by the accidental or intentional release of personal information, and the abuse of data by its use for commercial or even criminal purposes. For the most part, these can be satisfactorily addressed by wellestablished procedures for the security and handling of paper and digital documents, and by guidelines for the release of results, e.g. by aggregation or the ‘fuzzying’ of spatial details. But there are additional challenges. In the case of OPFAST, the PAPI activity/travel log is embedded in an overall survey design that specifies an unusual amount of contact with respondents and their households to establish a relationship of trust and ‘joint discovery’ — over several years in the panel application. There is some risk for undesirable outcomes in such situations, e.g. if a respondent is (or feels) intimidated by normative challenges that an interviewer might make to his/her pattern of consumption or its effects, such as carbon emissions. This can be addressed best by training interviewers how to probe while remaining non-judgemental, and also to maintain a ‘healthy distance’ from any household decisions, related to the survey subject, that happen to be in progress. In the case of the web survey, the use of the Internet introduces additional challenges. On top of the requirement to build a secure interface using HTTPS protocol with IDs and passwords, and to secure database operation on the server, the question arises on how to let several members of a household share a list of common activity places without disclosing travel logs of one person to others. This was among the issues raised by the Ethics Committee. Answering that concern was twofold. First, we relied on several references to discuss the issue with the Ethics Committee: Council of American Survey Research Organizations (2007), ESOMAR World Research (2005), Marketing Research Association Inc. (2007), Tierney et al. (1996), Stopher and Alsnih (2004). Second, part of the concerns was solved by the fact that a

242

Marius The´riault et al.

respondent did not have backward access to his daily travel log after it was completed, thus impeding access by others, but also restricting data validation process and lowering survey conviviality. Managing confidentiality among household members and providing a user-friendly interface, while avoiding potential database corruption (especially traces of completed steps in the survey), could become a highly challenging programming task. It is why we decided to deploy the survey with no provisions for backward edition of completed steps. Nevertheless, some very collaborative respondents sent emails asking for corrections on their already sent reports; this was done using in-site data management procedures. While it is essential to handle data over the Internet during the web survey (a few weeks), the database must be moved to a more secure computing environment for processing (isolated computer network), especially when the need arises to complement user-provided information, e.g. modelling trips using GIS procedures, computing GHG emissions and relating respondents to their neighbourhoods in order to characterise their living environments. Moreover, for this particular application we had to handle another confidentiality concern because the survey was conducted in collaboration with a private firm. The firm was not willing to transmit customer’s IDs and we did not have permission of the Ethics Committee to transmit individual data back to the company. Thus, a set of specially generated ID keys had to be handled to permit one way data transmission from Communauto to the survey team, and the results of the study were disclosed to Communauto and its customers only in highly aggregated format (e.g. summaries by neighbourhood type). This has to be handled cautiously in regard of the high accuracy of spatial location, the current state of jurisprudence on privacy and strong differences in legislation from country to country (Pedreschi et al., 2008; Verykios, Damiani, & Goulalas-Divanis, 2008), and between private and public sector activities, impeding immediate transfer of rules and best practices from one application to another.

11.6. Concluding Discussion This paper sought to evaluate a new web-based design to observe daily mobility as an alternative to PAPI. In particular, it has examined some of the major concerns associated with web-based interfaces for collecting spatialised data, and presented a solution-based package of functionalities that became increasingly feasible using information technologies which were readily available by 2009. In the light of the difficulties experienced with resolving problems in post-geocoding a medium-size PAPI database using the tools available in 2002, and in view of the high cost of any delays in data validation, priority was given in the web interface to maximising flexibility offered to respondents when identifying activity locations. The solution allows them to enter an address or a postal code, to choose from a list of places identified by name or keyword, or to click on a map. Getting this identification ‘right the first time’ was additionally important in order to satisfy some concerns about

Comparisons of Web with Pencil-and-Paper Surveys of Weekly Mobility

243

unauthorised access to an individual respondent’s data, including by members of the same household, which was achieved by blocking all online access (even for the respondent) to data entered from an earlier day. Nevertheless, errors in activity locations entered via the clickable map, and later discovered by the respondent to be incorrect, can be dragged to the correct position, leading to the automatic updating of the coordinates wherever in the database that unique location is cited — retrospectively or prospectively. There are some interesting nuances about potential item non-response in the relative popularity of the web survey’s sub-interfaces — the three methods of entering locations — and differences across groups. For example, men more often than women clicked on maps, but this gender difference was similar between those who completed the 7-day log and those who dropped out. A similar pattern is shown for respondents with any university education. This implies that the deletion of respondents with incomplete logs appears, from this study, unlikely to seriously bias results. It is interesting to compare the performance of this interface to that of the PAPI activity/travel log, which from a respondent support perspective was a something of a best case, as the amount of human help that was offered with the entire OPFAST multi-instrument package far exceeded that provided to respondents of most PAPI logs. Weekly trip frequencies were reasonably close, as were daily trip rates, which were also relatively stable across days. The car-sharing sample had expectedly lower weekly frequencies and daily rates of a plausible magnitude. Some instrument effects may have intervened, but for the purposes of the underlying GHG estimation study (which of course takes trip length into account as well), there seems little cause for concern about such effects. The abundant human help in the OPFAST survey, however, was undoubtedly associated with its high level of retention. While neither of these surveys employed movement-aware mobile devices, such as GPS loggers, some of the lessons learnt are relevant to the design issues raised by the increasing deployment of such devices in travel surveys. In particular, the type of updateable map click that is incorporated into the web interface would be very useful for prompted recall surveys in which an automatic algorithm may yield false positives or false negatives for activity stops. Some of the findings about the relative popularity of the sub-interfaces for activity locations among different respondent sub-groups could help avoid a ‘monolithic’ approach to prompted recall, or to the active learning phase of a movement-aware device aided survey that changes to a passive phase once calibrated for a particular respondent (see Lee-Gosselin, Doherty, & Shalaby, 2010). In a more general sense, the construction of an increasing variety of web-based survey interfaces should provoke a more meaningful debate about the art of using both mobile and static information and sensing technologies to engage respondent in observing their own mobility behaviour. Overall, the results of this evaluation of a new interface are encouraging, particularly for its multi-faceted approach to observing activity locations. Even in the case that a pencil-and-paper method is indicated despite the disadvantages of postgeocoding, a portable version of the interface would be very promising tool for prompt in-person or telephone follow-up where this is required. In our ongoing

244

Marius The´riault et al.

survey research, the lessons learnt in this study have been an important input to designs for the management of complex surveys, notably those that focus on behavioural change through the use of extended observation periods. Nevertheless, several challenges remain when implementing web-based questionnaire as one of several simultaneously offered options for responding to household travel surveys. For several experiments in the United States, web-based reporting was close to 20% of total completes (e.g. Greater Minnesota HTS in 2011– 2012) and response rate for web-based approaches to prompted recall for multi-day GPS-only surveys have been less than 30% (Stopher et al., 2012), leaving ample room for mail returns of activity-based diaries, while phone response rate has decreased. Thus, all those approaches should be combined in order to offer appropriate conviviality to different demographic groups, which lead to a supplemental need of ensuring equivalent quality of mobility and location data. Future investigation should address the issue of increased costs/efficiency for developing several instruments to reach specific segments of population, which is even higher within a household complete context. Finally, web-based household travel surveys are in their initial stages of development and there is still a need to investigate whether data quality could be improved when responses are entered into advanced web-based forms by highly trained interviewers and geocoding specialists. In other terms, is it more efficient to deploy the web tools (and CATI) for operation by respondents or by trained interviewers, which imply likely lower costs for system development?

Acknowledgements This paper was supported by the Social Sciences and Humanities Research Council of Canada through a Major Collaborative Research Initiative, 2000–2005 (Access to Activities and Services in Urban Canada: Behavioural Processes That Condition Equity and Sustainability) and an ordinary research grant, 2008–2012 (Accessibilite´, valeurs et mobilite´ des citadins: Les dynamiques actuelles favorisent-elles l’e´quite´ ?). Moreover, authors acknowledge the important support of the Canadian Network of Centres of Excellence in Geomatics (GEOIDE), the Quebec Ministry of Transportation, the Fonds Que´be´cois de Recherche sur la Socie´te´ et la Culture (FQRSC, 2009–2013: Acce`s a` la Cite´: De´veloppement urbain: compe´titivite´, e´quite´ de l’acce`s aux ressources, qualite´ et durabilite´ des milieux de vie) and Communauto, a private car-sharing company operating in the province of Quebec.

References Adler, T., Rimmer, L., & Carpenter, D. (2002). Use of Internet-based household travel diary instrument. Transportation Research Record, 1804, 134–143. Alexandre, L. (2010). La mobilite´ des abonne´s au service d’autopartage de Que´bec (Communauto) et leurs e´missions de gaz a` effet de serre. Me´moire de maıˆ trise en ATDR,

Comparisons of Web with Pencil-and-Paper Surveys of Weekly Mobility

245

E´cole supe´rieure d’ame´nagement du territoire et de de´veloppement re´gional, Universite´ Laval, Que´bec. Axhausen, K. W., & Ga¨rling, T. (1992). Activity-based approaches to travel analysis: Conceptual frameworks, models, and research problems. Transportation Reviews, 12(4), 323–341. doi:10.1080/01441649208716826 Cervero, R., & Kockelman, K. (1997). Travel demand and the 3Ds: Density, diversity, and design. Transportation Research Part D, 2(3), 199–219. doi:10.1016/S1361-9209(97)00009-6 Council of American Survey Research Organizations. (2007). Code of standards and ethics for survey research. New York, NY: Port Jefferson. Doherty, S., & Miller, E. (2000). A computerized household activity scheduling survey. Transportation, 27(1), 75–97. doi:10.1023/A:1005231926405 ESOMAR World Research. (2005). Conducting market and opinion research using the Internet. Retrieved from http://www.esomar.org/uploads/pdf/ESOMAR_Codes&Guideline-Conducting_ research_using_Internet.pdf Ettema, D., & Timmermans, H. (1997). Theories and models of activity patterns. In D. Ettema & H.Timmermans (Eds.), Activity-based approaches to travel analysis (pp. 1–36). Oxford, UK: Pergamon. Golob, T. F., & Meurs, H. (1986). Biases in response over time in a seven-day travel diary. Transportation, 13, 163–181. doi:10.1007/BF00165546 Greaves, S. (2004). GIS and the collection of travel survey data. In D. Hensher, K. J. Button, K. E. Haynes & P. R. Stopher (Eds.), Handbook of transport geography and spatial systems (pp. 375–390). Amsterdam: Elsevier. Kim, H. M., & Kwan, M. P. (2003). Space-time accessibility measures: A geocomputational algorithm with a focus on the feasible opportunity set and possible activity duration. Journal of Geographical Systems, 5(1), 71–91. doi:10.1007/s101090300104 Kreitz, M., & Doherty, S. T. (2002). Spatial behavioural data collection and use in activity scheduling models. Transportation Research Record, 1804, 126–133. doi:10.3141/1804-17 Kwan, M. P. (1998). Space-time and integral measures of individual accessibility: A comparative analysis using a point-based framework. Geographical Analysis, 30, 191–216. doi:10.1111/j.1538-4632.1998.tb00396.x Lee-Gosselin, M. E. H. (2005). A data collection strategy for perceived and observed flexibility in the spatio-temporal organisation of household activities and associated travel. In H. J. P. Timmermans (Ed.), Progress in activity-based analysis (pp. 355–371). Amsterdam: Elsevier. Lee-Gosselin, M. E. H., & Doherty, S. T. (Eds.). (2005). Integrated land-use and transportation models: Behavioural foundations. Amsterdam: Elsevier. Lee-Gosselin, M. E. H., Doherty, S. T., & Shalaby, A. (2010). Data collection on personal movement using mobile ICTs: Old wine in new bottles? In M. Wachowicz (Ed.), Movementaware applications for sustainable mobility (pp. 1–15). Hershey, PA: IGI Global. doi:10.4018/ 978-1-61520-769-5.ch001 Lee-Gosselin, M. E. H., & Harvey, A. S. (2006). Non-web technologies. In P. R. Stopher & C. Stecher (Eds.), Travel survey methods: Quality and future directions (pp. 561–568). Elsevier. Marketing Research Association Inc. (2007). The code of marketing research standards. Retrieved from http://www.mra-net.org/resources/documents/expanded_code.pdf Martin, B. (2007). Caracte´risation du syste`me d’autopartage dans l’agglome´ration montre´alaise et analyse spatiotemporelle de ses diffe´rents objets: usagers, stationnements et ve´hicules. Me´moire de maıˆ trise, E´cole polytechnique de Montre´al, Montre´al.

246

Marius The´riault et al.

McCray, T. M., Lee-Gosselin, M., & Kwan, M. P. (2005). Measuring activity and action space/time: Are our methods keeping pace with evolving behaviour patterns. In M. Lee-Gosselin & S. Doherty (Eds.), Integrated land-use and transportation models: Behavioural foundations (pp. 101–132). Amsterdam: Elsevier. Pedreschi, D., Bonchi, F., Turini, F., Verykios, V. S., Atzori, M., Malin, B., y Saygin, Y. (2008). Privacy protection: Regulations and technologies, opportunities and threats. In F. Gianotti & D. Pedreschi (Eds.), Mobility, data mining and privacy (pp. 101–119). Berlin: Springer. doi:10.1007/978-3-540-75177-9_5 Roorda, M. J., Doherty, S. T., & Miller, E. J. (2005). Operationalising household activity scheduling models: Addressing assumptions and the use of new sources of behavioural data. In M. Lee-Gosselin & S. Doherty (Eds.), Integrated land-use and transportation models: Behavioural foundations (pp. 61–85). Amsterdam: Elsevier. Sharp, J., & Murakami, E. (2005). Travel surveys: Methodological and technology-related considerations. Journal of Transportation and Statistics, 8(3), 97–113. Stopher, P. R. (1992). The use of an activity-based diary to collect household travel data. Transportation, 19, 159–176. doi:10.1007/BF02132836 Stopher, P. R. (2004). GPS, location and household travel. In D. Hensher, K. J. Button, K. E. Haynes & P. R. Stopher (Eds.), Handbook of transport geography and spatial systems (pp. 433–450). Amsterdam: Elsevier. Stopher, P. R., & Alsnih, R. (2004). Standards for household travel surveys — Some proposals. Working Paper. Institute of Transport Studies, University of Sydney, Australia. Stopher, P., Wargelin, L., Minser, J., Tierney, K., Rhindress, M., & O’Connor, S. (2012). GPS-based household interview survey for the Cincinnati, Ohio region. Ohio Department of Transportation and U.S. Department of Transportation, 64 pp. Tierney, K., Decker, S., Proussaloglou, K., Rossi, T., Ruiter, E., & McGuckin, N. (1996). Travel survey manual. Washington, DC: Federal Highway Administration. Verykios, V. S., Damiani, M. L., & Goulalas-Divanis, A. (2008). Privacy and security in spatiotemporal data and trajectories. In F. Gianotti & D. Pedreschi (Eds.), Mobility, data mining and privacy (pp. 213–240). Berlin: Springer. doi:10.1007/978-3-540-75177-9_9

Chapter 12

WORKSHOP SYNTHESIS: DESIGNING NEW SURVEY INTERFACES Marcelo G. Simas Oliveira and Mark Freedman 12.1. Purpose and Introduction Transportation survey methodologists are increasingly turning to information technologies and geomatics to enhance data quality, to decrease respondent burden, to lower costs and, eventually, to design continuous self-administered surveys that are predominantly passive. There are still considerable challenges to understanding the usability and relevance of these survey interfaces to gather complex spatialtemporal data on daily travels. Good designs depend on a strong understanding of web technologies and an excellent sense of graphic design, layout and style to build high performance front-end user interface components that engage the users. The very growth in computing power and design options for the latest systems also means that there are more opportunities to get it wrong. The goal is to design systems that do not comprise an effectively integrated information system that works well as a whole. Much can be learned from models of recent integrated survey systems and the challenges and solutions that were a part of their development. This paper summarizes the topics and discussions that took place during Workshop A4, titled ‘Designing New Survey Interfaces’ of the 2011 ISTSC conference in Chile. The workshop was attended by 14 persons1 from 8 different countries and included the following topics:  Multimodal survey collection  Interface design challenges  Motivating participants to start, continue and complete surveys

1. Participants: Elizabeth Ampt (Australia), Patrick Bonnel (France), Rinaldo A. Cavalcante (Canada), Mr. David (The Urban Transport Institute), David Richardson (Australia), Mark Freedman (USA), Stefan Hubrich (Germany), Uwe Kleinemas (Germany), Michael Meschik (Austria), Marcelo Oliveira (USA) (Chair), Zbigniew Smoreda (France), Marius The´riault (Canada), Rico Wittwer (Germany), Junyi Zhang (Japan).

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

248

Marcelo G. Simas Oliveira and Mark Freedman

This chapter also considers the challenges associated with conducting surveys with multiple modes of data collection. The discussions and recommendations followed from presentations on four unconventional survey approaches and two case studies in computer survey interface design and improvement. These case studies were derived from recently concluded household travel surveys (HTS) conducted in Israel and the United States. This opened up the topics of discussion to include emerging technologies and paradigm shifts expected in the coming years. The workshop concluded with the participants going over a series of questions suggested by the conference chairs which were designed to extract conclusions and directions for future research.

12.2. Examples of New Survey Interfaces and Their Implications Although traditional travel behaviour surveys usually are based on a formal sample design and well-defined survey questionnaires, less formal methods may present lower cost options for special topics and populations. Protocol analysis may offer a low-cost transport surveying method to understand travel behaviour as an alternative to traditional survey formats (Ampt, Paez, & Munro, 2011). The case study observed the conflicts between cyclists and drivers in the city of Melbourne, Australia. Data was captured in the field with the use of recorded voice and video data of cyclists and drivers; these two were combined and reviewed in order to determine the factors underlying the hypothesized tensions. In this case, despite the current climate of tight budgets, where there is little allowance for experimentation and using new methods to collect data, protocol analysis provided clear guidance on next steps for policy making at a reasonable cost, in cases where a small sample would be acceptable. However, the relatively high cost of intensive manual data reduction and coding required for such a protocol and the challenges of obtaining a representative sample would make this method less attractive for larger data collection requirements. Participatory self-declaring (PSD) is a survey method that allows respondents to personally declare the information required by public policy decisions in a continuous way, completely based on their own will and convenience (Zhang, Tsuchiya, & Fujiwara, 2011). This method was developed in order to deal with the fact that several local governments in Japan lack the funding to conduct a formal HTS every 10 years. PSD survey is a continuous survey and does not select respondents in advance. The problem of self-selection in the proposed method is a concern and is an area where further research is needed to develop methods to ensure that a representative sample is obtained to support unbiased estimates. The experiences gathered on the development, improvement and deployment of a Canadian web-based origin-destination travel survey tool provide insight into the challenges of such methods (Bourbonnais & Morency, 2011). The tool was originally developed to assess the potential of the web medium to complement large-scale phone surveys conducted regularly in the Quebec Province, in Canada, with the

Workshop Synthesis: Designing New Survey Interfaces

249

population invited to opt into the survey rather than sampled in a systematic manner. Special care was taken to develop user interface elements that were visually appealing and intended to be friendly to participants. The authors described key aspects of the technology used to develop the tool, as well as the particular functions, both for respondents and administrators that were developed. The collection of transit trips was one of the most challenging aspects of using the system and is an area with large potential for improvement. Other challenges in designing the system interface included deciding how much flexibility should be allowed in the input versus the amount of up-front data validation. The main concern here is that users may get frustrated and decide not to complete the survey. As in other surveys where respondents opt in, there is a problem of potential sample bias due to self-selection, which needs to be addressed when deploying web surveys. The paper compared two surveys performed in Quebec City in 2002–2003 and 2009 that sought very similar data on weekly activity planning and mobility. These provide an opportunity to understand the advantages and disadvantages of different survey methods on measuring mobility behaviour relevant to greenhouse gas emissions (The´riault, Lee-Gosselin, Alexandre, The´berge, & Dieumegarde, 2011). The first used a pencil-and-paper approach, complemented by phone interviews while the second consisted of a web-based survey for which participants received individual user names and passwords. Respondent help requests and feedback were used to continuously improve the survey’s user interface. Even though the online survey showed much better spatial data than the PAPI version, it suffered from a low response rate and a high drop-out rate. Problems with the sampling of participants for the web survey are an issue. Another concern is the potential impact on survey data quality and compatibility due to modifications made to the survey instrument interface or protocol during the data collection period.

12.3. State of the Practice — Multimodal Surveys and User Interfaces Multimodal survey efforts are becoming more prevalent as segments from the population become harder to reach using traditional approaches. In addition to the growing market penetration of the Internet in both developed and developing countries, there has been a recent surge of connected devices (e.g. Smartphones, tablets, iPads etc.), which provide additional means of contacting and engaging survey participants. Three aspects of multimodal surveys that relate to new survey interfaces include incentives, response bias and non-response.

12.3.1. Tailoring Incentives The question on how to tailor incentives to different survey modes is of key concern. Incentives should reflect local culture and the level of burden expected from the survey. The researcher (or survey designer) has the responsibility to advise the client

250

Marcelo G. Simas Oliveira and Mark Freedman

on what to do with limited resources. This should be done based on previously reported experiences and other information available during survey design. Encouraging public involvement is a related topic that is, in some scenarios, more important than incentives to the success of a project. This is despite the fact that there is general agreement within the state of the practice that there is a need to provide incentives in most HTS. Planning and implementing a good outreach and public involvement campaign is a very critical element in order to have a successful survey. Another important aspect in the incentive determination process is to make use of pre-test and pilot efforts in order to test different incentive scenarios before making decisions for the full survey. It is also critical to avoid making changes once an approach that works is found, unless there is an opportunity to retest before going into the full survey collection phase. Last minute changes to incentive values and structure are reported to have led to unforeseen negative impacts later during the survey. The question of whether incentives must always be of a monetary nature is important. The general agreement was that a monetary incentive is not always the appropriate choice, as it might introduce additional bias on the sample composition, especially in self-administered surveys. The real incentive is ‘what is of value to the respondent’. The use of advance materials and online resources (e.g. websites and apps with information about the survey effort) and their relationship with incentives is of intense interest, including the availability of advance information in the media and the source of advance materials.

12.3.2. Response Bias The user interface element of computerized self-complete surveys can introduce biases, especially in complex data gathering scenarios such as the ones found in HTS. Guidance on use and completion of the survey should be given without influencing the provided answers. Specific interface aspects identified as having the potential to lead to bias include the order of the options provided, inadequate (e.g. too narrow) set of response choices, varying support levels for different browsers in web surveys and inadequate accessibility (e.g. to aging populations, children, persons with special needs). Response bias was identified as a key issue on stated preference (SP) survey designs where several attributes of competing alternatives need to be compared by the respondent. In all these cases the consensus was that bias should be dealt with through extensive pilot testing and data reviewing before main collection starts. The discussion on bias also relates to the use of GPS technology to assist HTS. In this specific case, the use of prompted-recall interviews in the context of GPS-based travel surveys can help reduce biases by providing a basic structure and starting point to participants. Recent comparisons between diary and GPS-based promptedrecall samples from HTS have shown that the latter group tends to have higher trip rates (Wolf, Wilhelm, & Simas Oliveria, 2012). It may be concluded that there is

Workshop Synthesis: Designing New Survey Interfaces

251

no guaranteed approach, or recipe, to handle all potential user interface biases when compiling a single GPS-based dataset. However, the recommended way to address them is to conduct a carefully planned pilot test, accompanied by extensive evaluation of its results before going into the field.

12.3.3. Item Non-response There was general agreement that the problem of ‘non-response’ items is one that affects all data collection modes and accompanying user interfaces. It is something that needs to be taken into consideration during the survey interface design phase to ensure that adequate options and controls are built into it. For example, a significant portion of target populations have limited reading and writing skills. So it is necessary to be careful when phrasing question and answer options to ensure clarity and ease of comprehension at a lower reading level. Providing different collection possibilities (PAPI, CATI, web-based) and encouraging respondents to choose the form best suited for their needs can increase response rates.

12.4. Examples of State of the Practice — User Interfaces and Experiences Examples of two recent and noteworthy interface design, development and improvement tasks provide insight into the challenges and state of the art of interface design. The first example is of a desktop-based GPS prompted-recall retrieval tool that was developed to support the Jerusalem HTS (Oliveira et al., 2011), while the second one shows how the system originally developed for a face-to-face interview scenario in Jerusalem was adapted for the web and CATI environments to conduct a regional travel survey for the New York Metropolitan Transportation Council (NYMTC), the North Jersey Transportation Planning Authority (NJTPA) and their partners (Chaio et al., 2011). The survey portrayed on the second case study featured a 10% GPS prompted-recall subsample; it included a web-based self-completion mode, and the interface for collecting trip information was first developed for this effort. Feedback collected from pilot participants and through a focus group exercise revealed areas on the user interface that needed improvement, including making it look more like the diary by providing visual cues between the screen and paper materials, incorporating reminder messages into the user interface, and using colour and geometric shapes to drive focus to areas where attention and input was needed. User interfaces (UI) and user experiences (UXs) are two additional and critical dimensions of designing survey interfaces. The latter item has to do with how a person feels about using an interface and is a topic of intense discussion in the information technology arena. Given the increase in the use of Internet connected devices with highly sophisticated UXs by target populations, it is necessary that new user interfaces developed for travel surveys step up the level of their experiences.

252

Marcelo G. Simas Oliveira and Mark Freedman

The traditional approach of the first generation of survey interfaces has been to replicate paper questionnaire layouts, but this may not be the most effective strategy. Each collection mode and technology should be evaluated and leveraged to its full interactive potential. A first step in this direction is to avoid violating end user expectations on each platform. This adds costs to the design and development phases of a project, but it can pay off in terms of better response rates and shorter survey fielding times. Another topic of intense interest is that of the ergonomic aspects of survey user interfaces. The consensus was that better consideration of ergonomics and accessibility during the design phase is needed. In addition to that, there is a need to conduct additional, as well as targeted, testing when deploying new user interfaces. Potential methods for conducting these evaluations are usability testing, diagnostic testing, focus groups and cognitive interviews. There is an opportunity now to integrate communication technologies such as screen-sharing, instant messaging (IM), voice over Internet protocol (VoIP) and video conferencing tools (e.g. Skype) into online surveys as a means to provide instant assistance and thus improve response rates and data quality. It is clear that better interaction, as well as a balance, between the needs of technologists, survey scientists and modellers’ demands is needed. Survey scientists need to engage the technologists so that the user designs developed perform as needed. At the same time, survey scientists also need to become more technology literate in order to better interact with technologists. This has to happen while we deal with complexity to limit item non-response and keep respondent burden under control while meeting increasing data demands by activity-based models. A common thread across all aspects of new survey interface design and application is the need to better test all aspects of new survey interfaces and technologies. The state of the practice is now at a point where several projects have demonstrated the viability of emerging technologies to deploy user interfaces for travel surveys, ranging from web applications, to Smartphones and other connected Internet devices. However, more work is necessary to better understand how design decisions impact the overall performance of the survey user interface. This is made especially important due to the fact that there are few documented resources on the side-effects of using new web 2.0 applications and connected Internet devices (i.e. Smartphones and tablets) to conduct survey research across different target populations. Consequently, an effort should be made to bring in potential end users to help evaluate the survey design before committing any project’s full resources to it. There is also a need for survey scientists to cooperate with researchers to document the usability of various user interfaces used in travel surveys. This is an area where the travel survey community could benefit from research that has been done in the area of human computer interaction (HCI).2

2. A good source for information on HCI is the website from the Association for Computer Machinery Special Interest Group on Computer Human Interaction, available at http://www.sigchi.org

Workshop Synthesis: Designing New Survey Interfaces

253

12.5. Summary and Recommendations The workshop discussions regarded the use and implications of new user interfaces in the context of travel surveys. One of the main conclusions was that there needs to be a better understanding of the impact of new user interfaces on bias, response rates and accessibility as well as how they can benefit from incentives. One method of improving this understanding is through controlled testing and comparisons with traditional approaches. The group largely agreed that the topic of survey user interfaces is not necessarily going through a sudden paradigm shift, but rather a steady growth of the role of technology through the addition of multiple modes and their continuous evolution. The challenge is to add new interactive platforms while making an effort to stay compatible, or equivalent, with previous survey efforts. This latter need is important in order to generate datasets that are comparable with historical data, which enabled longitudinal analyses and the understanding of changes that occur over time.

12.5.1. Short-Term Research Needs The key short-term issues identified by the group were as follows:  Maintaining consistency in datasets collected using new technologies while ensuring compatibility with previously collected data: More research needs to be done to understand the biases introduced by new user interfaces that integrate online mapping and other tools to assist respondents.  Dealing with inclusion when designing new user interfaces: The research community should make efforts to minimize and/or to bridge technological gaps that prevent surveys from reaching important population segments. The emergence of Smartphones provides opportunities for connecting with some of these hard to reach populations, but at the same time it poses accessibility challenges that should be looked at with attention. There is a need to better deal with an aging population with more varied activity patterns (e.g. leisure travel may become more prevalent as populations age). This can be done by taking ergonomics into greater consideration when designing user interfaces and conducting cognitive testing to evaluate competing designs. In addition, options should be provided to participants through the use of multimodal surveys, including the use of traditional strategies such as ‘face to face’ and telephone interviews. A related issue identified during the discussions is that most self-complete HTS in the United States provide access using a household-level personal identification number (PIN), and this may not provide adequate privacy to household members when used in the context of a panel survey.  Improve the experience: It is not enough to make interfaces useable; we should also strive to make them enjoyable and less burdensome to participants and interviewers. It is time to move away from trying to replicate paper survey designs

254

Marcelo G. Simas Oliveira and Mark Freedman

on a computer screen. This has been made very clear by recent success stories coming from the Smartphone industry, which has made concerted efforts to improve user experiences overall.3 Positive responses to travel survey user experiences are likely to be manifested in terms of better participation and response rates. Conducting usability testing as part of the development process and reporting on findings should help improve the state of the practice of travel survey user interfaces.  Response rates and item non-response: Representativeness of the sample and nonresponse problems should be evaluated and understood for different collection modes and user interfaces. Having multiple avenues for collecting data is making sample frames and sample expansion (weighting) more complex and also increasing the difficulty of managing surveys.

12.5.2. Long-Term Research Needs When enquired about long-term research needs, the group identified the following items:  Develop user interface design approaches that generate comparable data while taking advantage of technology shifts: A crucial part of this is to conduct controlled experiments that compare data collected using traditional paper-andpencil methods with telephone data collection and computerized user interfaces. This topic also includes achieving a more seamless integration of data from tracking sensor technology into user interfaces in surveys, which can help respondents by providing basic framing data on their travel activities. This integration can come in the form of easier retrieval of sensor data through the use of telecommunications.  Managing potential reduction of response rate as a consequence of having too many report modes. As data collection modes and platform options increase, which fragments the sample into survey mode groups, considerable attention should be devoted to their contributions to the overall sample. New survey management techniques can help assist with this problem.  Develop and document methods to conduct usability and interface testing for travel survey user interfaces: This is an area where more interaction with the HCI community could benefit survey designers. A sub-topic in this area is to research the impact of different user interface widgets (i.e. lists, dropdown menus, radio button controls) on response rates and simplifying the input of data through the use of technology (e.g. allow use of open end responses as text parsing classification and interpretation become automated).

3. This is not a perfect analogy for travel surveys, but is nevertheless an indication that the public will respond positively if an engaging experience is provided.

Workshop Synthesis: Designing New Survey Interfaces

255

 Investigate application of new hybrid collection modes that combine self-complete user interfaces with on-demand assistance using instant messaging and teleconferencing technologies.  Improve inclusion of hard to reach population groups through the use of technology, and avoid exclusion of other groups that can occur if the survey is overly dependent on technology.  Changing and evolving household structures and how that affects existing survey designs (e.g. proxy reporting and single versus individual access points for HTS).  Potential uses of technology that can improve the experience of longitudinal surveys by, for example, keeping respondents connected and engaged.

References Ampt, E., Paez, D., & Munro, C. (2011). Protocol analysis: A low cost alternative to understand interaction between transport modes. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Bourbonnais, P., & Morency, C. (2011). Web-based personal travel survey: A demo. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Chaio, K.A., Argote, J., Zmud, J., Hilsenbeck, K., Zmud, M., & Wolf, J. (2011). Continuous improvement in regional household travel surveys. The NYMTC Experience. Presented at the 90th Annual Meeting of the Transportation Research Board, Washington, DC. Oliveira, M. G. S., Vovsha, P., Wolf, J., Birotker, Y., Givon, D., & Paasche, J. (2011). Global positioning system-assisted prompted recall household travel survey to support development of advanced travel model in Jerusalem, Israel. Transportation Research Record, 2246, 16–23. doi:10.3141/2246-03 The´riault, M., Lee-Gosselin, M., Alexandre, L., The´berge, F., & Dieumegarde, L. (2011). Web versus pencil-and-paper surveys of weekly mobility: Conviviality, technical and private issues. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Wolf, J., Wilhelm, J., & Simas Oliveria, M. (2012, January). Applications of GPS-based prompted recall methods in two household travel surveys. Presented at the 91st Annual Meeting of the Transportation Research Board, Washington, DC. Zhang, J., Tsuchiya, Y., & Fujiwara, A. (2011). Proposing a participatory self-declared survey and examining its public acceptance based on stated preference survey. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18.

Chapter 13

Shipper/Carrier Interactions Data Collection: Web-Based Respondent Customized Stated Preference (WRCSP) Survey Rinaldo A. Cavalcante and Matthew J. Roorda

Abstract Purpose — The main objective of this survey is to collect data for the development of six models in a freight modeling framework. The framework aims to simulate the interactions between shippers and carriers in a freight market. Methodology/approach — A web-based survey was designed using stated preference methods and experimental auctions, to collect information about shipper and carrier behavior when facing hypothetical situations. Hypothetical situations were constructed using information collected during the survey. Findings — The modeling results are available for one model, the carrier selection model. In this model, data were collected using stated preference (SP) methods. Nine SP designs were developed using D-designs and an approach to minimize the nonattendance problem. A multinomial probit model was used. No bias was found due to the position of alternatives on the screen, signs of the parameters are as expected, and level of service attributes are relevant in the carrier selection process. Research limitations/implications — The final response rate was small (about 9%) which is not uncommon in surveys with freight managers. This response rate might result in nonresponse bias of the estimates, which is the subject of future research.

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

258

Rinaldo A. Cavalcante and Matthew J. Roorda

Practical implications — Since freight transport is the output of a freight market, the application of the freight modeling framework presented in this chapter has potential to improve forecasts of freight flows. Originality/value of chapter — To the best of our knowledge, the survey presented in this chapter consists of an innovative data collection procedure for the development of an original freight modeling framework. Keywords: Freight model; SP survey; web survey; discrete choice model; economic theory of contracts; agent-based model

13.1. Introduction Conventional approaches for freight transportation modeling include 3- or 4-stage modeling approaches (e.g., Cambridge Systematics, 1996; Pendyala, Shankar, & McCullough, 2000) and commodity based approaches (e.g., Cambridge Systematics, 1998; Fischer, Ang-Olsen, & La, 2000). These approaches are primarily based on passenger 4-step modeling, even though there are fundamental differences between passenger and freight travel such as (Friedrich, Haupt, & No¨kel, 2003):  Freight is entirely passive and therefore may require specific infrastructure for loading and unloading;  Items being transported range from an urgent single parcel to nonurgent bulk shipments of thousands of tons;  Several factors influence the travel itinerary of freight items;  Service frequency and transport costs for shipments are often undefined until a potential shipper makes an enquiry. To overcome the limitations of using passenger modeling approaches in freight modeling, an agent-based microsimulation framework was presented by Roorda, Cavalcante, McCabe, and Kwan (2010). This framework explicitly represents the diversity of roles and functions that business establishments play, how they interact through markets, and how both long and short term interactions between business establishments occur in the market through contracts. One of the main limitations of this approach is that information about logistics services contracts is usually considered private by companies. This type of information can be strategic and generate a competitive advantage over other companies. To overcome these limitations in data collection, a web-based survey was designed to collect information about shipper and carrier behavioral decisions using Stated Preference (SP) methods and Experimental Auctions (EA). This chapter presents the survey and preliminary results. This chapter is organized using the following structure. First, a literature review of the theory used in the design of the survey is presented, including the conceptual framework (Roorda et al., 2010) and the economic theory of contracts. Second, the survey design, the sampling procedure, the models that are expected to be developed

Shipper/Carrier Interactions Data Collection: WRCSP Survey

259

with the data, and the theories used in the design (tailored design method, stated preference methods, and experimental auctions) are presented. Third, a discussion of some results obtained from the data collected in the survey is provided.

13.2. Background 13.2.1. Conceptual Framework The main objective of this survey is to estimate models that would be used in the logistics services contracts stage of an agent-based microsimulation model of logistic services. The conceptual framework for this model was presented in Roorda et al. (2010), and some of the relevant elements are summarized below. There are three main parts of this microsimulation model: commodities market, shipper–carrier market, and transport operations (see Figure 13.1). In the first two parts, market interactions between agents (business establishments) result in simulated contracts. In the commodities market, a commodity contract identifies a vendor, a customer, a price and a list of shipments from vendor to customer. In the shipper–carrier market, a logistics contract identifies an agent responsible for shipments, an agent (a carrier) that executes shipments, a price and a list of shipments to be transported. In the last part, transport operations, logistics decisions are simulated to represent operational decisions of the logistics service provider during the movement of the shipments. These three parts can be associated with traditional freight modeling stages (see Figure 13.1): commodity generation, commodity distribution, mode split/vehicle loading, and network assignment. 13.2.2. Economic Theory of Contracts The economic theory of contracts was initially developed in the 1970s (Bolton & Dewatripoint, 2005) to study economic relationships using a more realistic approach than the theory of general equilibrium. The models were developed using noncooperative

Figure 13.1: Conceptual framework shipment stages.

260

Rinaldo A. Cavalcante and Matthew J. Roorda

game theory with asymmetric information, where the equilibrium concepts belong to the family of perfect Bayesian equilibrium (Salanie´, 1997). The main improvements of this theory are related to the impact of private information available to economic agents on market dynamics. Private information is by nature asymmetric: firms know more about their own attributes (e.g., costs) than do the government and other firms. When informational asymmetries occur in a market, general equilibrium cannot be used for analysis (Bolton & Dewatripoint, 2005). There are two main groups of models in this theory: adverse selection models (signaling models and screening models) and moral hazard models. Adverse selection models represent the situation when a company (principal) wants to select another company (agent) to perform a task and the principal does not have precise information about the suitability of the agent to perform the task. For example, if the principal is a shipper and wants to select a carrier to deliver his shipments, the shipper does not know a priori which carrier will maximize its utility because carrier attributes are not observable by the shipper before the logistics service is provided. There are two types of adverse selection models:  Screening models: shippers (principals) may screen on the basis of information they have about the carriers (agents) and/or can perform an auction for the contract;  Signaling models: carriers (agents) may signal their qualities to shippers (principals) using advertisement, for example, to show that they are the best alternative. Moral hazard models are used to address the situation when the execution of the task by the agent cannot be observed closely by the principal and this execution would change the final result of the task. For example, shipment tracking systems using on-board technologies (e.g., GPS, Bluetooth) can be considered as a mechanism to reduce moral hazard problem because shippers are not able to follow the delivery of shipments. Adverse selection models and moral hazard models are very closely related and the selection of which model should be used depends on the type of application. A combination of adverse selection and moral hazard models can be used to represent interactions in the shipper–carrier market and they can be used to extend the structure of agent interactions presented by Roorda et al. (2010). The main objective is to develop a more realistic simulation framework resulting in more accurate predictions. For that, it is necessary to specify payoff (utility) functions for the agents that incorporate agents’ private information (attributes) and how their utility is impacted by the results of interaction with other agents.

13.3. Survey Design 13.3.1. Sampling The survey population in this project is composed of shippers and carriers that participate in the shipper–carrier market and are located in the Greater Toronto and

Shipper/Carrier Interactions Data Collection: WRCSP Survey

261

Hamilton Area (GTHA). Some companies might be shippers but are not active in the shipper–carrier market because they have their own private fleet and deliver their own shipments. These companies are not part of the survey population. This survey focused on three groups of companies: manufacturing (mainly shippers), wholesale (shippers or carriers) and motor-freight carrier (mainly carriers). These groups are classified using two digits of the Standard Industry Classification (SIC) codes. The selection of these groups is based on the fact that they generate a large share of the total commodity flow in the GTHA. A database with the survey population obtained in the summer of 2009 from InfoCanada, a Canadian provider of business and consumer databases, was available. Based on this database, the maximum population size for each group was: (1) Manufacturing: 12,957; (2) Wholesale: 12,665; (3) Motor-freight carrier: 2499. A stratified sampling procedure was implemented using annual sales volume and number of employees to define the strata. The minimum sample size was calculated using the formulation1 for minimum sample size in proportion (assuming maximum variance, p ¼ 0.50) with 95% confidence (Z ¼ 1.96), and 5% error (e) (Cochran, 1977): 384 observations per type of respondent (shipper or carrier). This result can be also used for models with continuous data if the coefficient of variation is lower or equal to 50%2 (Cochran, 1977). In each model (stated preference or experimental auction), we expected at least four observations (decisions) per respondent, therefore, a minimum sample size of 100 companies per type of respondent (shipper and carrier), 200 companies in total, was defined. The data were collected using a self-respondent web survey. Based on Dillman, Smyth, and Christian (2009), the response rate was expected to be between 5% and 25%. Using an effective response rate of 10% (i.e., the rate based only on companies that are successfully contacted) and assuming that 50% of the companies in the database would not be contacted or eligible to participate in the survey (not active in the shipper–carrier market), the minimum number of companies that should be in the sample was 4000, including 2000 shippers and 2000 carriers. To guarantee a good representation of each group in the sample, avoiding the underrepresentation of motor-freight carriers in the sample, we equally divided the sample between the three groups: about 1333 companies in each group. The sample of companies was distributed among strata using the same sample fraction (sample size/population size) for each stratum in a group: (1) Manufacturing: 1333/12,957 ¼ 10.3%; (2) Wholesale: 1333/12,665 ¼ 10.5%; (3) Motor-freight carrier: 1333/2499 ¼ 53.3%. We rounded up the number of companies in the sample in each stratum to guarantee at least one company for each stratum, resulting in a different sample fraction per stratum. The total number of companies in the sample

1.

2.

np ¼

nC ¼

Z2  p  ð1pÞ 1:962  0:5  0:5 ¼ ¼ 384 2 0:052

Z2  CV 2 1:962  CV 2 ¼ ¼ 384‘CV ¼ 0:5 or 50% 2 0:052

262

Rinaldo A. Cavalcante and Matthew J. Roorda

in this project was 4043 companies. A weighting procedure is used in the estimation of the models to represent the sampling strategy.

13.3.2. Modeling Framework The logistics services contract stage of the conceptual framework simulates shipper– carrier markets (see Figure 13.1). The input to this stage is a list of shipments between agents. For each shipment, the agent responsible for arranging logistics is assumed to be the vendor of the commodity. The outputs of this stage are the carriers selected for each bundle (or lane) of shipments. To generate this output we specified six models. Five models were specified to permit the implementation of this stage in the conceptual framework and one model was specified to synthesize agents from a list of companies (see Figure 13.2). The modeling framework and how the survey was used to collect data for the models are explained in more detail as follows. The implementation starts with the synthesis of agents. Using a list of companies (the agents) from a database provider, we use a classification model to classify the companies. We first classify companies as active or not active in the shipper–carrier market. An active company is a company that performs transactions in this market. If a company is active in the market, this company can be a shipper or a carrier. If a company is not active, it can be a shipper or neither a shipper nor a carrier. We assumed that all carriers are active in this market. The next classification is to define

Figure 13.2: Modeling framework.

Shipper/Carrier Interactions Data Collection: WRCSP Survey

263

the strategy of the company. We defined two types of strategies: minimizing cost/ price or maximizing level of service. The former strategy is associated with the concept of push logistics (e.g., bulk shipments) and the latter strategy is associated with the concept of pull logistics (e.g., technology parts shipments). These two logistics strategies can occur in the same supply chain (Simchi-Levi, Kaminsky, & Simchi-Levi, 2003) and the push strategy is associated with the beginning and the pull strategy with the end of the supply chain (see Figure 13.3). After the synthesis of agents, we classify the companies in the dataset into seven types: shippers are fed into the Commodity Contract Formation Stage (four types), carriers are fed into the Logistics Services Contract Formation Stage (two types) and the other companies are not included in the simulation. Data for this model was collected in the recruitment phase and also in the web survey with binary (Yes/No) questions. The first model in the Logistics Services Contract Formation stage represents the bundling of shipments, or the creation of lanes (a set of shipments in a contract), to reduce freight rates or increase profit. Based on Song and Regan (2003), the lanes would be created as a function of the relation between price of delivering a bundle with n shipments p(s1-s2-y-sn) and the sum of prices of delivering the n shipments individually p(s1) + p(s2) + y + p(sn). Using this relationship, a set of shipments can be classified in three ways: (1) Complementary (e.g., backhaul shipments): p(s1-s2-y-sn)op(s1) + p(s2) + y + p(sn); (2) Substitute: p(s1-s2-ysn)Wp(s1) + p(s2 p(sn); or (3) Additive: p(s1-s2-y-sn) ¼ p(s1) + p(s2) + y + p(sn). A survey question was presented to shippers and carriers that asked how often a shipper would combine two shipments with different origins/destinations (presented in the question) to reduce the price of delivery. Thirteen cities/metropolitan regions close to GTHA (see Figure 13.4) were selected and seven pairs of shipments were randomly generated from all possible combinations (see Figure 13.5). The characteristics of each shipment (e.g., weight, value) were based on information about a recent shipment provided by each respondent (shippers and carrier) in another question (see Figure 13.6). These seven combinations were presented to a randomly selected sample of the respondents (shipper and carrier) and the respondents were asked how often they thought these combinations would be

Figure 13.3: Push–pull strategy in the supply chain.

Figure 13.4: Cities/metropolitan regions GTHA: bundling model.

Figure 13.5: Bundling model of shipments: survey question.

Shipper/Carrier Interactions Data Collection: WRCSP Survey

265

Figure 13.6: Shipment example: survey question. bundled in one contract to reduce the freight rate, using a Likert 7-scale from ‘‘never’’ to ‘‘always,’’ and were presented with the odds in percentage (see Figure 13.5), e.g., ‘‘Frequently (80–99%).’’ The responses to these questions are well suited to be analyzed using an ordered logit structure. After this stage, the shipments are transformed into bundles of shipments. The next step is to identify if the shipper would use a private carrier or a for-hire carrier. If the shipper is classified as not active in the market the shipper will use a private carrier (decision ‘‘Type of Shipper?’’ in Figure 13.2). If the shipper is active in the market, the shipment bundle might be outsourced or not. For that we expect to develop a shipper outsourcing model. In the survey, we include one question to collect information about one shipment of shippers and we asked ‘‘How often does your company outsource the delivery of this shipment?’’ (see Figure 13.6). The responses to this question are also suited to be analyzed using an ordered logit structure. Using the previous two models, we identify which shipments would enter the shipper–carrier market modeling framework. In this market modeling framework, there are three models (see Figure 13.2). These models will be implemented using the framework of the economic theory of contracts. The first model represents the selection of carriers. The objective of this model is to simulate how shippers select carriers based on carrier attributes. The data for this model were obtained using a traditional SP survey design (Louviere, Hensher, & Swait, 2000). First, 10 carrier

266

Rinaldo A. Cavalcante and Matthew J. Roorda

Figure 13.7: Carrier attributes — level of importance: survey question. attributes were included in the SP experiment based on a literature review (Crum & Allen, 1997; Lambert, Lewis, & Stock, 1993; Meixell & Norbis, 2008; Murphy & Daley, 1997). Attributes in the SP experiment were defined to avoid the nonattendance problem (Hensher & Greene, 2010) (occurs when an attribute is included in the SP experiment but it is not considered by the respondent in the decision process reducing the significance of the parameters). To accomplish this, shippers were asked for a level of importance they assign to each attribute: from ‘‘Unimportant’’ to ‘‘Very Important’’ (see Figure 13.7). Only the attributes that were selected as ‘‘Important’’ or ‘‘Very Important’’ by shippers were included in the SP survey. To dynamically customize the survey for each respondent and using predefined levels for the attributes (see Table 13.1), nine different SP designs were developed to cover all the combinations of attributes selected by the respondent. One additional question was included, before the question in Figure 13.7, to ask shippers if they select carriers based only on price or based on price and other attributes (this information was also used in the agents classification model, see Figure 13.2). If the shipper answered that they selected carriers based only on price, the SP experiment would not be presented. The SP designs were constructed using a Bayesian efficient design procedure in the software SAS3 assuming the null hypothesis for the values of the parameters (b1 ¼ y ¼ bk ¼ 0). With the selection of the attributes that should be included in the SP experiment (Figure 13.7) and levels of the attributes (Table 13.1), a SP experiment that was customized for each shipper respondent (see Figure 13.8) was presented. The next model is the carrier proposal model. The objective of this model is to correlate price with other service attributes. In the survey, one question was used to

3. http://support.sas.com/resources/papers/tnote/tnote_marketresearch.html.

Shipper/Carrier Interactions Data Collection: WRCSP Survey

267

Table 13.1: Carrier attributes levels. Attribute Carrier reputation Response to problems

Quality of drivers Competitive pricing Follow-up on service complaints Billing accuracy Equipment availability Delivery reliability Loss/damage of products Past experience

Scale

Low

Medium

High

From 1 (Very Poor) to 7 (Exceptional) Percentage of unexpected problems solved without impacting the operation From 1 (Very Poor) to 7 (Exceptional) Compared to the expected price Time to give a follow-up

3

5

7

85%

90%

95%

3

5

7

10% below

Same

10% above

1 day

1 week

1 month

85%

90%

95%

85%

90%

95%

85%

90%

95%

2%

4%

6%

3

5

7

Percentage of bills accurate Percentage with equipments available Percentage of pickup/ delivery on time Average lost/damage in shipment value (%) From 1 (Very Poor) to 7 (Exceptional)

ask carriers the values of their service attributes see Figure 13.9). This question was only presented to carriers that answered that their strategy was based on maximizing the level of service. The price of the carrier was measured in a scale that represents the different between the price of the carrier and the average in the market (in percentage). The responses to this question are also suited to be analyzed using an ordered logit structure. The last, and most challenging, part of the survey was composed of questions to model how the freight rate is defined in the market. We used two questions in the survey to collect data for this model. The first one is the question in Figure 13.6 that was also presented to carriers, without the outsourcing question. In the other question, carriers were presented with seven hypothetical shipment contracts in an experimental auction situation (Lusk & Shogren, 2007) and asked to identify (a) if they would be interested in bidding on the contract or not, and (b) a bid value and

268

Rinaldo A. Cavalcante and Matthew J. Roorda

Figure 13.8: Customized SP experiment.

Figure 13.9: Carrier attribute values: survey question. the probability of winning the contract. We expect to use the responses to these two questions together to estimate this model (see Figure 13.10). 13.3.3. Implementation The first stage was the development of a programming code for the survey since the customization requirements exceeded the limitations of available survey design web

Shipper/Carrier Interactions Data Collection: WRCSP Survey

269

Figure 13.10: Experiment auction: survey question.

sites. Several web technologies were used in the web site design. JavaScript was used to provide dynamic behavior in the webpage, PHP and Ajax were used to exchange information between the server and the webpage in an efficient and dynamic way, and MySQL was used to store the data. Various drafts were developed and pilot tests were conducted to improve the quality of the web survey. The second stage was the recruitment process. This stage was composed of two parts: postcards and phone calls. Postcards were prepared and sent to all 4043 companies in the database. The reason for sending postcards was to inform companies about the survey before they received phone calls. This approach was used to increase the response rate (Dillman et al., 2009). A marketing research company conducted the phone calls within one week after sending the postcards. A script was provided to the marketing research company so that they would be able to: find the appropriate respondent in the company (responsible for logistics/ supply chain services), screen the respondent (if a shipper, check if the shipper delivers or outsources its deliveries of shipments, i.e., the company is active in the market), inform the respondent about the survey and ask for their participation. If the respondent agreed to participate, an e-mail was sent with more information and a link to the survey. With this approach, companies that answered or did not answer the survey were identified using an ID embedded in the web link.

270

Rinaldo A. Cavalcante and Matthew J. Roorda

13.4. Results 13.4.1. Response Rates The survey started in mid-April 2011 and was completed in June 2011. The results of the phone calls are presented in Table 13.2. In 49% of the cases, the marketing research company was not able to contact the company or to identify the respondent in the company. Twenty-one percent of the companies for which a respondent was identified were not included because they were not part of the survey population (mainly shippers that were not active in the market). About 30% of the companies (1194) were successfully contacted and were part of the survey population. Thirty-six percent of those agreed to participate and received an e-mail (431 companies). Some of these companies did not start or complete the survey. Using the approach presented by Bethlehem, Cobben, and Schouten (2011), the response of the recruitment process and web survey can be analyzed using the following stages: (1) Contact: companies that the marketing research company was able to contact; (2) Eligible: companies that are in the survey population; (3) Agrees to participate: companies that were recruited; (4) Participates: companies that started the survey; (5) Able: companies that were able to complete the survey. Based on the outcome of these stages, companies can be classified as nonresponse, which might result in an error in the models, or as overcoverage, which represents that the sample frame had more records than needed. A flowchart with the response results of the survey is presented in Figure 13.11. The percentage represents the percentage of companies per group in the output of each stage (a cumulative percentage is presented in the last box ‘‘Response’’). With the definitions used by Bethlehem et al. (2011), the response rate can be defined as the proportion of eligible contacts in the sample that completed the questionnaire: (1) Manufacturing: 9.1%; (2) Wholesale: 9.1%; (3) Motor-freight carrier: 8.8%. In the end, 120 companies fully completed the questionnaire. The final number of observations available for the development of the models is 83 shippers and 66 carriers.

Table 13.2: Recruitment results. Description Recruited Refused to participate Wrong phone number Active calls and call backs: not able to talk to the appropriate respondent Non-qualifier Total

Number of companies

%

431 763 1,049 955

11% 19% 26% 23%

845 4,043

21%

Shipper/Carrier Interactions Data Collection: WRCSP Survey

271

Even though the response rate is at the lower end of what was expected (5–25%), low response rates for web-based SP surveys in freight are not uncommon. In the Fall/2005, Patterson, Ewing, and Haider (2010) obtained a total response rate of 5.4% (392 responses out of 7229 companies) for a more restricted survey population (companies with more than 50 employees) in the Quebec–Windsor corridor. Patterson et al. (2010) did not identify whether the 7229 companies were eligible companies or not.

13.4.2. Survey Model Results: Carrier Selection Model Only the results for the carrier selection model are currently available. The other models are still being developed. The carrier selection model results are presented below. One major problem in self-respondent surveys is nonresponse (or selectivity) bias. This bias exists when respondents and nonrespondents have different responses for the same questions in the survey (Bethlehem et al., 2011). A well-known procedure for regression models to correct the nonresponse bias Heckman correction (Greene, 2003). However, this correction only applies for linear models. Few references analyze remedies for this bias in the discrete model case (Dubin & Rivers, 1989). The approach proposed by Dubin and Rivers (1989) will be adopted to estimate the final models of this survey. Therefore, the following results are presented assuming that the nonresponse bias is negligible. The first result is the carrier attributes importance level. Based on the literature review, we expected to find that service attributes are more important than price. The data collected in the survey presented that price is the 4th attribute in terms of the mean rating (1 for the lower rate and 5 for the higher). About 30% of shippers chose all ten attributes as important or very important in carrier selection. A multinomial probit model was adopted to estimate the parameters in the utility function presented in Eq. (13.1). Alternative ‘‘None of Them’’ was fixed as the reference by setting the utility level to zero (see Eq. (13.2)). The multinomial probit model was selected for the early stage of this model because it allows the inclusion of the option ‘‘None of Them’’ in the estimation process and it permits analysis of bias due to the layout of the survey question (see Figure 13.8). The software Stata was selected because it has a program called amsprobit,4 which simplifies the estimation of multinomial probit models. This program uses two normalizations (location and scale). The normalizations are performed by setting the variance and all the correlations of two alternatives equal to 1 and 0, respectively.

4. See help for this program at http://www.stata.com/help.cgi?asmprobit.

272

Rinaldo A. Cavalcante and Matthew J. Roorda

Figure 13.11: Survey response.

Shipper/Carrier Interactions Data Collection: WRCSP Survey

273

U CARRIER ¼ bCarRep CarRep þ bRespProb RespProb þ bQualDriv QualDriv þ bCompPric CompPric þ bFollServ FollServ þ bBillAccur BillAccur þ bEquipAvail EquipAvail þ bDelTimeRel DelTimeRel þ bLossDam LossDam þ bPastExp PastExp U NONE ¼ 0

ð13:1Þ (13.2)

This model was estimated assuming the following specification (Train, 2009):  Utility of ‘‘None of Them’’ alternative is zero;  Variance of ‘‘None of Them’’ alternative is different from the other alternatives;  Variance of the alternatives left-most (called alternative 1) and right-most (called alternative 4) in the webpage (see Figure 13.4) should be different from the other alternatives;  Correlation (n˜) between the alternative ‘‘None of Them’’ and alternatives 1 and 4 were assumed to be the same and estimated in the model;  Variance of the alternatives in the middle of the webpage (called alternatives 2 and 3) were fixed to 1 and their correlation with other alternatives were fixed to zero; These assumptions are based on the hypothesis that the alternatives located at the center of the question (see Figure 13.8) are not correlated with alternatives located at the left- (alternative 1) or right-most (alternative 4) in the question. The assumption of the same correlation between alternatives 1, 4 and ‘‘None of Them’’ is adopted assuming that they might be correlated because they are located close to the limits of the set of columns of alternatives (see Figure 13.8). All these assumptions are included to test for a bias on the location of alternatives in the question. The final variance-covariance matrix of the random terms (e1, e2, e3, e4, and eNONE) used in the estimation is presented in Eq. (13.3). This covariance matrix specification would have permitted the application of a mixed logit model which was not adopted because the Stata version available (seven) does not have a program for the mixed logit. 2 6 6 6 6 6 6 4

s1 0

0 0 1 0

0 pffiffiffiffiffiffiffiffiffiffi r s1 s4 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r s1 sNONE

0 1 0 0 0 0

pffiffiffiffiffiffiffiffiffiffi r s1 s4 0

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3 r s1 sNONE 7 0 7 7 7 0 0 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 7 s4 r s4 sNONE 7 5 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r s4 sNONE sNONE

(13.3)

The result of the estimation process is presented in Table 13.3. All signs of the parameters are as expected and 8 out of 10 of the attributes had significant parameters values even though the final number of observations was only 482. Apparently the design, which avoids the nonattendance problem and minimizes the D-error, results in satisfactory results despite the low sample size. Using the

274

Rinaldo A. Cavalcante and Matthew J. Roorda

Table 13.3: Carrier selection model: Multinomial probit. Parameter bCarRep bQualDrivers bPastExp bFollowServComp bRespProb bPrice bBillAccur bEquipAvail bDelTimeRel bLossDamProd Parameter s1 s4 sNONE r L(0) ¼  774.41 r2 ¼ 0.1413

Estimate 0.1449611 0.1467066 0.1986017  0.0437271 1.396444  3.855451 0.4961581 3.101123 5.632263  21.90458 Estimate

Standard error 0.04749 0.0431037 0.0562621 0.0104744 1.501718 0.9470553 1.407426 1.534114 1.669615 5.830124 Standard error

1.30599 0.4105773 1.406598 0.4447217 246.6657 209.472 0.1827277 0.3759291 R ¼ 83 respondents L(b) ¼  664.95

Z 3.05 3.40 3.53  4.17 0.93  4.07 0.35 2.02 3.37  3.76

PW|z| 0.002 0.001 0.000 0.000 0.352 0.000 0.724 0.043 0.001 0.000

Confidential Interval 0.7052436 2.41847 0.756915 2.613924 46.69346 1303.051  0.5208137 0.7384501 N ¼ 482 observations

confidence intervals in Table 13.3, we observe that a nested logit structure with partial degeneracy (Hunt, 2000) can be used for this model since: (1) variance of carriers’ alternatives are not significantly different from one; (2) correlation between alternative ‘‘None of Them’’ and alternatives 1 and 4 are not significantly different from zero; (3) variance of alternative ‘‘None of Them’’ is significantly different from one. The last result is a consequence of setting the utility level of alternative ‘‘None of Them’’ to zero. Since there is not any information to predict the level of utility of alternative ‘‘None of Them,’’ the variance of the random term for this alternative is significantly high (246 or s ¼ 15.7).

13.5. Conclusions/Future Directions This research effort has the objective to implement a customized freight data collection method to capture the behavior of agents (shipper and carriers) in a freight market. For that, we used stated preference methods and experimental auction in a web-based survey. The results showed that the response rate for this type of survey is low. Dillman et al. (2009) mentioned that the response rate for surveys with company professionals is usually in the range of 5–25%. The response rate for this survey was about

Shipper/Carrier Interactions Data Collection: WRCSP Survey

275

9%, in the lower end of the expected range. One possible reason for this is that the stated preference survey and especially the experimental auction impose a bigger challenge to respondents than traditional polls surveys. Using the identification of the respondent, it was observed that about 10% of shippers skipped the stated preference survey and 43% of carriers skipped the experimental auction survey. On the other hand, the experimental auction approach can still be considered to be a promising data collection method to capture the behavior of agents in competitive scenarios. However, it might not be appropriate for web-based surveys or cases where the respondents are carriers because of low response rates. Since the rules of the auction have to be explained during the web survey and carriers are constrained in time, it might be more appropriate to apply this survey in another environment such as in-person interviews at a conference, meeting, or exposition or in a multiple phase survey were the explanation of the rules and the participation in the auction would happen in different phases. The stated preference survey was intended to develop a carrier selection model and the survey was designed to minimize the nonattendance problem and the D-error. The results (Table 13.3) were satisfactory with all parameters having expected signs and eight of ten parameters with significant z-values. The r2 value is apparently low (0.1413) but the value of r2 is influenced by the number of alternatives in the SP survey, where a higher number of alternatives constrain the increase of r2. Overall, this research provides evidence that the simulation of agent (shippers and carriers) interactions in the freight market is feasible, and that supporting data can be successfully collected. The research demonstrates potential to use agentbased simulation to forecast freight transport given commodity flows, by representing the characteristics of freight transport as an output of two economic markets: the commodity market and the freight market. Representing these markets in a simulation framework would allow for more insightful analysis of the effects of economic policy, regulation, infrastructure investment, and changes in industry structure on the movement of goods. The future directions of this research are: (1) analyze the nonresponse bias in the carrier selection model; (2) develop the other models for which the survey was designed; (3) extend the conceptual framework of Roorda et al. (2010) to incorporate the economic theory of contracts in the modeling of agent interactions; and (4) apply the models in a practical application setting to demonstrate the enhanced sensitivities of agent-based simulation models.

Acknowledgment We would like to thank the financial support of Metrolinx and NSERC and the database provision by McMaster Institute for Transportation and Logistics, especially Mark Ferguson, who contributed directly to the results of this research effort.

276

Rinaldo A. Cavalcante and Matthew J. Roorda

References Bethlehem, J., Cobben, F., & Schouten, B. (2011). Handbook of nonresponse in household surveys. Hoboken, NJ: Wiley. Bolton, P., & Dewatripoint, M. (2005). Contract theory. Cambridge: The MIT Press. Cambridge Systematics. (1996). Quick response freight manual: Final Report. Prepared for the Federal Highway Administration, Cambridge. Cambridge Systematics. (1998). Collection and analysis of commodity flow information in the Portland metropolitan area: compendium of technical memoranda and other key documents. Prepared for Portland Metro and the Port of Portland with ICF Kaiser Consulting Group and Nelson/Nygaard Consulting Associates, Cambridge. Cochran, W. G. (1977). Sampling techniques (3rd ed.). New York, NY: Wiley. Crum, M. R., & Allen, B. J. (1997). A longitudinal assessment of motor carrier-shipper relationship trends, 1990 vs. 1996. Transportation Journal, 37(1), 5–17. Dillman, D. A., Smyth, J. D., & Christian, L. M. (2009). Internet, mail, and mixed-mode surveys: The tailored design method (3rd ed.). Hoboken, NJ: Wiley. Dubin, J. A., & Rivers, D. (1989). Selection bias in linear regression, logit and probit models. Sociological Methods & Research, 18(2/3), 360–390. Fischer, M., Ang-Olsen, J., & La, A. (2000). External urban truck trips based on commodity flows: A model. Transportation Research Record: Journal of the Transportation Research Board, 1707, 73–80. doi:10.3141/1707-09 Friedrich, M., Haupt, T., & No¨kel, K. (2003). Freight modelling: Data issues, survey methods, demand and network models. Proceedings of the 10th International Conference on Travel Behaviour Research, Lucerne (August, 10–15). Greene, W. H. (2003). Econometric analysis (5th ed.). Upper Saddle River, NJ: Prentice Hall. Hensher, D. A., & Greene, W. H. (2010). Non-attendance and dual processing of commonmetric attributes in choice analysis: A latent class specification. Empirical Economics, 39(2), 413–416. doi:10.1007/s00181-009-0310-x Hunt, G. L. (2000). Alternative nested logit model structures and the special case of partial degeneracy. Journal of Regional Science, 40(1), 89–113. doi:10.1111/0022-4146.00166 Lambert, D. M., Lewis, M. C., & Stock, J. R. (1993). How shippers select and evaluate general commodities LTL motor carriers. Journal of Business Logistics, 14(1), 131–143. Louviere, J. J., Hensher, D. A., & Swait, J. D. (2000). Stated choice methods: Analysis and application. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511753831 Lusk, J. L., & Shogren, J. F. (2007). Experimental auction: Methods and application in economic and marketing research. Cambridge: Cambridge University Press. doi:10.1017/CBO 9780511611261 Meixell, M. J., & Norbis, M. (2008). A review of the transportation mode choice and carrier selection literature. The International Journal of Logistics Management, 19(2), 183–211. doi:10.1108/09574090810895951 Murphy, P. R., & Daley, J. M. (1997). Carrier selection: Do shippers and carriers agree, or not? Transportation Research Part E, 33(1), 67–72. doi:10.1016/S1366-5545(96)00003-8 Patterson, Z., Ewing, G. O., & Haider, M. (2010). How different is carrier choice for third party logistics companies? Transportation Research Part E, 46(5), 764–774. doi:10.1016/ j.tre.2010.01.005 Pendyala, R. M., Shankar, V. N., & McCullough, R. G. (2000). Freight travel demand modelling: Synthesis of approaches and development of a framework. Transportation

Shipper/Carrier Interactions Data Collection: WRCSP Survey

277

Research Record: Journal of the Transportation Research Board, 1725, 9–16. doi:10.3141/ 1725-02 Roorda, M. J., Cavalcante, R., McCabe, S., & Kwan, H. (2010). A conceptual framework for agent-based modelling of logistics services. Transportation Research Part E, 46(1), 18–31. doi:10.1016/j.tre.2009.06.002 Salanie´, B. (1997). The economics of contracts: A primer. Cambridge: The MIT Press. Simchi-Levi, D., Kaminsky, P., & Simchi-Levi, E. (2003). Designing and managing the supply chain: Concepts, strategies, and case studies (2nd ed.). The McGraw-Hill/Irwin series in Operations and Decision Sciences. The McGraw-Hill Companies Inc. Song, J., & Regan, A. (2003). Combinatorial auctions for transportation service procurement. Transportation Research Record: Journal of the Transportation Research Board, 1833, 40–46. doi:10.3141/1833-06 Train, K. (2009). Discrete choice models with simulation (2nd ed.). Cambridge: Cambridge University Press. doi:10.1017/CBO9780511805271

Chapter 14

WORKSHOP SYNTHESIS: ALTERNATIVE APPROACHES TO FREIGHT SURVEYS Jesse Casas and Matthew J. Roorda1 14.1. Introduction and Scope Although goods movement has always had an impact on the network infrastructure; air quality, emissions, and logistics planning; primary data collection of shippers and carriers has not historically been at the forefront of survey research efforts compared to passenger travel until recently. Conducting primary data collection of shippers and carriers is highly complex due to the wide variation in the methods of shipment transport, the size, weight and/or volume of shipments, the shipment handling along the logistics chain, shipment scheduling, and shipment timing (as some shipments are preplanned while others are based on real-time scheduling such as those for single parcel shipments). The vast majority of goods movement is made by own-account and for-hire carriers that have developed proprietary methods or hold confidential information for efficient and cost competitive logistics planning. However, due to increased pressure on regulating greenhouse gases, emissions, and air quality, policy-makers worldwide are stressing the importance of collecting and analyzing freight data to better understand its impact on the environment. As fuel prices continue to rise, shippers and carriers have also become more cognizant of the importance of data to optimize logistics planning. The increased importance of collecting freight data is evident in the increase in freight surveys being conducted worldwide. Because of the many challenges facing policy-makers and logistics planners in conducting primary data collection efforts that traditionally emulate the traditional passenger travel behavior research model, alternative methods were discussed in this

1. Particpants: Jesse Casas (USA) (Chair), Ken Casavant (USA), Rinaldo Cavalcante (Canada), Oli Madsen (Denmark), Bongi Mpondo (South Africa), Jon Newkirk (USA), Ledile Nong (South Africa), Christophe Rizet (France), Matt Roorda (Canada) (Rapporteur), Katlego Setshogoe (South Africa), Valentina Sichel (Chile).

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

280

Jesse Casas and Matthew J. Roorda

workshop. The focus was on the feasibility and benefits of alternate survey methods, leveraging informatics such as roadway sensors, on-board global positioning systems (GPS), or weigh-in-motion (WIM) data sources with links to survey data. Two papers and one poster were presented, which engaged the delegates in a useful and productive discussion. The group of 11 delegates, representing 6 countries, engaged in discussions based on these presentations as well as their own personal experiences. Gaining different international perspectives revealed both common challenges in the collection of freight data and a few unique set of challenges as well.

14.2. Presentation of Related Papers Calvacante and Roorda’s (2011) presentation addressed the need for data to support an alternative approach to the conventional three- or four-stage modeling and commodity-based approaches to freight transportation modeling. Their presentation described a web-based stated preference (SP) survey to estimate a modeling system of shipper carrier interactions in the logistics services market. The survey addressed three important decisions made in the process of contract formation between shippers and carriers: (a) shipment bundling (i.e., which shipments a shipper decides to include in a single contract); (b) carrier bidding (i.e., what price a carrier would ask to move shipments; and (c) carrier selection (i.e., which carrier a shipper would select given price and other attributes). The survey, conducted on 431 respondents with multiple SP experiments per respondent, used D-design to minimize the general standard error of the parameters. They showed the outcomes of a preliminary carrier selection model to demonstrate the value of the data. Casavant’s presentation (Casavant, Jessup, & Lawson, 2011) recognized that although there is a considerable body of existing research addressing the specific data needed for passenger transportation models and state-wide freight truck movement, there is little research which focuses on methods for collecting data on urban freight transportation. The presentation identified the data needs for both urban truck modeling and freight planning and evaluates alternative data collection methods for providing these data. Results of two pilot studies, testing truck trip data collection methods implemented in the Portland, Oregon area were presented. One pilot study consisted of a roadside intercept survey method and the second pilot study tested various combinations of mail and fax methods. Although each of the two methods produced more complete information for certain data elements than others, no single method could be considered optimal by itself. Rizet’s poster (Rizet, Browne, & Leonardi, 2011) provided a summary of the advantages and disadvantages of how specific data variables are collected by various surveys in Europe for calculating energy consumption. Ongoing data collection programs as well as past surveys on freight transport energy consumption were reviewed. First, an ongoing survey using the vehicle or fleet approach (i.e., trucks) in France provides a micro-level approach for quantifying energy consumption. However, the survey data lacks the linkage between energy consumption and the

Workshop Synthesis: Alternative Approaches to Freight Surveys

281

economy because data gaps exist along the transport chain or on shippers. Second, a shipment or transport chain approach was reviewed tracing the transport chain of up to three shipments per shipper up to the consignee. An advantage using this approach is that optimized, or efficient, segments along the transport chain can be identified, providing a good link to economic activity. The disadvantage using this approach is the cost. Third, a supply chain approach was reviewed that links the different steps along the chain from raw materials to the consumer. Some of these surveys were designed to allow analysis of energy efficiency of the supply chain relative to its logistical organization. The main challenge of using this approach is the ability to collect data from shippers and transport operators, which affects the cost of undertaking such an endeavor. Fourth, a life-cycle analysis approach was reviewed, which takes into account the ‘‘life’’ of the product from manufacturing, product use, to the end of product life. Other components of the life cycle are also included such as the energy consumed in vehicle construction, infrastructure building, and maintenance.

14.3. Preliminary Discussion Subsequent to the presentations, delegates related top-of-mind successes and challenges in the conduct of freight surveys. Topics were generated based on delegates’ home country experiences regarding various methods used, data needs, and challenges faced. Several topics surfaced, which included:  Survey Data Collection  Response Rates  Use of Secondary, Passive, and Linked Data Sources. To address these topics, the chair asked delegates to frame the discussion around the challenges faced, recent changes in the industry, and recommendations for overcoming these challenges.

14.3.1. Survey Data Collection Conducting a telephone survey (or a combination of telephone and mail) of shippers/carriers is one method of collecting freight data. As Larson and Poist (2004) note in their article on improving response rates, Williams Walton (1997) calls for telephone surveying as an alternative to only a mail survey as a way to address nonresponse bias that is inherent in mail surveys. Williams concludes that ‘‘The telephone survey method is most appropriate for meeting the challenge of the Seven Rs of logistics research’’ which includes ‘‘the challenge of contacting the right person with the right information at the right time in order to ask the right questions using the right instrument for the collection of the right data at the right cost.’’

282

Jesse Casas and Matthew J. Roorda

Telephone recruitment with mail-out/mail-back surveying of shippers/carriers involves recruiting an establishment to participate in the survey, collecting company profile data, sending the survey instruments (additional company background, fleet characteristics, etc.) to be completed, and collecting the completed surveys. As can be imagined, the level of burden placed on shippers and carriers in providing the necessary information is significant. When shippers/carriers are asked to participate in a survey, survey participation and the provision of data are not usually high on their list of priorities and the level of effort expended by the surveying agency can add significant survey cost. Furthermore, more than one person within an organization is typically necessary to complete the survey depending on the data variables required. When surveys are conducted, there are often missing, inaccurate, or illogical data. Data quality can be negatively impacted when mail-out/mail-back or web-based self-reporting survey methods are used because self-administered surveys provide a chance for the respondent to inadvertently skip a question. Whereas, in a telephone survey, the data collection interviewer has more control over proper skip patterns, the need to collect required data elements, or check for illogical responses (i.e., automatically identifying an illogical response to a question based upon a response to a previous question). As detailed by Beagan, Fischer, and Kuppam (2007) there are other methods used to collect truck traffic counts that are more passive in nature. These include traffic counters (pneumatic tubes), loop detectors, manual observational truck counts, and video counts. Although these methods are useful for observing traffic volumes and for cordon point flows (i.e., internal–external, external–external, external–internal), each are limited for collecting more details on commodity type, freight volume, and local trip making behavior (i.e., internal–internal trips) among others. Data variability is also an important issue that should be addressed by any vehicle classification count program. In addition to time-of-day variations, truck volumes can have significant day-of-week and seasonal variations, which have not been as well established as time-of-day truck traffic distributions. An alternative to establishment-based surveys is roadside interviewing. This method can obtain details about the freight type and volume as well as specific origin–destination points. However, this method typically only collects origin–destination data for the previous trip and the next trip (i.e., no identification of trip chaining activities). Furthermore, these interviews can be costly depending on the number of cordon points in the region, can generate ill-will among drivers because of traffic disruptions, can involve safety hazards to the interviewer, and are limited to collecting data only on individual vehicles rather than the entire shipper/carrier fleet.

14.3.2. Response Rates Shippers and carriers operate using proprietary systems, operate using confidential route data for security purposes, and collect and maintain confidential client data.

Workshop Synthesis: Alternative Approaches to Freight Surveys

283

To maintain a competitive edge, companies may not be willing to provide or even see the relevancy of providing such data — that is, ‘‘What’s in it for me?’’ Such perspectives have led to a decline in response rates over the past 20 years. Even when companies agree to participate, a large percentage drop out after receiving the survey because of the amount of detail requested in the survey. Various strategies used in past surveys to maximize response rates were discussed in the workshop, including incentives, strategies to emphasize the legitimacy of the survey, advance notices, and appeals to altruism. These strategies have led to relative success in some data collection efforts while having no impact in others. It was agreed that new strategies in recruiting were needed to appeal to the business interests of respondents. It was also suggested that new methods are needed to ensure representativeness/remove bias in situations where low response rates are inevitable.

14.3.3. Use of Secondary, Passive, and Linked Data Sources The delegates discussed the use of secondary data sources, data from passively collected methods (e.g., GPS tracking systems or radio frequency identification (RFID)), and how survey data can be linked to these secondary sources to enhance understanding of goods movement. Some delegates noted the challenge of database incompatibility, inconsistent variable definitions (e.g., industry, commodity, or vehicle classifications), especially when databases are obtained from individual businesses, each with a different database management system. Secondary sources also are more likely to be missing data or may lack the level of detail needed at the local level for modeling purposes. Among the data sources mentioned by the delegates, the four most commonly used sources of freight data include Global Insight’s TRANSEARCH Data, the Federal Highway Administration’s (FHWA) FAF1 and FAF2, and the U.S. Census Bureau’s Bureau of Transportation Statistics’ (BTS) Commodity Flow Survey (CFS). Although there are many advantages to using these data sources, most urban freight planning efforts require detailed localized data. These secondary data sources may provide data with a respectable level of local detail, such as the privately maintained TRANSEARCH database. However there are limitations in the database such as mode for certain areas and so only conservative mode distribution estimates are provided. Furthermore, for a more customized profile of a specific geographic area, multiple databases are combined to enhance the data but caution must be taken (or one should have a complete understanding of the data) when conducting data analysis. For example, if one or more data sources lack responses to a particular data variable or if there is a lack of consistency in the response categories for a particular variable among the data sources, assumptions in the data accuracy must be made.

284

Jesse Casas and Matthew J. Roorda

14.4. What Has Changed Over the Past Three Years? Many of the challenges and opportunities in freight data collection have been around for some time. Yet the workshop delegates were able to identify some rapidly changing aspects of freight survey methods that are occurring in response to changes in freight modeling and real-world context for data collection. Since the last ISCTSC gathering three years ago:  Low emissions technology, fuel efficiency, emissions modeling, and emissions policy have continued to be a focus. Programs and policies have progressed (e.g., Marco Polo and SuperGreen in the EU, EPA SmartWay in the United States, GreenFreight in China, and ecoFreight in Canada).  Model advancements are demanding larger quantities of more accurate data. These advancements include the development of supply chain models and corridor performance measurement models that measure how particular supply chains perform in a corridor to identify problem areas, bottlenecks, and investment needs. There have also been developments in agent-based micro-simulation of freight systems, which require detailed data for the estimation of highly disaggregate models.  Surveys are getting more difficult to conduct successfully, with diminishing response rates being a primary challenge.  GPS technology is becoming more widely utilized in the freight industry and continues to evolve, particularly in toll collection technology. This evolution suggests a need for research in finding new ways to exploit GPS data.  Online tools for exchanges between shippers and carriers are becoming more prevalent. Such tools provide a potential data source for learning about market interactions and the relationship between commodity flows and truck flows.  There has been movement toward cooperation and centralized data organization. For example, data harmonization efforts in Europe, data consolidation efforts in the United States, the National Freight Databank in South Africa, and the efforts of the Southern African Development Community in which 14 member countries are voluntarily working toward the sharing of administrative data of goods movement across nations.

14.5. Recommendations to Improve Freight Surveys Most delegates experience similar challenges regardless of their home country. From North America to Argentina to France to South Africa, attempting to survey large establishments is often difficult because the most knowledgeable people in the establishment are also the busiest people, and a single person doesn’t know or have access to all of the required data. This has led to reduced response rates. Surveys of multiple establishments in supply chains experience the additional challenge of data losses because of nonresponse down the chain.

Workshop Synthesis: Alternative Approaches to Freight Surveys

285

To overcome these challenges and to increase data accuracy, the ISCTSC delegates recommend the following:  Involve local authorities and industry associations as much as possible — local authorities can add legitimacy to the survey, they can stress the importance of collecting such information to impact transportation efficiency for the region, and can communicate how such efficiency can have a positive impact on the environment, marketing the effort as ‘‘for the greater good.’’  Encourage coordination of data including:  Obtaining data from a variety of sources for analysis purposes (e.g., new surveys, GPS, RFID, WIM)  Enhancing administrative datasets (e.g., for safety/truck weight inspections/ tolling/regulation/hours of service) for other purposes like modeling/forecasting  Increasing accessibility to freight survey micro-data in a controlled environment (to ensure confidentiality), for analysis by responsible researchers  When conducting surveys (including intercept surveys, web-based surveys, etc.), take advantage of newly evolving technologies (e.g., tablet PCs with GIS capabilities) to reduce respondent burden and increase data accuracy, specifically when origin–destination data are collected.  Continue to evolve to a freight specific framework for modeling and data collection, rather than continuing to apply methods that were designed primarily for passenger travel.  Utilize secondary data sources (e.g., GPS, roadside counts, data from online freight auctions) to calibrate, validate, or augment surveys. However, using secondary data as a replacement or supplement to survey data must be approached with caution to ensure reliability, accuracy, and applicability.  Appropriate use of secondary datasets needs much more research — need to build upon company participation and trust, which requires solving the confidentiality of data issue, and need to ‘‘sensitize’’ the private sector to the needs of freight planning agencies.  Concerted efforts are needed on consolidating and maintaining freight data for regional planning. If other data are appended such as land-use or economic data, companies may see value in the entire dataset for their own planning purposes as well and be more willing to participate.  If a consolidated dataset can be developed, a process is needed to share the data but still maintain confidentiality. Perhaps creating a freight data center for sharing data would be beneficial. One suggestion to address confidentiality is to estimate models in the data center, borrow the model and outputs, but leave the data at the center.  Research is needed toward the linking of passive datasets (e.g., GPS, RFID, WIM), administrative datasets, vehicle identifiers, and engine and vehicle control information to survey data. This could involve research on data integration or data fusion.  Develop ‘‘Best Practices Handbook’’ for engaging the private sector. Handbook should include sections on incentives, guidelines on reporting back how

286

Jesse Casas and Matthew J. Roorda

information was used, methods for in-person contact, confidentiality guidelines, etc.  Develop general framework for collectively prioritizing and selecting the frequency of data collection (since companies are overburdened), sharing data (so companies don’t have to provide the same information repeatedly), and preventing overlap between surveys.  Promote the increased involvement of shippers/carriers/receivers at the annual TRB conference and other relevant conferences. Encourage them to present the data that they collect, engage them in the data collection conversation, and hear about how the information is useful to them.

References Beagan, D., Fischer, M., & Kuppam, A. (2007, September). Quick Response Freight Manual II. U.S. Department of Transportation, Federal Highway Administration, Publication No. FHWA-HOP-08-010, Washington, DC. Calvacante, R., & Roorda, M. (2011). Shipper/carrier interactions data collection: Web-based respondent customized stated preference (WRCSP) survey. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), November 14–18, Termas de Puyehue, Chile. Casavant, K., Jessup, E., & Lawson, C. (2011). Developing success methods for collecting truck trip data. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), November 14–18, Termas de Puyehue, Chile. Larson, P. D., & Poist, R. F. (2004). Improving response rates to mail surveys: A research note. Retrieved from http://findarticles.com/p/articles/mi_hb6647/is_4_43/ai_n29135384/?tag ¼ content;col1 Rizet, C., Browne, M., & Leonardi, J. (2011). Unit of observation for the measurement of energy consumption in freight transport. Poster presented at 9th International Conference on Transport Survey Methods (ISCTSC), November 14–18, Termas de Puyehue, Chile. Williams Walton, L. (1997). Telephone survey: Answering the Seven Rs to logistics research. Journal of Business Logistics, 18(1), 217–231.

THEME 3 COMPARING SURVEY MODES AND METHODS

Chapter 15

Analysis of PAPI, CATI, and CAWI Methods for a Multiday Household Travel Survey Martin Kagerbauer, Wilko Manz and Dirk Zumkeller

Abstract Purpose — In this chapter the three household travel survey methods PAPI (paper and pencil interview), CATI (computer-assisted telephone interview), and CAWI (computer-assisted web interview) are compared in order to show well-known and new methodological effects. Methodology/approach — The survey concept in the Stuttgart region with the three methods (PAPI, CAPI, and CAWI) offers the possibility to analyze the differences between these methods. This approach offers various possibilities to compare the subsamples and to evaluate the effects of the different survey methods in order to ensure a high data quality. Findings — The results show a clear tendency that retired people prefer the CATI design instead of CAWI, while younger persons prefer the CAWI design. The PAPI design seems to cover all parts of the population to the same extent and also achieves the same response levels as CATI and CAWI. Originality/value of chapter — The three different survey methods within one survey allow on the one hand methodological analyses without distortion of results by different framework conditions. On the other hand the CATI and CAWI survey methods are relatively new in the field of multiday surveys especially in Germany. Keywords: Survey methods; comparison; multiday survey; paper and pencil interview; web interview; telephone interview

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

290

Martin Kagerbauer et al.

15.1. Introduction In the region of Stuttgart a multiday survey was carried out, which combined PAPI (paper and pencil interview), CATI (computer-assisted telephone interview), and CAWI (computer-assisted web interview) methods in different subsamples. This survey offers the opportunity to examine response rates and key figures of the different survey methods in comparison. The results show a clear tendency that retired people prefer the CATI design instead of CAWI, while younger persons prefer the CAWI design when offered the choice between both methods. These preferences for CATI and CAWI also affect the key figures of mobility of both subsamples, so that a combination of different methods is strongly recommended to give participants the choice, especially if CAWI is to be used. The PAPI design seems to cover all parts of the population to the same extent and also achieves the same response levels as CATI and CAWI.

15.2. Background of the Survey This large-scale household survey was commissioned by the Association of the Region Stuttgart in order to obtain a suitable transportation database on the basis of which the transport model for the Region of Stuttgart can be improved. The Region of Stuttgart is a conurbation of 179 cities and boroughs with a total of about 2.7 million of inhabitants. It is thus the third largest conurbation in Germany. The sample size of the household travel survey was 5,000 households. The objective of the survey was to observe mobility patterns and mode use in a longitudinal perspective and to gather information about the variability and stability of travel behavior. The survey was carried out as a multiday survey in the course of one week. Each participant of the survey was asked to record each trip including departure time, trip purpose, modes used, time of arrival, the estimated trip length, and the location of origin and destination, which were then transferred into geocodes. Additionally, participants were asked to give socio-demographic data concerning the household and all persons living in the household. The design of the household travel survey in Stuttgart is based on the German Mobility Panel (MOP), which is being conducted in Germany as an annual PAPI survey since 1994, thus being the most renowned multiday panel survey in the field of transportation (Zumkeller et al., 2009). The experiences gained regarding characteristics and survey design (Kagerbauer & Manz, 2009) in that nationwide survey were transferred to this regional survey and applied to its conceptual design (see Zumkeller, Chlond, & Kagerbauer, 2008). This multiday survey with a reporting period of seven consecutive days is approved in research but not very common in the field of regional transport surveys. A multiday survey requires several contacts to the participants and a steady collaboration with the participants. Due to the high response burden the conditions of this survey approach is not comparable to a single day survey. The combination of

Analysis of PAPI, CATI, and CAWI Methods

291

telephone- and web-based surveys in standard transport survey approaches are well documented in the literature (e.g., Potoglou & Kanaroglou, 2008 or Braunsberger, Wybenga, & Gates, 2007). There are nearly no experiences in the combination of the different survey methods (PAPI, CATI, and CAWI) in multiday transport survey approaches.

15.3. The Three Different Methods of the Survey The sample size of the survey in the Stuttgart region with 5,000 households was divided into two subsamples: a PAPI subsample following the design of the German Mobility Panel (MOP) and another subsample using both the CATI and the CAWI method. In a two-step process the participants were allocated to one of these groups and methods. First the total initial number of households drawn from the register was randomly divided into the PAPI group and the CATI and CAWI group. In the first contact, participants of the CATI and CAWI initial draw were asked whether they wanted to be interviewed by phone or in a web survey. The PAPI subsample covered 1,000 households and served as the verification sample in the context of the project. The survey method of this subsample was a paper and pencil interview (PAPI) with a diary following the same design as the survey of the MOP. As the survey design was very close to the German Mobility Panel, it offered the possibility to compare mobility figures and allowed benchmarking without methodological differences. Furthermore this approach has been applied in many surveys and attrition bias had been analyzed in several publications (e.g., Kitamura & Bovy, 1987; Meurs, van Wissen, & Visser, 1989). The impacts of the German Mobility Panel are evidenced and well known because the data has been analyzed for more than 15 years. There are a lot of experiences with this type of dataset and it is possible to use the PAPI subsample as valid verification sample. The participants of the PAPI method were asked to fill in a questionnaire for the household (number of persons/children living in the household, number of cars in the household, information about the cars, data about parking situation and public transport connection, and so on) and for all persons in the households (age, sex, profession, mode use, and so on). The main part of the survey consisted in filling in a trip diary in which participants were asked to give details on each trip (start and end time of the trip, the modes used, the trip purpose, and the duration and the trip length). After sending back the questionnaires to the fieldwork company the data were checked for plausibility. The participants of the second subsample including 4,000 households (main sample) were given two different options on how they could participate in the survey. They could choose to answer the questions in a computer-assisted telephone interview (CATI) or in a computer-assisted web interview (CAWI). While the PAPI method is very common practice in Germany for the survey of travel behavior in a longitudinal perspective, both the CATI and the CAWI methods are more innovative

292

Martin Kagerbauer et al.

in multiday surveys. These two methods are often used and well known in crosssectional surveys but there are nearly no experiences in longitudinal surveys. In the CATI method the participants are asked to report their trips by phone. Two or three times during the survey period all participants are called and asked to report their trips. The reported trips are entered into the computer directly. During the telephone call a computer software checks all entered trips for plausibility. If there are mistakes in the data or in case of missing trips (e.g., if there is a time gap between two activities or trips), the interviewer can ask the participants and clarify the situation immediately. With this method the datasets on the one hand should have a higher quality than by means of a PAPI survey. On the other hand it is possible that the participants of the survey forget some trips in the haste of a telephone call. Within the CAWI method all participants report their trips via internet. The datasets are checked by an interviewer one day at a time. If trip data is erroneous or missing they get in contact with the participants and ask them to give the missing information. The advantage of this method in comparison to the CATI survey is that the participants can take part in the survey at any time they want. During the fieldwork a telephone hotline was set up so that the participants of all subsamples were provided the opportunity to ask all questions related to the survey. The CAWI subsample additionally was contacted by e-mail every second days in case of incorrect or missing values or in case of nonresponse. The survey concept in the Stuttgart region with the three methods (PAPI, CAPI, and CAWI) offers the possibility to analyze the differences between these methods. Furthermore it is possible to compare the results of the CAPI and CAWI method and the PAPI method in order to show well-known and new methodological effects (Arentze, 2005; Axhausen, Lo¨chl, Schlick, Buhl, & Widmer, 2007). This approach offers various possibilities to compare the subsamples and to evaluate the effects of the different survey methods in order to ensure a high data quality.

15.4. Comparison of the Different Survey Methods The survey carried out in the Region of Stuttgart gives an excellent opportunity to compare different survey approaches and their impacts. The following tables show some key characteristics of households (types of households and car ownership), persons (age, profession, and the possession of season ticket for public transport), and trips (mobility key figures, modal split, and trip purpose) in the different survey methods. Also, some figures concerning the response rates of the different methods are included.

15.4.1. Response Rates In a first step, persons were randomly drawn from the municipal register of the registration office. The register only provides mailing addresses, but not the

Analysis of PAPI, CATI, and CAWI Methods

293

Table 15.1: Response rates in the different survey methods. Response rates

Stuttgart region survey 2009 CATI

Ratio of households successfully surveyed (%) ( ¼ useful questionnaires/ sample size) Ratio of households willing to participate (%) ( ¼ households willing to participate/sample size) Ratio of households successfully surveyed (%) ( ¼ useful questionnaires/ households willing to participate)

CATI + CAWI

PAPI

Not applicable due to the survey method

8.1%

8.8%

Not applicable due to the survey method

16.7%

15.1%

51.8%

63.0%

64.1%

CAWI

39.7%

telephone number. As the response burden for a seven-day longitudinal survey is high for the participants, it is however helpful to have an initial contact by phone after mailing the information about the survey (see Zumkeller et al., 2008). Therefore it was attempted to merge a telephone number to the persons drawn.1 Then the number of households drawn from the register was divided into the two groups, the PAPI sample and the CATI/CAWI sample. An official letter from the Stuttgart Regions government was sent to all addresses drawn from the register, introducing the survey and asking the persons/households to take part in the survey. If the phone number of the household was available, the person was also called and asked for participation in the survey.2 Participants of the PAPI sample were added to the survey with the PAPI design. Those households allocated to the CATI/CAWI sample were asked to choose between CATI and CAWI. After completing this setup process, the survey was carried out in the second step according to the methods described above. Table 15.1 shows the response rates in the different survey methods.

1. This could be successfully done for nearly half of the number of the households. 2. Persons/households without a valid phone number were contacted by an initial letter announcing the survey and asking for participation. By using an enclosed reply card, these persons were asked to report the number of persons of the household and sign an agreement of participation. In case of answer these households were added to the sample accordingly.

294

Martin Kagerbauer et al.

The response rate on the household level was calculated as the number of all usable household questionnaires (households having completed the travel diaries) divided by the number of all households which had been drawn out of the municipal register. This response rate is only available for the PAPI sample and the CATI/ CAWI sample, but not separated by CATI and CAWI, as the participants choose the preferred method only after their approval for participation (households that were not willing to participate were not asked which method they would have chosen). The response rate of the combination of CATI and CAWI was 8.1%, for the PAPI survey it was 8.8%. By considering the huge efforts for the participants this response rate is at a normal level for a longitudinal survey including a seven-day trip report. The response rates of other surveys with similar design, e.g., regional panel surveys (see Zumkeller et al., 2008) range at the same level. Due to the two-step survey process it is possible to analyze the response rate of the households willing to take part in the survey compared with the total number of households drawn. The ratio of households willing to participate is slightly higher for the CATI and CAWI method (16.7%) than for the PAPI method (15.1%), even though the final response rate (useful questionnaires/sample size) is slightly higher for the PAPI method. Another interesting result is that the actual participation rate of those households approving to participate of the PAPI method (63.0%) is more than ten percentage points higher than the CATI and CAWI method (51.8%). The differences between the CATI method and in the CAWI method are even stronger. For the CATI method, the actual response rate (useful questionnaires/households willing to participate) is at 64.1%, nearly at the same level as for the PAPI method (63%), while the participation rate after the approval to participate for the CAWI method is only about 39.7%. This means that the CATI and PAPI method have approximately the same response rate and, concerning the response rates, it doesn’t matter if the survey is done by phone or by paper questionnaires. In the CAWI method many households which had agreed to take part in the survey abort the survey during the reporting period. A reason might be that the participants feel more obliged when joining the CATI survey than in a web-based self-interview. The high level of participation in the PAPI survey leads to the conclusion that this design fits best to the participants’ needs for an easy and uncomplicated mobility survey.

15.4.2. Sample Size The numbers of households taking part in the different survey methods are shown in Table 15.2. The CATI/CAWI sample comprises a total of 4,465 households, distributed to about equal parts to the two survey methods (CATI: 2,215 households and CAWI: 2,250 households). The number of households in the PAPI method is 1096. Altogether all approaches cover 5,561 households with 14,176 persons having reported their trips during a period of seven days.

Analysis of PAPI, CATI, and CAWI Methods

295

Table 15.2: Sample sizes in the different survey methods. Sample size

Total number of households with useful questionnaires

Stuttgart region survey 2009 CATI

CAWI

CATI + CAWI

PAPI

All (CATI + CAWI + PAPI)

2,215

2,250

4,465

1,096

5,561

Table 15.3: Ratio of household types in the different survey methods. Types of households

Sample size (households) Single household Couple household without children Family or single parent household with children o19 years Senior household Multi-personhousehold without children o19 years No classification

Stuttgart region survey 2009 CATI

CAWI

CATI + CAWI

PAPI

All (CATI + CAWI + PAPI)

2,215

2,250

4,465

1,096

5,561

7.1% 11.2%

12.2% 22.0%

9.7% 16.7%

8.0% 20.0%

9.4% 17.3%

19.6%

33.9%

26.8%

25.5%

26.5%

49.7% 12.5%

17.5% 14.4%

33.5% 13.4%

28.2% 16.8%

32.4% 14.1%

0.0%

0.0%

0.0%

1.6%

0.3%

15.4.3. Types of Households Table 15.3 shows the preferred survey methods of the different types of households. Participants in the CATI/CAWI sample of the survey had the choice to use the CATI and CAWI method. The classification ‘‘single households’’ contains households with one person aged between 18 and 60 years. A ‘‘couple household without children’’ means two adults aged between 18 and 60 years without any children. A ‘‘family or single parent household with children under 19 years’’ is a household with one or

296

Martin Kagerbauer et al.

more children under 19 years and parent(s) aged under 60 years. A ‘‘senior household’’ describes a household with one or two persons older than 59 years. If there are three or more persons in a household and all persons are older than 18 years, it is a ‘‘multi-person-household without children o19 years.’’ Others types of households were not classified. Table 15.3 shows the share of the types of households which took part in the different surveys. The CATI survey method shows a disproportionately high share of senior households in comparison to the PAPI method. This chapter assumes that the PAPI method is the benchmark, because it is a well-known method in Germany and a wellproven concept implemented in many surveys. In the PAPI method the population is well represented. Nearly half of the CATI sample consists of senior households. One reason why so many seniors take part in the telephone interview might be that elderly people have a higher likelihood to stay at home than other people, so that the interviewer can reach this type of persons more easily. Another reason is that the seniors are not so much used to dealing with the internet. In contrast to that the single households and the couples without children are disproportionately small in the CATI sample, because it is very hard to reach them by phone. By contrast, in the CAWI method there is a disproportionately high ratio of single households, couple households without children and family, and single parent households, whereas the ratio of senior households is very low in this sample. The reason is that most of the elderly people are not very familiar with the internet and they do not want or are not able to deal with this medium. The comparison of the CAWI/CATI method with the PAPI method shows that the shares of the types of households in these two subsamples are very similar. But it can also be shown, that there is a strong tendency of elderly people to join a telephone interview and a high affinity of young persons to use the web technology for such a survey. It can be expected that these groups may have the tendency not to participate in such a survey, when their preferred survey method is not available. The PAPI method seems to comply with all types of households, which is important in regard to high response rate in all parts of the population.

15.4.4. Car Ownership The availability of a car is one of the most important circumstances with regard to the individual mobility behavior. This survey differentiates between households with no car, with one car, and with two or more cars. Table 15.4 shows the results. The ratio of nonmotorized households is comparatively high in the CATI method. About 14% of all CATI households live without an own car. The CAWI method has a low ration of households with no car (7.2%) and this ratio is even lower for the PAPI interview (4.8%). In the population of the Stuttgart region there are about 12.1% households without a car in their household. The ratio of the low motorized households (no or one car) is also comparatively high in the CATI method. About 70% of all CATI household own one car or less. In

Analysis of PAPI, CATI, and CAWI Methods

297

Table 15.4: Ratio of car ownership in the households in the different survey methods. Car ownership

Sample size (households) No car One car Two and more cars Not specified

Stuttgart region survey 2009 CATI

CAWI

CATI + CAWI

PAPI

All (CATI + CAWI + PAPI)

Stuttgart region population parameter

2,215

2,250

4,465

1,096

5,561

2,679,000

14.1% 55.7% 30.2%

7.2% 48.3% 44.5%

10.6% 52.0% 37.4%

4.8% 49.5% 44.3%

9.5% 51.5% 38.8%

12.1% 56.0% 31.9%

0.0%

0.0%

0.0%

1.4%

0.3%

0.0%

addition to that the ratio of households with two or more cars is comparatively low (30.2%). The CAWI method shows the highest ration of households with two or more cars in their household (44.5%). The ratio of households with two or more cars in the PAPI interview is 37.4%. In the population of the Stuttgart region there are about 31.9% of all households owning more than one car. That means that the combination of the CATI and CAWI method shows a ratio which is close to the numbers represented in the Stuttgart region population. In the PAPI method, the responses show a higher percentage of households with more than one car and a lower percentage of households with no cars compared to the population in the Stuttgart region. It can be assumed that those households with a low level of mobility tend to not participate in a PAPI interview, while it is more likely to convince them to join the survey in a telephone interview. These findings have already been identified in the selectivity study of the German Mobility Panel (Zumkeller, Chlond, Kuhnimhof, & Manz, 2002).

15.4.5. Age For the comparison on the basis of age criteria four age groups were classified: persons under an age of 19 years, young adults aged between 19 and 34 years, persons aged between 35 and 64 years, and persons aged 65 years and older. The results of these analyses are shown in Table 15.5. Again, the CATI method shows a disproportionate share of retired persons with an age of 65 years and over. The reason is that those people are less used to dealing with a computer and the internet in contrast to younger people. Another point is that the elderly people are at home most of the time and it is easy to get in contact with them by phone. With the CATI method it is hard to reach people aged between 19 and 34 years, either because they are not at home very often or because these persons

298

Martin Kagerbauer et al.

Table 15.5: Ratio of the age (in classes) of the persons in the different survey methods. Age

Sample size (persons) 0–18 years 19–34 years 35–64 years 65 years and older Not specified

Stuttgart region survey 2009

Stuttgart region population parameter

CATI

CAWI

CATI + CAWI

PAPI

All (CATI + CAWI + PAPI)

5,166

6,149

11,315

2,852

14,167

2,679,000

20.3% 8.7% 38.3% 32.7%

27.2% 15.6% 46.5% 10.7%

24.0% 12.5% 42.7% 20.8%

23.6% 11.3% 46.7% 17.5%

23.9% 12.2% 43.5% 20.1%

18.8% 19.6% 41.9% 19.7%

0.0%

0.0%

0.0%

1.0%

0.2%

0.0%

only have cell phones and are not registered in official telephone registers. This age group is underrepresented in all the methods applied; the best method to reach them is by using the CAWI method. Opposed to this, with the CAWI method there is nearly no chance to reach elderly people. That means a combination of both (CATI and CAWI) is beneficial, even though the group of persons aged 19–34 is still underrepresented. A comparison of the combined CATI and CAWI method with the PAPI method shows that both results are similar.

15.4.6. Profession The categories for analysis of the sample according to the profession of the participants were defined as follows: fully and partly working persons, persons with no job (not working or jobless), persons in education, and retired persons (Table 15.6). As shown above, retired persons were least likely to respond to the CAWI method. In contrast, the percentage of elderly persons is distinctly higher in the CATI sample. Comparing the CAWI with the CATI method there are more working people (full-time or part-time) in the CAWI sample than in CATI sample. The ratio of the working people in the merged CATI and CAWI sample is about the same as the PAPI method. It is thus possible to assume that the combined CATI and CAWI sample leads to nearly the same results as the PAPI method.

15.4.7. Season Tickets for Public Transport To test the selectivity of the samples with regard to the use of public transport the share of participants owning a season ticket was analyzed. Table 15.7 shows the share of persons with and without season ticket.

Analysis of PAPI, CATI, and CAWI Methods

299

Table 15.6: Ratio of the profession of the persons in the different survey methods. Profession

Sample size (persons) Fully working Partly working Not working/ Jobless In education Retired person Not specified

Stuttgart region survey 2009 CATI

CAWI

CATI + CAWI

PAPI

All (CATI + CAWI + PAPI)

5,166

6,149

11,315

2,852

14,167

22.2% 10.8% 6.2%

32.8% 13.6% 7.7%

28.0% 12.3% 7.0%

29.2% 14.1% 7.6%

28.2% 12.7% 7.1%

18.8% 35.9% 6.1%

23.9% 11.9% 10.1%

21.6% 22.9% 8.3%

23.9% 19.5% 5.8%

22.1% 22.2% 7.8%

Table 15.7: Ratio of the season ticket in public transport of the persons in the different survey methods. Season ticket

Sample size (persons) No season ticket Season ticket Not specified

Stuttgart region survey 2009 CATI

CAWI

CATI + CAWI

PAPI

All (CATI + CAWI + PAPI)

5,166

6,149

11,315

2,852

14,167

78.5%

74.3%

76.3%

69.5%

74.9%

21.5% 0.0%

25.7% 0.0%

23.7% 0.0%

23.2% 7.3%

23.6% 1.5%

The figures show that there is nearly no difference between the ownership of a season ticket in the different survey methods. In the CAWI method there are slightly more persons with a season ticket compared with the CATI method. This result is in accord to the fact that retired people normally have a lower ownership of season tickets as they don’t have daily trips to work. Overall it can be stated that there are no significant differences in the ownership of season tickets in the different survey methods. In general the results also show that CATI and CAWI survey help to avoid missing values in the data, as it is possible to ask and check all details during the interviews. In the PAPI method such data gaps can only be eliminated with an enormous effort.

300

Martin Kagerbauer et al.

15.4.8. Mobility Key Figures, Modal Split, and Trip Purpose Previously the structure of the sample and the participants was investigated. In the following section there are some analyses about the key figures of mobility (see Table 15.8). Within the CATI method the share of trip makers is a little bit lower than in the CAWI and the PAPI method. These results are due to the higher share of retired people taking part in the CATI method. The trips per person and day are very similar in all three methods. In the CATI method the key figure of kilometers per person and day is lower than in the other methods due to the higher share of elderly people in this subsample. As the trips of retired persons are usually shorter (trip purposes: shopping, leisure) this result coincides with the findings above. The statistical analyses (t-test of means) show that the kilometers per person and day differ significantly between CATI and CAWI method and the PAPI and CATI method (see the figure with * in Table 15.8). The highest value measured for the time spent daily for transport was found in the CAWI method (77 minutes) and the lowest figure was measured in the PAPI method. This is clearly related to the amount of kilometers traveled per day measured in the CAWI and CATI methods. The t-test (means) shows that the travel time per person and day also differs significantly between the CATI and CAWI method and the PAPI and CATI method (purple figure in Table 15.8). The different levels of speed and trip length are a result of the mobility key figures explained above.

Table 15.8: Mobility key figures in the different survey methods. Mobility key figures

Sample size (person-days) Trip makers (%) Trips per person and day (no.) Kilometers per person and day (km) Minutes per person and day (min.) Average speed (km/h) Average trip length

Stuttgart region survey 2009 CATI 34.496

CAWI

CATI + CAWI

PAPI

All (CATI + CAWI + PAPI)

39.585

74.081

16.653

90.734

84% 3.0

87% 3.1

86% 3.0

90% 3.1

87% 3.0

29.5*

34.9*

32.3

35.3*

32.9

74*

77*

76

72*

75

24.1

27.0

25.7

29.4

26.4

9.9

11.4

10.7

11.2

10.8

Analysis of PAPI, CATI, and CAWI Methods

301

Table 15.9: Modal split in the different survey methods. Modal split

Sample size (person-days) Foot Bike Motorcycle and car as driver Car as passenger Public transport (bus, rapid transit) Railway Others

Stuttgart region survey 2009 CATI

CAWI

CATI + CAWI

PAPI

All (CATI + CAWI + PAPI)

34,496

39,585

74,081

16,653

90,734

27.6% 4.3% 42.6%

22.9% 5.5% 44.1%

25.0% 4.9% 43.5%

22.3% 4.6% 47.2%

24.5% 4.9% 44.1%

13.8% 10.5%

13.4% 12.3%

13.6% 11.4%

13.2% 10.6%

13.5% 11.3%

1.1% 0.1%

1.6% 0.2%

1.4% 0.2%

1.8% 0.3%

1.5% 0.2%

By considering the different reporting days in the survey period, we can detect that during the reporting period of the seven days all mobility key factors decrease. The regression analyses show that the number of trips decreases significantly in all methods. The distance per day is also decreasing, but not significantly due to the high standard deviation. The key figures of the seven reported days in the CATI method are less decreasing compared to the other two methods. The reason is probably the method of the telephone calls and the obligation to answer. The PAPI method has the largest decline. It has to be mentioned that in the survey concept all days of the week had the same likelihood to be chosen as the participants starting day. Table 15.9 shows the modal split of the different survey methods: The results concerning the modal split are similar for all the survey methods. Small differences were found for the mode ‘‘by foot.’’ In the CATI method the modal share for trips by foot is relatively high compared to the CAWI or PAPI methods. By comparing the trips by foot in different age groups the elderly people make slightly more trips in the CATI method than in the other two methods. In the group of the younger people the number of trips by foot is nearly the same in all the methods. Altogether the elderly people do make more trips by foot than the younger people, but it is not significant. It seems that the reason of the higher mode share for foot is not the elderly people but the survey method. We assume that this issue is explained by the opportunity to ask the person within the telephone call about forgotten trips and especially about short trips, normally made by foot. In this case the participants are able to recall forgotten trips and in the majority even remember short trips which were done by foot. The high share of short trips by foot is therefore seen as a methodical effect.

302

Martin Kagerbauer et al.

Table 15.10: Trip purpose in the different survey methods. Trip purpose

Sample size (person-days) Work Business trips Education Leisure Shopping Visitation Service Others Back home

Stuttgart region survey 2009 CATI

CAWI

CATI + CAWI

PAPI

All (CATI + CAWI + PAPI)

34,496

39,585

74,081

16,653

90,734

7.7% 1.2% 3.7% 18.5% 11.6% 12.5% 4.3% 0.2% 40.3%

11.6% 2.1% 5.1% 16.1% 9.5% 9.9% 4.8% 0.5% 40.4%

10.3% 2.4% 3.9% 15.8% 9.9% 11.8% 4.6% 0.8% 40.5%

9.9% 1.8% 4.4% 16.9% 10.3% 11.2% 4.6% 0.5% 40.4%

9.8% 1.7% 4.5% 17.2% 10.4% 11.1% 4.6% 0.4% 40.4%

The share of the trip purposes in the different methods is shown in Table 15.10. In the CATI method the share of trips with the purpose work and business is lower than in the others methods because of the relatively high share of retired persons. No relevant differences concerning these purposes can be found between the CAWI and the PAPI method. The share of the purpose ‘‘education’’ is comparatively high in the CAWI method due to the high share of young persons at an age under 19 years (see Table 15.5). In all other methods the purpose education is at a comparable level. All other purposes have similar shares between the methods, too. It is remarkable that the results of the combination of both subsamples CATI and CAWI are at the same level as the PAPI method.

15.5. Conclusions and Outlook In the Region of Stuttgart a multiday survey covering a reporting period of one week was conducted to obtain current and reliable data on mobility behavior needed for issues of planning and modeling. In this survey, an innovative survey design and approved methods were combined in one survey setup using three different survey methods — the PAPI, CATI, and CAWI methods — in parallel. As the survey covered a multiday reporting period including a seven-day trip report, it followed the design of the German Mobility Panel (MOP), a well-established annual survey conducted in Germany as a multiday survey using the PAPI design since 1994. Participants were recruited on the basis of data derived from the municipal registration offices. As this data does not contain telephone numbers, it was necessary to

Analysis of PAPI, CATI, and CAWI Methods

303

set up two different recruitment designs: The initial contact for those persons with an available phone number was done by phone. Persons without available phone number were contacted by mail, asking them to use a postal reply card for an agreement of participation. As expected, an initial contact by phone led to much higher response rates than the recruitment with an initial letter with an enclosed postal reply card. The survey showed in general that the PAPI method works very well and is accepted in all parts of the population. Therefore by using the PAPI design it was possible to obtain reliable results without unexpected nonresponse effects. Beside the PAPI-based sample, a second sample had been set up using both the CATI and the CAWI design in parallel, so that the participants had the opportunity to choose between a telephone and a web-based interview. The results of this part of the survey show a clear difference between both methods: Younger people have a strong preference of joining the CAWI survey while elderly or retired people clearly favor the telephone interview. As a result most key figures of household and person attributes as well as mobility behavior are affected in both subsamples. Analyzing the data of the CATI and CAWI subsamples in combination also gives appropriate results, comparable to those of the PAPI sample. But it has to be stated that especially the CAWI method as a stand-alone survey bears the risk of getting a massive overrepresentation of younger persons, thus asking for a precise weighting and verification strategy before using the survey data. Therefore it seems to be necessary to combine the CAWI method with more approved methods (PAPI or especially CATI) in a survey to prevent problems with nonresponse in the older parts of the population. A combination of different survey methods always increases costs and complexity of a survey. Therefore it has to be considered carefully, if a multi-method survey containing PAPI, CAWI, and CATI methods in parallel is the most efficient way to obtain reliable data on mobility behavior. This is especially true for local surveys covering only a low- or medium-sized sample.

References Arentze, T. (2005). Internet-based travel surveys: Selected evidence on response rates, sampling bias and reliability. Transportmetrica 1(3), 193–207. Axhausen, K. W., Lo¨chl, M., Schlick, R., Buhl, T., & Widmer, P. (2007). Fatigue in longduration travel diaries. Transportation, 34, 143–160. Braunsberger, K., Wybenga, H., & Gates, R. (2007). A comparison of reliability between telephone and web-based surveys. Journal of Business Research, 60, 758–764. doi:10.1016/ j.jbusres.2007.02.015. Kagerbauer, M., & Manz, W. (2009). Anforderungen an Mobilita¨tsdaten aufgrund heterogener Entwicklung der Verkehrsnachfrage. Mobiles Leben — Festschrift fu¨r Prof. Dr.-Ing. Dirk Zumkeller, S. 84–101. Kitamura, R., & Bovy, H. L. (1987). Analysis of attrition biases and trip reporting errors for panel data. Transportation Research A, 21(4/5), 287–302. doi:10.1016/0191-2607(87)90051-3

304

Martin Kagerbauer et al.

Meurs, H., van Wissen, L., & Visser, J. (1989). Measurement biases in panel data. Transportation, 16, 175–194. doi:10.1007/BF00163114. Potoglou, D., & Kanaroglou, P. (2008). Comparison of phone and web-based surveys for collecting household background information. Paper presented at the 8th International Conference on Survey Methods in Transport, May 25–31, Annecy, France. Zumkeller, D., Chlond, B., & Kagerbauer, M. (2008). Regional panels against the background of the German mobility panel — An integrated approach. 8th International Conference on Survey Methods in Transport, May 25–31, Annecy, France. Zumkeller, D., Chlond, B., Kuhnimhof, T., Kagerbauer, M., Schlosser, C., Wirtz, M., & Ottmann, P. (2009). Deutsches Mobilita¨tspanel (MOP) — wissenschaftliche Begleitung und erste Auswertungen: Bericht 2008 im Auftrag des Bundesministers fu¨r Verkehr, Bau und Stadtentwicklung. Forschungsprojekts FE Nr. 70.0813/2007. Zumkeller, D., Chlond, B, Kuhnimhof, T., & Manz, W. (2002). Selektivita¨t des Mobilita¨tspanels. Schlussbericht zu FE96.0732/2002 fu¨r das BMV.

Chapter 16

Comparing Trip Diaries with GPS Tracking: Results of a Comprehensive Austrian Study Birgit Kohla and Michael Meschik

Abstract Purpose — In order to analyse applicability, comparability and limitations of GPS technology in travel surveys, different mobility survey techniques were tested in an Austrian pilot study. Methodology/approach — Four groups of voluntary respondents recorded their travel behaviour over a time period of three consecutive days. The groups were assigned to three different and combined methods of data collection: Paper–pencil trip diaries, passive GPS tracking, active GPS tracking and prompted recall interviews. Findings — The resulting mobility parameters show that self-reported paper– pencil surveys yield accurate sociodemographic information on the respondents as well as trip purposes and modes of transportation, although too few trips are reported. Passive GPS-based methods minimize the strain for respondents. Methods that combine GPS-based data collection and questionnaire provide the most reliable mobility data at the moment. Research limitations/implications — Due to funding restrictions the sample sizes had to be relatively small (235 participants). Further development in research methodology will increase the effectiveness of automated data analysis, for example more accurate detection of activities and transport modes. The usefulness of GPS-based data collection in a large-scale surveys is planned to be tested in the next Austrian national travel survey.

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

306

Birgit Kohla and Michael Meschik

Originality/value of paper — The pilot study allows a detailed comparison of traditional and GPS-based travel survey methods for the first time, due to data collection combined with prompted recalls. Keywords: Travel survey; travel behaviour; GPS-tracking; trip diary; prompted recall survey; individual mobility parameters

16.1. Introduction GPS tracking of trips is nowadays expected to replace traditional paper–pencil and telephone travel surveys. The latter have several disadvantages, such as the underreporting of trips. In an Austrian travel survey (MobiFIT) different techniques were used over three days on a total of 235 participants to assess the pros and cons of both survey approaches. The mobility parameters show that self-reported paper– pencil surveys yield accurate sociodemographic information about the respondents as well as trip purposes and modes of transportation, whereas too few trips are reported. Automated GPS-based methods have weaknesses identifying trips, modes and purposes. With GPS the length and duration of trips — when identified correctly — can be determined more accurately. Average trip distances estimated and selfreported by respondents almost match the values from the GPS evaluation, although the standard deviation is higher, as some respondents overestimate and others underestimate their trip lengths. The daily trip duration was overestimated by the respondents. GPS-tracking records behaviour but cannot reveal motives. Consistently, we had to interview GPS respondents in prompted recall surveys right after the mobility recording to collect sociodemographic data, verify trips, modes of transport and trip purposes. Passive GPS-based methods minimize the strain for respondents. Automated analysis of GPS data sets can become more economical as soon as good post-processing algorithms are developed and costs of devices decrease. Methods that combine GPS-based data collection and questionnaire provide the most reliable mobility data at the moment. Current evaluations of the recorded acceleration data promise improved mode detection.

16.2. State-of-the-Art Traditional paper–pencil or telephone travel surveys are commonly used for medium and large travel surveys. They have proved reliable, and meticulous directives have been developed over the years on how to conduct all necessary steps to obtain significant and reliable results. Yet mobility researchers are well aware that these techniques also have several disadvantages, such as:  strain for respondents;  non-response or trips omitted due to privacy concerns, low acceptance, misinterpretation etc.;

Comparing Trip Diaries with GPS Tracking: An Austrian Study

307

 incomplete representation of the mobility behaviour (unknown extent of missing, wrong or inaccurate data);  difficulties with reading or understanding the questionnaires and the accompanying instructions. Current travel survey studies are trying to eliminate these problems by implementing new technologies, especially GPS tracking (e.g. Wolf & Lee, 2008). High expectations are fuelled, regarding the number of trips and the accuracy of the trip lengths and exact locations of individual trip origins and destinations. Marchal et al. (2008) conducted a GPS-based survey as a 500-respondent sub-sample of the French National Travel Survey 2007–2008. They used passive tracking GPS devices as well as prompted recall face-to-face interviews before and after the tracking period. Stopher, FitzGerald, and Xu (2007) tested a mobility survey technique with GPS technology and automated data processing within the Sydney Household Travel Survey. In this study origins and destinations of ‘trips’ are detected by a defined stop time (inactivity/no movement) of two minutes or more. Further data processing algorithms were developed (Stopher, 2008; Stopher, Clifford, Zhang, & FitzGerald, 2008). Another research has been done on multiday GPS surveys (Stopher, Kockelman, Greaves, & Clifford, 2008) and on the comparison of GPS survey and trip diary (Stopher & Greaves, 2010). Wolf and Bricka (2011) included a GPS-based survey in the Chicago regional household travel inventory 2007. A first large-scale GPS household travel survey with web-based prompted recall interviews was conducted between 2009 and 2010 in Cincinnati, Ohio Region (Stopher et al., 2012). Table 16.1 shows an overview of results in response issues in these studies. Doherty and Lee-Gosselin (2006) found 20–30% more trips with a combined GPS survey and prompted recall than with a traditional survey. Schu¨ssler and Axhausen (2008) calculated 52% more trips with automated trip detection, compared to the Swiss Micro Census survey 2005 (Swiss Federal Statistical Office, 2006). Stopher et al. (2007) noted 7.4% less trips in the traditional survey, and that trip length and trip duration are overestimated by respondents. Current research (Stopher & Greaves, 2010) shows an underreporting of trips in trip diaries of 18.6%. The same authors found self-reported durations in trip diaries about 10% longer than those measured with GPS. Trip length was underestimated by respondents by about 22%. This discrepancy is supposedly resulting from respondents’ misjudgement as well as from the algorithm used to calculate distances from GPS positions. The GPS-based approach also promises less strain for the respondents. Although these new data collection methods can deliver more accurate trip-location data — thus improving some of the drawbacks mentioned above — not all problems are solved. New issues arise, like extensive post-processing of data, technical difficulties resulting in missing information and difficulties collecting sociodemographic data and information on mode of transportation, trip purpose, car occupancy, ticket costs etc. There are hardly any studies comparing the completeness and accuracy of data collected with different survey methods. It can be concluded that the

Sample for recruitment

58 H (+18 partially completed H) 115 P

601 H

32%

32% 44%b 65%b

6,100 P (800 selected) 98 H

1,217 P (160 selected) 2,608 H

500 P

1,104 P



1,200 P

51 H

Participating and useable sample



Response rate

(70 H selected)

Responding sample

17% (during GPS-tracking); 74%b (during prompted recall)

28%b

41% (22%)b

38%

27% (during GPS-tracking); 36%a (during prompted recall) 8%b

Drop-out rate during the survey

H, Households; P, Participants. a Mean value of drop-out rates; Stopher et al. (2007) found different rates according to the methods of prompted recall: telephone: 24%, internet: 65%, conventional mail: 23%, face-to-face: 43%. b Results from own calculation.

Sydney HTS (Stopher 5,000 H (households et al., 2007) of traditional survey) Netherlands (Bohte – & Maat, 2008) French NTS 19,000 P (participants (Marchal et al., of traditional 2008) survey) Victoria ISTA07 306 H (households of (Stopher & traditional survey) Greaves, 2010) Chicago RHTI (Wolf 2,741 P (participants & Bricka, 2011) of traditional survey) Cincinnati HTS 4,000 H (Stopher et al., 2012)

Survey

Table 16.1: Sample sizes and response rates in current GPS surveys.

308 Birgit Kohla and Michael Meschik

Comparing Trip Diaries with GPS Tracking: An Austrian Study

309

classical definition of a trip (out-of-home mobility between two different places with specific purposes) does not apply to some of the GPS-detected ‘trips’: Here a ‘trip’ is often identified as movement over distance and time between two places of assumed immobility, regardless whether a purpose was fulfilled or not. Consequently GPS-based surveys ‘detect’ up to seven trips per mobile person and day (Stopher, Clifford, & Halling, unpublished), compared to about four trips per day from paper or telephone based surveys, resulting in difficulties comparing those results. GPSbased trip detection needs technical criteria and currently diverges from the originally intended trip definition. From automatically analysed GPS data we can measure something looking like a trip, but is this what we intend to measure or more likely just something fulfilling a technology hype? As old-fashioned survey techniques are increasingly replaced with modern (GPS) technology, those issues should at least be discussed. Both types of surveys have pros and cons, and important questions are: Which results do we expect and how can we measure and assess them in the most efficient matter? Which (combinations of) survey techniques are suited best in quality (representativeness, response rates etc.), offer minimal burden for the respondents and reasonable costs?

16.3. The Austrian Travel Survey MobiFIT In a recent research work (Herry et al., 2011) different travel survey techniques were combined to analyse applicability and limitations of GPS technology in travel surveys. The integration of new technologies, particularly GPS technology, was tested to track people of all age groups and social levels on all their trips and all modes of transportation. Experiences could be gained in technical aspects (technologies, devices, data collection, data transfer and data processing) as well as methodological issues (passive tracking and active tracking compared to traditional paper–pencil trip diaries). The pilot surveys took place between November 2009 and February 2010 in two Austrian regions, one rural region (‘Tullnerfeld’, about 41,000 inhabitants) and one urban region (city of Graz, about 260,000 inhabitants). Four groups of voluntary respondents recorded their mobility over a time period of three consecutive days. Table 16.2 gives an overview of the groups which were assigned to three different and combined methods of data collection:  In the first group (A1) respondents were asked to simply carry small GPS devices (see Figure 16.1) during the data collection time (passive tracking).  In a second group (A2) respondents also carried these GPS devices, but they were also asked to push buttons on the devices accordingly whenever they changed to another mode of transportation (active tracking).  Participants in the third group (B) had to carry GPS devices (passively — similar to group A1) and were asked to fill in the same trip diaries as group (C) for the same time period as well.

310

Birgit Kohla and Michael Meschik

Table 16.2: Groups of respondents and numbers of participants in MobiFIT. Group A1 A2 B C

Methods of data collection

Number of participants

Passive GPS tracking with prompted recall Active GPS tracking with prompted recall Passive GPS tracking with prompted recall and trip diary Paper–pencil trip diary

35 41 58 101

Figure 16.1: MobiTest-device; the buttons were used for identifying different transport modes in MobiFIT. Note: ‘‘MitfahrerIn’’ ¼ passenger; ‘‘LenkerIn’’ ¼ driver.  Respondents in the fourth group (C) filled in standard paper–pencil trip diaries for three consecutive days; they were used as control group. The questionnaire was conventionally mailed to all the households of the sample and all households were reminded of the date by a phone call before the first survey day. Participants in all GPS groups (A1, A2 and B) were interviewed at home shortly after the data collection in order to complete trip details (origin and destination, transport mode and travel purpose) of each trip. In those prompted recall surveys, respondents were shown their geocoded mobility (identified trips) on a digital map,

Comparing Trip Diaries with GPS Tracking: An Austrian Study

311

to help recollect their past activities. The intention was to retrieve calibration data for the comparison with trip diary data as well as data from the automated GPS-evaluation process.

16.4. Results 16.4.1. Technical Issues It can be assumed that the technology of GPS-based mobility recording devices develops rapidly — every year newer models of GPS data-loggers are available, featuring higher accuracy, a quicker first GPS fix, more memory space and longer battery life etc. Most devices are, however, optimized for the huge market in outdoor sports activities rather than for the tiny market of mobility research. Literature is not a very reliable reference for choosing suitable devices: As soon as a paper on tested devices is published, the next generation of devices is already on the market. To determine the devices fitted best for the trip-detecting purpose, several (then) current GPS data-loggers have been tested. Some of the criteria underlying the selection were: size, data quality, battery life, memory capacity, sleeping mode ability, minimum time for the first GPS fix, available buttons for active tracking, additional functions, ease of handling and recharging, data transfer and processing. Finally the MobiTest-device of MGE Data (2011) (www.mobitest.eu) without display was used for the pilot survey, one important reason also was the possibility to lease the devices rather than having to buy them, prices starting from h300 apiece (see Figure 16.1). Beside GPS data this device stores additional information on the status of the device (e.g. error codes, reasons for deactivation, pushed buttons) and acceleration data in three directions (Strnad, 2008). Position data are recorded every second, acceleration data ten times per second. There are six buttons on the device, a powerbutton and five buttons variably assignable. Those five were used to enter the mode of transportation in the active tracking group (A2). Respondents were not supposed to turn off the devices during the survey; however some of them did, mostly during night-time. Devices were handed out to participating respondents during a personal household interview previous to the three day-long survey period. All participants received a brief introduction to the devices plus an introductory guide. Consequently, only 0.8% of them had problems with the handling. An advantage of this device is that it is of no use for other purposes except mobility data recording and data can only be downloaded by special software of the producer. So none of the 50 devices used during the survey were lost or broken. Even six-year-old children carried the devices without problems. Collected data were downloaded during the prompted recall surveys in the households and further processed by the special software of MGE Data (2011), based on Google maps. Trips were examined, verified, corrected and completed during the interview.

312

Birgit Kohla and Michael Meschik

16.4.2. Recruitment and Response Issues Participants’ contact data for both regions were acquired from a random sample of addresses from a register of residents in Graz and a commercial register in Tullnerfeld. Related telephone numbers were looked up in the public telephone directory. Methodological problems with representativeness in all groups resulted from unlisted telephone numbers (about 50%). However, this was not a core issue in this survey on different survey techniques. One hundred and twenty-one households could be recruited from a random sample of 646 contacted households in both regions. Data from 235 persons above six years of age and of 705 reference days were collected from these households. The response rates in the GPS groups A1 (17.9%) and A2 (15.2%) tended to be lower than in the control group C with trip diaries (28.1%), although participants rated filling out questionnaires more annoying than carrying GPS data-loggers (see Figure 16.2). In comparable paper–pencil surveys in Austria with up to six reminders, response rates above 50% have been achieved (Sammer, 1995). Reasons given for non-participation were: perceived lack of suitability due to old age, illness or ‘uninteresting’ mobility behaviour or other personal circumstances (18%), too much bother and effort (10.2%), not interested (7.3%), unfamiliarity with technical devices (3.6%), privacy concerns (1.9%) and other reasons (1.9%); 56.9% of contacted households gave no reason for refusing. Participating respondents in the GPS groups (A1 and A2) and in the combined group B were asked about the amount of effort necessary for and reluctance against the applied methods. They preferred passive data collection with GPS (87%) and perceived everything else as more laborious (filling in trip diaries and questionnaires on sociodemographic data as well as pushing buttons to enter the mode of transport in the active group A2, see Figure 16.2).

16.4.3. Item Non-Response In the GPS groups (A1 and A2) only a small part of the reference days completely lacked usable mobility data: 0.4% of days because of interviewer’s mistakes, 0.3% because of devices forgotten at home and 2.3% because of technical issues. There were no data missing because of empty batteries. Among the collected trips there are more trips with missing details (item non-response) in GPS-based mobility data (about 30% of trips, although the prompted recall surveys already revealed and filled in quite a few missing details) than in mobility data of trip diaries (about 4% of trips). Table 16.3 compares the percentages of missing trip details. In 13% of the GPS trips the purpose of the trip could not be ascertained. In contrast almost all trips gained by trip diary came with trip purposes. It can be assumed that people remember trips best by means of activities, therefore in trip diaries trip purposes are almost always reported with the trips. Missing length and duration in GPS trips mainly resulted from missing or inaccurate GPS data. Most of these data gaps are a consequence of initial positioning problems (cold start or first fix). The manufacturer

Comparing Trip Diaries with GPS Tracking: An Austrian Study

313

Figure 16.2: Rating the respondents’ effort for distinct tasks within the surveys (percentage of participating respondents) in MobiFIT.

Table 16.3: Differences in missing trip data between GPS-based and trip diary-based data collection in MobiFIT. Percentage of trips with missing Trip duration Trip length Trip purpose Mode of transportation Quota of incomplete trips

GPS with prompted recall (groups A1, A2, B)

Trip diary (groups B, C)

7.4% 6.1% 13.0% 5.9% 29.1%

0.5% 3.3% 0.4% 0.3% 3.8%

314

Birgit Kohla and Michael Meschik

of the GPS components specifies the time required for such a cold start (with valid A-GPS data) as 32 seconds. Once a participant has left a place where GPS signals cannot be received properly (e.g. a house or a tunnel) much more time passes until the GPS logger is able to fix four satellites at minimum and record the location correctly (from several minutes up to 53 minutes).

16.4.4. Active Tracking In survey group A2 (‘active tracking’) respondents were required to press the corresponding button (five different modes of transportation: walking, bicycle, car driver, car/taxi/coach passenger, public transport) whenever changing the transport mode. This should result in modes of transportation assigned to individual trip stages. The results of assessing all pressed buttons before or even during a trip stage are disappointing but not surprising. For only 45% of trip stages a button was pressed. Eighteen per cent of these buttons were incorrectly pressed. Summarizing, a correct detection of the transport mode from pressed buttons was only possible in 27% of trip stages. Participating respondents were asked to report problems with pressing the buttons. Nineteen per cent of them had difficulties with the handling of buttons, 47% stated that they had forgotten pressing buttons at times and 17% felt that they had sometimes pushed wrong buttons.

16.4.5. Key Mobility Figures A comparison of the different data collection methods reveals that (as was expected) the number of mobile persons, tours (all trips from leaving home until returning) and trips is underreported in trip diaries. We found that in the mixed group B respondents had 1.6 tours and 4.5 trips per day, gathered by GPS tracking and prompted recall interview, compared to 1.5 tours and 3.9 trips per day self-reported by paper– pencil trip diaries (see Table 16.4). Respondents in the control group C, who had to complete trip diaries only, without parallel GPS tracking, reported still less tours (1.3) and trips (3.4) per day. These results indicate an underreporting of about one trip per day in trip diaries (up to 32%). It can be assumed, that participants in group B conscientiously reported all their trips in the trip diary due to feeling controlled by the ‘big-brother-device’ they were made to carry with them all day. Based on the distinct methods of data collection, several other differences in the mobility characteristics could be detected in the data evaluation. For example, the reported trip lengths and trip durations were also different. The analyses revealed longer average trips in trip diaries than from GPS data collection. Differences in the average trip duration and trip length result from omitted short trips (up to 2.5 km) in the trip diary groups. The daily trip length seems to be a little bit longer than self-reported (33 km), whereas the daily trip duration is notably shorter than reported by the respondents (65 min) (see Table 16.4).

Comparing Trip Diaries with GPS Tracking: An Austrian Study

315

We are well aware that (due to budget constraints) the sample size is too small to allow significant deductions. Yet in group B we have the same respondents and their mobility surveyed with two different techniques, the groups A and C are unrelated with different respondents and different mobility patterns compared to group B. Weighting the results for age groups and gender proved unsatisfying, as some age and gender groups contained too few respondents; we had to use unweighted data for the results shown in Table 16.4. Comparison of the groups B and C shows considerable differences in the reported mobility data of the respondents. On the one hand this can be explained by differences in mobility behaviour due to different sociodemographic distributions of respondents in the compared groups. Participants of GPS groups in the region Tullnerfeld are, for example, significantly older than those of the trip diary group C. On the other hand differences result from methodological differences. In group B respondents were aware of their trip diaries being controlled by parallel GPS tracking. The intensive support and personal interviews also amounted to more accurate results in group B (and A), compared to the self-completed questionnaires in group C. The main reason why significantly less trips were self-reported in group B than gathered with the combined GPS and prompted recall method may also be that participants find the definition of a trip not applicable in practice, especially for short trips. For example buying a snack on your

Table 16.4: Comparison of key mobility figures by survey methods and participant groups in MobiFIT. GPS with prompted recall

Mobile persons (%) Tours per mobile person and day (tours) Trips per mobile person and day (trips) Daily trip duration per mobile person (min.) Average trip duration (min.) Daily trip length per mobile person (km) Average trip length (km) Trips per mobile person and day with trip length up to 2.5 km (trips) Trips per mobile person and day with trip length over 2.5 km (trips)

Trip diaries

(Groups A1 & A2)

(Group B)

(Group B)

(Group C)

90.4 1.7

91.3 1.6

91.3 1.5

87.2 1.3

5.0

4.5

3.9

3.4

65

65

73

75

12.9 30

14.3 33

18.6 30

21.7 31

5.9 3.0

7.2 2.7

8.0 2.1

9.0 1.4

2.0

1.8

1.8

2.0

316

Birgit Kohla and Michael Meschik

Figure 16.3: Modal Split based on trips in survey group B in MobiFIT with mixed survey methods (n ¼ 710 trips). work trip should per definition split this trip into two trips, yet minor errands on a longer trip are ignored most of the time. Figure 16.3 shows the modal split resulting from trip diary data compared to data from the GPS survey with prompted recall, both in survey group B. Differences result mainly from trips not reported in the trip diary. Most of the trips not reported seem to have been made by bicycle or as car passenger, some as car driver. Walking trips and trips by public transport are rarely forgotten. Almost all of the missing trips are short trips (up to 1 km for vulnerable road user trips, up to 2.5 km for car trips). It can be assumed that respondents regard short trips as unimportant or that they simply do not remember them. Concerning trip purposes, shopping trips, business trips and trips for recreation tend to be underreported in trip diaries, whereas respondents filled in more commuter trips (see Figure 16.4).

16.4.6. Trip Length and Duration In the project, trip length has been acquired by three different methods: reported length in trip diaries, calculated values of GPS tracks and calculated values of

Comparing Trip Diaries with GPS Tracking: An Austrian Study

317

25%

20%

15%

10%

5%

0%

All GPS groups with prompted recall (A + B)

All travel diary groups (B + C)

Figure 16.4: Distribution of trip purpose by method of survey in MobiFIT (percentage of trips by trip purpose) (numbers of trips A and B: n ¼ 596; B and C: n ¼ 501). manually routed GPS tracks with Google Maps (manual map-matching). Selfreported daily trip lengths in group B (n ¼ 38 days) showed overestimations of 4.4 km/d (186% at maximum) and underestimations of 4 km/d (50% at maximum) as well, compared to routed GPS-based trip lengths. On average there is hardly any discrepancy between reported and routed trip lengths with a standard deviation of 2 km/d. In the GPS groups A and B lengths of trip stages (part of a trip with the same mode of transportation) calculated from GPS positions are on average 3% shorter than routed lengths (n ¼ 2207 trip stages). Concerning the trip duration, respondents in group B overestimated (up to 26 min or 275%) and underestimated (up to 19 min or 27%) their daily trip duration (n ¼ 23 days). On average an overestimation of the daily trip duration of about 3 min/d with a standard deviation of 8.6 min/d could be observed.

16.4.7. Automated Data Processing First attempts of automated data analyses in cooperation with MGE Data (2011) tested how many ‘pseudo-trips’ could be found with the following criteria (indicating

318

Birgit Kohla and Michael Meschik

an activity): minimum stop duration of 120 seconds of staying within an area of 50  50 m. As expected, this algorithm confuses short activities with traffic-related stops (e.g. due to traffic signals). On average we found 5.5 pseudo-trips per mobile person and day with these criteria compared to 4.7 trips per mobile person and day from the prompted recall in all GPS groups. Compared to the real number of trips detected by GPS combined with prompted recall this algorithm amounted in some cases to many more pseudo-trips per day (up to 47 pseudo-trips per day more) and resulted in some cases in up to 12 pseudo-trips less per day (with some delivery persons who actually had dozens of trips per day). Only in 13% of 363 survey-days the correct number of trips could be detected. Further post-processing algorithms of MGE Data (2011) which produced results based on the assumption that a significant stop takes more than 30 minutes without changing to a different street link, found up to 20 more in some cases and in other cases 12 less pseudo-trips per day than reported. The correct number of trips could be detected in 25% of 363 days. Further research on algorithms to improve trip detection has to be done. Some approaches with or without using map-matching algorithms are described by Marchal, Madre, and Yuan (2011), Stopher, Clifford et al. (2008) as well as by Schu¨ssler and Axhausen (2008). The detection of trip purposes is dependent on respondent reporting up to now. Further research on testing map-matching with commonly visited locations reported by respondents as well as with public points of interest listed by a commercial register is in progress. Up to now automated determination of transport modes has been done mainly by evaluating velocity (and velocity characteristics) from the GPS data. From a basic evaluation of MGE Data (2011), resulting modes of transportation are: ‘slow, fast and unknown’. It is hardly possible to determine detailed modes of transportation such as public transport or cycling from these results. European travel surveys based on GPS seem to be more complicated than surveys conducted in the United States and Australia, where people use their private cars for 80–90% of their trips and hardly walk, cycle or use public transport. Car-based mobility is the easiest to detect with GPS, speeds are high and characteristic, GPS-signal detection is excellent in the middle of streets. In Europe modal split has high proportions of walking, bicycling and public transport — these modes are difficult to detect with GPS as travellers sometimes move underground, near building fronts where the GPS signals get distorted or have similar characteristics (e.g. bicycle vs. bus in built-up areas). Recent research combines analysis of GPS data and data of a 3D-acceleration sensor for mode detection. New algorithms produce quite good results, even with typical European model split.

16.5. Discussion and Outlook Travel survey methods that combine GPS-based data collection and questionnaires currently provide the most reliable results in mobility data. Trip diaries have proven reliable with transport modes and trip purposes, but show weaknesses in assessing the correct number, length and duration of trips. Automated GPS-based methods

Comparing Trip Diaries with GPS Tracking: An Austrian Study

319

determine accurate location, trip length and duration, but have weaknesses detecting the number of trips, as well as the transport mode and purpose, depending on the quality of GPS data and additional information (e.g. geospatial data, accelerometer data, information on availability of transport mode). Automated analysis of GPS data sets can be more economical as soon as good post-processing algorithms are developed and costs of devices decrease. Trips from GPS-based surveys are not directly comparable with trip diary data. The most reliable way to collect sociodemographic data is to ask respondents. Passive GPS-based methods minimize the strain for respondents. Prompted recall survey with face-to-face interview and manual map-matching cause additional burden but improve data quality. The next steps in research will increase the effectiveness of automated data analysis, for example in the detection of activity and transport mode and will improve GPS data quality and accuracy, for example with differential GPS or further developments in technology. The successful implementation of GSM technology and accelerometer devices already fills gaps in trip recording and helps detecting modes of transportation; compasses have already been built into some devices. Processing of acceleration data brings significant improvement of automated mode detection (Kohla, 2012; Wally, 2012). Currently we are assessing in depth the recordings of the acceleration sensors from the surveys in MobiFIT, in order to identify the respective trip-stage modes as well as starts and stops of trip stages more accurately. The next challenge is assessing the usefulness of GPS-based data collection in large-scale surveys, such as the next Austrian national travel survey.

Acknowledgements The authors of this paper want to thank all partners in the MobiFit project, especially Gerd Sammer and Rene Wally from the Institute for Transport Studies, University of Natural Resources and Life Sciences, Vienna as well as Max Herry and Rupert Tomschy from Herry Consult, Vienna, without whose contribution this paper would not have been possible.

References Bohte, W., & Maat, K. (2008). Deriving and validating trip destinations and modes for multiday GPS-based travel surveys: A large-scale application in the Netherlands. Paper presented at the 8th international conference on survey methods in transport, Annecy, France. Doherty, S. T., & Lee-Gosselin, M. (2006). An internet-based prompted recall diary with automated GPS activity-trip detection: System design. Paper presented at the 85th Annual Meeting of the Transportation Research Board, Washington, DC. Herry, M., Tomschy, R., Sammer, G., Meschik, M., Kohla, B., Wally, R., & Fu¨rdo¨s, A. (2011). MobiFIT — Mobilita¨tserhebungen basierend auf Intelligenten Technologien, Endbericht. IV2Splus — intelligente Verkehrssysteme und Services plus — ways2go 1. Ausschreibung, Projekt 819267, gefo¨rdert aus Mitteln des BMVIT, Vienna, Austria. Kohla, B. (2012). MODE — Verfahren zur automatisierten Identifikation von Verkehrsmitteln aus technologiegestu¨tzten Mobilita¨tsdaten, Endbericht. IV2Splus — intelligente

320

Birgit Kohla and Michael Meschik

Verkehrssysteme und Services plus — ways2go 3. Ausschreibung, gefo¨rdert aus Mitteln des BMVIT, Vienna, Austria. Marchal, P., Madre, J. L., & Yuan, S. (2011). Postprocessing procedures for person-based global positioning system data collected in the French National Travel Survey 2007–2008. Transportation Research Record: Journal of the Transportation Research Board, 2246/2011, 47–54. Marchal, P., Roux, S., Yuan, S., Hubert, J. P., Armoogum, J., Madre, J. L., & Gosselin, M. L. (2008). A study of non-response in the GPS sub-sample of the French National Travel Survey 2007–08. Paper presented at the 8th international conference on survey methods in transport, Annecy, France. MGE Data. (2011). Datenbearbeitungs-Prozess BOKU. Prague, Czech Republic. Sammer, G. (1995). Problems and solutions in urban travel survey. In P. Bonnel (Ed.), Les enqueˆtes de de´placement urbains: mesurer le present, simuler le futur. Programme RhoˆneAlpes de Recherche en Sciences Humaines, Lyon. Schu¨ssler, N., & Axhausen, K. W. (2008). Identifying trips and activities and their characteristics from GPS raw data without further information. Paper presented at the 8th international conference on survey methods in transport, Annecy, France. Stopher, P. R. (2008). Collecting and processing data from mobile technologies. Paper presented at the 8th international conference on survey methods in transportation, Annecy, France. Stopher, P. R., Clifford, E., & Halling, B. (unpublished). Evaluating a voluntary travel behaviour change by means of a 3-year GPS panel. Travel Demand Management 2008, Vienna, Austria. Stopher, P. R., Clifford, E., Zhang, J., & FitzGerald, C. (2008). Deducing mode and purpose from GPS data. Working Paper ITLS-WP-08-06, University of Sydney. Stopher, P. R., FitzGerald, C., & Xu, M. (2007). Assessing the accuracy of the Sydney Household Travel Survey with GPS. Transportation, 34, 723–741. doi:10.1007/s11116-0079126-8 Stopher, P. R., & Greaves, S. (2010). Missing and inaccurate information from travel surveys: Pilot results. Working Paper ITLS-WP-10-07, University of Sydney. Stopher, P. R., Kockelman, K., Greaves, S. P., & Clifford, E. (2008). Reducing burden and sample sizes in multi-day household travel surveys. Journal of the Transportation Research Board, 2064/2008, 12–18. Stopher, P.R., Wargelin, L., Minser, J., Tierney, K., Rhindress, M., & O’Connor, S. (2012). GPS-based household interview survey for the Cincinnati, Ohio Region — Final Report for the Ohio Department of Transportation, Office of Research and Development and the U.S. Department of Transportation, Federal Highway Administration. Strnad, D. (2008). Enhanced POSTAR Mobility Measurement Technology. Prague: Czech Republic. Swiss Federal Statistical Office. (2006). Ergebnisse des Mikrozensus 2005 zum Verkehrsverhalten. Neuchatel: Swiss Federal Statistical Office. Wally, R. (2012). MOTION-FF — Analyse GPS basierter Mobilita¨tsdaten zur Etappen- und Verkehrsmittelidentifikation fu¨r Fahrrad und FuXga¨nger, Endbericht. IV2Splus — intelligente Verkehrssysteme und Services plus — ways2go 3. Ausschreibung, gefo¨rdert aus Mitteln des BMVIT, Vienna, Austria. Wolf, J., & Bricka, S. (2011). Chicago regional household travel inventory — GPS final report. Prepared for Chicago Metropolitan Agency for Planning. Wolf, J., & Lee, M. (2008). Synthesis of and statistics for recent GPS-enhanced travel surveys. Paper presented at the 8th international conference on survey methods in transport, Annecy, France.

Chapter 17

Correcting Biographic Survey Data Biases to Compare with Cross-Section Travel Surveys Francis Papon

Abstract Purpose — The purpose of the chapter is to make retrospective data from biographic surveys comparable with traditional cross-section travel surveys, by correcting some biases attached to the biographic collection method. This is applied to a biographic survey passed in France within the 2007–2008 national travel survey. Methodology/approach — The methodology implemented deals with three specific biases: the general survey sampling and response rate, the survival bias, due to differential surviving rates according to generations, and the geographical bias, as biog‘raphies were not passed in all regions. All biases were corrected by computing specific weightings. Findings — One main finding is that with these three corrections, biographic data can yield modal shares for commuting trips to work and for commuting trips to education that are similar to those derived from the historical crosssection surveys about regular trips. Research limitations/implications — Though biographic collection suffers from the memory effect, this effect remains low and does not disturb the modal shares derived from biographies. The most challenging issue is that of missing generations that contributed to past mobility. But they can be replaced by modeling with an age-period model.

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

322

Francis Papon

Practical implications — The chapter provides methodology to correct biographic data to reconstitute historical behavior. Social implications — Exploring the memory of living people is essential to save data about the past, that otherwise could be lost, although they may be useful to understand present behavior and future likely trends. Originality/value of chapter — Investigating biographic surveys is a new topic in the field of transport survey methods. Keywords: Biographies; survey; comparison; bias; weighting; travel behavior

17.1. Introduction Biographic surveys provide retrospective data over respondents’ life courses. Bertaux (1980) studied their methodology and validity. As they rely on the respondents’ memory, one of the problems is date error. All researches on the event retention function agree to state that it performs in a decreasing way with time. The event date setting does raise more issues: a host of publications describe telescoping ‘‘time compression toward present’’, past events are perceived by the interviewee as having occurred more recently than it was actually the case. Dating error risks are all the greater when events occurred at more remote dates (Courgeau & Lelie`vre, 1993). But the dating errors do not affect the order of events (Courgeau, 1991). Biographic surveys have been implemented to collect information in different fields such as social links, social change, or migrations. In the field of travel behavior, they were conducted in the United Kingdom (Pooley & Turnbull, 2000), Switzerland (Axhausen, 2006), Germany (Schoenduwe, Mueller, Peters, & Lanzendorf, 2009), and France (Papon, Hubert, & Armoogum, 2007; Papon, Roux, & Marchal, 2009). This last biographic survey was included in the 2007–2008 French National Travel Survey (FNTS). It yielded 1133 useable completed questionnaires. It gives information about respondents’ residence place, household size, number of people below 18, four wheelers and two wheelers motor vehicle stock, main activity, activity location, and main travel mode. Each change during the respondent’s life in any of these items is reported on a grid, displaying one column per item, and one line per year. From these data, it is not possible to assess the amount of travel performed, but it is possible to estimate the modal split for commuting trips to work or education as well as the commuting distance. It is also possible to derive the modal split for people without an activity. These results can be computed for all years from when a sufficient number of respondents had been born: from the 1930s for education trips and from the 1950s for work trips. It is interesting to compare these results with those

Correcting Biographic Biases to Compare with Cross-Section Data

323

of the historical cross-section travel surveys that were passed in France in the past (1966–1967, 1973–1974, 1981–1982, 1993–1994, and 2007–2008). Nevertheless, a certain number of differences between both kinds of results occur, due to several biases that affect biographic results. The purpose of this chapter is to describe how some of these biases have been corrected. It will deal with the general sampling and non-response rate of the 2007–2008 FNTS, with survival biases, and with geographical biases. Comparisons between biographic survey results and historical cross-section survey results will also be given.

17.2. Correcting Biographical Survey Biases 17.2.1. Survey Sampling Bias The first bias is not specific to biographic collection and derives from the general sampling of the 2007–2008 FNTS within which the biographic questionnaire was passed, and from the general response rate of this survey. The 2007–2008 FNTS sampling scheme oversampled on the one hand households with two cars or more, and on the other hand, households living in rural areas. Moreover, in some regions, regional extensions were added, according to a sampling scheme meeting local needs. These sampling schemes can be corrected by weighting underrepresented or overrepresented categories. Besides, response rates varied according to household characteristics: in particular households living in single family units have a better response rate than those living in multifamily units. Following a long established methodology (Armoogum, 2002), to take into account these response mechanisms, and sampling errors, a margin correction was performed with some variables: position social category of the reference person, gender*age group of the reference person, household type, building type, reference person’s nationality, living zone, number of individuals by gender*age group, household car ownership, survey month. This correction dealt with households’ and individuals’ responses to the first survey visit. But the biographic questionnaire was passed at the end of the second visit, and only to a randomly sorted individual in each household according to the ‘‘Kish’’ method. In addition, non-response could occur for this second visit. For that, a second margin correction was applied to some variables to take into account this second selection and response rate distortions. The variables were: position social category of the ‘‘Kish’’ individual, gender*age group of the ‘‘Kish’’ individual, household size of the ‘‘Kish’’ individual, ‘‘Kish’’ individual’s living zone, ‘‘Kish’’ individual’s household car ownership, survey month, survey day. This second weighting lead to a weighting coefficient for the ‘‘Kish’’ individual, denoted pondki. This coefficient is used to correct biographies with respect to the sampling and non-response of the FNTS. In general, the survey was answered by the ‘‘Kish’’

324

Francis Papon

individual, but sometimes by another person (contrary to survey instructions). Anyway, the pondki coefficient is used for these grids as it corrects the FNTS biases at the household level. 17.2.2. Survival Bias The second bias is closely attached to the biographic data collection method. In fact, only individuals that had survived could be interviewed. For a given year n in the past, generations that had been present in that year n had undergone attrition due to mortality between year n and survey year (2008). This attrition differs according to generations. It is possible to correct this selection bias by assuming that those persons who passed away had had the same behavior as the persons from the same generation who had survived (and thence could answer the survey). For that, surviving tables are used (Vallin & Mesle´, 1989, 2001). For each past year n, each generation is weighted in a reciprocal proportion to the survival rate of this generation between past year n and survey year. If S(n, a) represents the number of surviving persons aged a in year n for 100,000 births of generation n–a, S(2008, a + 2008–n) represents surviving persons in 2008 (survey year) for generation n–a, and this generation size was divided by S(n, a)/S(2008, a + 2008–n) between year n and 2008. If it is assumed that those who passed away before 2008 had in year n the same mobility behavior than those who survived from the same generation, observations of surveyed survivors must be weighted with this factor to correct this bias. That has been done, for each year and each age. The computation can be improved by making a gender distinction. But the main problem is that this weighting gives high importance to rare individuals, who are the oldest in the first generations that survived, with weights exceeding 500. Unfortunately, this method can only compensate losses of generations that had survived until the survey date. For previous generations that had completely disappeared, no data is available. Yet, those bygone generations also contributed to mobility in past years. For that, biographic data can only provide an estimate of past mobility for the surviving generations, that is to say the youngest ones. The remoter into the past, the more generations are missing. For example, in 1950, the oldest individual was a one-hundred-year old respondent, who was then only 43. The behavior of individuals who were then older is unknown. To compensate for this lack of information, only reconstruction methods through modeling and extrapolating can be developed.

17.2.3. Geographical Bias The third considered bias is linked to geographical reasons. Biographies were passed only in 17 out of 22 metropolitan French regions. But many surveyed individuals

Correcting Biographic Biases to Compare with Cross-Section Data

325

had moved from one region to another during their lives, which gives information on the past for all regions. If it is assumed that the persons who had left one region had had the same behavior than those who had stayed in the same region, the region selection bias can be corrected by giving underrepresented regions a compensating weight. The retained geographical level is that of ZEAT (study and territorial planning zones). The Ile-de-France ZEAT and North ZEAT both include a single French region that was not surveyed (this is more problematic for the North than for Ile-deFrance, as many people move from the Paris region into the provinces). The West ZEAT includes two regions without biography (Brittany and Loire Countries), and one region with biographies (Poitou-Charentes). The weight attached to each ZEAT at a given period is set proportionally to the whole ZEAT population at this period, though only the youngest persons are present in the biographic survey (those who survived). To be more accurate, only the surveyed age groups should be taken into account, but as the largest bias is the lack of the oldest age groups, it is rather secondary to make age group structure distinctions for the surviving age groups. The considered periods are decades (it is possible to consider years, but it would not change much the resulting weights, with more variations due to sample size). Data come from 14 censuses from 1906 to 1999: 1906, 1911, 1921, 1926, 1931, 1936, 1946, 1954, 1962, 1968, 1975, 1982, 1990, 1999, estimates from the continuous census for 2007 and projections for 2009. These data are at the region level. For years between censuses, linear interpolations were made. Then the results were summed by decade and ZEAT (Table 17.1). These data were compared to the distribution obtained in the biographic grids by region and decade, after weighting for correcting the FNTS sampling and nonresponse, and the survival bias. Correcting weights were computed so as to get the historical distribution. The reference decade was the last full one (1990–1999). For this reference decade, the regional weight is computed as the ratio of actual population as given in Table 17.1, to that derived from biographies. Thus, the weight attached to one year of a grid represents the number of persons who are aliased by this single grid year. For other decades, the weight is also taken so as to get the same regional structure than in the historical distribution. But, the average weight across regions is made equal for all decades. So, only the synchronic geographical biases are corrected, not the differences over time, which would imply high weights to past decades to compensate for the lack of the oldest generations in these decades. And this would lead to a biased age structure by giving too high weights to younger age groups. So the population in past decades is not totally reconstructed, but only the generations that were surveyed in biographies. As a result, the weight is only a part of total population: the missing part of the population amounts to 11% in the 1970s, 21% in the 1960s, 33% in the 1950s, 54% in the 1940s, 71% in the 1930s, and 86% in the 1920s. Only metropolitan French regions were weighted. This geographical weighting completely erases life years spent abroad, or in unidentified places. The resulting

326

Francis Papon

Table 17.1: Distribution of metropolitan French population by ZEAT, for different decades (million person-years). ZEAT

1910–1919 1920–1929 1930–1939 1940–1949 1950–1959 1960–1969

Ile-de-France Paris basin North East West South-West Central-East Mediterranean All

54.5 82.2 29.4 39.1 62.4 52.0 49.5 37.4 406.5

ZEAT

1970–1979

1980–1989

1990–1999

2000–2009

All

97.7 95.7 39.0 48.7 68.6 55.4 60.6 57.1 522.8

102.7 100.5 39.4 49.9 72.8 57.9 64.5 62.9 550.6

108.0 103.6 39.8 50.9 76.1 60.6 68.1 68.4 575.6

113.9 106.1 40.1 52.7 81.6 65.3 72.6 75.1 607.5

833.5 892.9 352.2 441.1 663.7 540.6 565.1 514.1 4 803.2

Ile-de-France Paris basin North East West South-West Central-East Mediterranean All

60.2 78.7 30.1 37.6 59.4 49.3 48.7 38.8 402.8

67.3 79.1 32.1 39.1 59.0 49.0 49.2 43.0 417.7

66.9 77.7 31.3 37.2 59.2 49.2 47.7 40.3 409.7

74.4 81.3 33.9 40.6 60.8 49.6 49.3 41.9 431.7

87.7 88.0 37.1 45.3 63.9 52.2 54.7 49.2 478.2

Source: INSEE. Population censuses 1906, 1911, 1921, 1926, 1931, 1936, 1946, 1954, 1962, 1968, 1975, 1982, 1990, 1999, and estimates for 2000–2009.

weight vary from 2 to 8 for the most present ZEAT in biographies (East), from 13 to 149 for Ile-de-France, from 11 to 678 for the North, from 21 to 102 for the West, and from 18 to 41 for the Mediterranean ZEAT. Average is 13.

17.2.4. Overall Weighting Finally, the resulting weight pondes is the product of all three preceding weights, taking into account the sampling and non-response biases, the survival bias, and geographical biases. Nevertheless, the value was capped to avoid giving an excessive weight to some individual years. Indeed, rare generations and rare regions were attributed very high weights. The maximum weight is set to 750,000 person years per life year. This cap affects less than 1% of life years. 99% of life years show a weight below 477,951,

Correcting Biographic Biases to Compare with Cross-Section Data

327

90% below 124,019, 75% below 58,007 (third quartile), 50% below 26,790 (median), 25% below 10,615 (first quartile), and a little more than 13% of life years display a zero weight (abroad and unidentified places). Average weight is 53,332. The heaviest life year is 14 times as heavy as the average life year. This resulting factor does not correct other households’ or individuals’ characteristics than region. But the obtained frequencies show that structures by gender, social category, household size, and car ownership level are relatively satisfactory. Age structure is also correct, but of course for the absence of bygone generations.

17.3. Comparison of Biography Results with Travel Survey Data To check the validity of modal shares obtained from biographical data, it is necessary to compare them with historical cross-section travel surveys. In the biographical inset of the last FNTS 2007–2008, data about travel modes are limited to one main mode per person per year, for commuting trips only for those who have a work or education place, or for other trips for other people. These data can be used to draw modal splits: modal shares by period and age group, for commuting trips to education or to work. In all FNTS from 1973–1974, one section is devoted to the description of regular trips, that is, commuting trips to work or education (education from FNTS 1981– 1982 only), for all individuals in the household. While this section provides many more details about commuting alternatives than the biographies do, it also gives for each individual a main mode used for commuting, which is the closest variable to that derived from biographies. To increase the number of life years involved in the comparison, for a single historical survey year, several years from the biographies around this survey year are used for comparison (the midpoint between two successive FNTS is used to cluster one year into one or the other). So, years 1978– 1987 from biographies are compared with FNTS 1981–1982; years 1988–2000 from biographies are compared with FNTS 1993–1994; and years 2001–2008 from biographies are compared with FNTS 2007–2008. Besides, modes are aggregated to get relevant comparisons. Apart from the previously described difference in definitions, and the biases investigated in this chapter, the main difference between both data sets lies in the memory process: in biographies, respondents have to recall remote behavior in the past, while in cross-section surveys, they only describe their current behavior. A previous tentative comparison was made (Papon et al., 2009), without using weights for biographies. But some differences appeared that could be attributed to one of the three biases that were corrected in the previous section. So, now, modal shares from weighted biographies are compared with that from fully representative historical cross-section travel surveys. As regards commuting trips to education (Tables 17.2a and 17.2b), and the 1978– 1987 period, walk is overestimated in biographies, to the expense of some of the other

328

Francis Papon

Table 17.2a: Modal split of regular trips to education by age group, FNTS 1981–1982. Modes

Age

%

0–5

6–14

15–17

18+

Walk Two wheelers Car driver Car passenger Public transport, other

53 1 0 40 7

57 5 0 18 20

26 20 0 6 48

14 13 21 4 48

Source: INSEE-INRETS-OEST, FNTS 1981–82, regular trips.

Table 17.2b: Modal split of regular trips to education by age group, biographies years 1978–1987. Modes

Age

%

0–5

6–14

15–17

18+

Walk Two wheelers Car driver Car passenger Public transport, other No travel and don’t know

68a 0 1 25b 5 0

59 6 1 13b 20 1

33a 5b 1 5 45 11a

29a 6b 11b 7 43b 4

Source: SOeS-INSEE-INRETS, FNTS 2007–2008, biographies years 1978–1987. Notes: a Cases where the value computed after biographical surveys exceeds by five points or more that after the historical cross section survey. b Cases where the value computed after biographical surveys is lower by five points or more than that after the historical cross section survey. In other cases both values are similar.

modes (car passenger until age 14), but for the main age group 6–14, differences are low. For commuting trips to education for the 1988–2000 period (Tables 17.3a and 17.3b), differences are observed for those aged 0–5 (rare in biographies) and for walk for those above 18. Obvious coding errors in biographies between car driver and car passenger explain the observed gaps for the 6–14 age groups. Otherwise, both tables remain close.

Correcting Biographic Biases to Compare with Cross-Section Data

329

Table 17.3a: Modal split of regular trips to education by age group, FNTS 1993–1994. Modes

Age

%

0–5

6–14

15–17

18+

Walk Bicycle Powered two wheelers Car driver Car passenger Public transport, other

42 1 0 0 51 7

42 3 0 0 31 24

22 2 5 2 13 57

14 2 2 29 8 44

Source: INSEE-INRETS-OEST, FNTS 1993–1994, regular trips.

Table 17.3b: Modal split of regular trips to education by age group, biographies years 1988–2000. Modes

Age

%

0–5

6–14

15–17

18+

Walk Bicycle Powered two wheelers Car driver Car passenger Public transport, other No travel and don’t know

55a 0 0 8a 23b 14a 0

45 2 0 5a 23b 23 0

20 2 5 1 8b 62a 2

22a 1 3 26 5 42 3

Source: SOeS-INSEE-INRETS, ENTD 2007–2008, biographies years 1988–2000. Notes: a Cases where the value computed after biographical surveys exceeds by five points or more that after the historical cross section survey. b Cases where the value computed after biographical surveys is lower by five points or more than that after the historical cross section survey. In other cases both values are similar.

As far as commuting trips to work in the 1978–1987 period are concerned (Tables 17.4a and 17.4b), biographies overestimate car driver for those aged 45–64 to the detriment of all other modes, and walk for those aged 25–34 to the detriment of public transport and car passengers. But overall, no particular mode is overestimated.

330

Francis Papon

Table 17.4a: Modal split of regular trips to work by age group, FNTS 1981–1982. Modes % Walk Two wheelers Car driver Car passenger Public transport, other

Age 18–24

25–34

35–44

45–54

55–64

15 14 41 9 22

12 8 53 9 18

15 9 50 8 18

18 12 43 8 20

22 17 33 6 23

Source: INSEE-INRETS-OEST, FNTS 1981–1982, regular trips.

Table 17.4b: Modal split of regular trips to work by age group, biographies years 1978–1987. Modes % Walk Two wheelers Car driver Car passenger Public transport, other No travel and don’t know

Age 18–24

25–34

35–44

45–54

55–64

18 11 41 2b 23 4

19a 11 52 4b 13b 2

15 8 54 7 14 2

13b 12 53a 3b 13b 7

11b 9b 71a 0b 4b 4

Source: SOeS-INSEE-INRETS, FNTS 2007–2008, biographies years 1978–1987. Notes: a Cases where the value computed after biographical surveys exceeds by five points or more that after the historical cross section survey. b Cases where the value computed after biographical surveys is lower by five points or more than that after the historical cross section survey. In other cases both values are similar.

Now, considering commuting trips to work in the 1988–2000 period (Tables 17.5a and 17.5b), differences are scarcer, and do not concern a particular mode. That means that the modal structure is relatively well reconstructed by biographies without any specific bias. For the most recent 2001–2008 period (Tables 17.6a and 17.6b), differences in commuting trips to work are again scarce, without any privileged mode. As it is a

Correcting Biographic Biases to Compare with Cross-Section Data

331

Table 17.5a: Modal split of regular trips to work by age group, FNTS 1993–1994. Modes % Walk Bicycle Powered two wheelers Car driver Car passenger Public transport, other

Age 18–24

25–34

35–44

45–54

55–64

13 2 5 52 8 19

9 2 2 64 7 15

9 3 2 63 7 15

12 3 1 63 6 15

18 4 2 55 5 17

Source: INSEE-INRETS-OEST, FNTS 1993–1994, regular trips.

Table 17.5b: Modal split of regular trips to work by age group, biographies years 1988–2000. Modes % Walk Bicycle Powered two wheelers Car driver Car passenger Public transport, other No travel and don’t know

Age 18–24

25–34

35–44

45–54

55–64

15 1 1 53 8 18 4

12 2 6 59b 4 15 2

13 2 7a 59 4 13 2

11 1 4 55b 5 21a 2

11b 6 1 64a 3 12b 2

Source: SOeS-INSEE-INRETS, FNTS 2007–2008, biographies years 1988–2000. Notes: a Cases where the value computed after biographical surveys exceeds by five points or more that after the historical cross section survey. b Cases where the value computed after biographical surveys is lower by five points or more than that after the historical cross section survey. In other cases both values are similar.

recent period, it is normal that differences are lower, as memory errors are less expected in the retrospective survey. Finally, this comparison shows that biographies can reconstruct past modal split without any particular bias, as far as commuting trips to work or education are concerned.

332

Francis Papon

Table 17.6a: Modal split of regular trips to work by age group, FNTS 2007–2008. Modes % Walk Bicycle Powered two wheelers Car driver Car passenger Public transport, other

Age 18–24

25–34

35–44

45–54

55–64

13 2 5 54 8 17

9 2 3 68 3 16

8 2 4 72 3 11

10 2 2 70 3 12

14 2 1 67 3 13

Source: SOeS-INSEE-INRETS, FNTS 2007–2008, regular trips.

Table 17.6b: Modal split of regular trips to work by age group, biographies years 2001–2008. Modes % Walk Bicycle Powered two wheelers Car driver Car passenger Public transport, other No travel and don’t know

Age 18–24

25–34

35–44

45–54

55–64

14 1 1 53 9 21 3

3b 5 2 74a 2 14 1

13a 3 3 73 1 4b 3

7 0 6 71 2 12 2

8b 1 2 59b 6 21a 3

Source: SOeS-INSEE-INRETS, FNTS 2007–2008, biographies years 2001–2008. Notes: a Cases where the value computed after biographical surveys exceeds by five points or more that after the historical cross section survey. b Cases where the value computed after biographical surveys is lower by five points or more than that after the historical cross section survey. In other cases both values are similar.

17.4. Conclusion Biographical travel surveys can be used to get data on past mobility. To quantitatively process these data, it is necessary to overcome some biases attached to biographies. This chapter relates the weighting of biographic life years to correct three biases: the sampling and response rate of the survey, the survival rate, and the geographical bias. With such a weighting, it was shown that biographic data could be

Correcting Biographic Biases to Compare with Cross-Section Data

333

60% BIC DRM 50%

NSP PDD TCA

40%

car driver

VCO VPA

walk

MAP

30%

BICmin

no travel

DRMmin 20%

NSPmin

bicycle public transport 10%

motorcycle don't know

PDDmin car passenger

TCAmin VCOmin VPAmin MAPmin

0% 1920-29 1930-39 1940-49 1950-59 1960-69 1970-79 1980-89 1990-99 2000-09

Figure 17.1: Modal split and confidence intervals of main modes after an age-period model, France, 1920–2008. Source: SOeS-INSEE-INRETS, FNTS 2007–2008, biographies, and Papon (2011) modeling. Note: The total of shares is 100%. The mode codes and confidence intervals are: bicycle BIC [BICmin, BICmax], powered two-wheelers DRM [DRMmin, DRMmax], don’t know NSP [NSPmin, NSPmax], no travel PDD [PDDmin, PDDmax], public transport and others TCA [TCAmin, TCAmax], car driver VCO [VCOmin, VCOmax], car passenger VPA [VPAmin, VPAmax], reference: walk MAP ¼ MAPmin ¼ MAPmax. used to reconstruct past modal splits for commuting trips that were similar to actual modal splits as measured by historical cross-section surveys. This comparison will be enhanced in a coming Ph.D. thesis. The biographic data were also used to tabulate some 60,000 life years from 1907 to 2008 from biographies with respect to a number of other variables: generation, decade, age, gender, residence region, social category, home to activity distance, household size, number of persons under 18 in household, number of cars in household, number of powered two wheelers, main activity, usual travel mode. These tabulations can be done without, or with the computed weights. Cross-tabulations between two of these variables, for some relevant pairs of variables, were also computed. The one-dimension tables have no direct physical meaning, as they mix results for different years: they in fact give an average distribution over one century, with a very specific weighting.

334

Francis Papon

But directly understandable results are yielded by cross tabulating these variables with one (or two) of the three temporal dimensions: period, generation and age group, provided that the sample is sufficient. For example, the share of each mode was computed in two-dimension tables according to two of the three temporal dimensions. In each of these tables, a triangular part of the table is missing, for different reasons. In the period-age tables, the upper right triangle corresponding to old ages in earlier periods is missing because the corresponding generations did not survive until the survey. In the period-generation tables, the upper right triangle corresponding to recent generations in earlier periods is missing because they correspond to age groups before birth. In the generation-age tables, the lower right triangle corresponding to older persons of recent generations is missing because they correspond to future periods. Such tables allow a study of the evolution of mode use over nearly one century. To take into account all generations in past periods, including those who did not survive until the survey date, different generalized logistic model were tested (Papon, 2011) to estimate the main means shares. Some models use a number of explanatory variables, but many variables are difficult to know in the past. Generation has a low effect, while period shows a high effect. The three variables age, period, and generation are of course correlated. For that, it appeared that the age-period model was the easiest model to estimate past modal share, taking into account past age groups share. Walk is the reference mode. Confidence intervals are computed for other modes (Figure 17.1). In particular, the ‘‘don’t know’’ modality shows a decreasing share with time. But even for the remotest decade, this share is below 10%. As this ‘‘don’t know’’ share is a sort of measure of the memory effect that was not corrected in this chapter, it should not disturb the relative shares of other modalities too much, at least somewhat less than the statistical error shown in the confidence intervals. The survival bias is more critical: as no car driver and no motorcycle driver from the 1920s and 1930s survived until the survey, the model can only give a nil share to these modes, against historical sources that describe motorcars and motorcycles in those periods.

Acknowledgment Raw data were processed thanks to a funding by ADEME, the French agency for environment and energy.

References Armoogum, J. (2002). Correction de la non-re´ponse et de certaines erreurs de mesures dans une enqueˆte par sondage: Application a` l’enqueˆte Transports et Communications 1993–1994. Rapport INRETS, no. 239, 173pp. INRETS, Arcueil.

Correcting Biographic Biases to Compare with Cross-Section Data

335

Axhausen, K. W. (2006, May). New survey items for a fuller description of traveller behavior (Biographies and social networks). TRB Travel Demand Forecasting Conference, Austin. Bertaux, D. (1980). L’approche biographique. Sa validite´ me´thodologique, ses potentialities.  Cahiers Internationaux de Sociologie, vol LXIX, n spe´cial ‘‘Histoire de vie et vie sociale’’, P.U.F., pp. 197–225. Courgeau, D. (1991). Analyse des donne´es biographiques errone´es. Population, 46(1), 89–104. doi:10.2307/1533611 Courgeau, D., & Lelie`vre, E. (1993). Nouvelles perspectives de l’analyse de´mographique. Cahier que´be´cois de de´mographie, 22, 23–43. Papon, F. (2011). The evolution of bicycle mobility in France. XXII International cycling history conference (ICHC), Paris (May 25–28). Papon, F., Hubert, J. P., & Armoogum, J. (2007). Biography and primary utility of travel: New issues in the measurement of social contexts in the next French National Travel Survey. WCTR conference, Berkeley, CA (June 24–28, 29pp). Papon, F., Roux, S., & Marchal, M. (2009). A biographic survey to be compared with past travel surveys in France. 12th IATBR conference, Jaipur, India (December 13–18). Pooley, C. G., & Turnbull, J. (2000). Modal choice and modal change: The journey to work in Britain since 1890. Journal of Transport Geography, 8(2000), 11–24. doi:10.1016/S09666923(99)00031-9 Schoenduwe, R., Mueller, M., Peters, A., & Lanzendorf, M. (2009). Analyzing mobility biographies with the life course calendar: Evaluation of a quantitative instrument for collecting retrospective longitudinal data. 12th IATBR conference, Jaipur, India (December 13–18). Vallin, J., & Mesle´, F. (1989). Reconstitution de tables annuelles de mortalite´ pour la France au XIXe sie`cle. Population, 44(6), 1121–1158. Vallin, J., & Mesle´, F. (2001). Tables de mortalite´ franc- aises pour les XIXe et XXe sie`cles et  projections pour le XXIe sie`cle. E´ditions de l’INED N 4-2001.

Chapter 18

WORKSHOP SYNTHESIS: COMPARATIVE RESEARCH $ INTO TRAVEL SURVEY METHODS Jimmy Armoogum and Marco Diana 18.1. Introduction The historical evolution of survey methods has led to a wide array of protocols that are nowadays available for the transport sector. Technology implementations are in fact enabling a growing number of ways to collect data related to mobility behaviours. There is thus an emerging need to understand (1) which are the best ambits of use of each method and (2) how comparable are travel-related data that have been collected through different survey instruments. Given the complexity and the extension of this topic, the workshop focused on some of the key issues at stake, also on the basis of the conference papers (Kagerbauer, Manz, & Zumkeller, 2013; Kohla & Meschik, 2013; Papon, 2013), that fuelled the initial discussion and that are reported in separate chapters of this book, with the added contribution of some of the workshop-associated papers (Diana, 2011; Hubrich & Wittwer, 2011; Moutou, Greves, & Puckett, 2011), whose main findings have been discussed as well during the sessions. In particular, workshop participants debated on the following aspects:

(1) Why do we need comparative studies of survey methods? This seems a bit of a provocative question, however it represents the preliminary step to understand the relevance and the practical implications of what seems a theoretically interesting and even intriguing task, at least for researchers.

$

Participants: Jimmy Armoogum (France) (Chair), Marco Diana (Italy) (Rapporteur), Flavio Devillaine (Chile), Bernhard Fell (Germany), Mark Freedman (USA), Stefan Hubrich (Germany), Tuuli Ja¨rvi (Finland), Michael Meschik (Austria), Claudine Moutou (Australia), Francis Papon (France), Toky Randrianasolo (France), Gerd Sammer (Austria), Herrie Schalekamp (South Africa), Alan Thomas (Chile), Rico Wittwer (Germany), Dirk Zumkeller (Germany).

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

338

Jimmy Armoogum and Marco Diana

(2) Why should we change methods for collecting mobility data? In many countries, there is by now a consolidated practice related to travel surveys, which in some cases dates back several decades. Opportunities linked to technological advances should be weighed against risks in abandoning well-established paths that have given in most cases satisfactory results up to now. (3) Given a series of typical features concerning mobility surveys, which survey methods give the best results and which are less suitable for them? By answering such a question, workshop participants structured their discussion on the state of the art and on some of the most recent findings of comparative research into travel survey methods.

18.2. Why do We Need Comparative Studies of Survey Methods? There are at least two good reasons to develop such comparative studies. The first reason takes the decision makers point of view, who are increasingly paying attention to new opportunities given by technology, that are sometimes naively seen as a panacea and a perfect even less expensive substitute of more traditional surveying techniques. Researchers have to check if the stakeholders’ wish to switch for example to passive data collection through GPS can give data that are comparable with past ones and can be used in the same way for modelling purposes. Conversely, it is also true that a too conservative attitude of analysts, that is merely trying to stick to well consolidated analytic methods, could miss new opportunities and ways of more efficiently and effectively exploiting data coming from innovative survey protocols. The second main reason is given by the fact that different travellers’ profiles are more fitted to answer specific survey instruments. This was largely shown in some of the papers associated to this workshop (Kagerbauer et al., 2013, p. 293) and could prompt for a shift to multi-protocol surveys in which each respondent is given more than one option to answer a questionnaire. In such case, there is a need to understand how comparable data collected in different ways are, since different methods are likely to induce different biases. In more general terms, during the discussion it was pointed out that comparability issues are crucial for data quality, which is regretfully an aspect that is not much valued by decision makers. The choice of the method depends also on the specific business case: one recommended practice for agencies funding travel surveys is to include quality criteria in calls for tenders, after which the offer could include the survey methods that are to be used to most efficiently achieve the sough quality level. Finally, it was noted that comparative research is needed because the best method to collect any kind of data depends also on the purpose of the study, namely the way the resulting data will be exploited. In some cases, data are collected only to study availability and use in transport or compile descriptive statistics, in other cases they may feed trip based, activity-based or micro-simulation models. In addition to that, survey methods are rapidly changing, as well attested by the online travel survey manual of the TRB travel survey methods committee (http://www.

Workshop Synthesis: Comparative Research into Travel Survey Methods

339

travelsurveymanual.org/) that is in a wiki format to give the possibility of being constantly updated as research progresses.

18.3. Why Should We Change Methods for Collecting Mobility Data? Travel surveyors are facing an increasing pressure to compress their budget, and new technologies are often simply seen as a way to save money. Yet this was not always the case, at least in the experience of some of the workshop participants. For example, passive (GPS) data collection methods generate huge amounts of information that are very resource consuming. Data losses due to technical problems are common. If they need to be manually analysed, as it is sometimes done, for example to geocode trip ends or assign travel modes and trip purposes, sometimes costs per person-day or per trip are even higher compared to standard travel diaries, and moreover these costs are less easily predictable. To sum up, changing methods is an opportunity mainly given by technology development; however, one needs to preserve data quality and comparability while implementing such shift. Data comparability can be seen both with a longitudinal perspective, that is related to different surveys made over time on the same territory, and with a transversal perspective, to allow the comparison of the same mobility figures across different territories. This latter aspect is quite relevant for example in the EU to have harmonized statistics among member states (EUROSTAT is planning an EU-wide travel survey), whereas in the United States local surveys are mainly done for modelling-planning purposes within a given time line. It was suggested that one of the ways to achieve transversal comparability is to bind survey funding from a central agency to local authorities with the respect of some standard and protocols, as for example it is the case in France. Concerning longitudinal comparability, one of the workshop papers triggered the discussion on the extent to which it is possible to ‘mimic’ a panel survey by using retrospective biographic grids (Papon, 2013). Some recommendations to preserve data comparability when changing survey instrument emerged during the discussion:  It is recommended to have subsamples that use both the old and the new method or technology and to perform some statistical tests for significance of eventually observed differences.  One should be careful to keep the same definitions of the variables to be measured when changing methods. For example, trip definitions should not change simply because it is easier with a GPS to detect any physical movement, otherwise the resulting descriptive statistics would not be comparable across different methods.  Consider shifting to new technologies only when all the knowledge is there, in particular when it is possible to automatically post-process data in a reliable way. If this cannot be ensured, it could be better to have a mixed approach, where, for example GPS can increase the accuracy of old measures. Creative uses of new technologies can also be envisaged: since it has been observed that survey

340

Jimmy Armoogum and Marco Diana

respondents with a GPS tend to fill their travel diaries more accurately, one could think about a GPS placebo, that is an empty box given to respondents and alleged to be a GPS device.  Beyond technology-induced changes, one should consider that questionnaire contents need in any case to be periodically updated as well, so that comparability issues are a more general problem that is sooner or later arising in any case. Therefore, even if variables definitions necessarily evolve (e.g. horses are generally no more considered as a travel mode, quads came up in more recent yearsy), one should try to keep a core structure of the questionnaire that is not changing over time. Another interesting and related issue that was mainly discussed in relation with paper (Hubrich & Wittwer, 2011) is the choice between household and personal travel surveys (and eventually the switch from one kind of survey to the other). In Western countries there are many examples of surveys of both kinds, and some countries changed from household to personal surveys. Yet there is not a clearcutting answer on which is the best method. Household surveys allow to capture more easily interaction effects among household members due, for example to the joint use of cars. Given the importance of this, whenever personal travel surveys are implemented one needs to add questions to gather information on such effects (travelling alone or with other family members, car ownership and use across the whole household etc.). This is likely to increase the respondent burden in personal surveys, especially when the demographics of household members is not known in advance, and increases the chances of having proxies that give responses for other members. On the other hand, response rates for personal surveys are usually higher. Concerning household surveys, there are also different sampling strategies concerning the household members to interview: should all of them be interviewed, only a few or just one? Usually only a subsample is interviewed, even if this is a clustering technique that, like any other, usually worsens estimations. Ultimately, it was agreed that the choice on the survey design depends on the likely data use, in particular on the importance of investigating household interaction effects (which are, for example quite relevant for activity-based models, whereas other techniques focus more on personal behaviours).

18.4. Advantages and Disadvantages of Different Survey Methods under Some Key Aspects Workshop participants preliminarily identified and defined a set of aspects that are related both to the implementation of travel surveys and to the variables that typically need to be measured. Concerning the travel surveys implementation, we considered the following three items:  Sample reach: the first step that is needed to implement a survey is to define the sample. This implies the availability of a sampling frame to adequately represent

Workshop Synthesis: Comparative Research into Travel Survey Methods

341

the universe, that is the target population in statistical terms. The possibility of building such frame and of correctly drawing a random sample (i.e. with known probabilities) is affected by the survey method, although such aspect has probably been less investigated up to now.  Recruitment: once the sample has been designed, it is necessary to be able to contact the observation units, which for travel surveys are typically either households or individuals. Like for the previous item, greater attention should be paid on the implications of different surveying methods.  Response rate: enlisting the cooperation of survey respondents and ensuring their willingness to thoroughly and completely answer to the questionnaire has traditionally been one of the key missions of the research in travel survey methods. Non-response issues are said to influence representativeness of the sample and should be addressed, the more so when response rates are low. The items typically being investigated in a travel survey and those which were also considered in the workshop evaluation exercise are the following:  Demographic characteristics customarily needed for planning purposes to link observed behaviours with more general socioeconomic trends.  Trips, intended as physical movements between two different places, not only where movement comes to a standstill, but also where (trip) purposes are accomplished.  Trip ends, that is the physical location of trip origins and destinations.  Trip purposes as the reasons for making the trips, distinguished by activities taking place at both origins and destinations.  Trip lengths, in terms of travelled distance.  Trip durations or travel time.  Trip stages or legs, in which we decompose a single trip when more than one travel means is used.  Trip fares/tolls, whose consideration is fundamental to correctly model observed behaviours. The above eleven aspects served as a comparative evaluation basis of the following five survey methods:     

PAPI (paper and pencil self-administered interview). CAPI (computer-assisted personal interview). CATI (computer-assisted telephone interview). CAWI (computer-assisted self-administered interview through the web). GPS (passive data collection through a global positioning system device).

It has to be mentioned that most of these survey methods only work reliably with additional prompted recall strategies, sometimes also face to face interviews are needed, for example to retrieve trip purposes in passive GPS-tracking surveys. The overall outcome of the group discussion is represented in the following table, where three stars indicate that the method is well suited in relation with a specific aspect,

342

Jimmy Armoogum and Marco Diana

Table 18.1: Evaluating the most important survey methods according to some key aspects.

Sample reach Recruitment Response rate Demographic characteristics Trips Trips ends Trips purpose Trips length Trips duration Trip stages and modes Trips fare/tolls

PAPI

CAPI

CATI

CAWI

GPS

*** ** * ** * *** *** * * * *

*** *** *** *** ** *** *** ** ** *** ***

** *** ** *** ** *** *** ** ** *** ***

* ** * ** ** *** *** ** ** ** ***

*** ** ** – *** ** * *** *** *** –

***The method is well suited in relation with the specific aspect. **Potential pitfalls of the method in relation with the specific aspect. *More serious shortcomings of the method in relation with the specific aspect. – Not relevant.

two stars point to some potential pitfalls and one star is assigned when more serious shortcomings were detected. A dash is placed where the discussion is not relevant. Some of the items represented in the table do not need any further comment because they are more or less obvious, whereas for others the discussion enlightened some aspects that are often overlooked or less known. In the following we focus on the most insightful and important issues that were pointed out (Table 18.1). Concerning the sample reach, the availability of a proper sample frame widely varies both across different methods and depending on the local context in which the survey is organized. In most European countries, surveyors can rely on households and/or personal registers, so that the sample frame for PAPI and CAPI methods is easily available, but this might not be the case in other countries. Sample frames for CATI typically are based on land phone directories and therefore readily available virtually anywhere, but in this case the sample reach is declining over time due to the declining penetration of such communication means. One could think about integrating or substituting land phone samples with cell phones, however this could give problems if the sample design departs from simple random sampling with equal probabilities (as it is virtually always the case in mobility surveys). In particular, weighting observations could become very complicated since cell phones lists are not associated to geography. On the other hand, despite the higher penetration level of cell phones compared to land phones in most countries, sampling biases due to the fact of excluding individuals or households that do not have cell phones could be higher than those induced by considering land phone subscribers, at least for some key mobility figures such as car ownership (Diana, 2011).

Workshop Synthesis: Comparative Research into Travel Survey Methods

343

Sample reach issues related to the use of CAWI are a serious matter, since no sampling frame, such as a list of e-mail addresses for the whole population, is available. Beyond still low penetration levels in some population segments that could nevertheless be a lesser issue in a more or less near future, this appeared to be an almost insurmountable problem of CAWI, making it difficult to implement a survey exclusively through such means. Nevertheless, advantages of CAWI are well known and were amply debated in previous ISCTSC conferences, making such survey protocol a very attractive and promising option whenever the universe is not the general population but rather specific groups of individuals for which e-mail lists are available (e.g. when organizing mobility surveys for employees in a given firm to set up travel commute plans). For surveys targeting the general population, a strategy that has been explored in recent years is to use CAWI as a secondary mode, that is giving the option to survey respondent, that were sampled with other methods, to answer an online questionnaire. One of the workshop papers reported on results of a survey with such option (Kagerbauer et al., 2013). Finally, concerning GPS, only a subsample is usually provided with such devices in current practice, so that sample reach is not yet an issue. However, at least in perspective, a GPS-based travel survey could draw a sample from the most convenient available frame depending on the specific context (person registries, phone directoriesy), so that this kind of survey protocol should not give problems on this aspect. Recruitment issues, according to the previous definition, show different patterns compared to sample reach across the considered methods. In general terms, selfadministered surveying modes (PAPI and CAWI) are not very effective, whereas assisted interviews (CAPI and CATI) are clearly more suited to enlist cooperation. GPS devices might raise privacy concerns. On the other hand, both the plenary discussion and some research being presented during the workshop (Moutou et al., 2011) have stressed that the shortcoming of self-administered protocols concerning recruitment can be strongly reduced by getting into touch with the sampling units through the right communication channel. Traditionally, a letter is sent to individuals to interview but this could not be the best strategy. In particular, trying to use the same communication channel that is foreseen to gather the responses has proven its effectiveness, so that for example for CAWI protocols a good idea might be to advertise the survey through some popular website. When it is possible to persuade respondents of their involvement being in their own interest or ‘for the greater good’, recruitment seems to be more efficient. Financial incentives are said to result in distorted participation throughout different social groups, resulting in unreliable samples. Response rates, as previously said, have been one of the main fields of investigation of research into survey methods. They are highly dependent on the target population, particularly concerning self-administered protocols, whereas when an interviewer is foreseen usually one obtains better results. Personal interviews after that the respondent agreed to participate and having fixed an appointment give of course the best results in terms of response rates, whereas interviews over the phone are more subject to both respondent availability and fatigue problems. One key

344

Jimmy Armoogum and Marco Diana

strategy to improve response rates for CATI is to call back respondents several times, reminding them to complete the task assigned to them, possibly more than ten times. Demographic characteristics are generally easier and more reliably achieved in personal interviews, whereas self-administered protocols generally give less accurate data. On the other hand, social psychologists tell us that human interactions between interviewers and interviewee can influence answers to some ‘socially sensible’ questions, so that ‘cold protocols’ that do not have such interactions such as self-administered surveys could lead to better results. This is more a concern in questionnaires on attitudes and opinions rather than in those asking for factual elements or acted behaviours like standard travel surveys, yet specific items such as income could be problematic. Care must be put in properly formulating the questions in order to minimize such possible source of biases, for example by more indirectly posing questions on sensible topics or asking for income brackets rather than exact figures. Concerning the measurement of trips, self-administered questionnaires have the very basic problem of failing to effectively communicate to respondents what is considered as a trip. As a consequence, especially short trips on foot and return trips to home are, for example quite systematically underreported, while others are inappropriately merged (e.g. considering workplace to shop and then to home movements as a single trip). The problem is possibly alleviated with CAWI, where a properly coded instrument can make some online checks on the responses, but can be very disturbing and is often overlooked in more naive PAPI travel surveys. Of course, automatic checks cannot easily intercept all the likely ways in which respondents can misrepresent their travel activities without inducing unnecessary respondent burden, and interactions with interviewers are usually more effective. Concerning GPS, it usually works well to detect physical movements; however, translating these data into trips is problematic. In fact, while trip ends are usually easier to survey with active protocols, GPS traces need to be carefully interpreted under this point of view. On one hand, the researcher has to understand if a prolonged stop is due to an activity being performed (thus configuring the destination of a previous trip and the origin of the following one) or to a change of travel mode (therefore defining two stages of the same trip). Then, the physical location has to be associated to an activity through geocoding (if the relevant datasets and related computational procedures are available) and/or asking the GPS carrier to fill-in a trip diary. Surveying other trip characteristics gives widely different results according to the method being used. Trip purposes are linked to the activities performed at different locations and are therefore easily understood and straightforwardly determined through most protocols, with the notable exception of GPS according to the previous discussion. The situation is almost the contrary concerning trip lengths, where GPS gives the maximum precision whereas the difficulty in estimating such figure is widely documented in the literature, particularly when public transport is used. The same evaluation pattern holds also for trip duration, even if the precision of responses is usually higher. In both cases, both assisted interviews and CAWI can implement devices such as interactive maps that can help the respondent to give a more reliable

Workshop Synthesis: Comparative Research into Travel Survey Methods

345

estimation of trip lengths and durations, whereas in PAPI questionnaires one can only rely on the capacity of respondents to estimate travel distances and times. More detailed results concerning the evaluation of GPS versus active data collection methods for several different trip characteristics can be found in two of the workshop papers (Kagerbauer et al., 2013; Kohla & Meschik, 2013). One central trip attribute that needs to be ascertained is the travel mode, giving origin to more than one trip stage if more than one mode is used. Self-administered surveys give less satisfactory results in this case, since respondents tend to report only one mode which seems most important to them and usually forget about trip legs on foot. There is no simple way online or offline to check their responses. Interviewers instead are instructed to insist on this point in order to record all the modes that have been used for a specific trip. GPS can be quite effective to detect trip stages, provided that it is possible to automatically detect the travel means being used by postprocessing the traces. This is in fact one of the most active research areas in the field. As a final note, workshop participants acknowledged that all the above discussed pros and cons of different methods should be weighted against the related implementation costs. While CAPI is outperforming other methods according to the above table, its costs per household are almost twice as those of CATI. On the other hand, CAWI and GPS are still to be considered as complementary or secondary survey methods. Most interestingly, it was pointed out that CAWI surveys are often more expensive than expected because of the tendency to underestimate the effort needed to carefully design, implement and test the instrument on one hand and to postprocess the resulting dataset on the other. It is however acknowledged that one of the advantages of such a method is the possibility of more easily achieving scale economies in case of larger or repeated surveys.

References Diana, M. (2011). Relationships among household communication tools ownership and use, ICT skills and mobility patterns: Evidence from Italy. Poster presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile (14–18 November). Hubrich, S., & Wittwer, R. (2011). Household or individual as unit of analysis? Consequences of Different Interview Selection Strategies Concerning Travel Behaviour, Using the Example of the Survey ‘Mobilita¨t in Sta¨dten – SrV’ (Mobility in Cities – SrV). Poster presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile (14–18 November); paper available in the conference proceedings. Kagerbauer, M., Manz, W., & Zumkeller, D. (2013). Analysis of PAPI, CATI, and CAWI methods for a multiday household travel survey. In J. Zmud, M.-L. Gosselin, M. Munizaga & J. A. Carrasco (Eds.), Transport survey methods: Best practice for decision making (pp. 289–304). Bingley, UK: Emerald Group Publishing Ltd. Kohla, B., & Meschik, M. (2013). Comparing trip diaries with GPS tracking: Results of a comprehensive Austrian study. In J. Zmud, M.-L. Gosselin, M. Munizaga & J. A. Carrasco (Eds.), Transport survey methods: Best practice for decision making (pp. 305–320). Bingley, UK: Emerald Group Publishing Ltd.

346

Jimmy Armoogum and Marco Diana

Moutou, C., Greves, S., & Puckett, S. (2011). Responses to sustainable transport initiatives: A small business survey. Poster presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile (14–18 November); paper available in the conference proceedings. Papon, F. (2013). Correcting biographic survey data biases to compare with cross-section travel surveys. In J. Zmud, M.-L. Gosselin, M. Munizaga & J. A. Carrasco (Eds.), Transport survey methods: Best practice for decision making (pp. 321–335). Bingley, UK: Emerald Group Publishing Ltd.

THEME 4 FACING UP TO SAMPLE ATTRITION IN LONGITUDINAL SURVEYS

Chapter 19

Optimal Sampling Designs for Multi-Day and Multi-Period Panel Surveys Makoto Chikaraishi, Akimasa Fujiwara, Junyi Zhang and Dirk Zumkeller

Abstract Purpose — This study proposes an optimal survey design method for multi-day and multi-period panels that maximizes the statistical power of the parameter of interest under the conditions that non-linear changes in response to a policy intervention over time can be expected. Design/methodology/approach — The proposed method addresses balances among sample size, survey duration for each wave and frequency of observation. Higher-order polynomial changes in the parameter are also addressed, allowing us to calculate optimal sampling designs for non-linear changes in response to a given policy intervention. Findings — One of the most important findings is that variation structure in the behaviour of interest strongly influences how surveys are designed to maximize statistical power, while the type of policy to be evaluated does not influence it so much. Empirical results done by using German Mobility Panel data indicate that not only are more data collection waves needed, but longer multi-day periods of behavioural observations per wave are needed as well, with the increase in the non-linearity of the changes in response to a policy intervention. Originality/value — This study extends previous studies on sampling designs for travel diary survey by dealing with statistical relations between sample size,

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

350

Makoto Chikaraishi et al.

survey duration for each wave, and frequency of observation, and provides the numerical and empirical results to show how the proposed method works. Keywords: Multi-day and multi-period panel; sampling designs; German mobility panel; statistical power; non-linear changes

19.1. Introduction When a travel (or activity) diary survey is designed, we must determine sampling procedures, questionnaire items, financial schemes and so on. Academic research in the transportation field has concentrated more on how to use data (e.g. behavioural modelling) than on how to collect it (e.g. survey design). It may be time to pay greater attention to survey design issues, because, unlike model development, data collection is often time sensitive. For some information, if we miss the opportunity to collect the data (such as information that will only be in short-term memory for a limited time), retrospective surveys will not be able to capture the detailed behavioural data of interest. Recent travel surveys have often been smaller in scale (partly because of financial difficulties), and under such situations, we have to carefully consider how to design smaller surveys while minimizing the loss of necessary information. This study approaches such survey design issues from a statistical perspective. In the near future, many developed counties will face decreasing populations (indeed, countries such as Germany, Italy and Japan already face the problem), which may trigger a number of microscopic and macroscopic changes. In this situation, it may not be appropriate to apply one-day data to demand forecasting, in which longitudinal trends and changes in behaviour are extrapolated from cross-sectional data (Kitamura, 1990). Instead, longitudinal data would be more appropriate for demand forecasting and policy evaluation, especially for cases in which changes are expected over time (see, e.g. Goodwin, 1998). In this context, it is suggested that multi-day and multi-period panel survey data can provide the information we need to represent behavioural variations and changes in model development and policy evaluation (Pendyala & Pas, 2000). The Dutch Mobility Panel, Puget Sound Transportation Panel and the German Mobility Panel are prominent examples of multi-day and multi-period panel surveys. These survey data have been used in a number of studies and have not only improved our understanding of activity–travel behaviour, but also provided fundamental data that have been used to establish new theoretical foundations. However, there are still relatively few empirical investigations of multi-day and multi-period panels. The reasons for this may include: (1) conducting such longitudinal surveys is seen as more expensive than crosssectional surveys; (2) such surveys may require more complicated institutional arrangements; (3) there may be a certain difficulty in designing and maintaining a panel and (4) the advantages and disadvantages of applying such complicated survey data are not clear. Considering these concerns, it would be useful to clarify for a given budget, how, for instance, parameter accuracy is changed by shifting from cross-sectional surveys to multi-day and multi-period surveys, what kind of

Optimal Sampling Designs for Panel Surveys

351

trade-off exists between survey cost reduction and parameter accuracy improvement, and which survey design components (sample size, survey duration, number of waves etc.) should be changed to reduce survey cost while minimizing the loss of parameter accuracy. In other words, the development of cost effective survey designs could be one way to encourage the use of multi-day and multi-period panel surveys. We should mention here that several transportation studies have focused on developing effective travel diary survey designs. Pas (1986) established the optimal length (in days) for multi-day panel surveys and underscored the substantial benefits of multi-day panel surveys for reducing data collection costs and/or improving the precision of parameter estimates. Kitamura, Yamamoto, and Fujii (2003) focused on the design of multi-period panel surveys in the context of discrete travel behaviours, and concluded that continuous behavioural observations are needed to detect changes in behaviour. This implies that, to identify changes in behaviour, we may have to explicitly distinguish between short-term variability and long-term changes, especially in practical situations (e.g. applying one-day data to forecasting involves longitudinal extrapolation of cross-sectional variability). Because multi-day data contain information on short-term variability and multi-period data contain information on long-term changes, using both multi-day and multi-period panel data could be one of the solutions to this problem. Based on the above considerations, this study attempts to develop a method for determining optimal design for multi-day and multi-period panel surveys under the conditions that the non-linear changes in response to a policy intervention over time can be expected. Specifically speaking, we describe the trade-offs between (1) the observed duration of each wave (i.e. how many consecutive days respondents should report their behaviour for each wave); (2) the interval between successive waves (i.e. how frequently their behaviour is observed) and (3) the sample size, focusing on the statistical power of the parameter estimate. The proposed method is based on existing methods developed in other fields. In particular, the methodological framework for managing longitudinal sampling designs developed by Raudenbush and Xiao-Feng (2001) is fundamental to the current study. However, a straightforward application of this method is hampered by the substantial day-to-day variations in travel behaviour — such substantial fluctuations of objective variables generally do not appear when this method is used in other fields. To handle such fluctuations, we include a multi-level modelling technique in the Raudenbush and Xiao-Feng (2001) methodological framework, which allows us to distinguish between interindividual and intra-individual variances. This extended method has the same structure as ‘the cluster randomized trials with repeated measures’ described by Spybrook, Raudenbush, Congdon, and Martinez (2011). In this study, we further extend the methodology by introducing a budget constraint. We believe that this is the first work to focus on the optimal design of multi-day and multi-period travel diary panel surveys. After illustrating the methodological framework for optimal panel survey designs and showing some numerical simulation results, we present an empirical application, using German Mobility Panel data, that focuses on trip generation behaviour. Although the findings of this study are only applicable when the statistical power of a particular parameter is the focus of the

352

Makoto Chikaraishi et al.

survey, the clarified trade-offs among several survey elements could be a useful guide for policymakers who must make difficult survey design decisions. The next section reviews previous studies focusing on optimal survey designs. In Section 19.3, a method for optimal survey design of multi-day and multi-period panels is described. Then, following numerical simulations based on the proposed method, we show the empirical results based on the German Mobility Panel. In the final section, key conclusions and future tasks are summarized.

19.2. Literature Review 19.2.1. Methods for Panel Survey Design Although there are some discussions of optimal panel survey design in the transportation field (e.g. Lawton & Pas, 1996; Pendyala & Pas, 2000), there is little methodological research on optimal survey designs for multi-day and multi-period panels. On the other hand, methodological studies for panel survey designs have been published in other fields, including statistics, psychology, and medical science, as well as in the social, biomedical, and educational research fields. One of the important early studies was done by Hansen, Hurwitz, and Madow (1953). They proposed an optimal survey design method for cluster sampling (e.g. cluster randomized trials). Although cluster sampling is known to be inefficient because the data obtained from a given cluster are generally correlated with each other (and thus there is a certain loss of information), Hansen et al. (1953) showed that this inefficiency may be offset by the reduced survey costs associated with collecting data from the same cluster. In fact, the optimal travel diary survey design method proposed by Pas (1986) is a straightforward extension of Hansen et al.’s (1953) method to multi-day panel surveys, in which the cluster is an individual and each observation is done at the person–day level. Such cluster sampling-based methods have been further developed in a multi-level modelling approach, especially by researchers in education and psychology (Berger & Wong, 2009; Hox, 2010; Moerbeek, Van Breukelen, & Berger, 2010; Snijders, 2005). Snijders and Bosker (1993) developed optimal sampling designs for a general two-level linear model, and Raudenbush (1997) presented optimal survey designs for identifying the effect of a policy invention at the cluster level. Raudenbush (1997) also showed that a covariate can substantially increase the efficiency of cluster sampling. Moerbeek (2006) introduced a cost function to describe trade-offs between using covariate and increasing sample size. Cohen (1998) developed optimal multi-level survey designs for the estimation of variances of unobserved components, and Cohen (2005) did the same for situations in which intraclass correlation was the primary interest. Moineddin, Matheson, and Glazier (2007) simulated the properties of optimal survey designs for multi-level logistic regression. The studies reviewed thus far have focused on time-invariant aspects of behaviour. Schlesselman (1973) conducted an important initial study of optimal survey designs for changing behaviour by using multi-period panels. The study examined the proper balance in a longitudinal survey between the frequency of measurements

Optimal Sampling Designs for Panel Surveys

353

and study duration. The results showed that a unit increase in the study duration reduced the standard error of a parameter estimate more than did a unit increase in the frequency of measurements. Raudenbush and Xiao-Feng (2001) extended Schlesselman’s approach to include higher-order polynomial effects. Bloch (1986) introduced a cost function for optimal multi-period panel survey designs, and presented optimal survey designs that tried to strike a balance between additional subjects and additional measurements for each subject. Winkens, Schouten, van Breukelen, and Berger (2005) focused on the optimal time points for repeated measurements under various covariance structures and found that the commonly used design, with equally spaced measures, is not optimal under certain conditions. Basagan˜a and Spiegelman (2009) claimed that existing longitudinal research has assumed that exposure (in our context, policy intervention) is time invariant, and they proposed new survey design methods that assumed policy interventions that vary with time. For repeated-measurement survey designs, several studies have addressed panel-specific issues, including dropouts and missing data (e.g. Muthen & Curran, 1997; Galbraith, Stat, & Marschner, 2002).

19.2.2. Application to Activity–Travel Diary Surveys There is also a history of survey designs in the transportation field, for activity–travel diary surveys (Stopher, 2009). In the early period, the sample sizes for home interview surveys were generally defined as a percentage of the population, and often ranged from 1% to 3% of the population (in Japan, slightly higher percentages are used, with calculations based on the idea of Relative Standard Deviation (JSTE, 2008)). Then, because of increasing survey costs and a better understanding of sampling statistics, sample sizes dropped in the 1970s, and they were no longer calculated as a percentage of the population. Smith (1979) published one of the pioneering works on sample size reduction, claiming that 900–1200 respondents constituted a sufficient sample size. From the 1980s on, multi-day panel and/or multi-period panel surveys have been popular because they allow us to describe dynamic travel behaviour with both short- and long-term variability (Pendyala & Pas, 2000). Smart (1984) and Pas (1986) discussed optimal survey designs for multiperiod panels and multi-day panels, respectively. However, to the best of our knowledge, there is no transportation research on optimal travel survey designs for multi-day and multi-period panels, although, as mentioned above, there are plenty of panel survey design studies in other fields. One of the crucial reasons why there is little study of this topic in the transportation field might be that there is relatively little data on changes in travel behaviour. Such knowledge is generally a prerequisite for determining optimal survey designs (e.g. intuitively speaking, we may need more behavioural observations when there are substantial changes in behaviour). In recent years, with the increasing availability of longitudinal data and the development of modelling methods, a number of studies have explored changes in various aspects of behaviour (e.g. Chikaraishi, Fujiwara, Zhang, Axhausen, & Zumkeller, 2011; Chikaraishi, Zhang, Fujiwara, & Axhausen, 2010; Kitamura, Yamamoto, Susilo, & Axhausen, 2006; Pas, 1987; Pas & Sundar, 1995; Pendyala, 1999). By

354

Makoto Chikaraishi et al.

survey designs for multi-day and multi-period panels, followed by development of a method for the optimal design of survey panels.

19.3. Optimal Survey Design for Panel Surveys The methodological foundation for the current study was formulated by a series of studies in other fields (Raudenbush, 1997; Raudenbush and Xiao-Feng, 2001; Schlesselman, 1973; Spybrook et al., 2011). Although applying these methods to optimal designs for multi-day and multi-period travel surveys may be worthwhile, we extend this research by introducing a cost function — we set an exact maximization problem under a certain budget constraint.

19.3.1. Basic Concept In this study, the term ‘survey design’ refers to the design of a multi-day and multiperiod survey where (1) the observed duration of each wave is denoted by D (day); (2) the frequency of survey is denoted by F (waves per year) and (3) the sample size is denoted by N (person). Note that the frequency F is directly related to the overall survey period G and the total number of waves T. Specifically, F can be defined as (T  1)/G. For example, when the total number of waves T is 6 and the survey period G is 10, the frequency F is equal to 1/2, that is the survey would be conducted once every two years. Thus, when two of these terms are set, the remaining element is automatically determined. In this study, for simplicity, G was a fixed number, and thus, identifying F is the same as identifying T. Hereinafter, we mainly use T instead of F, but essentially there is no difference. Looking at real examples, the fourth person trip survey in the Tokyo metropolitan region applied {N, T, D} ¼ {883044, 1, 1}, Mobidrive survey (Axhausen, Zimmermann, Schonfelder, Rindsfuser, & Haupt, 2002) applied {361, 1, 42}, and the German Mobility Panel (Zumkeller, 2009) applied {1800, 17, 7} (this is a rotation panel survey data, see Section 19.5.1 for details). The survey designs vary from survey to survey, probably depending on the purpose, and thus determining the purpose is the initial step in survey design. In this study, we set ‘capturing non-linear changes in response to a certain policy intervention’ as the main purpose of the survey. Specifically, we want to maximize statistical power for the parameter that represents the degree to which policy intervention affects the average rate of change, rate of acceleration, and higher degree polynomial effects. Thus, the term ‘optimal’ survey design refers to a survey design {N, T, D} that maximizes the statistical power of the response to policy intervention. The basic concept for the multi-day and multi-period survey design is presented in Figure 19.1. In the discussions on optimal survey designs, two different behavioural aspects are considered: policy response (i.e. the 1st-order moment of behaviour) and behaviour variations (i.e. the 2nd-order moment of behaviour). Although capturing the former utilizing these modelling methods, we will conduct empirical studies on optimal

355

t=1

Sample size N

d=2 d=1 Duration D

Sample size N

Observed Observed Observed Unobserved Unobserved

Sample size N

Response to policy intervention

Optimal Sampling Designs for Panel Surveys

Duration D

Duration D

t=2

t=3

time

Total number of waves T

Figure 19.1: Basic concept of survey designs for multi-day and multi-period panel. aspect is our main interest, we will provide the evidence that the latter significantly affects the optimal survey designs. 19.3.2. Assumptions The main assumptions in this study are as follows: 1. 2. 3. 4. 5. 6. 7.

The objective variable is continuous. Policy intervention is randomly assigned to N/2 individuals. The survey perfectly follows random sampling procedure. There are no panel-specific problems, such as panel fatigue, dropouts etc. There are equidistant intervals between successive panels. There is a hierarchical covariance structure (see Section 19.3.3). There is a time-invariant population.

As mentioned in the previous section, a number of methods could be used to relax these assumptions, such as by using a logit-type model (Moineddin et al., 2007) for Assumption 1, by introducing an autoregressive covariance structure (Winkens et al., 2005) for Assumption 6 etc. However, we use these assumptions to make the discussion simple and clear. Future extensions of this method may help to strengthen its practical application to the design of similar surveys. 19.3.3. Model Formula In this study, the following three-level model is employed: P X apdi cpt þ etdi Y tdi ¼ p¼0

(19.1)

356

Makoto Chikaraishi et al. apdi ¼ bpi þ updi

(19.2)

bpi ¼ gp0 þ gp1 W i þ vpi

(19.3)

where Ytdi is a dependent variable observed from individual i ( ¼ 1, 2,y, N), at day d ( ¼ 1, 2,y, D), in wave t ( ¼ 1, 2,y, T). Let etdi, updi and vpi be normally distributed, with means 0 and variances s2, tap, and tbp, respectively. These random components can be regarded as inter-wave variation (i.e. variation associated with the repeated measures), intra-individual variation and inter-individual variation, respectively. Unknown parameters apdi, bpi, gp0, and gp1 have a hierarchical relation: apdi is the intra-individual-level coefficient, bpi is the inter-individual-level coefficient, and gp0 is the grand mean, for the pth-order polynomial change parameter cpt. The term gp1 is the response to policy intervention Wi, which is a policy intervention indicator set at 1/2 for those who have experienced a policy intervention, and otherwise at  1/2. Our main interest here is to find optimal survey designs that maximize the statistical power of parameter gp1. Because the optimal design depends on the order of polynomial change p, this study derives three different optimal survey designs (i.e. up to the 3rd order of polynomial change (P ¼ 3)). To do this, we adopt the method used in Raudenbush and Xiao-Feng (2001) in which orthogonal polynomial contrast coefficients, which allow us to simplify the computations of statistical power, are employed. Specifically, we set orthogonal polynomial contrast coefficients cpt as follows: c0t ¼ 1 c1t ¼ t 

(19.4) T X

(19.5)

t=T

t¼1 T X 1 2 c2t ¼ c1t  t2 =T 2 t¼1

!

PT 4 ! 1 3 t c1t c  Pt¼1 c3t ¼ T 2 6 1t t¼1 t

(19.6)

(19.7)

19.3.4. Statistical Power As we mentioned above, in this study, the statistical power of the response to policy intervention gp1 is maximized. We defined the null hypothesis as H0: gp1 ¼ 0, and the alternative hypothesis as H1: gp16¼0. The variance of gp1 is defined as follows (see Spybrook et al., 2011):   4 tbp þ ðtap þ V p Þ=D (19.8) Varð^gp1 Þ ¼ N

Optimal Sampling Designs for Panel Surveys

357

where Vp ¼

s2 s2 F 2p ðT  p  1Þ! ¼ T P K p ðT þ pÞ! c2pt

(19.9)

t¼1

Kp is a constant term, where K1 ¼ 1/12, K2 ¼ 1/720, and K3 ¼ 1/100,800. Here, because statistical power is the probability that the test will reject the null hypothesis when the null hypothesis is false, we can set the probability as follows:   (19.10) P1b ¼ 1  Pr F 0  F ð1; N  2; lÞ l¼

g2p1 Ng2p1  ¼

4 tbp þ ðtap þ V p Þ=D Var g^ 2p1

(19.11)

where F0 is the critical value of F(1, N  2), which follows the central F distribution, while F(1, N  2; l) follows the non-central F distribution with non-centrality parameter l. Maximizing Eq. (19.10) under a given budget is our main objective here.

19.3.5. Survey Cost Function Needless to say, larger sample sizes, longer durations, and more frequent measurements will increase statistical power, but they also increase the cost of data collection. Although there are many possible cost functions for multi-period and multi-day panel surveys (depending upon the costs associated with each of these parameters), in this study, we used the following cost function: C ¼ C 0 þ C N N þ C D DN þ CT TN

(19.12)

where C is a total survey cost, C0 is the initial setup survey cost, CN is the cost for recruiting an individual, CD is the cost for increasing an observed duration per individual, and CT is the cost for increasing an observed time point (or wave) per individual. Here, it should be noted that we have arbitrarily set this cost function. For example, there is a possibility that the marginal cost of extending the duration of a survey per individual may not be fixed for consecutive additions. In future, it may be needed to identify survey cost functions through, for example a kind of meta-analysis.

19.3.6. Maximizing Statistical Power under Budget Constraints Based on the above-mentioned settings, we set the following maximization problem: Max

P1b ðN; T; DÞ

(19.13)

358

Makoto Chikaraishi et al. s:t:

B  C; N40; D40; T  p þ 1

C ¼ C 0 þ C N N þ C D DN þ CT TN

(19.14)

where B is the total budget that can be used for the survey. The number of waves T should be bigger than p + 1. This is because when we try to capture a linear change (i.e. p ¼ 1), at least two time-point observations are needed. Thus, this constraint represents the minimum number of waves required to capture the pth-order polynomial change. The output of this maximization problem is the optimal survey design {N, T, D}. We should note here that another optimization problem, that is minimizing survey cost while achieving a given level of statistical power, can also be developed in a way similar to the maximization problem described in Eqs. (19.13) and (19.14). The choice between these optimization methods might depend on the circumstances. In this study, we deal with the situation that a certain fixed survey budget is given.

19.4. Numerical Simulation 19.4.1. Basic Settings Before describing the empirical study, we report numerical simulations we conducted to confirm the behaviour of each parameter. The basic parameter settings for the simulations are shown in Table 19.1. Here, the degree of the response to policy intervention is defined based on the following standardized effect size dp: gp1 dp ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (19.15) tbp þ tap We set the standardized effect size as 0.2, resulting in gp1 ¼ 0.2  (10)1/2. In the following subsections, we change one of the parameters in Table 19.1, and confirm how the parameters affect statistical power, since the estimated parameters may vary depending on which behavioural aspect we focus on (Chikaraishi, Fujiwara, Zhang, & Axhausen, 2009). In Sections 19.4.2 and 19.4.3, we calculate statistical

Table 19.1: Basic parameter settings for numerical simulation. Survey designs (objective variables) Sample size (person) Observed duration of each wave (day) Total number of waves Total survey period (year) Parameters in the model Unobserved inter-wave variation Unobserved intra-individual variation Unobserved inter-individual variation Response to policy intervention

N D T G

200 14 6 5 (i.e. F ¼ 1)

s2 ta tb gp1

20 5 5 0.2  (10)1/2

Optimal Sampling Designs for Panel Surveys

359

power based on Eqs. (19.10) and (19.11) to confirm the impacts of the changes on the parameters, and in Section 19.4.4 the maximization problem shown in Eqs. (19.13) and (19.14) is used to check the impacts of changes in cost parameters. Finally, the impacts of effect size on optimal survey designs are identified in Section 19.4.5.

19.4.2. Survey Duration, the Number of Waves and Sample Size To begin, the impacts of each survey design element on the statistical power are addressed. Figure 19.2 shows the results by the order of polynomial change. From the results, we can confirm first that increases in survey duration, the number of waves and sample size increase statistical power. We also confirm that the higher the order of polynomial change, the longer the survey duration for each wave, the more waves, and/or the larger sample size needed to obtain the same statistical power. This implies that, when a survey is conducted during a period when non-linear changes can be expected, richer behavioural observation is needed to obtain a given level of statistical power. In addition, the results show that marginal returns for statistical power are basically decreasing with increases in parameters N, T and D, whereas a higher-order of polynomial change still keeps higher marginal returns even when the parameters become larger. This indicates that conducting richer multi-day and multi-period surveys would be worthwhile, especially when non-linear changes are expected.

19.4.3. Unobserved Variations The impacts of changes in unobserved variations on statistical power are calculated as shown in Figure 19.3. The results indicate that higher inter-wave, intra-individual and inter-individual variations consistently reduce statistical power. We also confirmed that, in case of inter-wave variations, the degree of loss is the greatest with the highest polynomial change (i.e. p ¼ 3). On the other hand, for intra- and inter-individual variations, the larger impact is observed in the lower order of polynomial change under the current parameter settings. Based on these results, we can say that, with greater intra- and inter-individual variability in behaviour, richer multi-day and multi-period survey designs are needed (i.e. longer survey durations for each wave, more waves, and/or larger sample sizes). In addition, because behavioural variability might differ for different aspects of behaviour, optimal survey design may strongly depend on the specific aspects of behaviour studied, emphasizing the need to clearly define behaviours of interest before designing the survey.

19.4.4. Cost Function To check the impacts of changes in cost parameters, the parameter settings for the cost function shown in Table 19.2 were applied as basic settings. How power and the maximization results vary with cost parameter changes is our interest here.

360

Makoto Chikaraishi et al. 0

20

40

p=1

60

80

100

p=2

p=3

0.5

power

0.4 0.3 0.2 0.1

0

20

40

60

80

100

0

20

40

60

80

100

Observed_duration_of_each_wave 5

10

p=1

15

20

p=2

p=3

power

0.4

0.3

0.2

5

10

15

20

5

10

15

20

Number_of_waves 0 p=1

500

1000

1500

p=2

p=3

1.0

power

0.8 0.6 0.4 0.2

0

500

1000

1500

0

500

1000

1500

Sample_size

Figure 19.2: Statistical power versus survey design components for each order of polynomial change. The maximization results of statistical power and the optimal survey designs under various cost parameters are shown in Figure 19.4 and Table 19.3, respectively. From the figure, it can be confirmed that, for the higher-order polynomial change, there are marginal changes in power, and the increases in survey costs are much

Optimal Sampling Designs for Panel Surveys 0

10

20

p=1

30

40

361

50

p=2

p=3

0.5

power

0.4 0.3 0.2 0.1 0

10

20

30

40

0

50

10

20

30

40

50

30

40

50

30

40

50

Inter_wave_variation 0

10

20

p=1

30

40

50

p=2

p=3

0.40

power

0.35 0.30 0.25 0.20 0.15 0

10

20

30

40

0

50

10

20

Intra_individual_variation

0 p=1

10

20

30

40

50

p=2

p=3

power

0.6

0.4

0.2

0

10

20

30

40

0

50

10

20

Inter_individual_variation

Figure 19.3: Statistical power versus unobserved variations for each order of polynomial change.

higher for CD and CT, compared with CN. This implies that, relative to the cost of recruiting individuals, increasing survey durations and waves is more sensitive to the costs under the current cost function. On the other hand, for the 1st-order polynomial change, the sensitivities are not so different among the different types of

362

Makoto Chikaraishi et al.

Table 19.2: Parameter settings for cost function. Parameters in the cost function (in Japanese yen) Total budget Initial set-up survey cost Cost for recruiting an individual Cost for increasing an observed duration per individual Cost for increasing an observed time-point per individual

B C0 CN CD CT

5,000,000 2,000,000 1000 500 200

cost parameters. Thus, when the existence of non-linear changes is expected, how survey costs are reduced for increasing durations and time points can be crucial, and depending on the cost structure, the optimal survey design could be quite different. This tendency can also be seen in Table 19.3. For example, when CT increases from 100 yen to 1000 yen, the optimal survey design shifts from {N, T, D} ¼ {908, 2.0, 4.2} to {473, 2.0, 6.7} for the 1st-order polynomial changes, whereas it shifts from {276, 29.1, 13.9} to {139, 7.9, 25.4} for the 3rd-order polynomial changes. Of course, such discussions are strongly dependent on the cost structure of the survey, which may vary from case to case. Even so, we believe that such theoretical considerations of multi-day and multi-period panel survey data can be a useful guide for those who are designing multi-day and multi-panel panel surveys.

19.4.5. Effect Size The degree of effect size depends on which policy is introduced. For example, regulation-based policies, such as congestion pricing, might have a greater effect size than do non-regulation-based measures, such as information provision. To confirm whether the optimal survey design varies according to the type of policy being evaluated, the impacts of effect size on optimal survey designs are shown in Table 19.4. From the table, we can confirm that the optimal survey designs are not strongly affected by effect size, implying that the type of policy evaluated through the survey may not be important for survey designs when there is a budget constraint. In other words, the same survey design could be used for evaluating multiple policies.

19.5. Empirical Studies In this section, we present an empirical example of the proposed survey design method described in Section 19.3, focusing on the impacts of policy intervention on trip generation. In Section 19.5.1, the empirical data used in this study (from the German Mobility Panel) are briefly described. In Sections 19.5.2 and 19.5.3, the model estimation and optimization results are explained, respectively.

Optimal Sampling Designs for Panel Surveys 500

1000

p=1

1500

363

2000

p=2

p=3

power

0.8 0.7 0.6 0.5 0.4 0.3 500

1000

1500

2000

500

1000

1500

2000

Cost_CN 200 p=1

400

600

800

1000

p=2

p=3

power

1.0 0.8 0.6 0.4 0.2 200

400

600

800

1000

200 200

p=1

Cost_CD 400 600 800

400

600

800

1000

800

1000

1000

p=2

p=3

power

0.7 0.6 0.5 0.4 0.3 0.2 200

400

600

800

1000

200

400

600

Cost_CT

Figure 19.4: Maximization results of statistical power under various cost parameters for each order of polynomial change. 19.5.1. Empirical Data Data from the German Mobility Panel (Zumkeller, 2009), a multi-day and multiperiod panel survey, are used for the empirical analysis. The German Mobility Panel survey has been conducted since 1994. In this survey, each respondent is asked to report a period of continuous one-week travel behaviour over each of three years. In

364

Makoto Chikaraishi et al.

Table 19.3: Optimal survey designs with varying cost parameters. Cost parameter

Maximization results (optimal survey designs) p¼1

CN 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 200 400 600 800 1000 1200 1400 1600 1800 2000

p¼2

p¼3

CD

CT

N

D

T

N

D

T

N

D

T

500 500 500 500 500 500 500 500 500 500 100 200 300 400 500 600 700 800 900 1000 500 500 500 500 500 500 500 500 500 500

100 200 300 400 500 600 700 800 900 1000 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200

908 815 740 682 628 587 555 526 498 473 1279 1067 945 869 815 764 731 698 670 645 1403 1171 1008 898 815 741 677 634 593 557

4.2 4.6 4.9 5.2 5.6 5.8 6.0 6.2 6.5 6.7 9.5 7.1 5.9 5.1 4.6 4.2 3.9 3.6 3.4 3.2 3.1 3.5 3.9 4.3 4.6 4.9 5.3 5.5 5.7 6.0

2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0

576 522 504 468 433 403 379 358 339 322 964 784 677 602 522 489 436 401 375 359 828 713 642 579 522 488 463 426 415 386

6.7 7.2 7.2 7.7 8.2 8.6 9.1 9.4 9.8 10.2 14.1 10.2 8.5 7.5 7.2 6.4 6.2 5.9 5.7 5.3 5.1 5.8 6.1 6.6 7.2 7.4 7.6 8.2 8.1 8.7

8.4 5.8 4.5 4.0 3.7 3.5 3.4 3.3 3.3 3.2 3.5 3.9 4.4 5.0 5.8 6.6 7.8 8.7 9.5 10.3 4.4 4.6 5.0 5.3 5.8 6.2 6.4 6.7 7.0 7.3

276 222 196 182 171 164 157 151 145 139 500 364 294 252 222 201 186 174 162 154 255 247 237 229 222 214 210 202 200 192

13.9 16.1 17.8 19.1 20.5 21.6 22.7 23.6 24.5 25.4 31.2 23.4 19.9 17.5 16.1 14.9 13.6 12.8 12.3 11.7 14.7 14.9 15.8 15.7 16.1 16.8 16.5 17.2 17.2 17.9

29.1 22.4 18.1 14.8 12.5 10.9 9.7 8.9 8.3 7.9 9.4 12.8 16.2 19.3 22.4 24.9 28.0 29.8 32.1 34.2 21.1 21.5 20.9 22.2 22.4 22.3 23.1 23.1 23.0 23.7

our empirical analysis we excluded data from those who dropped out from the survey. We obtained 93,303 samples reported by 4443 people (i.e. for each respondent, 7 (days) * 3 (waves) ¼ 21 days of travel behaviour were reported) from 1996 to 2008. Because the total survey period G is 12 years, orthogonal polynomial contrast coefficients can be calculated as {c11,y, c1t,y, c113} ¼ {  6,  5,  4,  3,

Optimal Sampling Designs for Panel Surveys

365

Table 19.4: Optimal survey designs with varying effect size. Effect size dp 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.2

Maximization results (optimal survey designs) p¼1

p¼2

p¼3

N

D

T

Power

N

D

T

Power

N

D

T

Power

– – – 802 806 802 810 813 813 809 814 815 816 815 815 815 816 816 814 815

– – – 4.7 4.6 4.7 4.6 4.6 4.6 4.6 4.6 4.6 4.6 4.6 4.6 4.6 4.5 4.6 4.6 4.6

– – – 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0

– – – 0.079 0.095 0.116 0.141 0.169 0.202 0.238 0.278 0.321 0.367 0.414 0.463 0.513 0.562 0.611 0.658 0.702

– – – 534 548 545 545 545 550 545 544 537 536 533 538 528 533 535 537 522

– – – 7.3 7.1 7.1 7.0 7.1 6.9 7.0 7.0 7.1 7.0 7.1 7.0 7.0 7.0 6.9 6.9 7.2

– – – 4.9 4.7 4.8 4.9 4.9 4.9 5.0 5.2 5.3 5.4 5.4 5.4 6.0 5.7 5.9 5.6 5.8

– – – 0.070 0.082 0.096 0.113 0.133 0.156 0.182 0.211 0.242 0.276 0.312 0.350 0.391 0.431 0.474 0.515 0.556

– – 219 218 222 222 223 221 223 220 225 222 220 225 225 224 225 225 220 222

– – 15.8 15.6 17.2 17.5 17.0 17.2 17.0 17.0 16.5 16.8 16.9 16.2 16.2 16.0 15.9 16.0 16.2 16.1

– – 23.9 24.8 19.4 19.0 19.6 20.1 19.9 20.4 20.4 20.6 21.1 21.2 21.3 22.0 22.0 21.8 22.6 22.4

– – 0.055 0.059 0.064 0.071 0.078 0.087 0.097 0.109 0.122 0.136 0.151 0.168 0.186 0.206 0.226 0.248 0.271 0.295

Note: ‘–’ means the optimal solution cannot be identified because the power is too small.

 2,  1, 0, 1, 2, 3, 4, 5, 6}, {c21,y, c2t,y, c213} ¼ {11, 5.5, 1,  2.5,  5,  6.5,  7,  6.5,  5,  2.5, 1, 5.5, 11}, and {c31,y, c3t,y, c313} ¼ {  11, 0, 6, 8, 7, 4, 0,  4,  7,  8,  6, 0, 11}. In this empirical analysis, Ytdi is defined as individual i’s total number of trips on day d at wave t and the policy intervention variable is defined by residential location, that is downtown or outskirts. Although residential location itself is not a kind of policy variable, it could be assumed that urban development projects such as TOD (Transit Oriented Development), compact city etc., influence respondents’ mobility levels (represented by trip frequencies) to a greater or lesser extent depending upon where they live. Importantly, the primary purpose for conducting this empirical analysis was to identify the variation structure of trip frequencies. As shown in Section 19.4.5, the degree of the response to this policy variable was not important in the optimal survey design. We could give any value to the parameter for the policy intervention variable gp1 for the optimal survey designs when the survey budget is fixed, although small fluctuations exist.

366

Makoto Chikaraishi et al.

19.5.2. Model Estimation Results The estimation results for the activity generation model are shown in Table 19.5. They show that location was not a significant factor for all polynomial changes, whereas statistically significant effects of constant variables are observed for the 1st- and 3rd-order polynomial changes, implying that there would be some nonlinear changes in trip frequencies. For the estimation results of random effects, intra-individual and inter-individual variations become smaller as the polynomial order increases. To examine the properties of unobserved variations, the following decomposition technique was applied: VarðY tdi jgp0 ; gp1 ; W i Þ ¼

P X

c2pt tap þ

p¼0

|fflfflfflfflffl{zfflfflfflfflffl} intraindividual variation

P X

c2pt tbp þ |{z} s2 p¼0 interwave |fflfflfflfflfflffl{zfflfflfflfflfflffl} variation

(19.16)

interindividual variation

Table 19.5: Estimation results of activity generation model. Parameter

t-Value

g00 g01 g10 g11 g20 g21 g30 g31

3.543 0.008  0.015 0.000a 0.005  0.003  0.008 0.001

136.4 0.220  2.010  0.040 1.400  0.510  2.880 0.310

ta tb ta tb ta tb ta tb s2

0.548 0.927 0.006 0.019 0.000b 0.006 0.000c 0.004 3.120  208,487  198,033 93,303

Item Explanatory variables c0t Constant living in inner city (D) Constant c1t living in inner city (D) Constant c2t living in inner city (D) c3t Constant living in inner city (D) Random effects Intra-individual variation c0t Inter-individual variation Intra-individual variation c1t Inter-individual variation Intra-individual variation c2t Inter-individual variation Intra-individual variation c3t Inter-individual variation Inter-wave variation Initial log likelihood (only constant) Final log likelihood Sample size a

The estimated value is  0.000448. The estimated value is 0.0000000000846. c The estimated value is rounded to zero in the model estimation. b

Optimal Sampling Designs for Panel Surveys

367

The calculated ratio of intra-individual variation, inter-individual variation, and inter-wave variation to the total variation is 11.9%, 29.8% and 58.3%, respectively. It can be confirmed that, while inter-individual variation is higher than intraindividual variation, the biggest unobserved variation is for inter-wave variation, which is variability associated with repeated measures. This means that the weekly behaviour of wave t differs greatly from that of wave t + 1, implying that a multiperiod survey could be quite important, especially when non-linear changes are expected (see discussion in Section 19.4.3).

19.5.3. Optimal Survey Designs for Activity Generation Based on the identified behavioural variations of the trip generation model shown in the previous subsection, here we attempt to derive optimal survey designs with a budget constraint. Optimal survey designs are identified with the following given parameters: Total budget B ¼ 5,000,000 (Japanese yen); Initial setup survey cost C0 ¼ 2,000,000 (Japanese yen); Cost of recruiting an individual CN ¼ 1000 (Japanese yen); Cost of increasing an observed duration per individual CD ¼ 500 (Japanese yen); Cost of increasing an observed time point per individual CT ¼ 200 (Japanese yen); Total survey period ¼ 12 (years); and Standardized effect size ¼ 0.2. For the remaining parameters (i.e. parameters for unobserved components), the estimation results shown in the previous section are used. Note that, although the estimated effect size could also be used, we used a given for that parameter because (1) the introduced policy variable (i.e. residential location) was not significant for all polynomial changes, and (2) the effect size has little effect on optimal survey design, as shown in Section 19.4.5. The optimal survey design for the 1st-, 2nd- and 3rd-order polynomial changes was identified as {N, T, D} ¼ {813, 2.00, 4.58}, {690, 4.38, 4.94} and {550, 7.78, 5.80}, respectively, with statistical power ¼ 0.527, 0.421 and 0.373, respectively. As with the numerical simulation results shown in the previous section, for higher-order polynomial changes, not only are more data collection waves needed, but longer multi-day periods of behavioural observations per wave are needed as well. Intuitively, this could be understood as a need for distinguishing between shortterm and long-term behavioural variability. When complicated non-linear changes in response to the policy variable can be expected, the behavioural differences between different time-point observations can be explained in two ways: (1) fluctuations/ dispersions in behaviour, and (2) structural changes in behavioural mechanisms. To distinguish between these two possibilities, more behavioural information in both the near and far term may be needed — that is making each wave longer (i.e. enrichment of variation information) and increasing the number of waves (i.e. enrichment of change information). On the other hand, a clear distinction between measurement variations and true behavioural changes is very difficult to make, and this may be similar to the ecological fallacy (Robinson, 1950). How close together in time do two observations need to be to have any differences between them regarded as

368

Makoto Chikaraishi et al.

measurement variability rather than actual changes in behaviour? Can the temporal averaging of behaviour be regarded as typical behaviour? This temporal version of the ecological fallacy arises when we attempt to distinguish between variation and change. Although such effects could be minimized when the temporal behavioural rhythms (weekly rhythm, yearly rhythms etc.) are taken into account, exploring this temporal version of the ecological fallacy may be important in future research, especially when the existence of substantial non-linear changes is expected. To do this, existing studies dealing with MAUP (Modifiable Areal Unit Problem), which can be assumed as a geographic version of the ecological fallacy, could be a useful guide (see, e.g. Zhang & Kukadia, 2005).

19.6. Conclusion In designing a multi-day and multi-period survey, there are trade-offs among sample size, survey duration of each wave, and the frequency of observations. In this study we focused on adjusting these three survey design components to minimize either total survey costs or error in judgment. Specifically, we first developed a survey design method for determining optimal sampling designs for multi-day and multiperiod panel surveys with a given budget and in which statistical power for a given parameter was maximized. Non-linear changes were also taken into account by introducing higher-order polynomial changes. To our knowledge, this is the first study to develop a method for optimal sampling design of multi-day and multiperiod travel diary surveys with non-linear changes in a given parameter. After developing the survey design method and showing numerical simulation results, we conducted an empirical analysis using data from the German Mobility Panel, which is an excellent, ongoing, multi-day and multi-period survey. In our empirical analysis we identified optimal survey designs for capturing the impacts of policy interventions on trip generation. There are several important findings in the numerical simulations and empirical results. First, we confirmed that when non-linear changes occur during a survey period, (1) much richer behavioural observation is needed to obtain a given level of statistical power, and (2) survey costs for increasing survey durations and time points can strongly affect optimal survey designs — it could be difficult to decide an optimal survey design without taking into account the data collection cost. Second, in designing a multi-day and multi-period survey, the optimal survey designs may not be affected very much by the effect size, implying that the type of policy to be evaluated would not be important in survey design. Instead, the specific aspect of behaviour being surveyed might have bigger impacts on optimal survey design. More precisely, how the behaviour of interest varies and changes might be more important for the survey design with respect to maximizing statistical power. Therefore, because the relationship between data collection and behavioural understanding constrain and influence each other (Axhausen, 2008), deepening our understanding of behaviour based on the existing multi-day and multi-period survey data may be an

Optimal Sampling Designs for Panel Surveys

369

important area for research for the improvement of the transportation planning process. For example, our empirical analysis of trip generation showed that, for inter-individual, intra-individual and inter-wave variation, the greatest change occurred in inter-wave variation, implying that weekly behaviour at wave t substantially differs from that at wave t + 1. Such fundamental behavioural understanding could be important for survey design. Finally, this study also highlighted the existence of a temporal version of the ecological fallacy, which may be relatively new in the transportation field. This temporal ecological fallacy may open a new and important future research area, namely, how do we distinguish between behavioural variation and behavioural change. This question may be more important when there are non-linear changes in behaviour, which may make the relationship between variations and changes more complicated. Of course, this study has a number of limitations. First, we made assumptions in developing our optimal survey designs. Although it may be difficult to eliminate all assumptions, it would be necessary to compare the results obtained with different sets of assumptions to strengthen our findings or to make modifications in our method. For example, addressing panel-specific issues such as attrition bias in the optimal survey design is an important future task to be explored. Secondly, we only set the maximization problem using a fixed budget, but sometimes a given level of statistical power may be much more important than preserving the budget, such as when the policy intervention is costly, as in road investments. In such a case, minimizing survey costs while maintaining statistical power may be more appropriate. Thirdly, it would be worth identifying appropriate cost functions based on the experiences of existing multi-day and multi-period surveys, perhaps through some form of meta-analysis. Finally, in this study, we identified optimal survey designs for a single parameter and for a single behavioural aspect. We could extend the proposed method to a multi-objective optimization problem that would incorporate multiple parameters with multiple behavioural aspects. Such research could strengthen the practical application of the proposed survey design methods.

Acknowledgement This work was supported by Grants-in-Aid for Scientific Research (22860042). We would like to thank Kay Axhausen for his generous support.

References Axhausen, K. W. (2008). Definition of movement and activity for transport modelling. In D. A. Hensher & K. J. Button (Eds.), Handbook of transport modelling (2nd ed., pp. 329–344). Oxford: Elsevier.

370

Makoto Chikaraishi et al.

Axhausen, K. W., Zimmermann, A., Schonfelder, S., Rindsfuser, G., & Haupt, T. (2002). Observing the rhythms of daily life: A six-week travel diary. Transportation, 29, 95–124. doi: 10.1023/A:1014247822322 Basagan˜a, X., & Spiegelman, D. (2009). Power and sample size calculations for longitudinal studies comparing rates of change with a time-varying exposure. Statistics in Medicine, 29, 181–192. Berger, M. P. F., & Wong, W. K. (2009). An introduction to optimal designs for social and biomedical Research. Wiley. doi: 10.1002/9780470746912 Bloch, D. A. (1986). Sample size requirements and the cost of a randomized clinical trial with repeated measurements. Statistics in Medicine, 5, 663–667. doi: 10.1002/sim.4780050613 Chikaraishi, M., Fujiwara, A., Zhang, J., & Axhausen, K. W. (2009). Exploring variation properties of departure time choice behavior using multilevel analysis approach. Transportation Research Record, 2134, 10–20. doi: 10.3141/2134-02 Chikaraishi, M., Fujiwara, A., Zhang, J., Axhausen, K. W., & Zumkeller, D. (2011). Changes in variations of travel time expenditure: Some methodological considerations and empirical results from German Mobility Panel. Transportation Research Record, 2230, 121–131. doi: 10.3141/2230-14 Chikaraishi, M., Zhang, J., Fujiwara, A., & Axhausen, K. W. (2010). Exploring variation properties of time use behavior based on a multilevel multiple discrete-continuous extreme value model. Transportation Research Record, 2156, 101–110. doi: 10.3141/2156-12 Cohen, M. P. (1998). Determining sample sizes for surveys with data analyzed by hierarchical linear models. Journal of Official Statistics, 14, 267–275. Cohen, M. P. (2005). Sample size considerations for multilevel surveys. International Statistical Review, 73, 279–287. doi: 10.1111/j.1751-5823.2005.tb00149.x Galbraith, S., Stat, M., & Marschner, I. C. (2002). Guidelines for the design of clinical trials with longitudinal outcomes. Controlled Clinical Trials, 23, 257–273. doi: 10.1016/S01972456(02)00205-2 Goodwin, P. (1998). The end of equilibrium. In T. Ga¨rling & T. Laitila (Eds.), Theoretical foundations of travel choice modeling (pp. 103–132). Elsevier. Hansen, M. H., Hurwitz, W. N., & Madow, W. G. (1953). Sample survey methods and theory (Vol. II). New York, NY: Wiley. Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd ed.). New York, NY: Routledge. Japan Society of Traffic Engineers (JSTE). (2008). Traffic engineering handbook. Japan Society of Traffic Engineers. Kitamura, R. (1990). Panel analysis in transportation planning: An overview. Transportation Research Part A, 24, 401–415. doi: 10.1016/0191-2607(90)90032-2 Kitamura, R., Yamamoto, T., & Fujii, S. (2003). The effectiveness of panels in detecting changes in discrete travel behavior. Transportation Research Part B, 37, 191–206. doi: 10.1016/S0965-8564(01)00036-2 Kitamura, R., Yamamoto, T., Susilo, Y. O., & Axhausen, K. W. (2006). How routine is a routine? An analysis of the day-to-day variability in prism vertex location. Transportation Research Part A, 40, 259–279. doi: 10.1016/j.tra.2005.07.002 Lawton, T. K., & Pas, E. I. (1996). Resource paper for survey methodologies workshop. Proceedings of conference on household travel surveys: New Concepts and research needs (pp. 134–153). Washington, DC: National Academy Press. Moerbeek, M. (2006). Power and money in cluster randomized trials: When is it worth measuring a covariate? Statistics in Medicine, 25, 2607–2617. doi: 10.1002/sim.2297

Optimal Sampling Designs for Panel Surveys

371

Moerbeek, M., Van Breukelen, G. J. P., & Berger, M. P. F. (2010). Optimal designs for multilevel studies. In J. Leeuw & E. Meijer (Eds.), Handbook of multilevel analysis (pp. 177–205). New York: Springer. Moineddin, R., Matheson, F., & Glazier, R. H. (2007). A simulation study of sample size for multilevel logistic regression models. BMC Medical Research Methodology, 7, 34. Retrieved from http://www.biomedcentral.com/content/pdf/1471-2288-7-34.pdf. doi:10.1186/1471-2288-7-34 Muthen, B. O., & Curran, P. J. (1997). General longitudinal modeling of individual differences in experimental designs: A latent variable framework for analysis and power estimation. Psychological methods, 2(4), 371–402. doi: 10.1037/1082-989X.2.4.371 Pas, E. I. (1986). Multiday samples, parameter estimation precision, and data collection costs for least squares regression trip-generation models. Environment and Planning A, 18(1), 73–87. doi: 10.1068/a180073 Pas, E. I. (1987). Intrapersonal variability and model goodness-of-fit. Transportation Research Part A, 21, 431–438. doi: 10.1016/0191-2607(87)90032-X Pas, E. I., & Sundar, S. (1995). Intrapersonal variability in daily urban travel behavior: Some additional evidence. Transportation, 22(2), 135–150. doi: 10.1007/BF01099436 Pendyala, R. M. (1999). Measuring day-to-day variability in travel behavior using GPS data. Final Report DTFH61-99-P-00266., FHWA, U.S. Department of Transportation, Washington, DC. Retrieved from http://www.fhwa.dot.gov/ohim/gps/index.html Pendyala, R. M., & Pas, E. I. (2000). Multi-day and multi-period data for travel demand analysis and modeling. TRB Transportation Research Circular, E-C008, Transportation Surveys: Raising the Standard. Raudenbush, S. W. (1997). Statistical analysis and optimal design for cluster randomized trials. Psychological Methods, 2(2), 173–185. Raudenbush, S. W., & Xiao-Feng, L. (2001). Effects of study duration, frequency of observation, and sample size on power in studies of group differences in polynomial change. Psychological Methods, 6(4), 387–401. doi: 10.1037/1082-989X.6.4.387 Robinson, W. S. (1950). Ecological correlations and the behavior of individuals. American Sociological Review, 15(3), 351–357. doi: 10.2307/2087176. Schlesselman, J. J. (1973). Planning a longitudinal study: II. Frequency of measurement and study duration. Journal of Chronic diseases, 26(9), 561–570. doi: 10.1016/00219681(73)90061-1 Smart, H. E. (1984). The dynamics of change: Applications of the panel technique to transportation surveys in Tyne and Wear. Traffic Engineering and Control, 25, 595–598. Smith, M. E. (1979). Design of small-sample home-interview travel surveys. Transportation Research Record, 701, 29–35. Snijders, T. A. B. (2005). Power and sample size in multilevel modeling. In B. S. Everitt & D. C. Howell (Eds.), Encyclopedia of statistics in behavioral science (Vol. 3, pp. 1570–1573). Chichester, UK: Wiley. Snijders, T. A. B., & Bosker, R. J. (1993). Standard errors and sample sizes for two-level research. Journal of Educational Statistics, 18(3), 237–259. doi: 10.2307/1165134 Spybrook, J., Raudenbush, S. W., Congdon, R., & Martinez, A. (2011). Optimal design for longitudinal and multilevel research. Documentation for the ‘‘Optimal Design’’ Software, Version 1.76. Retrieved from http://people.cehd.tamu.edu/Bokwok/epsy652/OD/ od-manual-20080312-v176.pdf Stopher, P. (2009). The travel survey toolkit: where to from here? In P. Bonnel, M. LeeGosselin, J. Zmud & J.-L. Madre (Eds.), Transport survey methods: Keeping up with a changing world. Bingley, UK: Emerald Group Publishing Limited.

372

Makoto Chikaraishi et al.

Winkens, B., Schouten, H. J. A., van Breukelen, G. J. P., & Berger, M. P. F. (2005). Optimal time-points in clinical trials with linearly divergent treatment effects. Statistics in Medicine, 24(24), 3743–3756. doi: 10.1002/sim.2385 Zhang, M., & Kukadia, N. (2005). Metrics of urban form and the modifiable areal unit problem. Transportation Research Record, 1902, 71–79. doi: 10.3141/1902-09 Zumkeller, D. (2009). The dynamics of change — Latest results from the German Mobility Panel. Paper presented at the 12th International Conference on Travel Behaviour Research, Jaipur, India (13–18 December).

Chapter 20

Data Quality and Completeness Issues in Multiday or Panel Surveys Bastian Chlond, Matthias Wirtz and Dirk Zumkeller

Abstract Purpose — The paper aims at an improvement of the understanding, how mobility is reported in longitudinal surveys and to develop ideas how to assess the completeness of the reported mobility. Methodology/approach — Analyses of data quality and completeness are performed on the multiday and multiperiod data of the German Mobility Panel. Distinctions are made between differing reporting behaviours of individuals who either reported three times, two times or only once. Findings — It can be shown that the reporting behaviours are different depending on the number of repetitions. The results illustrate that on the one hand individuals who repeat the survey in a consecutive wave tend to report with greater motivation, endurance and accuracy. On the other hand, participants who have not reported completely and accurately are more likely to drop out. These effects positively influence the quality and completeness and therefore the reliability of recorded mobility figures in multiperiod mobility surveys. Practical implications — The analytical possibilities of combined multiday and multiperiod data in terms of the assessment of data quality will be demonstrated. Hints to identify such types of survey artefacts are presented. Keywords: Survey; data quality; attrition; fatigue

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

374

Bastian Chlond et al.

20.1. Introduction A very common question when surveying mobility behaviour is whether interviewees have recalled and reported all trips. This leads to the question if we are generally able to assess total mobility volumes as we cannot expect that even highly motivated participants will report their mobility without any faults and fatigue. The question of the completeness of data is relevant for all kinds of surveys. Completeness and enduring motivation of participants are however particularly relevant for longitudinal (multiday, e.g. seven consecutive days) and panel surveys (multiperiod, e.g. with a repetition after 1 year). In both cases the respondent burden and how the participants handle this, respectively, become a relevant issue. In the case of multiday surveys (longitudinal) usually both a declining completeness and a declining reporting accuracy (fatigue) can be observed; moreover, a share of participants is likely to stop reporting at all, due to lacking and declining motivation (attrition within wave). The aspects of in-wave-fatigue and in-waveattrition have been analysed earlier (e.g. Kitamura & Bovy, 1987; Meurs, Van Wissen, & Visser, 1989) measuring the level of fatigue on a relative scale by using the decline in mobility figures as an indicator and allowing the calculation of the ‘true’ mobility figures — at least for the aggregate. However, it becomes difficult to do this for subgroups in a population and it is impossible to do so for individuals. For multiperiod surveys (panel) an additional effect comes up: People do not repeat reporting in the following wave. We call this ‘attrition between waves’. This affects the usability of the survey data as the lacking repeaters do not allow intrapersonal analyses. Furthermore, this results in a biased sample which does not represent the basic population any more. Generally spoken, a high level of repetition is usually desired. But it is even more disadvantageous if the reporting quality is different in consecutive waves, as effects of reporting behaviour cannot be separated from effects caused by any factors relevant for the mobility itself: i.e. for analyses which explicitly look at individuals, an incomplete or inaccurate report is of disadvantage as this could be misinterpreted as a changed behaviour. In such cases it is advantageous if such individuals do not repeat in the next wave or if sloppy reports could be identified and removed from the dataset. In view of these factors the question arises if the drop-out of participants is generally of disadvantage or if the drop-out of those with incomplete reports is of benefit for the survey data quality. Explicitly we will lay our emphasis in this paper on descriptive analyses of reporting behaviour. Summing up these ideas the objectives of the paper are as follows:  To improve the understanding of how mobility is reported in longitudinal surveys  To illustrate the analytical possibilities of combined multiday and multiperiod data in terms of the assessment of data quality  To illustrate how an assessment of the completeness of the reported mobility can be performed, showing which kind of trips is affected and which relevance the non-reported trips have  To demonstrate the issues of different motivations and reporting behaviours

Data Quality and Completeness Issues in Multiday or Panel Surveys

375

 To give hints which consequences can result out of this information for the interpretation of the results This paper will give insights on how mobility is reported when using conventional survey instruments (self-administered paper questionnaires and trip diaries for seven consecutive days). We will analyse these aspects by using the data of the multiday and multiperiod German Mobility Panel (MOP). Therefore, we will give some information about the characteristics of the MOP first. Afterwards hypotheses about the reporting behaviours will be formulated, which will later be checked in detail.

20.2. Design and Characteristics of the German Mobility Panel MOP is a survey about the everyday mobility behaviour of the German population and has to be considered as a multipurpose survey: It is designed both to provide figures about the mobility demand development on a yearly basis (consideration of the data in a cross-section) as well as to provide data for in-depth analyses for mobility research (longitudinal consideration of the data). Furthermore, the reported activity programmes of one week are used as an input for an agent-based travel demand model. To achieve this, several design instruments have been combined:  People are asked to fill in a mobility diary for 1 week. This allows for an analysis of mobility behaviour in the course of 1 week (e.g. multimodal behaviour (Kuhnimhof, Chlond, & von der Ruhren, 2006a), variability in behaviour (Lipps, 2001)) and provides activity chains for the modelling of complete weeks.  The MOP is organized as a rotating panel, i.e. participants are asked to report their mobility behaviour in three consecutive years. This rotation approach is actually a combination of methods: On the one hand this approach allows for an up-to-date sample representing the German population in a cross-section for a given year by replacing the resigning participants — regardless of the ongoing reasons (according to plan or as a not planned drop-out). On the other hand, it is possible to analyse processes due to its panel characteristics. This approach also takes respondent burden into account. After preliminary experiments the reporting of the mobility of seven consecutive days in three consecutive years by means of a conventional diary was regarded as the limit of reasonableness (Zumkeller, Chlond, & Lipps, 1997).  In addition the car-owning households (about 78% of all German households) are asked to fill in a form about their car-use behaviour for a period of altogether 8 weeks (distance travelled and fuel consumption), which means additional burden. The completeness and enduring quality of the reports have to be regarded as a central issue for the MOP to serve its purposes. All together the combination of several ‘longitudinal’ elements of surveying mobility behaviour requires a repeated and enduring motivation to report from the side of the participants. Here the question comes up, how motivated the participants initially are to report their mobility behaviour: It has to be kept in mind that the recruitment of

376

Bastian Chlond et al.

participants for the MOP already has some consequences for the sample composition. The recruitment procedure comprises several stages. In each stage the sampled households or at least the deciding persons in the households are given the chance to resign or refuse to participate (Chlond & Kuhnimhof, 2003; Kuhnimhof, Chlond, & Zumkeller, 2006b). Therefore, we can assume that the participants can be regarded as basically motivated. In fact one central outcome of this selectivity study is that in spite of the low participation rates the results of the survey in terms of general mobility figures can be regarded as valid: The motivation to participate at all is not dependent on the mobility behaviour or travel volumes but usually dependent on the general interest in mobility questions and on a sense of responsibility for the provision of data for scientific purposes and to serve public interests. Beyond that, participants are well informed about their expected repetition to participate in three consecutive years. They are asked to agree that their addresses are stored which implicitly can be regarded as a consent for a repeated participation. In order to analyse effects caused by fatigue it has to be ensured that a decline in trip rates is not caused by the usually lower trip rates on weekends. This is achieved by splitting up each year’s sample into seven splits and assigning each split to a different weekday to start reporting. By an appropriate weighting it can be assured that every day of the week has the same probability to be the first, second, third etc. reported day for any analyses. This approach allows for the identification of fatigue effects — at least for aggregate figures.

20.3. Problem Description and Hypotheses The complex recruitment procedures in several stages allow for the hypothesis that a participant’s motivation will initially (on the first day of the first reporting period) be highest: The sampled persons have given their assent and are willing to participate. Thus, we can assume that at least at the very beginning (first day of reporting in the first wave) the novice participants produce a reasonable completeness and accuracy in reporting and in principle are willing to repeat their participation. We can even assume a kind of excess motivation comparable to a sportsperson at the beginning of a contest. Nevertheless this high motivation can be expected to cool down as a result of the high respondent burden: Some participants will underrate the demanding completion of the trip diaries on seven consecutive days. Furthermore, the newness of the survey will decline: The reporting of mobility may be fascinating for 1 or 2 days but not for a whole week with many trips, which are partly identical. As a result the degree of accuracy and completeness in the reports is likely to decline. Typical mobility figures such as trips per person and day are therefore likely to fall — at least for some of the participants. This will result in two different effects which have to be distinguished:  A declining motivation to report all trips will result in a decline in trip rates. This can be caused by omitting very short trips which from the perspective of the

Data Quality and Completeness Issues in Multiday or Panel Surveys

377

participants are more or less irrelevant. Or it can be caused by summing up short trips to a longer one. Both aspects become relevant in particular if the participants are not filling in their diaries when the trips occur but will complete the trips into the diaries some days later. This will result in a decline in trip rates, nevertheless the level of tripmaking (share of mobile persons per day) and the total distance travelled will basically not be affected as much.  A minority of people is expected to stop reporting at all. These are presumably persons who underestimated the burden of filling out the diaries. Or they initially put off the actual fill out process for some days, having in mind that they could do it later. But they never catch up and don’t have the motivation to report their mobility retrospectively at the very end of the reporting period. This effect will manifest by a decline in the share of tripmaking (share of mobile persons per day) in the aggregate. As a result, any other key mobility figures directly depending on the tripmaking (number of trips per day, kilometres travelled per day) are also affected. It can furthermore be hypothesized that persons who did not manage to report a complete week are likely not to repeat the survey in the next year. But we also have to emphasize that the reverse cannot be assumed: Respondents who report accurately and complete in one wave might not participate in the following year too. Therefore, the consideration of the attrition between waves becomes relevant: The probability that participants show a decline in motivation (or stop reporting at all) most likely varies between the group of repeating participants and drop-outs. Table 20.1 shows the different key mobility figures of repeating participants and drop-outs between 2 years (example of 2009 and 2010) in comparison. As expected, drop-outs report lower mobility figures in their first wave already (e.g. significantly lower level of trip rates). The lower levels of the mobility figures

Table 20.1: Key mobility figures of repeaters and drop-outs in comparison between years 2009 and 2010 processed data. Mobility key figure

Number of persons

Tripmaking (%) Repeaters 928 Drop-outs 399 Trip rate(trips per day) Repeaters 928 Drop-outs 399 Distance travelled (kilometres per day) Repeaters 928 Drop-outs 399 **significant on the 95% level

Mean 91.2 90.5 3.39 3.18 40.6 39.9

t-test 0.615

5.103**

0.068

378

Bastian Chlond et al.

become even more evident when taking the different demographic characteristics of both groups into consideration, as will be shown later. We will distinguish the different groups as follows:  Those who reported once (1  -reporters) will drop out after the first report.  Those who reported twice (2  -reporters) did not report in the third wave.  Those who have reported thrice (3  -reporters) and thus completely. As shown in Figure 20.1, the share of elderly participants is higher in the group of the repeaters. It shows the differences between the biased age-distribution of participants with different numbers of repetitions against the reference values (age distribution of the German population in 2008). For 1  -reporters the share in the age groups 10–17, 18–25 and 26–35 is higher than the respective reference values, whereas their share in the group of the 3  -reporters is lower. For the age classes 36–50, 51–60 and 61–70 the situation is reverse: These age groups are slightly underrepresented in the group of the 1  -reporters and over-represented in the group of the 3  -reporters. The age group W 70 years old is below the reference values in all groups but improves for repeaters. Taking into consideration that the relative weight of the older age classes grows by the number of repetitions (mainly of the senior citizens), we would expect lower mobility figures (assuming the usual experience that senior citizens are less mobile)

0% 1%

2% 1%

2%

5%

3% 4%

10%

2% 3%

8%

11%

15%

–9% –8%

–5%

–3%

–1%

–1% –3%

0% –5%

–5%

–1%

0%

–10% –15% 10–17

18–25

26–35

36–50

51–60

61–70

over 70

age class 1x–reporters

2x–reporters

3x–reporters

Figure 20.1: Differences between the age distribution of participants by the number of repetitions and the reference value (age distribution of the German population in 2008).

Data Quality and Completeness Issues in Multiday or Panel Surveys

379

within the group of the 3  -reporters. But nevertheless the repeaters show higher mobility demand figures than the (on the average younger) drop-outs. The demographic differences in terms of age between the 1  -, 2  -, and 3  reporters are evident. Other differences in socio-economic characteristics are not significant (Table 20.2) except the city size shows a slight significance. It can be concluded that older participants are in principle more duteous than people in the young age classes. This obviously affects both the number of repetitions (reporting periods) as well as the reported amount of mobility which will be shown later.

20.4. Analyses For our analyses we used the raw data (unprocessed data) of cohorts1 of the MOP starting between 2006 and 2008. We have concentrated our analysis not on the most recent years as we can only decide ex post, after the cohort has terminated as scheduled, whether a person has repeated or not. According to the design of the MOP, we can distinguish the different person groups as follows:  Those who reported once (1  -reporters) in the cohorts 2006, 2007 and 2008 in exactly those years (n ¼ 625, this figure is also the number of weeks reported). According to our assumptions we expect that within this group the mean reporting completeness can be assumed as lower than for the other groups. Nevertheless, this group will consist of both participants who show relevant fatigue and participants who report basically complete.  Those who reported twice (2  -reporters) in the above-mentioned cohorts in 2006 and 2007 or 2007 and 2008 or 2008 and 2009 respectively (n ¼ 525 persons, 1050 weeks of report).  Those who have reported three times (3  -reporters) who participated 2006 and 2007 and 2008 or 2007 and 2008 and 2009 or 2008 and 2009 and 2010 (altogether 1207 persons with 3621 reported weeks). Since we can only rely on reported mobility data, we can only compare mobility figures between different groups in order to assess the data quality. In our analysis we use the mean differences between the groups of 1  -reporter and 3  -reporters as an indicator of reported data quality. It is necessary to emphasize that a good and complete reporting behaviour as well as a less accurate reporting will be existent in all distinguished groups. Nevertheless, average data quality is expected to be different in all groups. And it is necessary to

1. All participants of the MOP who have been initially recruited in the same wave are referred to as a cohort. Since the time span between waves is one year, the cohorts can uniquely be identified by specifying the year of their first participation. The cohort of 2008 is asked to participate in the years 2008, 2009 and 2010.

380

Bastian Chlond et al.

Table 20.2: Distribution differences between 1  - and 3  -reporters of some sociodemographic characteristics. Attribute

Sex

Value

Male Female Age 10–17 18–25 26–35 36–50 51–60 61–70 W70 Car availability Yes Yes, but according to prior agreement No o20,000 City sizea 20,000–100,000 W100,000 Level of education Secondary school General certificate of secondary school University entrance diploma, university diploma Frequent traveller Yes card rail No o 1,000 h Household gross 1,000–1,999 h incomea 2,000–2,999 h W 3,000 h

1 -reporters

3 -reporters

v2

Share (%)

N

Share (%)

N

49 51 12 12 19 26 12 13 6 57 9

308 315 74 75 120 159 76 80 39 357 56

48 52 8 4 9 29 17 24 8 60 10

575 0.809 632 98 221.982*** 53 105 353 204 295 99 720 1.583 126

18 23 47 30 21 26

113 142 292 189 133 162

17 26 42 32 22 28

208 314 506 387 263 334

37

230

40

477

10 86 30 27 7 33

65 535 184 168 46 204

9 87 28 32 6 32

104 1055 343 382 76 384

6.703**

0.206

2.542 5.869

**significant on the 95% level; ***significant on the 99% level. a It has not been taken into account that these variables are household related and therefore the values are not necessarily independent on the individual level.

mention that we can only talk about aggregated numbers: On an individual level we are usually (with few exceptions which will be shown later) not able to identify and indicate ‘low’ and ‘high’ data quality: All the quantitative results are relevant for aggregates.

Data Quality and Completeness Issues in Multiday or Panel Surveys

381

In the next sections we will focus on the analysis and interpretation of the following figures:  the share of tripmaking;  the trip rates per day and per mode to identify recall effects relevant only for certain modes; and  the distance travelled per day. As shown above, differences in the age distribution and socio-demographic characteristics between repeaters and drop-outs can of course affect mobility key figures (see Madre, Axhausen, & Bro¨g, 2004). In the following diagrams it has therefore to be taken into consideration that while the absolute volumes and size of the mobility figures are affected by these socio-demographic differences, the decline over the reporting period has to be regarded as an effect of the declining motivation and the reporting behaviour.

20.4.1. Tripmaking Figure 20.2 shows the share of tripmakers by reported day and number of repetitions. The rate of tripmaking is highest for the very first day and nearly identical for all three groups in spite of the socio-demographic differences between the three groups. Since the share of elderly people is higher in the group of 3  -reporters compared to the group of 1  -reporters, a lower level of tripmaking should be expected. But this lower level of around 91% mobile persons is only established after 3 days.

share of mobile persons

95%

93%

91%

89%

87% 1

2

1x-reporters

3 4 reported day 2x-reporters

5

6

7

3x-reporters

Figure 20.2: Share of tripmakers by day of report and number of repetitions.

382

Bastian Chlond et al.

After the first day the level of tripmaking is declining but at different rates and due to different circumstances:  For those reporting three times, the level is stabilizing from the third day on and remains approximately stable after that.  For those who report only once and, to a lesser extent, for those who report twice the decline in tripmaking is higher.  The decline is generally highest during the first 3 days. We interpret these effects as follows: As mentioned above, we have to assume that — regardless of the earlier or later drop-out of a participant — all participants will usually start their report with a good motivation. They are willing to report completely and accurately. But after the first 1 or 2 days some participants stop filling in the trip diary promptly due to different reasons. Some will fill in the diaries at the end of the survey period, presumably with a lower rate of completeness due to memory gaps or in the attempt to ease the reporting burden. Others — but only a minority — will stop reporting at all, causing a decline in the rate of tripmaking from one report day to the next. 3  -reporters behave differently, thus showing a fairly stable level of tripmaking starting from Day 3 on. With regard to their high motivation (and taking into consideration the fact, that they will also report in the next waves), it would be unlikely to assume that participants stop reporting already after the first or second day. Furthermore, if any form of attrition within the waves would exist, it would also occur in the following days. It seems to be much more likely that the high motivation even leads to an excess motivation to report during the first 2 days: The participants are asked to report their mobility, have given their consent to cooperate and will be in principle eager to have anything to report at all. A minority of participants might be motivated to perform trips they would not have done without the existence of the survey and the diary: They assume that they are expected to report trips — even if there is nothing to report at all. We can characterize this as a ‘pre-conditioning of participants’ to the issue of the survey. Such effects (‘survey artefacts’) are well known in social science and social psychology (an overview about artefacts in social psychology is given in Bungard & Abele, 1980). In the case of a mobility survey this effect potentially affects the outcome of the survey, i.e. higher mobility figures reported at the beginning of each survey. As a result some of the participants leave the house even if they originally did not plan to do so — just in order to have at least one trip to report. This can be a short stroll or a walk to the bottle bank. This assumption is supported by the analysis of the number of trips described later as well as by considering the total mileage and the modes reported. The results confirm the hypothesis, that a kind of ‘pre-conditioning’ by the survey itself really takes place. People who have given their consent to participate in a complex survey with a considerable amount of respondent burden in which they are expected to report in detail really want to do so and want to present results from

Data Quality and Completeness Issues in Multiday or Panel Surveys

383

the day they start the report. Such an excess motivation is even likely for the 1  - and 2  -reporters, nevertheless in these cases it is difficult to circumstantiate.

20.4.2. Trip Rates Regarding the trip rates (number of trips per day) we can observe a decline during the reporting period for all three groups (Figure 20.3(a)). Since the true relation between trip rate and the number of the reported day is not well known, a linear relation has been assumed for the following analyses. The results of the linear regression for each group of reporters are shown in Table 20.3. The decline in the trip rates is higher for 1  - and 2  -reporters compared to those who will report three times. It is obvious that to a large extent the decline can be explained by those who stop reporting at all. Therefore, the decline in tripmaking directly affects the trip

number of trips per person and day

(a) not corrected by tripmaking

(b) corrected by tripmaking

3.7

3.6

3.6

3.5

3.5

3.4

3.4 3.3 3.3 3.2 3.2 3.1

3.1

3.0

3.0 1

2

3

4

5

6

7

1x-reporters

1

2

3

4

5

6

7

reported day

reported day 2x-reporters

3x-reporters

Figure 20.3: Trip rates by reported day and number of repetitions (a) not corrected by tripmaking and (b) corrected by tripmaking.

Table 20.3: Linear regression model for trip rates with the number of the reported day as explaining factor for each group of reporters.

1  -reporters 2  -reporters 3  -reporters

Count

Intercept

Reported day

PrW|t|

625 1050 3621

3.40 3.42 3.56

 0.0430  0.0366  0.0189

0.0133 0.0072 0.0091

384

Bastian Chlond et al.

rates. In Figure 20.3(b) the trip rates are again shown by reported day but the outcome has to be interpreted against the decline in tripmaking. In both diagrams we can see a ‘trough’ or ‘dint’ in the number of reported trips about in the middle of the 7 days reporting period. At the end of the 7 days, the number of reported trips is above average again and resumes the levels of the first 2 days. The depth of this dint is smaller for the 3  -reporters and considerably higher for those who only report once. We interpret this trough as follows: Participants are highly motivated in the very beginning of the survey and fill out the trip diary immediately. But the accurateness of reporting and the reporting behaviour is not stable over the complete reporting period. After some days a considerable part of participants postpone the actual fill out procedure to the end of the reporting period. But then some trips are missed when recalling the activities of the last days and participants might be tempted even more to try to speed up the fill out procedure by skipping some short trips. The form of the three curves obviously represents the general motivation:  Those who are reporting three times in general show a very stable reporting behaviour. The trough in trip rates as well as the maximum deviation in trip rates between days is small.  For those who will drop out later, the trough as well as the maximum deviation in trip rates are much higher. Nevertheless, it can be seen that the total number of trips is — with the exception of the first day — approximately stable for the 3  -reporters. On the contrary, the 1  -reporters show a significant decline in trip rates towards the middle of the reporting period. This means for the aggregate that the decline in trip rates is dominated by the 1  -reporters.

20.4.3. Trip Rates by Mode of Transport Walking by foot and cycling (slow modes) are usually used only for short trips, whereas the car as a universal mode has its fields of application in all distance classes. Public transport is more or less used for longer distances. In Figure 20.4 the trip rates measured in trips per person and day are depicted for the modes by foot, by bicycle, by car and by public transport. The diagrams illustrate which trips are more or less likely to be underreported. The number of trips with non-motorized modes is declining for all groups, but the average level and the decline are different for each group: 3  -reporters show a high stability at least after the second day, the ‘trough’ in reported trips is small. This is obviously different for 1  -reporters who show a stronger decline. As can be concluded from Figure 20.4, the differences in the total number of reported trips per day between 3  -reporters on the one side and 1  - and 2  -reporters on the other side is mainly caused by a under-reporting of trips by

Data Quality and Completeness Issues in Multiday or Panel Surveys trips 0.85

trips 0.55

by foot

0.80

0.50

0.75

0.45

0.70

0.40

0.65

0.35

0.60

0.30

0.55

0.25

0.50

385

by bicycle

0.20 1

2

3

4

5

6

7

1

2

reported day trips 2.10

3

4

5

6

7

reported day trips 0.45

by car

2.05

0.40

2.00

0.35

1.95

0.30

1.90

0.25

1.85

0.20

1.80

0.15

by public transit

0.10

1.75 1

2

3

4 5 reported day 1x-reporters

6

7

1

2

3

4

5

6

7

reported day 2x-reporters

3x-reporters

Figure 20.4: Trip rates by reported day and number of repetitions for different modes of transport. foot. This holds true even when considering the different socio-demographic composition of the groups. The reported number of trips by bicycle shows high stability for all groups and the absolute levels for each group are fairly the same. One possible reason might be that persons who cycle have a high motivation to report and do so very accurately. Trips by car are — disregarding the total decline — more severely affected. As car trips are more often organized in trip chains with possibly several short trips, the possibility to sum up at least some very short trips into longer ones is higher. This strategy is of significant influence for all three groups, but it is obviously most dominant for the 1  -reporters. Nevertheless, a memory loss can be generally observed as it can be concluded from the ‘trough’ in the middle of the reporting period. Trips by public transport generally show the lowest probability to be forgotten or to be underreported. The decline for the 1  -reporters is more or less only caused by those stopping to report at all. The recalling of public transport trips obviously is successful. This shows that public transport trips are either of a high regularity and will not be forgotten as they are an element of the daily routines or public transport

386

Bastian Chlond et al.

trips are usually of a length and importance that they can be recalled easily compared to trips by other modes. Beyond that they might be regarded as relevant for the reported mobility. The question arises which kind of trips by length are affected. The abovementioned aspects allow for the conclusion that mainly the number of short trips is slowly declining. This can, on the one side, result from the effect that participants start to merge short trips into longer ones, i.e. to combine trip chains of short trips to one trip. This would however not affect the level of distance travelled. On the other side, the decline in the very first days can be caused by the ‘additional’ trips (result of the excess motivation of participants at the beginning of the survey period as mentioned above). In this case, mainly trips by slow modes are likely to belong into this category. The effect of the higher rate of tripmaking will be mainly relevant for those person groups who are usually less active than others.

20.4.4. Distance Travelled

kilometers traveled per person and day

The distance travelled by reported day and number of repetitions is shown in Figure 20.5. The result of the related linear regression is shown in Table 20.4. Here again — since the true relation between distance travelled and the number of the reported day is not well known — a linear relation has been assumed for the following analyses. Obviously a significant decline in reported distance travelled by reported day cannot be observed. As mentioned earlier, participants have a hard time to recall each and every trip. This applies especially for short trips. Longer trips which take

48 46 44 42 40 38 36 34 1

2

3

4

5

6

7

reported day 1x-reporters

2x-reporters

3x-reporters

Figure 20.5: Distance travelled by reported day and number of repetitions.

Data Quality and Completeness Issues in Multiday or Panel Surveys

387

Table 20.4: Linear regression model for distance travelled with the number of reported day as explaining factor for each group of reporters.

1  -reporters 2  -reporters 3  -reporters

Count

Intercept

Reporting day

PrW|t|

625 1050 3621

42.54 44.23 40.96

0.0385 –0.7092 0.1673

0.9530 0.1430 0.5160

longer times to undertake are much less likely to be overlooked. Therefore, no significant decline in reported distance travelled can be observed. The high variation in distance travelled for all groups is related to the small share of long-distance trips over 100 km. The share of long-distance trips is roughly 1.4% in the MOP. To get a reliable figure for distance travelled on long-distance trips, the sample size used for this analysis is not suitable. From the point of infrastructure planning, it can be stated that independent of the level of motivation and therefore independent of the number of repetitions, the figures for the distance travelled are reported accurately. Even the group of the 1  -reporters report the same level of distance travelled as the very motivated 3  -reporters. This can very cautiously be interpreted in a way that the drop-outs within a wave are usually persons with a low mileage, who understand their stopping to report as irrelevant.

20.4.5. Learning Process Even for the highly motivated group of the 3  -reporters there are slight fatigue effects. Figure 20.6 shows the trip rates of 3  -reporters by reported day for each of the three waves. The level of trip rates is higher for the first wave than for the following waves. The ‘learning process’ obviously takes place during the first wave. The differences in trip rates between the second and third wave are negligible and usually not significant on a 95% level. The learning process only affects the trip rates by the obvious summing up short trips to longer ones, whereas the distance travelled is not affected. The phenomenon of the trough is observable in all waves. And once again an ‘excess of motivation’ at the very beginning of the survey seems to be likely: The number of trips peaks at the beginning of the first two waves.

20.5. Re´sume´ and Conclusions The usefulness and multipurpose applicability of multiday and multiperiod data of travel behaviour are obvious — particularly for scientific purposes. Nevertheless, the participation in surveys using conventional survey methods (self-administered

Bastian Chlond et al. number of trips per person and day

388

3.8 3.7 3.6 3.5 3.4 3.3 3.2 3.1 1

2

3

4

5

6

7

reported day first wave

second wave

third wave

Figure 20.6: Trip rates by reported day and wave for 3  -reporters.

questionnaires and diaries) is very demanding. In spite of the high motivation of the participants, which can be assumed due to the recruitment process, the quality of reports is, at least for a minority of participants, not of constancy and steadiness. With the exception of those who stop reporting completely, people are motivated to report their mobility. They have understood the purpose of the survey but they try to ease the burden by omitting short trips or by summing up some trips to larger ones. In principle they are willing to bring the survey to an end, but with a reduced accuracy. Incomplete data and data of persons who stop reporting cause problems as they falsify the outcome of survey results. As mentioned above, these effects could be accounted for by weighting the data accordingly. But this affects only the figures for the whole population. It is not possible to do this for individuals. And it is unclear how to take into account the higher level of tripmaking in the early beginning of the survey periods due to excess motivation. From the results of the analyses as shown above, we can conclude that there is a general relation between the completeness to report and the probability to remain in the MOP. This results in a self-selectivity of the MOP, which has to be regarded as positive, as those persons who report incompletely or inaccurately are obviously much more likely to drop out.

20.5.1. Self-Selection in the MOP — Do Drop-Outs Really Hurt? The experiences of the MOP show a positive self-selection in terms of data quality. Those who are reporting with a reduced data quality usually are more likely not to repeat in a consecutive wave.

Data Quality and Completeness Issues in Multiday or Panel Surveys

389

The approach of the rotating panel with basically stable sample sizes and stable rates of repetition between 75% and 85% (see Table 20.5) leads to a reasonable and interpretable data quality — at least in terms of cross-sectional considerations. The mistakes or biases of the sample against the represented population are basically the same for each year due to the effects of this self-selection of the sample. As a refreshment of the rotating panel has to be done anyway (also the 3  -reporters are replaced after three waves), the sample for analyses in cross-sections of 1 year can be kept representative in terms of the socio-demographic characteristics of the represented population. The typical socio-demographic characteristics of the irregular drop-outs can be anticipated; those of the regularly replaced 3  -reporters are known. Refreshment procedures can consider this. Assuming constant cohort sizes there is a stable mixture of different motivation levels. The share within a 1-year sample of participants who only report once is only at about 10% (in 2008, there were 1783 participants and only 761–575 ¼ 186 were 1  -reporters). Altogether a reasonable compensation between excessively motivated and less motivated participants takes place. Usually the attrition between waves and thus the drop-out phenomena are regarded as negative in panel surveys, as some of the recruitment cost and a considerable part of potential information are lost. In addition the ‘drop-outs’ will affect the total statistical error as the overall sample size will be reduced in the next round of the panel. The improved data quality thus affects the representativeness and the statistical validity of the remaining sample. Therefore, the effects on gains in data quality and completeness on the one side and the loss in statistical validity due to a smaller sample size on the other side should be analysed in detail. Summing up these considerations, it cannot be decided free of doubt if drop-outs hurt or not, but at least the drop-out phenomenon should not be regarded as solely negative.

Table 20.5: Cohort sizes (persons) and repetition rates in the German Mobility Panel. Cohort

Year of wave 2004 #

2004 2005 2006 2007 2008 2009 2010

%

2005 #

%

2006 #

%

2007 #

%

2008 #

%

2009 #

%

2010 #

%

748 100 575 77 401 70 671 100 448 67 347 77 706 100 506 72 433 86 714 100 589 82 442 75 761 100 575 76 480 83 613 100 491 80 797 100

390

Bastian Chlond et al.

20.5.2. Advantages of Continuous Surveys in a Form of a Rotating Panel Beyond that, some more typical advantages of continuous surveys in a form of a rotating panel become obvious:  Considering that recruitment costs are a very relevant part of the total costs of a survey, it makes sense to improve cost efficiency by the multiperiod approach to make use of motivated participants as much as possible, without endangering the data quality.  It is possible to observe the data quality every year. Any changes in reporting behaviour and data quality can quickly be identified.  Since the socio-demographic and socio-economic characteristics of the drop-outs are known this can be anticipated in the recruitment of any further wave. This allows for the representativeness of the rotating panel at least in a cross-sectional consideration.

20.5.3. Considering Measures to Improve Data Quality and Completeness The ‘trough’ in the middle of the survey period of seven days could possibly be avoided when implementing a more frequent reminding process or by using computer-assisted-telephone-interview (CATI) based approaches in order to avoid memory losses. This enhances reporting completeness but otherwise could cause additional burden and bothering of participants — the success of such a measure is difficult to forecast. Some motivated participants could potentially feel suspected which could endanger their probability to participate for another wave of the survey. Technological approaches could potentially help: Those who participate in the MOP are in principle motivated but are by reasons of the conventional PAPI-diary approach confronted with very detailed and demanding fill-in procedures. GPS- or Smartphone-based surveys reduce respondent burden and they make the results objectively measurable: Due to on-line recording no more need for recalling the mobility would be necessary — trips would be reported completely. Nevertheless the effect of excessively motivated participants who perform additional trips at the early beginning is likely to arise as well — perhaps such an effect would even be aggravated.

Acknowledgements The study is part of the scientific output within the analyses of the data of German Mobility Panel (MOP). The MOP survey as well as the analyses done are funded by the German Ministry of Transport, Construction and Urban Development. The comments of two reviewers considerably improved this paper.

Data Quality and Completeness Issues in Multiday or Panel Surveys

391

References Bungard, W., & Abele, A. (1980). Die ‘‘gute’’ Versuchsperson denkt nicht: Artefakte in der Sozialpsychologie. Mu¨nchen: Urban & Schwarzenberg. Chlond, B., & Kuhnimhof, T. (2003). Rules of non-response and selectivity — Analysing the drop-out in the multistage recruitment process for the German Mobility Panel. Paper presented at the European Transport Conference, Strasbourg, France. Kitamura, R., & Bovy, P. H. L. (1987). Analysis of attrition biases and trip reporting errors for panel data. Transportation Research, 21A, 287–302. Kuhnimhof, T., Chlond, B., & von der Ruhren, S. (2006a). Users of transport modes and multimodal travel behavior: Steps toward understanding travelers’ options and choices. Transportation Research Record: Journal of the Transportation Research Board, (1985), 40–48. doi:10.3141/1985-05 Kuhnimhof, T., Chlond, B., & Zumkeller, D. (2006b). Nonresponse, selectivity, and data quality in travel surveys: experiences from analyzing recruitment for German mobility panel. Transportation Research Record: Journal of the Transportation Research Board, (1972), 29–37. doi:10.3141/1972-06 Lipps, O. (2001). Modellierung der individuellen Verhaltensvariation bei der Verkehrsentstehung. Dissertation. Schriftenreihe des Instituts fu¨r Verkehrswesen der Universita¨t Karlsruhe (TH), Heft 58, 2001. Madre, J.-L., Axhausen, K.W., & Bro¨g, W. (2004). Immobility in travel diary surveys: An overview, Arbeitsbericht Verkehrs- und Raumplanung, 207, IVT, ETH Zu¨rich, Zu¨rich. Meurs, H., Van Wissen, L., & Visser, J. (1989). Measurement bias in panel data. Transportation 16, 175–194. doi:10.1007/BF00163114 Zumkeller, D., Chlond, B., & Lipps, O. (1997, May). The German mobility panel: Options, limitations and the complementary use of secondary data. Proceedings of the International Conference on Transport Survey Quality and Innovation: Transport Surveys: Raising the Standard. Grainau, Garmisch-Partenkirchen, Germany.

Chapter 21

WORKSHOP SYNTHESIS: LONGITUDINAL METHODS: OVERCOMING CHALLENGES AND EXPLOITING BENEFITS Elizabeth Ampt 21.1. Purpose and Introduction The purpose of the workshop was described as follows: The phrase ‘‘longitudinal survey’’ covers many different types of designs — retrospective, continuous, panels, and multi-day surveys. While the transportation research community widely believes that longitudinal surveys provide much deeper insight into behavioural processes than do cross-sectional surveys; the designs have not been widely implemented. Longitudinal studies track the same people, events, or behaviours over time, which brings particular methodological challenges. Even with these challenges, longitudinal surveys provide unique benefits such as expanded opportunities for dynamic travel behaviour analysis, providing more responsive survey information to support policy and planning information needs, and bringing potential statistical and cost-efficiencies to future survey efforts’. The workshop was designed to explore the challenges and benefits of different types of longitudinal surveys in the light of recent experience. It was attended by 18 people1 participating from six different countries. In addition, the workshop was

1. Participants: Chester Wilmot (USA), Bastian Chlond (Germany). Carolina Alvarado (Chile) Erika Spissu (Italy), Makoto Chikaraishi (Japan), Tony Richardson (Australia) (Rapporteur), Luis Rizzi (Chile), Tim Raimond (Australia), Nicolas Haverkamp (Germany), Uwe Kleinemas (Germany), Liz Ampt (Australia) (Chair), Jane Gould (USA), Juan de Dios Ortu´ar (Chile), Jean-Loup Madre (France), Toshiyuki Yamamoto (Japan), Dirk Zumkeller (Germany), Junyi Zhang (Japan), David Richardson (Australia).

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

394

Elizabeth Ampt

provided with four papers for presentation, together with a further six posters that were relevant to the discussions of the workshop.

21.2. Discussion of Context The workshop began with an open discussion to set the context of the participants’ concept of longitudinal survey methods. The group noted that the common element was: A survey repeated at least twice with the same people where each time unit is compared in the analysis. There were at least four different types of longitudinal survey identified initially including retrospective surveys, continuous surveys, panels and multiday surveys. It was agreed that during the workshop we would keep in mind three key questions:  What are the methodological issues?  What are the challenges?  What are the benefits?

21.3. Presentation of Related Papers The workshop discussions were initiated by the presentation of four papers that were assigned to the workshop. The first paper was by Chikaraishi, Fujiwara, Zhang, and Zumkeller (2011) and was presented by Makoto Chikaraishi. The background rationale for this work was the fact that there are new drivers behind setting up surveys — there are changes in policies, changes in behavioural patterns (e.g. the decline in the percentage of commuting trips, and widespread reductions of budgets available). The types of questions people asking are, for example, whether it is better to have a 7 day cross-sectional survey with a sample size (e.g. 600), a panel survey of two time points for a sample size of say 300 or a panel of 100 respondents over 14 years of age at four points in time. The authors’ aim of the research was specifically to develop a method for determining optimal sampling designs for a multiday, multiperiod survey under the existence of nonlinear changes of a certain parameter. After making some tight assumptions their aim was to evaluate the parameters of sample size, survey duration and frequency of observation. They used four parameters — (1) a behavioural model, (2) statistical power, (3) survey cost function and (4) maximising statistical power under given budget constraints. Once a preferred sampling design was found they tested two validation methods: numerical simulation and an empirical analysis using German Mobility Panel (MOP) data.

Workshop Synthesis: Longitudinal Methods

395

They found that they were able to develop an optimal sampling design, but that it was highly dependent on unobserved variation properties of the parameter of interest and that robustness of results depended on having not only more waves, but more days per wave. An important question that was raised in response to this paper was: Are the changes seen at different points in time simply variations or do they in fact represent true change? During the discussion it was recommended that the same test be done on other types of data set, e.g. a rotation panel survey or a continuous survey. The second paper was by Chlond, Wirtz, and Zumkeller (2011) and was presented by Bastian Chlond. This paper addressed the question of drop-outs in a panel survey and analysed data quality and completeness in the MOP multiday panel survey. This presentation began by outlining the analytical advantages of the relatively complex design of multiday panel surveys which include being able to     

measure variations in travel behaviour (e.g. multimodality); classify individuals by behaviour type; identify intrapersonal changes in travel behaviour and the underlying reasons; measure the effects of changed frameworks; and evaluate the effects of policy measures.

It is also clear that these surveys create a relatively high respondent burden which can result in a greater drop-out rate than might be expected in a one-off survey. The MOP survey method is to collect travel diary data for seven consecutive days. The method then uses a rotating panel approach where a household participates in three consecutive waves (one per year). In each wave one third of respondents are rotated out of the panel (in addition to the approximately 20% that drop-out of their own volition [i.e. become non-respondents]). The method applied is a self-completed paper-based diary. The analysis focused on differentiating between those who participated for one, two and three of the annual waves before they were rotated out of the panel. The overall finding was that all participants reported less trips (of most types) over time — whether they were participating for one, two or three waves. This suggested the influence of respondent fatigue and was a key point of later discussion. However, for almost all variables investigated, those who dropped out before the third wave under-reported to a higher degree than those who stayed in the panel for the duration. This included mobility rates (made a trip or not), trip rates, trips by cycling and walking. There were some exceptions, e.g. with public transport. The authors concluded that people who stay in the panel also report in more detail — so removing drop-outs improves the quality (though not bias) of the data. The questions raised included ‘If respondent fatigue is the key to poor reporting at all levels, what are the solutions — technological? multiple survey methods?’ These were discussed in more depth in the discussions. The third paper was presented by Toshiyuki Yamamoto (Yamamoto, 2011). This paper considered the travel feedback programmes (TFP) used in Japan as a method

396

Elizabeth Ampt

of changing people’s travel behaviour, in particular to encourage less use of the car. The method used in Japan consists of three steps: 1. a self-completion survey about travel behaviour; 2. provision of individualised information about ways to change travel behaviour; and 3. a follow-up self-completion survey about travel behaviour after the information has been received. The effects of TFP are evaluated by analysing the difference in travel behaviour observed between the first and follow-up questionnaire surveys. Since a significant number of respondents do not complete the follow-up survey, this paper addressed the issue of whether attrition is randomly distributed across the sample (in which case they argued that no bias is introduced) or whether there is an attrition bias in unit non-response. Using data from a TFP conducted at Nagoya, Japan in 2006, the author was able to divide the sample into three groups — those who were in the programme and received individualised information, those who were in the programme and did not receive individualised information and a control group where surveys were carried out before and afterwards, but where there was no intervention. The response rates for the follow-up survey of the three groups were 63.0%, 63.2% and 58.3% respectively (with no significant difference) suggesting in the first instance that there is no attrition bias. The author notes that the key problem in this analysis is that it is only possible to observe travel behaviour in cases where people respond, meaning that it is necessary to have a proxy to deduce an indirect relationship. This was done in two ways. The first way was to assume that those people who responded to the survey after a reminder was sent were closer to non-respondents than those who responded immediately. The second method was to use the level of detail respondents had provided in their planned new travel behaviour during the intervention phase and to use this in the analysis of respondents and non-respondents (is a higher level of detail associated with greater change?) The results suggested that there was no difference in travel behaviour before and after having received the reminder. However, there was a clear relationship between behavioural intention and response, suggesting that those who have a greater intention to change behaviour have a higher response rate to the second survey. In other words, the effects of the TFP are overestimated if unit non-response is not taken into account. The final paper was presented by Tony Richardson from The Urban Transport Institute in Australia (Richardson, Richardson, & Roddis, 2011). This paper addressed one of the challenges in the conduct of continuous travel surveys over an extended survey period, maintaining control over dataflows associated with various aspects of the survey process. It was noted that these surveys require multidisciplinary skills including psychology, statistics, computer systems management, database management, data

Workshop Synthesis: Longitudinal Methods

397

analysis, human resource management, project management and logistics and systems integration. The example on which the presentation was based came from the continuous Victorian Integrated Survey of Travel and Activity (VISTA) in Victoria, Australia. It has the following components that require the integrative software:  A pre-contact letter, announcing the survey  A self-completion questionnaire  Letter and questionnaire delivered personally  Recording answers to two questions if there is a refusal  A motivational reminder call prior to the travel day  Personal collection  A reminder call and letter for initial non-response  Clarification interviews during data entry. The various stages requiring integration of different data sources included the following (with those in bold being covered in the presentation):          

Project planning and budgeting Sampling Mapping Preparation of fieldwork materials Field administration Data entry and geocoding Data clarification Data editing, imputation and weighting Data analysis Presentation of results.

The presentation showed the way in which the detailed preparation of fieldwork materials (e.g. questionnaires, delivery and so on) was integrated with an expense tracking system to assist the consultant and client in understanding progress of the survey at any time. The data entry program made it simple for the data entry team to enter the data (the screen looking similar to the survey forms) and displayed activity profiles of each person completing the form to highlight unusual cases that might need validation by phone. Since the data is made available freely to the public, it meant that analysis needed to consider presenting the results in a simple, usable fashion. Some commonly used tables and results are available for different geographical regions (e.g. summary of all trips, mode share based on trips, mode share for work trips, daytime population in a given area and mapped options). Overall the paper underscored the importance of the complexity of continuous surveys, the need for dataflow management and the need for ongoing learning as new approaches and technologies become available. In particular it posed the question

398

Elizabeth Ampt

and challenge that if more data is made more readily available, it is likely that more funding will be made available in the future.

21.4. Preliminary Discussion Although the group had agreed on the definition of longitudinal as ‘a survey repeated at least twice where each time unit is compared in the analysis’, after the presentations consideration was given to defining each of the different types of longitudinal survey that had been presented. These were considered as follows.

21.4.1. Panel Surveys The basic definition of panel surveys comes from medical research where they are used to measure the same units on one variable over time. In travel surveys, a panel is always associated with attrition, and the use of a panel method requires the definition of the way in which attrition will be dealt with. One of the most recent methods (discussed in Chlond et al., 2011) is that of the Rotating Panel where attrition is anticipated and there is deliberate replacement after a defined number of waves. A sub-category of panel surveys was defined as a Pseudo-Panel with the following characteristics: A pseudo panel is a longitudinal method that is built from crosssectional design but includes people with the same demographics in each wave, rather than the exact same people. Each of the elements in each sample is effectively independent.

21.4.2. Before and After Surveys These surveys are possibly the most common longitudinal survey methodology used in the collection of day-to-day travel data. Independent samples are selected at different points in time (usually twice) to measure changes in travel patterns in a given geographic area or target population. They are often referred to as crosssectional surveys and form the basis of travel data collection in cities and countries internationally using many data collection methods.

21.4.3. Continuous/Ongoing Surveys These names are interchangeable for a survey where a sample is selected from the same population repeatedly over time (e.g. every year for many years). It was agreed

Workshop Synthesis: Longitudinal Methods

399

that for travel surveys the definition is most usually where every day is sampled for at least a 12 month period. Continuous or ongoing surveys can include sampling methods which enable the same units (e.g. households) to be selected for each wave or those which remove units sampled from subsequent waves.

21.4.4. Retrospective Surveys This type of longitudinal survey asks people to supply information about repeated events over time retrospectively. For example, it could include a survey that asks about a person’s long-distance trips between one or more destinations over a 12 month period.

21.5. Focusing on the Future As mentioned earlier, it was agreed that during the workshop we would keep in mind three questions: 1. What are the methodological issues? 2. What are the challenges? 3. What are the benefits? Our discussions therefore focused on the above four categories and explored the above issues using the rephrased questions:     

When, why and how should each of the methods be chosen? What is their key benefit? What are the key requirements of choosing that method? What are the problems and challenges of that method? What do we not yet know about this method?

21.5.1. Panel Surveys When, why and how? What is their key benefit? Panel surveys are ideal if the measurement is to reflect habitual behaviour or to observe how travel behaviour styles develop (e.g. the trip linking patterns that might be associated with the introduction of a new public transport service). Because they accumulate data on the same sampling units (usually people) over time, they make it possible to monitor the effect of policies ranging from regulation and pricing to voluntary behaviour change interventions.

400

Elizabeth Ampt

Panel surveys also have the benefit of being able to observe the dynamics and processes of change (e.g. the way a decrease in household size as a child leaves home effects the behaviour of the remaining people). Panel surveys also make it possible to model inertia, and observing behaviour over time has sometimes been used as a proxy variable for changes in attitudes. Finally, a key benefit of panel surveys compared with their cross-sectional counterparts is their greater statistical precision. 21.5.1.1. Key requirements of panel surveys When a panel survey method is selected, there are at least four key requirements: 1. A policy on replacement. This can include replacement by area, demographics or other variables. The Rotating Panel method is a way of minimising the need for replacement. 2. Long-term commitment by commissioning authority and funding source. While this seems a disadvantage, some organisations have found that if it is a recurring cost it can be accepted more easily in the longer term. 3. The funding organisation needs patience as comparative data is only available after the first wave and it often takes until the second to get data of the required significance. 4. It may be necessary to have multiple financiers of long-term panel surveys to ensure their continuation. 21.5.1.2. Problems and challenges of panel surveys The main challenge of using the panel survey method is finding ways to change the method without producing artefacts. Because of declining overall response rates, there is the need to apply emerging survey methods such as: 1. a parallel survey, i.e. overlapping the conventional and the new survey method in time so that the outcome of both approaches can be thoroughly compared (both the characteristics of survey respondents as well their mobility). That is the conventional approach serves as a control sample for the new one. 2. correction factors have to be derived to allow for unbiased time series. This demonstrates that it is particularly difficult to design a methodology for panel surveys which includes a control sample. Control samples are particularly important when the panel is being used to measure changes related to an intervention that is designed to change behaviour in a large portion of the population. 21.5.1.3. What do we not yet know about panel surveys? One of the key areas that needs more work is the provision of a ‘supply profile’ to match the data that is collected in a panel survey. In other words, what factors exogenous to the respondents are influencing change. Documentation of these would entail a survey of a different kind — an area which has had little attention to date.

Workshop Synthesis: Longitudinal Methods

401

Although Panel Surveys provide rich data about intrahousehold change, further work is needed on actually understanding what these changes mean for policy and planning and indeed for the design of future panel surveys. Leading on from this is the need to define travel behaviour more clearly which might include ‘parameters + outcome + perception of variable’. For example, demographic changes may mean that reported travel behaviour is reflecting age-specific lifestyles that are reflected in the intensity and frequency of several behaviour patterns. Moreover, cultural changes and influences of ‘Zeitgeist’ may change the way in which people travel and may give clues to politically desirable travel behaviour changes.

21.5.2. Before and After Surveys 21.5.2.1. When, why and how? What is their key benefit? As noted earlier, while panel surveys can also be used as a tool for before and after measurement, before and after surveys in this context are those which use a cross-sectional approach at each stage (i.e. independent samples chosen at different points in time). They are primarily used to measure the effects of an intervention of some type. This could range from the introduction or removal of a facility to the impact of an intervention. They are designed to give measurable indicators of the effect of the change. 21.5.2.2. Key requirements of before and after surveys Before and after surveys almost all require a control group to validate that the changes being measured are not due only to external influences. Workshop members verified this, giving examples of cases where both control and intervention groups had made changes in the ‘wrong’ direction, but the intervention group had changed significantly less meaning that the intervention had had the required effect. The comparative nature of these surveys means that not only do both before and after surveys need the majority of questions repeated exactly, but that the measurement goals need to be specified very clearly at the outset of the before survey. In addition, planning and funding needs to be pre-empted prior to the initial survey. 21.5.2.3. Problems and challenges of before and after surveys While the ideal requirement is for the same methodology in both surveys, there have been and possibly will continue to be cases where two different methods are used. An example would be when there is a technological improvement in data collection or a cost restriction. This is usually most simply addressed by applying the ‘before’ method to a small sample in the ‘after’ survey so that data from the new method can be weighted for methodology-specific biases. Another key challenge is the measurement of change from travel behaviour interventions. There are many considerations which include the survey method (diary vs. GPS is a key decision), the survey period (is 1 day, 2 days or more necessary), and the seasonality (particularly since mode use is likely to be affected). And the

402

Elizabeth Ampt

independence of the control group is often at odds with the need to have comparable samples between the control and intervention surveys. Finally, attention needs to be focused on the challenge of optimising sample size to get valid results in view of attrition. It is recommended that the travel fraternity looks as lessons from other fields of research and survey design. 21.5.2.4. What do we not yet know about before after surveys? Many of the issues arising from panel surveys also apply to before and after surveys. For example, similar to the panel surveys, the lack of research into the best ways to provide a ‘supply profile’ to match the data collected in before and after surveys is an area for investigation. A set of design guidelines for the selection of control groups is also needed. Finally more definitive work on the impact of regression to the mean (the tendency of both high and low users — e.g. of cars — to move towards the mean regardless of intervention type) influences in behaviour change evaluation.

21.5.3. Continuous Surveys 21.5.3.1. When, why and how? What is their key benefit? In line with panel surveys, continuous surveys are ideal to monitor the effect of policy over time by choosing repeated cross-sectional samples. They have the key benefit that the costs are spread over multiple years and can either be built into ongoing budgets or bid for annually. They are also well-suited to analysing the influence of seasonal data. This can include the effects of weather or differences in behaviour during other periods such as holidays, events, weekends and so on. They are ideal for measuring variation at the aggregate level. Continuous data can also be used to test whether it is possible to forecast future trends from early years. 21.5.3.2. Key requirements of continuous surveys Continuous surveys need a multiyear commitment by commissioning authorities or they run the risk of simply being a one-off survey. This can have serious consequences for planning and modelling if frequencies are too low. Furthermore, it is necessary to decide the size and frequency of each successive wave at the commencement of a continuous survey. Another key requirement is the need for a method of combining data from each year. This is because, although they are ‘continuous’ in the sense that households are usually being sampled every day for multiple years, it is best to draw the sample annually or even more frequently to ensure a sampling frame that reflects recent changes to the population and housing infrastructure. While not necessarily a ‘key requirement’, most participants mentioned the value of publicly releasing travel survey data to ensure ongoing funding.

Workshop Synthesis: Longitudinal Methods

403

21.5.3.3. Problems and challenges of continuous surveys Because continuous surveys typically cover large geographic areas — often including regional as well as urban — there can be large changes in supply on an ongoing basis. Because, even with an extensive pilot survey, the first year of a continuous survey in a given location can result in many lessons, there can be a question of the reliability of the first year of data collection. And another question relates to a decision on the length of the rolling period — is it, in fact, forever, or is it continuous for a period (say 5 years) and then repeated for 5 years at 5 year intervals? As with panel surveys, continuous surveys are faced with the fact that, over time, there may be a need to change the methodology. Here again, the outcome may not necessarily be comparable to the results of previous survey rounds — and again the new and the old methods need to run parallel for a short time to generate results that can be compared in a time series. 21.5.3.4. What do we not yet know about continuous surveys? The issue of understanding the changes we are measuring applies to continuous surveys as it does to other longitudinal surveys. Another area discussed in the workshop was the extent to which continuous cross-sectional surveys could be applied in developing countries. Issues such as the difficulty of defining a sampling frame and even greater supply changes over the period were raised. The fact that the data are frequently used for four-step models means that it is worth investigating whether this focus leads to a bias in the data that are collected. Finally, while there was some work done on travel behaviour variations between inter- and intrahousehold travel, little has been done recently and was flagged as an area for further research.

21.5.4. Retrospective Surveys 21.5.4.1. When, why and how? What is their key benefit? The key reason for using retrospective surveys asking an individual about travel over a longer period has been to gather data on long-distance and inter-urban trips. This is because the infrequency of this type of travel generally does not make them amenable to ongoing or beforeafter methods. They are therefore useful surveys for policy-makers to understand trip patterns. Similarly they can be used as an exploratory tool to understand patterns before pursuing another survey method. 21.5.4.2. Key requirements of retrospective surveys The key requirement for carrying out these surveys relates to minimising respondent burden. Given that they are required to recall information over a longer period of time, the following needs to be considered:  Survey method: needs a method where records can be searched over time (i.e. selfcompletion can be easier than methods that need an immediate response such as

404

Elizabeth Ampt

phone, face to face or even online). However, this needs to be juxtaposed against the burden of searching for information and an example from France was given when a self-completion method was replaced by face to face after a pilot. The interviewers were able to provide stimulus to response that the self-completion method did not.  Information and tips to aid memory: Whichever method is used, the survey design needs to suggest ways for respondents to find data. Many of these ideas can come from a pilot survey that gathers this information in addition to travel data.  Choosing the sample to include people that have made the ‘target trips’. 21.5.4.3. Problems and challenges of retrospective surveys The nature of these surveys — retrospective — means that the most important challenges are the memory effects on the data. In particular, it is likely that some types of travel are likely to be more affected than others. Another challenge is deciding what variables are reasonable to measure. For example, while someone may be able to recall that they took a trip on a certain date and time (possibly with the aid of a diary or calendar) other aspects of that particular trip may not be as easy to recall, e.g. access mode, duration of trip, other attributes such as delay times. This gives the researcher the choice of limiting data collection to specific data items, providing assistance with recall or working out a way of determining the reliability of respondents’ estimates (e.g. ‘I must have gone to the airport by taxi because I always do’, compared with ‘I get there in several ways and I can’t remember which I did that time’). As noted above, another decision that can be problematic is that of survey method — choosing between self-completion and interviewer-prompted methods in particular. As in all longitudinal surveys, this highlights the importance of exploratory pilot surveys. 21.5.4.4. What do we not yet know about retrospective surveys? One of the key research topics for longitudinal retrospective surveys relates to reliability of recall and selective memory over time. A key research topic would be to carry out some retrospective surveys without the use of memory aids, and with them to estimate their importance over time. Another area of interest relates not only to how you measure not only what is changing, but why it is changing. In that retrospective surveys require the respondent to reflect over time, these surveys may be the most suitable to carry out some controlled in-depth reflection on the perceived ‘why’ of change.

21.6. Research Directions by 2014 The workshop group felt that the following were the directions of research in the next 3 years, using likely paper topics as a way of categorising them.

Workshop Synthesis: Longitudinal Methods

405

21.6.1. Panel Surveys  Understanding processes and the components of change.  Managing methodological changes in the state of the art that arise over time (what is fatigue, attrition, technology etc.) — designing longitudinal surveys that adapt to these changes.  Design specifications for a longitudinal ‘supply’ profile (i.e. what do we need to control for?).  Explaining the benefits of longitudinal surveys.  From how to why? Integration of descriptive and explanatory survey approaches.  Understanding non-response in longitudinal surveys.  Integration of advanced technology into longitudinal methods.  Better understanding of the motivations of respondents and why they participate in surveys (e.g. characteristics and reasons for ‘over-motivation’ (i.e. participation) compared to those who are non-respondents or who drop out. This could occur in a positive direction (reporting more behaviour the respondent expects that the researcher wants) or in a negative direction (the reverse).  Rotating questions to broaden survey range without increasing respondent burden.  Examining utility functions and choice paradigms over time for panel data.  How can we observe changes without a control group, e.g. run a repeated crosssection at the same time? in-depth cognitive?  The role of attitudes over time.  Deciding on the size and frequency of each wave.  Identification of artefacts in survey data (e.g. variation as a result of measurement).  Observing behaviour over time as a proxy variable for attitudes.  Inter- and intrapersonal travel behaviour variations. 21.6.2. Before and After Surveys  Analysis of behavioural change vs. observed outcome change — can someone please expand.  Optimisation of the sample size to get valid results in view of attrition — lessons from other fields.  Design guidelines for the selection of control groups.  Problem of self-selection in the measurement of behaviour change programmes.  Impact of regression to the mean in behaviour change evaluation.  Stated intention as a reflection of actual change and its relationship with attrition. 21.6.3. Continuous Surveys  The value of publicly released travel survey data — an analysis.  Using continuous data to test whether it is possible to forecast future trends from early years.

406

Elizabeth Ampt

 Reliability of first year data in a continuous travel survey — implications and estimate of the required roll-out period.  Deciding on the size and frequency of each wave; is looking at data use for four step models only a bias?  Inter- and intra-personal travel behaviour variations.

21.6.4. Retrospective Surveys  The memory effects on data in retrospective surveys.  How do you measure not only what, but why behaviour changes? What are the types of variables are reasonable to measure?  What type of survey method is best to collect this data?

References Chikaraishi, M., Fujiwara, A., Zhang, J., & Zumkeller, D. (2011). Optimal sampling designs for multi-day and multi-period panel surveys. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Chlond, B., Wirtz, M., & Zumkeller, D. (2011). Do dropouts really hurt? — Considerations about data quality and completeness in combined multiday and panel surveys. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Richardson, A.J., Richardson, D., & Roddis, S. (2011). Integrative software for dataflows in continuous travel surveys. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18. Yamamoto, T. (2011). Attrition bias in before and after survey for personalized travel planning. Paper presented at the 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile, November 14–18.

THEME 5 UNDERSTANDING THE SOCIAL CONTEXT OF DATA COLLECTION

Chapter 22

Affective Personal Networks versus Daily Contacts: Analyzing Different Name Generators in a Social Activity-Travel Behavior Context Juan Antonio Carrasco, Cristia´n Bustos and Beatriz Cid-Aguayo

Abstract Purpose — In the context of the study of the role of social networks in travel behavior, this chapter adds to that body of knowledge by presenting a new data collection effort, which collects a wide array of information about the social, urban, and temporal context where social activity-travel behavior occurs. Methodology/approach — The study was developed in Concepcio´n, Chile, involving 240 respondents from four different urban contexts and their personal networks. The analysis concentrates on the challenges and opportunities of different techniques to build personal networks as a way of studying the social dimension of travel behavior. Although most of the current methods to study personal networks rely on emotional closeness, this approach may not be sufficient, since these ‘‘elicited’’ people may not include daily contacts that could be relevant to study social activities. Tackling this issue, the data instrument also collects those daily ‘‘revealed’’ people, on a two-day time use diary and a social activities listing. With this information, the chapter presents a comparative analysis between these ‘‘elicited’’ and ‘‘revealed’’ personal networks. Findings — Overall, the results illustrate the dependence of the name generator technique on what is observed in terms of social activity-travel behavior, specifically on aspects such as personal network size, average distance, and frequencies of interaction. In addition, the comparison between the different methods to construct the personal networks, illustrates how name generators

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

410

Juan Antonio Carrasco et al.

provide the opportunity to further understand transport related questions, such as the role of income and access to amenities on spatial and temporal patterns of social interactions, and their effect on social capital. Keywords: Social networks; travel behavior; name generator

22.1. Introduction and Motivation Having as a base the activity-based approach (Pas, 1990), and understanding that a relevant part of activities — and their associated travel — have a social motivation, recent research on travel behavior has focused on understanding and modeling the relevance of social networks on travel behavior (Dugundji, Pa´ez, & Arentze, 2008). Since current travel demand analysis has still mostly an individual approach (Jones, 2009), collecting data about the social context of travel behavior remains as a key challenge (Axhausen, 2008). Recent experiences have successfully adapted methods from the social network literature to study the social dimension of travel (Carrasco, Hogan, Wellman, & Miller, 2008a). However, as the field moves forward, not only on understanding but also on modeling these social processes (Arentze & Timmermans, 2008), these data collection techniques need to be further assessed. The most usual method employed in travel behavior research to explicitly study social networks corresponds to a personal or egocentric approach, which concentrates on the social contacts of a specific person, called ego (Carrasco et al., 2008a). The core of a personal network data collection is the name generator technique, which consists of a set of questions used to elicit the respondents’ social contacts. Since personal networks in urban settings can be as large as having hundreds of people, making unpractical an in-depth data collection of social activity-travel patterns, name generator serve to delimit the ‘‘boundary’’ of the social contacts (alters) included in the analysis. The sociological literature has studied the role of name generators on the assessment of the overall social context of people (Bailey & Marsden, 1999; Marin & Hampton, 2007; Marsden, 2005; McCarty, 2002; McCarty, Killworth, Bernard, Johnsen, & Shelley, 2000) and has debated about the appropriateness of some techniques to assess relevant societal questions (McPherson, Smith-Lovin, & Brashears, 2006; Mok, Wellman, & Carrasco, 2010). However, little is known about how different name generator techniques are capable to capture the spatial dimension of personal contacts, and which is the role of the income and urban context on these social networks; all key aspects to understand travel behavior. Using recently collected data in Concepcio´n, Chile, the objective of this chapter is to study the different spatial and temporal patterns that are captured by four different name generator techniques, and how these techniques help to assess the role of income and the urban context on that behavior. The rest of the chapter is organized as follows. First, the chapter presents an overview of the context where the data were collected as well as a description of the four name generator techniques used in the analysis. Then, these four techniques are

Affective Personal Networks versus Daily Contacts

411

compared focusing on their different networks sizes, as well as on their spatial patterns and frequencies of interaction, assessing how the context where the data were collected mediates these results. Finally, some conclusions and future work lines are presented.

22.2. Data Collection 22.2.1. The ‘‘Communities in Concepcio´n’’ Study The data employed in this paper comes from the study ‘‘Communities in Concepcio´n’’ (August 2008–April 2009), which focused on the characteristics of social activity-travel through the analysis of personal networks in different neighborhoods of Concepcio´n, Chile. The city is located 500 km south from Chile’s capital, Santiago, and its Greater Area has a population of around one million people, being the second largest in the country. The city area is around 300 km2 and is served by a relatively good quality radial-based public transport system (buses and shared taxis), with good public coverage and level of service, currently representing a 60% of modal share. The other two key transport modes are walking (20% of modal share) and car (15% of modal share); the latter having good level of services as well. Concepcio´n presents diverse income levels, with an auto ownership of around 35% of the households. The city has a several economic activities, especially manufacturing and services, constituting the second most important economy in the country. In this urban context, social network data were collected in four distinctive neighborhoods in the city. The first two neighborhoods had good access to downtown amenities: Agu¨ita de la Perdiz, composed mainly by medium through lowincome households, and La Virgen, composed by medium through high-income neighborhood. Spatially, these two neighborhoods situate besides each other, and have good accessibility (around fifteen minutes walking distance) to downtown Concepcio´n, where most of the services and workplaces locate. In that sense, the choice of these two neighborhoods is aimed to understand the social context of two different income contexts, but controlling for a good spatial access to most services and jobs. On the contrary, the other two chosen neighborhoods had less access to downtown amenities than the previous areas: Santa Sabina, composed mainly by medium through low-income households, and San Sebastia´n, composed by medium through high-income households. In the context of Concepcio´n, less access implies that, although downtown is not too far (around 30 minutes by car), walking is not an attractive alternative for most of those who live on these two neighborhoods. Both low-income neighborhoods, Aguita de la Perdiz and Santa Sabina, were created in 1950s by a land occupation performed by politically organized ‘‘pobladores’’ that constructed their own neighborhood, which gave then good levels of social organization. In the earlier years, it looked more like a shantytown; however, after sixty years of continuous occupation, and thanks to the selfconstruction and the public effort of urbanization, it has become an average

412

Juan Antonio Carrasco et al.

lower-middle class neighborhood. La Virgen (the high-income and high-access neighborhood) was built in two stages. First, around the 1950s, two important city companies built houses for their professional staff. Later, during the 1970s and 1980s, liberal professional and upper middle class, built new houses in the area, consolidating the neighborhood as it currently looks. Finally, San Sebastia´n (the high-income and low-access neighborhood) is a much newer neighborhood, created in the 1990s with new dwellings for middle to high-income households, having similarities with North-American gated communities. Note that these four neighborhoods do not have extreme poor or wealthy households, which are uncommon in the city in general. In this way, these four neighborhoods were chosen as a way of studying the differential role of income, on the one hand, and access to downtown amenities, on the other hand, with respect to social activity-travel behavior.

22.2.2. Name Generator Approaches in Personal Networks Defining the network’s boundary is a crucial challenge. In particular, eliciting ‘‘appropriate’’ network members is difficult due to the large size of networks, and the need to sample adequate network members for the phenomenon of interest. In egocentric methods, the most common technique to elicit network members is the name generator, which consists of free recall questions that elicit alters from an ego’s network (Burt, 1984; Marsden, 2005). Name generating questions elicit ‘‘a fraction of respondents’ social contacts’’ (Marsden, 2005, p. 12). The key decision then is choosing the appropriate specific question(s) that will elicit the network members relevant for specific phenomenon of interest, constrained by the available time, and the desired level of complexity of the data collection instrument. Also, the number of alters elicited can be limited by a specific number (Marsden, 1987) or unlimited (as here). There is an extensive literature that compares different name generators, discussing aspects such as their influence on network size, the number of ‘‘core’’ and ‘‘extended’’ network members that each elicit, the importance of the instrument’s context, the relevance of the order and wording of questions, and the forgetting phenomena (for a further review, see Marsden, 2005, 1990 and the references therein). In the particular application reported in this chapter, four name generator techniques were applied to 240 people (60 from each neighborhood) in semi-guided interviews. Respondents were chosen by a random and sociodemographic quota based procedure, which tried to preserve age, sex, and occupation proportions from the latest Census data. The total number of elicited social contacts ranged between 4437 and 794, depending on the name generator technique. The four name generators presented to the respondents were: emotional closeness (EC), network capital (NC), social activities (SA), and time use (TU). As a way of preserving a reliable exercise and elicit sufficiently large network sizes, the order in which the name generators were presented was the same for all respondents, similarly

Affective Personal Networks versus Daily Contacts

413

to other experiences reported in the literature (Marin & Hampton, 2007). Then, it is important to remark that these name generators are not entirely independent between them, but are built in a progressive way. Although this characteristic prevents the possibility of a randomized experiment to control the ordering effects, this bias is compensated with larger and more diverse personal networks. In this way, although new social contacts may be named in the approaches that follow the EC, the key comparison focus in the analysis is regarding the size, spatiality, and frequency of interaction of the social circles captured by the EC approach — the most usual in current travel behavior research — with respect to the other techniques. More details about each name generator technique are explained next. 22.2.2.1. Emotional closeness This name generator consists on eliciting personal networks using the emotional closeness concept, which has been extensively used in sociology (e.g., Marsden, 2005), and which concentrates on the individual’s affective network, or people the respondent defines as emotionally close. Concretely, respondents named people who lived outside their household, with whom they felt very close and somewhat close. Very close were defined to the respondent as ‘‘people with whom you discuss important matters with, or regularly keep in touch with, or they are for you if you need help.’’ Somewhat close consisted of ‘‘more than just casual acquaintances, but not very close people.’’ Similar name generators have been used in recent transport research literature (Carrasco et al., 2008a; Kowald, Frei, Hackney, Illenberger, & Axhausen, 2009; Larsen, Urry, & Axhausen, 2006; van den Berg, Arentze, & Timmermans, 2009). 22.2.2.2. Network capital There is a long tradition on sociology on name generators based on social capital measures (Lin, Fu, & Hsung, 2001; Marsden, 1990; Van der Gaag & Snijders, 2005); all of them try to elicit personal network members with whom there can be interchange of resources, concept known as network capital (Wellman & Frank, 2001). In the specific case of the dataset studied, the name generator technique consists on asking the respondents about people outside their household with whom they have given and/or received the following network capital resources: – – – – – – – –

Advice on important matters; Care on illness; Help to use a computer or internet; Drive to work or shopping; Drive on emergency situations; Small amounts of money on emergencies; Advice on new job opportunities; Talk about the day.

In this way, the network capital measure employed in the dataset involves both emotional and material resources, similarly to those elicited in the sociological literature previously mentioned.

414

Juan Antonio Carrasco et al.

22.2.2.3. Social activities This name generator was derived from a list of up to four of the ‘‘most relevant’’ social activities that each participant engages in the previous month period; besides from Carrasco et al. (2008a), there are not other experiences with this technique in the literature. After listing those activities, the respondents named the social contacts outside their household with whom they engage on those activities. In this way, this name technique generates a personal network that is ‘‘revealed’’ from the social activities that the respondent mentions. In other words, the unit of analysis is each social activity from which personal contacts emerge. From a temporal viewpoint, it is expected that the alters elicited with this technique would capture a social behavior involving more frequent and spatially closer contacts than the previous two name generators. 22.2.2.4. Time use The respondents were asked to fill a retrospective two-day time use diary; one of them had to be the closest Saturday or Sunday prior the interview, and the other day had to be the closest weekday (Monday to Friday) prior the interview. Although retrospective time use surveys have biases in terms of event recall, the face-to-face data collection mode helped to minimize that problem. For each event, the respondents were asked about start and end time, activity type, location, for whom, and with whom they performed each event. Detailed information about each of these contacts was asked, omitting family members who live at the same participant’s household. In this way, the list of these contacts can be conceptualized as a name generator that ‘‘reveals’’ the personal network implicit in the two-day time use. A similar name generator has been applied and studied in sociology (Fu, 2005), and more recently in travel behavior research (Habib & Carrasco, 2011; van den Berg, Arentze, & Timmermans, 2010); although without making any comparative effort with other different name generator approaches.

22.3. Comparisons between Name Generator Techniques In this section, the data elicited from the four name generator techniques are compared, on three key dimensions: network size, spatiality, and frequency of interaction.

22.3.1. Personal Network Size Table 22.1 shows key descriptive statistics about the network sizes for the four neighborhoods studied; Table 22.2 shows the pair wise Mann–Whitney nonparametric tests between neighborhoods for each name generator technique, and Table 22.3 shows the pair wise Mann–Whitney nonparametric tests between name generator techniques for each neighborhoods. Nonparametric specifications are more appropriate for network size variables, considering that they have long tails

Affective Personal Networks versus Daily Contacts

415

Table 22.1: Network sizes by name generator and by neighborhood. Agu¨ita de la Perdiz (low income, high access)

Barrio la Virgen (high income, high access)

Santa Sabina (low income, low access)

San Sebastia´n (high income, low access)

Emotional closeness Min. Max. Average Deviation Percentile 25 Percentile 50 Percentile 75

4 44 18.5 8.1 13 16 23.75

4 66 19.9 11.3 13 17 24

2 42 15.6 8.2 9 13 22

5 62 19.6 11.3 12.25 15 25.75

Network capital Min. Max. Average Deviation Percentile 25 Percentile 50 Percentile 75

0 32 9.6 6.7 5.25 8 12

2 67 14.0 10.2 7 11 17

2 42 9.8 7.3 4 8 12.5

2 55 16.0 10.0 10 14 17.75

Social activities Min. Max. Average Deviation Percentile 25 Percentile 50 Percentile 75

1 21 6.6 4.2 4 6 8

0 23 6.3 4.4 3 5 8

0 16 5.5 3.2 3 5 7

0 21 5.3 5.2 2.25 4 6

Time use Min. Max. Average Deviation Percentile 25 Percentile 50 Percentile 75

0 15 3.6 3.4 1 3 5

0 16 4.5 3.8 2 3 6

0 8 2.3 2.2 0.25 2 4

0 16 3.0 3.4 1 2 3.75

416

Juan Antonio Carrasco et al.

Table 22.2: Mann–Whitney nonparametric tests on network sizes between pairs of neighborhoods (z scores of the differences). Neighborhood/name generator Low income and high access– high income and high access Low income and high access–low income and low access Low income and high access– high income and low access High income and high access–low income and low access High income and high access– high income and low access Low income and low access–high income and low access

Emotional closeness

Network capital

Social activities

Time use

 0.34*

 3.00

 0.34*

 1.48*

 2.24

 0.19*

 1.25*

 1.97

 0.05*

 4.39

 2.64

 1.23*

 2.21

 3.10

 0.79*

 3.55

 0.44*

 1.52*

 2.14

 2.77

 2.10

 4.34

 1.51*

 0.83*

*Statistically significant similarity at a 95% level.

Table 22.3: Mann–Whitney nonparametric tests on network sizes between pairs of name generators (z scores of the differences).

EC–NC EC–SA EC–TU SA–NC SA–TU NC–TU

Agu¨ita de la Perdiz (low income, high access)

Barrio la Virgen (high income, high access)

Santa Sabina (low income, low access)

San Sebastia´n (high income, low access)

 8.10  6.33  9.07  2.89  4.39  6.10

 8.14  3.83  8.80  5.78  2.54  7.18

 7.74  4.56  9.13  3.80  5.76  7.67

 8.10  2.14  9.00  7.25  3.51  8.54

in the upper values. More details about these statistical tests can be found in Dineen and Blakesley (1973) and Sheskin (2007). Comparing between the different neighborhoods, for the case of EC, the average number of alters per network is very similar, with the exception of Santa Sabina, the low-income and low-accessibility neighborhood, whose values are statistically significantly lower, judging from the nonparametric tests from Table 22.2. In the case of NC sizes, both high-income neighborhoods have larger networks than their low-income counterparts. In fact, sizes are statistically equal only on the pairs of

Affective Personal Networks versus Daily Contacts

417

neighborhoods with the same income. In this way, EC and NC are actually capturing different effects of income on social contacts. Theoretically, this result shows that network capital is not just a consequence of the density and intensity of emotional closeness, but that also requires actual materialities to circulate in the network. Therefore, in low-income contexts, networks of mutual cooperation cannot be maintained in the absence of material resources to exchange. In that way, social resources would be another form of capital that enriches those that already possess other (material or monetary) forms of capital. Although patterns are not as strong as in the previous cases, in the case of SA and EC, Santa Sabina (the low-income and low-accessibility neighborhood) shows again statistically significant smaller networks than their counterparts. A possible hypothesis for this result may be that social activities require either spatial closeness — having a door-to-door interaction, as the case of both high-income neighborhoods — or enough money to afford mobility in the city, as in the case of both low-income neighborhoods. Finally, the TU name generator remarks the differences between neighborhoods in terms of access more than in terms of income levels. A possible hypothesis for that result is that, at least with respect to network sizes, TU reflects certain spatial/ temporal local constraints, which are not captured by the main social activities inventory. The previous specificity between neighborhoods does not occur from the viewpoint of the name generator technique. As expected, there are substantial differences between the four name generators. EC and NC capture larger name generators, suggesting eliciting longer term contacts (and longer term social behavior) than SA and TU. In this way, EC and NC seem to capture longer term spatial and time behavior. In fact, in each of the four neighborhoods, the average personal networks are statistically different depending on the name generator used; ordered from the largest to the smallest, these are: EC, NC, SA, and TU. The result is expected since EC captures somewhat more general network capital than NC, and SA captures a broader time span than TU. In addition, the greater number of personal contacts on emotional closeness and social capital remarks that these social exchanges not only occur with local contacts but with others at further distances.

22.3.2. Personal Network Spatial Patterns The instrument collected in all four name generators the home address (closest main intersection) of the respondent’s social contacts, information that was geo-coded and from where the linear ego-alter distances were calculated. Although not all social activities occur at home locations, they have been used in the recent literature as a good proxy to understand the spatiality of social activities (e.g., Axhausen, 2008; Carrasco, Miller, & Wellman, 2008b; van den Berg et al., 2009). In fact, results from this dataset (not shown in this chapter) also suggest that over 80% of the social activities occur at one of the social contact’s homes. Thus, despite the potential

418

Juan Antonio Carrasco et al.

biases, and in order to make the analysis as clearer as possible, we compare the egoalter home distances between the different name generators and neighborhoods. 22.3.2.1. Comparisons between neighborhoods There are striking differences when comparing these values between the different neighborhoods, both in terms of averages and the overall probability distribution, as shown in Table 22.4. In fact, nonparametric tests show that higher income neighborhoods have longer ego-alter distances with respect to their lower income counterparts (Table 22.5), result that

Table 22.4: Ego-alter distances (km) by neighborhood. Agu¨ita de la Perdiz (low income, high access) Min. Max. Average Deviation Percentile 25 Percentile 50 Percentile 75

Barrio la Virgen Santa Sabina San Sebastia´n (high income, (low income, (high income, high access) low access) low access)

0.02 1397.85 34.69 125.63 1.09 4.16 9.58

0.09 6304.34 319.43 846.73 5.38 58.34 215.39

0.05 453.39 48.45 90.43 1.67 7.64 41.91

0.20 2522.76 200.37 369.99 6.81 89.81 206.89

Table 22.5: Mann–Whitney nonparametric tests on ego-alter distances (km) between pairs of neighborhoods (z scores of the differences). Neighborhood/name generator Low income and high access– high income and high access Low income and high access–low income and low access Low income and high access– high income and low access High income and high access–low income and low access High income and high access– high income and low access Low income and low access–high income and low access

Emotional closeness

Network capital

Social activities

Time use

 6.16

 6.70

 4.35

 3.84

 1.68*

 2.44

 2.20

 1.06*

 5.80

 7.19

 5.58

 3.73

 5.47

 5.14

 2.11

 3.18

 0.92*

 0.04*

 1.82*

 0.20*

 4.73

 5.49

 3.65

 3.43

*Statistically significant similarity at a 95% level.

Affective Personal Networks versus Daily Contacts

419

holds for all name generators, and that is expected considering the cost of communications and travel. In addition, the comparison between the two highincome neighborhoods shows no differences on any of the name generator techniques, suggesting that the spatial spread of personal networks on these two neighborhoods follow a similar pattern for each type of social contact, regardless of their difference on access to downtown amenities. In this way, income people is capable to ‘‘compress’’ space (Harvey, 1990), overcoming — at least in part — the friction of distance for their different social contacts. In the case of the low-income neighborhoods, although their ego-alter distance patterns are also similar, there are statistical differences between NC and SA (see Table 22.5). This latter difference between the two low-income neighborhoods suggests that spatial closeness to downtown amenities could somehow level-up part of the low-income condition in network capital acquisition. 22.3.2.2. Comparisons between name generators From the viewpoint of the name generator techniques, average ego-alter distances have similar patterns as network sizes (Table 22.6). In fact, EC and NC show longer average distances than SA and TU. Interestingly, SA presents shorter mean ego-alter distances than TU, suggesting in average core social activities tend to be more local than daily activities, such as work, study and others. However, at the same time, ego-alter distances of social activities (SA) do not show a statistical difference with respect to TU use in any neighborhood, despite the difference on their averages (Table 22.7). Behaviorally it is also interesting to note that, for high-income respondents, social and daily activities are more local with respect to social capital, remarking that network resources are not constrained by their neighborhood, but to other social relationships. On the contrary, low-income neighborhoods present similar local patterns not only for SA and TU, but also for NC, showing their lack of spatial (and thus social) diversity to reach social resources compared with their high-income counterparts. This result is especially relevant in the case of Agu¨ita de la Perdiz (the low-income and high-access neighborhood), where only EC presents statistically

Table 22.6: Ego-alter distances (km) by name generator technique.

Min. Max. Average Deviation Percentile 25 Percentile 50 Percentile 75

Emotional closeness

Network capital

Social activities

Time use

0.04 6304.34 236.51 639.56 6.66 67.69 186.20

0.03 5517.11 194.12 483.01 4.28 38.29 187.86

0.04 2522.76 70.02 256.66 2.34 5.68 45.50

0.02 5992.07 93.78 468.47 1.31 5.13 15.94

420

Juan Antonio Carrasco et al.

Table 22.7: Mann–Whitney nonparametric tests on ego-alter distances between pairs of name generator techniques (z scores of the differences).

EC–NC EC–SA EC–TU SA–NC SA–TU NC–TU

Agu¨ita de la Perdiz (low income, high access)

Barrio la Virgen (high income, high access)

Santa Sabina (low income, low access)

San Sebastia´n (high income, low access)

 3.62  2.29  3.50  1.21*  0.31*  1.50*

 5.81  1.14*  5.46  4.86  0.15*  4.53

 3.23  1.46*  4.23  1.70*  1.41*  2.95

 3.41  0.01*  4.34  3.44  1.49*  4.43

*Statistically significant similarity at a 95% level.

longer distances with respect to the other name generators, suggesting strong doorto-door patterns.

22.3.3. Frequencies of Interactions on Personal Networks The instrument collected the frequency of interaction between the respondents and each of their social contacts generated by the four approaches, differentiating by the following modes: face-to-face (including casual or at job), socializing, by telephone, and by email. Considering the diversity of name generators studied, the comparative analysis concentrates on face-to-face interaction. Most of the tendencies in terms of differences between the name generators and neighborhoods can also be appreciated for the other modes, and thus those results are omitted in the chapter. For an example of further analysis on the interrelations between the different modes of contact, the reader is referred to Carrasco (2011). Figure 22.1 presents the tendencies for the case of face-to-face frequencies of interaction, which is measured in four ordinal scales: (1) less than a year, (2) between once a year and once a month, (3) between once a month and once a week, and (4) once a week or more frequently. Since these values were collected using ordinal scales, nonparametrical statistical test are not meaningful. The comparison between the neighborhoods shows very similar trends among them. With respect to the name generator techniques, and as expected, EC presents a similar trends of frequencies of interaction with respect to NC, being able to incorporate longer time horizons on social interactions, compared with respect to SA and TU. Complementary with that finding, the frequency of face-to-face interaction is higher in TU with respect to SA, reflecting that the latter is capable of capturing longer time horizons of social interactions with respect to TU.

Affective Personal Networks versus Daily Contacts Agüita de La Perdiz (Low income, high access)

421

Barrio Universitario (High income, high access) 100%

100% Less than once a year

80%

Less than once a year

80%

60%

Between once a month and once a year

60%

Between once a month and once a year

40%

Between once a week and once a month

40%

Between once a week and once a month

20%

Once a week or more frequently

0% EC

SA

NC

20%

Once a week or more frequently

0% EC

TU

Santa Sabina (Low income, low access) 100%

SA

NC

TU

San Sebastián (High income, low access) 100%

Less than once a year

80% 60% 40% 20%

Between once a month and once a year

60%

Between once a month and once a year

Between once a week and once a month

40%

Between once a week and once a month

Once a week or more frequently

0% EC

SA

NC

TU

Less than once a year

80%

20%

Once a week or more frequently

0% EC

SA

NC

TU

Figure 22.1: Face-to-face interaction by neighborhood and name generator technique.

22.4. Conclusions Using personal network data, this chapter has reviewed how four name generators capture the participants’ social context in relation with their spatial and temporal patterns of social interaction. A special focus was made on how these techniques help to understand the role of income and access to amenities on those spatial and temporal patterns. The analysis shows that there are similarities between the EC and NC techniques on the one hand, and between SA and TU, which reflect longer and shorter term social interaction patterns in time and space. However, this similarity is heavily mediated by the context studied; in this case, the economic conditions of the neighborhoods. In other words, the different name generator techniques serve to highlight different aspects of the sociospatial and temporal patterns, but only after considering the overall context of the study (in this case, the different neighborhoods). Two examples about the usefulness of using name generators on assessing the social dimension of activity travel are the analysis of spatial and network capital patterns on the different neighborhood contexts of the study area. In the case of the spatiality, the results suggest that being close to amenities and other social contacts act as a compensatory effect with respect to income. In fact, the number of contacts and the social activity patterns in the low-income/good-access neighborhood is similar to those with high income and different (greater) than the low-income/ low-access neighborhood. In the case of social capital, the data collected suggest that certain dimensions of network capital are heavily correlated with economic capital.

422

Juan Antonio Carrasco et al.

In other words, the circulation of cooperation needs the existence of goods which could be interchanged and the (physical) access to them. In general, the results from the chapter give an important warning about the dependence of the name generator technique employed to understand the social context of travel behavior. Aspects such as personal network size, average distance, and frequencies of interaction heavily depend on the name generator used. In addition, the data presented also suggests that the assessment of the role of income and access to amenities on spatial and temporal patters of social interactions, also depend on the name generator used. Note that the results dependence on name generators is not necessarily an issue, but can also be understood as an opportunity in terms of their capability on capturing different spatial and time scopes, depending of the travel behavior-related question that is needed to assess. For example, comparing network sizes and spatial distances between the four name generators provided the opportunity to understand the differential role of income and access on network capital, giving a complementary view about the current discussions about the relationship between transport and social capital (Lucas, 2009). Although further refinements need to be done on the techniques used to collect these data and analyze them, the suggestions that arise from the results are three: (1) Using more than one name generator when broader personal networks in time and space are required for the research, which could combine shorter term relationships (SA and TU) with longer term relationship (EC and NC). This recommendation goes in line with previous arguments from sociology (e.g., Marin & Hampton, 2007) about the need of multiple name generators in order to have appropriate network sizes and other metrics, now extended with the added evidence from this chapter regarding the spatial and temporal characteristics of personal networks. (2) Differentiating the alters that are elicited from each technique, considering that each name generator captures different components of the overall respondent’s social contacts. In this way, these techniques provide the researcher the ability to understand the intertwined role of space, time and characteristics of the respondents, their social contacts, and other contextual aspects, such as income and urban amenities, as in the example from this chapter. (3) Assessing the appropriateness of the elicited networks respect to the specific research questions that motivated the study. In fact, relevant background theory (e.g., Wellman, 2001) as well as the empirical findings from this chapter, remark that personal networks are composed by several social groups, with different characteristics in aspects such as spatiality, temporal interaction patterns, emotional and monetary exchange, and types of relationships. Then, choosing an appropriate name generator influences directly on the activity-travel characteristics that can be elicited from personal networks, aspect that is not only crucial for properly understanding the social dimension of mobility, but also when using this empirical data for modeling purposes (e.g., Kowald, Arentze, & Axhausen, 2012).

Affective Personal Networks versus Daily Contacts

423

Further research needs to continue assessing the influence of name generators on the personal networks characteristics that arise from them, ideally differentiating on other contextual aspects, such as income and urban environment, as in the example from this chapter. Overall, although these techniques have caveats and need to be used with careful, they constitute a very useful tool to empirically study people’s social context and its role on activity and travel patterns.

Acknowledgment This study was supported by the Chilean Fund for Sciences and Technology (CONICYT), project Basal FB 0816.

References Arentze, T., & Timmermans, H. (2008). Social networks, social interactions, and activity-travel behavior: A framework for microsimulation. Environment and Planning B, 35(6), 1012–1027. doi:10.1068/b3319t Axhausen, K. W. (2008). Social networks, mobility biographies, and travel: Survey challenges. Environment and Planning B, 35(6), 981–996. doi:10.1068/b3316t Bailey, S., & Marsden, P. V. (1999). Interpretation and interview context: Examining the general social survey name generator using cognitive methods. Social Networks, 21(3), 287–309. doi:10.1016/S0378-8733(99)00013-1 Burt, R. S. (1984). Network items and the general social survey. Social Networks, 6(4), 293–339. doi:10.1016/0378-8733(84)90007-8 Carrasco, J. A. (2011). Personal network maintenance, face to face interaction, and distance: Studying the role of ICT availability and use. Transportation Research Record: Journal of the Transportation Research Board, 2231, 120–128. doi:10.3141/2231-15 Carrasco, J. A., Hogan, B., Wellman, B., & Miller, E. J. (2008a). Collecting social network data to study social activity-travel behavior: An egocentred approach. Environment and Planning B, 35(6), 961–980. doi:10.1068/b3317t Carrasco, J. A., Miller, E. J., & Wellman, B. (2008b). How far and with whom do people socialize? Empirical evidence about distance between social network members. Transportation Research Record: Journal of the Transportation Research Board, 2076, 114–122. doi:10.3141/b2076-13 Dineen, L. C., & Blakesley, B. C. (1973). Algorithm AS 62: Generator for the sampling distribution of the Mann–Whitney U statistic. Applied Statistics, 22, 269–273. doi:10.2307/ 2346934 Dugundji, E., Pa´ez, A., & Arentze, T. (2008). Social networks, choices, mobility, and travel. Environment and Planning B, 35(6), 956–960. doi:10.1068/b3506ged Fu, Y-C. (2005). Measuring personal networks with daily contacts: A single-item survey question and the contact diary. Social Networks, 27(3), 169–186. doi:10.1016/j.socnet. 2005.01.008 Habib, K. N., & Carrasco, J. A. (2011). Investigating the role of social networks in start time and duration of activities: A trivariate simultaneous econometric model. Transportation Research Record: Journal of the Transportation Research Board, 2230, 1–8. doi:10.3141/ 2230-01

424

Juan Antonio Carrasco et al.

Harvey, D. (1990). The condition of postmodernity: An enquiry into the origins of cultural change. Cambridge, MA: Blackwell. Jones, P. (2009). The role of an evolving paradigm in shaping international transport research and policy agendas over the last 50 years. Paper presented at the 12th International Conference on Travel Behavior Research, Jaipur, India (13–18 December). Kowald, M., Arentze, T., & Axhausen, K. W. (2012). Population’s leisure network, descriptive statistics and a model-based analysis of leisure-contact selection. Eidgeno¨ssische Technische Hochschule Zu¨rich, IVT, Institute for Transport Planning and Systems. Kowald, M., Frei, A., Hackney, J., Illenberger, J., & Axhausen, K. W. (2009). The influence of social contacts on leisure travel: A snowball sample of personal networks. Paper presented at the 12th International Conference on Travel Behavior Research, Jaipur, India. Larsen, J., Urry, J., & Axhausen, K. W. (2006). Mobilities, networks, geographies. Aldershot, UK: Ashgate Publishing Limited. Lin, N., Fu, Y.-C., & Hsung, R.-M. (2001). The position generator: Measurement techniques for investigations of social capital. In N. Lin, K. Cook, R. Burt, G. Farkas & K. Lang (Eds.), Social capital: Theory and research. New York, NY: Aldine de Gruyter. Lucas, K. (2009). Making the links between transport and social exclusion in the UK. Paper presented at the Third Workshop Frontiers in Transportation, Niagara on the Lake, Canada (24–26 June). Marin, A., & Hampton, K. (2007). Simplifying the personal network name generator. Field Methods, 19(2), 163–193. doi:10.1177/1525822X06298588 Marsden, P. V. (1987). Core discussions networks of Americans. American Sociological Review, 52, 122–131. doi:10.2307/2095397 Marsden, P. V. (1990). Networks data and measurement. Annual Review of Sociology, 16, 435–463. doi:10.1146/annurev.so.16.080190.002251 Marsden, P. V. (2005). Recent developments in network measurement. In P. Carrington, J. Scott & S. Wasserman (Eds.), Models and methods in social network analysis (pp. 8–30). New York, NY: Cambridge University Press. doi:10.1017/CBO9780511811395.002 McCarty, C. (2002). Structure in personal networks. Journal of Social Structure, 3(1). Retrieved from http://www.cmu.edu/joss/content/articles/volume3/McCarty.html McCarty, C., Killworth, P. D., Bernard, H. R., Johnsen, E. C., & Shelley, G. A. (2000). Comparing two methods for estimating network size. Human Organization, 60, 28–39. McPherson, M., Smith-Lovin, L., & Brashears, M. E. (2006). Social isolation in America: Changes in core discussion networks over two decades. American Sociological Review, 71, 353–375. doi:10.1177/000312240607100301 Mok, D., Wellman, B., & Carrasco, J. (2010). Does distance matter in the age of the internet? Urban Studies, 47(13), 2747–2783. doi:10.1177/0042098010377363 Pas, E. (1990). Is travel demand analysis and modelling in the doldrums? Paper presented at the Developments in dynamic and activity-based approaches to travel analysis, Aldreshot, UK. Sheskin, D. J. (2007). Handbook of parametric and nonparametric statistical procedures (4th ed.). New York: Chapman & Hall. van den Berg, P., Arentze, T. A., & Timmermans, H. J. P. (2009). Size and composition of ego-centered social networks and their effect on geographic distance and contact frequency. Transportation Research Record: Journal of the Transportation Research Board, 2135, 1–9. doi:10.3141/2135-01 van den Berg, P., Arentze, T. A., & Timmermans, H. J. P. (2010). Factors influencing the planning of social activities: Empirical analysis of social interaction diary data.

Affective Personal Networks versus Daily Contacts

425

Transportation Research Record: Journal of the Transportation Research Board, 2157, 63–70. doi:10.3141/2157-08 Van der Gaag, M. P. J., & Snijders, T. A. B. (2005). The resource generator: Social capital quantification with concrete items. Social Networks, 27(1), 1–27. doi:10.1016/ j.socnet.2004.10.001 Wellman, B. (2001). Physical place and cyberplace: The rise of personalized networking. International Journal of Urban and regional Research, 25(2), 227–252. doi:10.1111/14682427.00309 Wellman, B., & Frank, K. A. (2001). Network capital in a multilevel world: Getting support from personal communities. In N. Lin, R. S. Burt & K. Cook (Eds.), Social capital: Theory and research (pp. 233–268). Hawthorne, NY: Aldine De Gruyter.

Chapter 23

Qualitative Methods in Transport Research: The ‘Action Research’ Approach Karen Lucas

Abstract Purpose — This paper explores the potential of ‘action research’ as transport survey method, with particular emphasis on critically assessing its utility in the resolution of major transport policy challenges, such as the mitigation of climate change and environmental impacts, transport-related social exclusion and intergenerational equity issues. Although not particularly novel within the social sciences, it is an approach that has been largely overlooked within the field of transport studies to date. Methodology/approach — The paper presents practical examples of where action research has been used to elicit information about people’s travel experiences and behaviours and discusses how it achieves different outcomes from other qualitative transport survey methods. It identifies appropriate contexts for action research and explores the skills and techniques to overcome some of the main criticisms of the method. It then evaluates some of the critical challenges of applying an action research approach and identifies potential ways for overcoming these. Finally, it discusses the key challenges for analysis, presentation and dissemination of their action research ‘data’ and potential ways of overcoming these. Findings — Action research has a long history within the social sciences, dating back to practical problems in wartime situations in Europe and the United States. It can be applied at either the level of individuals, small groups and/ or ‘communities’ and organisations, with the expressed aim of bringing together research enquiry and future policy or planned actions (ibid). It provides a useful additional survey technique for policy-makers wishing

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

428

Karen Lucas

to understand the detailed process of travel behaviours and barrier to travel at the individual level. Originality/value of the paper — The action research method is specifically useful for supporting and actively encouraging behaviour change as an integral part of the research process. It has only recently emerged within the literature as a transport survey method. It can be a particularly useful method for developing more collaborative data collection methods research participants enquires and thus enable us to identify their underlying motivations, intentions, perceptions and negotiations, as well as the micro-level impacts of smaller scale transport initiatives. Keywords: Qualitative research; action research; transport planning; behaviours; experiences; perceptions

23.1. Introduction Qualitative research methods are increasingly recognised as valuable for understanding the underlying motivations behind people’s travel behaviours and for teasing out their more hidden attitudes and perceptions (Grosvenor, 1998). Qualitative research refers to a wide variety of different fieldwork approaches, including interviews, role play, focus groups and other dynamic and deliberative methods that can be conducted with either transport decision-makers and other professional stakeholders or the end users and recipients of transport interventions. The specific aim of this paper is to explore whether ‘action research’ as specific approach can be used to as to complement this wider suite of qualitative methods in instances where the desire is to record the change processes that occur from small-scale transport projects, particularly those that aim to directly influence the behaviours of individual travellers. This can be seen as a particularly important issue in light of the growing academic and policy recognition of the need to more actively engage citizens in the environmental and social consequence of their travel behaviour decisions in response to ‘wicked problems’ such as climate change and social exclusion. The complexity of these questions inherently requires the development of more innovative and interpretive data collection methods than have been previously witnessed within the field of transport studies, but which maybe borrowed and adapted from other areas of the behavioural sciences. I begin by outlining a few of the key principles of action research and describe how it has been generally applied within social science research. I then identify why it might be appropriate as a method within the field of transport studies and offer some practical examples of some projects that have successfully adapted the approach for this purpose. I then explore some general criticisms concerning the validity of the action research approach and how these might affect the outcomes of successful enquiry in the area of transport studies. I conclude by offering some solutions for overcoming such methodological challenges through further potentially complementary avenues of research and analysis.

Qualitative Methods in Transport Research: The ‘Action Research’ Approach

429

23.2. Background: What is Action Research? Broadly speaking, action research describes a hugely diverse set of methodological practices and so there is no short answer to or necessarily general consensus about the question of what exactly constitutes an action research approach. Reason and Bradbury’s handbook of action research (Reason & Bradbury, 2001, p. 1) identifies it as ‘a participatory and democratic process concerned with developing practical knowing in the pursuit of worthwhile human purposes, grounded in a participatory worldview’. It has its roots in ‘political activism’ and so generally seeks ‘transformative and emancipatory goals’, which are based on the ‘lived experience’ of the researchers and research subjects who collaboratively participate in its activities. Fundamentally, it aims to bring together action and ongoing reflection through theory and practice. The primary role of the academic is to engage the necessary ‘actors’ in order to facilitate a process of learning and reflection in relation to a set of practical challenges, in particular those which are considered to be complex and ‘messy’. Good examples of this would be the delivery of sustainable development, social equity or community well-being. Touraine, who is often cited (alongside Castells) as one of the key originators of the social action research method, argues that it is the ethical responsibility of researchers who engage in social research not only to recommend changes for the improved welfare of their research participants and others, but to actively engage in y an intensive and in-depth process during which sociologists lead the actors from a struggle they must carry on themselves to an analysis of their own action. (Touraine, Dubet, & Wieviorka, 1982, p. 280) In fact, an action research approach has been applied to a wide range of policyrelevant contexts (including political protest, education and health promotion, environmental behaviours, local regeneration and community development). This has most usually been with the specific aim of encouraging the active engagement of previously excluded, marginalised or disempowered populations [such as children (e.g. Porter, Hampshire, Abane, Robson, & Munthali, 2010), people with disabilities (e.g. Danieli & Woodhams, 2005) and minority ethnic or faith groups (e.g. Harris, Hutchinson, & Cairns, 2005)]. The engagement and participation of these hitherto overlooked and marginalised communities within transport planning and research is generally seen as important to successful the (re)development and (re)design of new policy delivery systems, processes and/or stakeholder practices (Reason & Bradbury, 2001). At the organisational level, the action research method has mostly been used to interact with employers and employees about changes to their current behavioural and/or institutional practices (e.g. Khisty & Arslan, 2005) and/or with policy-makers and their target audiences (e.g. Seyfang & Smith, 2007). The family of methods that have most usually been adopted by action researchers embrace some core features, but can vary greatly according to the nature of the research enquiry and the skills and needs of its participants (this will be explored in

430

Karen Lucas

greater depth in later sections of the paper). Some studies are led by researcher or practitioner-led enquiries, while others are generated by ‘activist’ communities themselves. Whoever leads, however, there should be an emphasis on partnership, collaboration and empowerment through participation (Todhunter, 2001). The spectrum of involvement will generally differ according to the pre-existing capacities of the actors involved, as well as the specific circumstances of the project. Cornwall (2001) identifies six main stages of involvement from ‘co-option’ at the lowest level of engagement (where communities are only token representatives but have no real power or input as to the design of the research process), through ‘compliance’ where tasks are given to participants but outsiders decide the agenda and direct the process), through ‘consultation’ (where communities are asked their opinions but outsiders decide the appropriate course of action to ‘cooperation’, ‘co-learning’ and ‘collective action’ (where there is gradually greater involvement of communities along the spectrum and eventually communities enact their own agendas). In practice, many projects often start out at the bottom end of this spectrum and move towards the top-end of it over time, as community skills and capacities increase, and/or their tolerances for being the mere subjects of research enquiry decrease. In this way, action research can be described as highly ‘path dependent’ in that what happens at any one stage of the research process is determined by the earlier choices and experiences of the research participants themselves, as well as by other actors outside their direct field of influence. Nevertheless, it does have at its core an iterative methodological pathway. They are identified as involving (i) opening opportunities for dialogue between the various actors; (ii) experimenting with different cycles of action and reflection, congruence (checking if what is claimed has actually happened); (iii) reframing the issues in the light of new social learning; (iv) seeking ways of acting through inner and outer arcs of attention; (v) developing dialogue and participation skills; (vi) developing design and facilitation skills; (vii) validating processes. As there is a strong emphasis on experiential and social learning, there is also a tendency towards the use of narrative and discursive methods, such as open forum discussions, focus groups, citizens’ juries and learning histories. One key challenge for such approaches, as I shall explore more fully later in the paper, is (a) how to practically capture these rich but often fragmented sources of ‘data’ and (b) how to critically analyse and evaluate them once you have done so. Similar to most qualitative methods, an action research approach can, of course, be combined with quantitative data collection methods at any point in the overall study process. In this way, it can be used to inform the design of questionnaires or as an explanatory tool for interpreting the outcomes of statistical survey analysis. Perhaps a more unique application, however, is in helping those who are tasked with the delivery and monitoring of travel behaviour intervention projects, such as travel awareness programmes and cycling and walking promotion projects, to better understand the dynamic processes that are involved in shaping and reshaping everyday travel habits. In the following sections of the paper, I identify three example studies that have employed an action research approach in which I was myself involved as

Qualitative Methods in Transport Research: The ‘Action Research’ Approach

431

a researcher. I do this to both offer the reader a flavour of the type of methods and tools that were used and to draw attention to some of the methodological challenges that arose and were sometimes addressed (and sometimes not) as part of these research processes. I then use these examples as a basis of a critical evaluation of the potential of action research as an applied methodological approach for the transformation of travel behaviours. I first discuss why such an approach might be desirable for the more effective delivery of transport policy and transport systems planning, not only in the United Kingdom but also internationally and perhaps more importantly for this specific conference, what implications the implementation of an action research has for the collection and analysis of travel behaviour data.

23.3. Why Action Research for Transport? There is little argument that transport delivery worldwide is in something of a state of crisis. This is despite considerable innovation in the ways in which we now plan, deliver and manage our transport systems. There is also widespread evidence that the way in which most people currently chose to travel and how our goods and services are delivered is environmentally unsustainable, socially unjust and economically nonviable over the longer term. Both are impelling reasons for transformative, rather than incremental, changes in these systems and processes; and (arguably) neither governments nor the market are delivering these changes rapidly enough to avert the simultaneous crises of global economic meltdown, climate change, peak oil and the ensuing civil unrest that will in all likelihood follow should our transport systems fail us in the future. There is an argument, therefore, to think and act differently at every level of our individual and collective travel behaviours. While action research with communities and businesses cannot hope to deliver the scale of changes that would be necessary to resolve these crises, it may be a way to promote technological innovation and social learning about what needs to be done. It may also identify new and more politically acceptable pathways for change. Egmose (2011) identifies a lack of public trust in scientific, technological and policy solutions for more sustainable lifestyles, which cannot be simply explained by a public knowledge deficit. He suggests that this is because many of the ‘solutions’ that are on offer fundamentally interfere with the perceived life-world needs of most ordinary citizens, i.e. their need to secure a reasonable quality of life for themselves and their children in an increasing uncertain and unstable world. It is this impasse that action research might most usefully seek to address, through a collaborative process of problem and identification solution pathways between scientists, policymakers and citizens themselves through ‘grounded democratic deliberations’ (Egmose, 2011, p. 28). The question remains as to how to capture and robustly evaluate the impact of these very micro and often ephemeral local ‘action’ initiatives on the travel preferences and choices and longer-term social norms for travel demand among the population at large.

432

Karen Lucas

23.4. Three Short Case Studies of Action Research In this section of the paper I describe three example case studies of action research transport initiatives in the United Kingdom. They have been chosen less because they are ‘best practices’ of an action research approach (although they might demonstrate this in some respects) and rather because they are projects in which I have been directly involved as a researcher. The advantage of this is that I am able to more effectively reflect on their merits and shortcomings than I would otherwise be able from merely scrutinising the reports of similar projects in which I have not been directly involved. I personally find that one of the major challenges we face as action researchers is how to effectively evaluate and communicate many of the more subjective aspects of such projects to the outside world, a point that I will pick up on and elaborate further in a subsequent section of this paper. All three projects were externally identified and funded by actors outside the community that was the ‘subject’ of the enquiry and thus fell short of the aim of ‘collective action’ within the action research philosophy. In this respect, at their inception at least, all three studies reside somewhere between ‘co-option’ and ‘cooperation’ in their design within Cornwall’s spectrum of participation (Cornwall, 2001). Nevertheless, a strong element of co-learning between the researcher and researcher subjects was an identified aim within the research design as integral to the methodology, as well as a planned opportunity for local agenda setting and collective responses at the output stages of each project. The first example is the Citizen’s Science for Sustainability (SuScit) project www.suscit.org.uk, which took place in the London Borough of Islington between September 2006 and July 2009. It brought together researchers, policy-makers and members of the local communities to identify local priorities for sustainable urban living on the Mayville Estate, a recognised area of economic deprivation, social exclusion and environmental degradation. The second example is that of an EU-funded project, OPTIMUM2 (Optimal Planning through Implementation of Mobility Management) http://connectedcities. eu/downloads/conferences/london_optimum2.pdf, which centred on a local business, rather than resident community in London. It aimed to set up and facilitated a travel plan group to identify practical initiatives to encourage local employees to cycle or walk to their place of work as part of the wider redevelopment business plan for London’s South Bank. The third example, and one I have often used in my previous publications (e.g. Lucas, 2004), is the Braunstone Bus project, which evolved from a Department for Transport funded research study to identify ways to address social exclusion through improved local transport planning. The project was one of six case studies across different deprived urban communities and focused on the transport and accessibility needs of low-income populations living on the Braunstone Estate in Leicester. The study involved researchers, officers from the local transport authority and the regeneration partnership, which included resident representatives from the local community.

Qualitative Methods in Transport Research: The ‘Action Research’ Approach

433

23.4.1. The SuScit Project The project was funded by the UK’s Engineering and Physical Science Research Council to identify new research ideas to support community-based initiatives for sustainable urban living. It sought to actively promote a process of mutual learning between scientists, policy-makers and lay citizens about how to formulate ‘a community-led research agenda for urban sustainability research’. In stage one of the project, local residents were engaged in a 6-week filming project to explore and share their experiences of life in their local area. Stage two, shared their films with researchers and policy-makers in a series of workshops to engage them in active three-way dialogue, which sought to propose ways to needs and concerns that were raised. These discussions were supported by analysis of local datasets and other background research to provide the participant with the information they required to explore these issues in the full knowledge of state-of-the-art technical innovations and policy and planning frameworks. Transport (among other issues such as housing supply and access to community space) became a key focus for debate within the research process. Young people in particular noted that they felt unable to participate as much they would like in employment, education and social activities due to their inability to travel. It was clear from analysis of the local travel datasets that were available that this suppressed travel they experienced, as well as its negative consequences in terms of their social exclusion was largely unrecorded. This was for a variety of reasons, but mostly because: a. the geographical data was not sufficiently refined to differentiate between the travel behaviours of people living on the Mildmay Estate and the rest of the (fairly affluent) Islington population as a whole, and so masked their much lower levels of travel activity; and b. GIS-based accessibility analysis of the public transport system serving the area did not provide information about its level of connectivity with the places local residents wished to get, the times they needed to travel, the cost of getting there and other barriers to travel such as fear of crime, which was high among both the older and younger resident population. Perhaps more fundamental to debates about sustainability was the fact that many participants did not wish to travel far and preferred to act locally, but were prevented from doing so due to a loss of local facilities and the difficulties of walking or cycling due to high level of heavy traffic on local roads. Although not an intended outcome of the research process, one interesting issue that emerged and was partially addressed through the shared workshops was the huge deficit in local knowledge about the activities and facilities that were available to young people locally, as well as some of the concessionary travel passes that were being offered to people on low incomes by Transport for London. Another issue that was raised in terms of transport was the huge complexity of the public transport system in London and the difficulties of navigating it, especially for people with low literacy levels or physical mobility and metal

434

Karen Lucas

disabilities. Journey planners were seen as useless to people if they didn’t already understand the system or know exactly where they were going or have access to ICT technologies. In research terms, the need to develop better low-end technologies was identified as a key challenge; better communication of how to reach key services was seen as an important priority for the providers of services. 23.4.2. The OPTIMUM2: Better Bankside Travel Planning Group OPTIMUM2 was a European Union funded project that has as its primary aim the improvement of the accessibility of busy locations in urban areas, with a focus on the three case study areas of South East London, the City of Edinburgh and the NoordHolland Province. The Better Bankside elements of the project focused on the redevelopment of London’s South Bank, which involved 272 businesses located in the area paying a compulsory Business Improvement District (BID) Levy of d570,000 per annum to contribute to its physical uplift. Initially travel and transport were not a feature of the BID campaign, but was identified as a key problem for the area. The project aimed to increase and improve travel options for all those working in, living in and visiting Bankside and act as a forum for the exchange of ideas on existing travel solutions and the development of workplace travel planning tools. The Better Bankside Travel Planning Group (BBTPG) was established to identify a number of collaborative projects and to seek funding from Transport for London, the London Borough of Southwark and other bodies to deliver these joint ventures. The main focus of the initiative was on improvements to the walking and cycling environment and on public transport links. The BBTPG achieved regular attendance of 6–12 businesses at 6 weekly meetings, ‘in kind’ support for projects (e.g. design work) and was the first travel plan group in London to develop its own Master Travel Plan. The group was able to initiate a number of new projects including two area-wide travel surveys, a cycle parking audit and subsequently improved cycle parking facilities, a pool bike scheme for Southwark businesses, a series of health travel lunchtime events and walks and an interactive map and web-based travel site. It was also winner at the Transport for London Sustainable Transport Awards 2007 for ‘Innovation in Promoting Travel Plans to Business’). A follow-up survey of local businesses conducted in May–July 2006 received responses from over 100 businesses (from both larger and smaller employers), with a total of 626 individual responses from employees. It reported a 10% increase in the share of people walking or cycling to work. However, there were some issues regarding the robustness of this finding due to the comparability of the survey sample and survey design in the before and after waves of the monitoring. 23.4.3. The Braunstone Bus Project Unlike the previous two examples, the devising, design and delivery of this project came about through the direct collaborative efforts of researchers working with communities and policy-makers. It evolved from a prior project in the area, which

Qualitative Methods in Transport Research: The ‘Action Research’ Approach

435

was funded by the Department for Transport in an attempt to improve the social inclusion of local communities in the transport planning process (Lucas, Solomon, & Wofinden, 2002). This earlier study identified that people living on the Braunstone Estate in the urban periphery of Leicester City felt that they had been systematically excluded by the public transport system in their area and this were unable to reach key destinations such as the new retail park, the two hospitals serving the city and other colleges and schools proximal to their locale. A follow-up action research project was then used to work with representatives of the local community and the local regeneration partnership to identify a set of public transport routes and operating schedules to link residents to these identified activities. It facilitated a series of exchanges with the local transport authority and two of the larger public transport operators to secure the ‘pump-prime’ funding from national government, via a bidding challenge, to tender these service and to train local residents to operate them as a social enterprise. A process of continuous data collection, funded by the regeneration partnership and undertaken under the supervision of the research participants, was then used monitor the performance of the service and to feedback suggested improvements to the operators. The information that was gathered was also used to evaluate the service in terms of its contribution to social inclusion outcomes, including local job uptake, reduced school truancy and college attendances. The research was able to demonstrate that patronage had increased beyond the usual levels of uptake that is generally expected from new or improved transport services and demonstrated significant results in terms of indices of social inclusion (Lucas, Tyler, & Christodolou, 2009).

23.5. The Strengths and Challenges of Action Research: Creating the Reflective Research Practitioner One of the cited often key strengths of action research is that it produces outcomes that are both useful for the participants (in that it encourages and supports further courses of local action) and the researcher (in that its findings are more grounded and robust). Involving communities in the analysis and interpretation of research findings can improve the quality and accuracy of the findings and fosters community learning as communities are more likely to ‘own’ findings if they have helped interpreted them. It can also enable a more reflexive approach for researchers, who can consider how the context and engagement of specific participants might influence some of the data generated. As well as understanding why and how certain transport behaviours are constructed, it is possible for both the participant and the researcher to interactively explore what might change a given behaviour and the dynamic and social consequences of that change. The focus is far more on understanding detailed processes at the individual level rather than gross behavioural outcomes, but nevertheless understanding this can lead to the development of more effective survey and policy instruments and grounded interpretations of modelled outcomes.

436

Karen Lucas

One of the key issues with transport behaviour change programmes in the United Kingdom has been the very localised and micro-nature of such projects and the difficulties, therefore, of capturing their impact on people’s travel behaviours in a meaningful way. While national travel surveys do a very good job of capturing and communicating changes in travel behaviours at the aggregate national level (as is their intended function), they are much less able to demonstrate this at lower levels of geographical specification, and especially at the very micro-level of travel activity. Although some local transport authorities will run their own supplementary surveys to understand patterns, many communities are left to undertake their own evaluation studies. Action research can help in this respect because local people can be directly engaged in the survey design and data collection process and their local knowledge is often invaluable for identifying a suitable sample population and in the contextual interpretation of results. Local people can be trained to undertake interviews themselves and in some areas have set up their local survey enterprise companies with often far greater demonstrable success at securing survey respondents than traditional transport consultancies. Arguably an even greater contribution of the action research approach is that it encourages the development of the ‘reflective research practitioner’ (Schon, 1983). The researcher is encouraged to enter the field of study without a preconceived notion of what s/he will find and to constantly develop ideas from their field observations and interactions with the community of interest. This is designed to encourage research practitioners to think creatively ‘outside the box’ to critically reflect on their research practices and thus develop more innovative and productive ways to engage and reengage with the everyday experiences of their participants. While this might seem an irrelevance to some aspects of travel behaviour research, learning-in-action approach can be a invaluable where the researcher is unacquainted with the social and cultural practices of their community of interest. For example, it is often very difficult for a highly educated research professional to appreciate the barriers to travel or challenges of behaviour change of a car-reliant single mother who lives on a peripheral housing estate several miles from her work, her child’s school and the local shopping centre. Undertaking a programme of action research with her (perhaps getting her to record her own travel experiences and emotional responses to those experiences on a video as she travels and then presenting the recording to local policy-makers), can not only help to empower the participant but also enhances the impact of research message. On the down side, one commonly expressed critical view is that it is inappropriate for researchers to actively direct the research process and its outcomes in this way and that they should be more impartial if they are to ‘objectively’ report their results. Such critics (e.g., see Danieli & Woodhams, 2005) point to the problems of ‘insider’ bias, inconsistency, contradiction, selectivity and non-replication, as well as the failure of researchers to recognise the influence of important power differences between themselves as the research enquirer and the subjects of their research enquiry. However, it is possible to control for most of these side-effects within a carefully considered study design and tightly monitored programme. For example, Smith, Bratini, Chambers, Jensen, and Romero (2010, p. 423) argue that while researchers might get placed in the ‘expert role’, they must be open about

Qualitative Methods in Transport Research: The ‘Action Research’ Approach

437

what they bring and how they are perceived, and ‘must approach the Participatory Action Research endeavour as people with knowledge to share who are also sincere learners, and whose knowledge is not automatically privileged over others’. This is reinforced by others (e.g. Stoecker, 2009) who urge academics not to sell their skills and insights short, but to help document and share the processes of the groups they are working with to enable further learning. It is the responsibility of the researcher to make the multiplicity of their roles clear (e.g. Charles 2011; Rogers, Convery, Simmons, & Weatherall, 2012), a useful tool being a memorandum of agreement which outlines the roles and responsibilities that the researcher brings and what is expected in return from the research participants. It is also possible to refute accusations of lack of objectivity and reduced analytical rigour that is often levelled at qualitative research with the now well-worn adage that all research includes the in-built biases of its architects but that qualitative research is more open and explicit about these subjectivities. Action research should be no exception to this general rule. Participative data collection exercises should use tape and video recordings, fieldwork diary and meeting notes and other detailed record-keeping tools. At the analytical stage of the research, it is important to move beyond simple descriptions of what has been observed and recorded to explore deeper underlying trends within the data. This can be approached using an analytical framework that has been developed from existing theories within the literature as a baseline, first grouping the collected observations according to key emergent themes and then cross-examining these themes (or dependent variables) with core attributes of the research participants, such as by age, gender, income (or independent variables) in much the same way as a statistical analysis is undertaken. Where sub-themes and narratives can be seen to emerge, they may be correlated with key episodes, places or events, thus building up a rich contextual explanation of the experiences or behaviours of different groups of participants in the study. The picture or narrative emerges as the researcher stands back from his/her own involvement in the ‘action’ elements of the research and allows the data itself to ‘speak’. However rigorous the analytical process, it remains a truism that the type of data that is produced through the application of an action research approach is difficult to incorporate in any meaningful way within mathematical transport models, although it may help to inform their conceptual design. As Grosvenor (1998) has previously noted, this difficulty may lead the policy-maker to enquire how its results can be usefully employed within their high-level and strategic decision processes and may ultimately lead to the undervaluing of research outcomes. On the other hand, it may be seen as an opportunity for the action researcher to work on projects in close collaboration with quantitative data analysts and modellers to assist in the explanation of some of their more ‘black box’ outputs.

23.6. Conclusion The qualitative method of action research has been widely applied in the social sciences to engage local citizens in the transformation of their attitudes, behaviours,

438

Karen Lucas

patterns of activity and social norms. This paper has explored whether it is a methodology which could usefully be adopted by transport studies, particularly within the context of the travel behaviour change agenda. The paper identifies that monitoring the outcomes of micro-level transport behaviour change programmes often present a significant challenge for more traditional travel survey methods. In these instances, action research can be particularly useful for developing more collaborative data collection methods with research participants and thus enable us to capture their underlying motivations, intentions, perceptions and negotiations, as well as the micro-level impacts of smaller scale transport initiatives. The benefits of iterative cycles of planning, action and reflection and sharing emergent findings with research participants and their wider stakeholder networks can be helpful for organisational learning, in aiding reflexivity, testing assumptions and biases, and for helping us to collaboratively develop new resources and strengthen the existing skills and capacities of both the researcher and community participants. These outputs can provide theorists and practitioners with rich contextual understanding of how travel behaviours are constructed by individuals, which can in turn serve to improve our theoretical framings, data collection tool and models and policy instruments. While it may be difficult to incorporate less tangible factors within mathematical models, these understanding can help transport decision-makers considerably in the interpretation of modelled outcomes. Action research can also be a useful tool for empowering communities to participate in the transport decision-making, infrastructure design and transport planning processes. This could help to make schemes more sensitive and reactive to local needs and concerns and plans more transparent and publicly accountable. It can also be used in conflict resolution between actors with different interests, offering the potential to explore the circumstances of conflicting views and helping to identify pathways to consensus. .

References Charles, L. (2011). Animating community supported agriculture in North East England: Striving for a ‘caring practice’. Journal of Rural Studies, 27, 362–371. http://dx.doi.org/ 10.1016/j.jrurstud.2011.06.001 Cornwall, A. (2001). Towards participatory practice: Participatory rural appraisal and the participatory process. In K. de Koning & M. Martin (Eds.), Participatory research in health: Issues and experiences. London: Zed Books. Danieli, A., & Woodhams, C. (2005). Emancipatory research methodology and disability: A critique. International Journal of Social Research Methodology: Theory and Practice, 8(4), 281–296. Egmose, J. (2011). Towards science for democratic sustainable development: Social learning through upstream engagement. PhD thesis, Roskilde University, Denmark. Grosvenor, T. (1998). Qualitative research in the transport sector. Retrieved from http:// onlinepubs.trb.org/onlinepubs/circulars/ec008/workshop_k.pdf

Qualitative Methods in Transport Research: The ‘Action Research’ Approach

439

Harris, M., Hutchinson, R., & Cairns, B. (2005). Community-wide planning for faith-based service provision: Practical, policy and conceptual challenges. Non-profit and Voluntary Sector Quarterly, 34(1), 89–109. http://dx.doi.org/10.1177/0899764004269305 Khisty, C. J., & Arslan, T. (2005). Possibilities of steering the transportation planning process in the face of bounded rationality and unbounded uncertainty. Transportation Research Part C: Emerging Technologies, 13, 77–92. http://dx.doi.org/10.1016/j.trc.2005.04.003 Lucas, K. (Ed.). (2004). Running on empty: Transport, social exclusion and environmental justice. Bristol, UK: The Policy Press. Lucas, K., Solomon J., & Wofinden D. (2002). Factoring social exclusion into local transport planning. Report to Department for Transport Mobility and Inclusion Unit. Lucas, K., Tyler, S., & Christodolou, G. (2009). Assessing the ‘value’ of new transport initiatives in deprived neighbourhoods in the UK. Transport Policy, 16(3), 115–122. Porter, G., Hampshire, K., Abane, A., Robson, E., & Munthali, A. (2010). Moving young lives: Mobility, immobility and inter-generational tensions in urban Africa. Geoforum, 41, 796–804. http://dx.doi.org/10.1016/j.geoforum.2010.05.001 Reason, P., & Bradbury, H. (2001). Participative inquiry and practice. London, UK: Sage. Rogers, J. C., Convery, I., Simmons, E., & Weatherall, A. (2012). What does a friendly outsider do? Critical reflection on finding a role as an action researcher with communities developing renewable energy projects. Educational Action Research, 20(2), 201–218. doi:10.1080/09650792.2012.676286 Schon, D. A. (1983). The reflective practitioner: How professionals think in action. London: Temple Smith. Seyfang, G., & Smith, A. (2007). Grassroots innovations for sustainable development: Towards a new research and policy agenda. Environmental Politics, 16(4), 584–603. http:// dx.doi.org/10.1080/09644010701419121 Smith, L., Bratini, L., Chambers, D. A., Jensen, R. V., & Romero, L. (2010). Between idealism and reality: Meeting the challenges of participatory action research. Action Research, 8, 407–425. http://dx.doi.org/10.1177/1476750310366043 Stoecker, R. (2009). Are we talking the walk of community-based research? Action Research, 7(4), 385–404. http://dx.doi.org/10.1177/1476750309340944 Todhunter, C. (2001). Undertaking action research: Negotiating the road ahead, Social Research Update. Guildford: University of Surrey. Retrieved from http://sru.soc.surrey. ac.uk/SRU34.html Touraine, A., Dubet, F., & Wieviorka, M. (1982). Une intervention sociologique avec Solidarnosc. Sociologie du travail, 24(3), 279–292. In J. Hamel (1997). Sociology, common sense, and qualitative methodology: The Position of Pierre Bourdieu and Alain Touraine. Canadian Journal of Sociology, 22, 95–112.

Chapter 24

WORKSHOP SYNTHESIS: COLLECTING QUALITATIVE AND QUANTITATIVE DATA ON THE SOCIAL CONTEXT OF TRAVEL BEHAVIOUR Kelly J. Clifton 24.1. Purpose and Introduction Recent interest in understanding the social context of travel behaviour has been framed by an array of transport-related questions. While neglected in the past, these questions are associated with emerging policy concerns, such as inter-generational and transport-related social exclusion, as well as ways to promote pro-environment travel behaviour. In this context, some examples of key research questions are the direct role of interpersonal interactions on transport-related decisions, such as leisure travel, residential location and auto ownership, as well as the relevance of social influence, cohesion and trust on travel decisions, such as mode choice. The scope of these questions requires innovative data collection methods that could incorporate and adapt a diversity of methods from social sciences, both qualitative and quantitative. This workshop reviewed the recent practical experiences as well as explore opportunities for applying methods from social sciences and other related fields as we seek to capture the inherent complexity of the role of the social context in travel behaviour. The inclusion of this workshop in this conference represents a continuation of a theme at the ISCTSC, and more broadly in the transportation research arena, to better understand the behavioural processes and social circumstances that produce activity and travel outcomes. The reasons for this need are many but essentially the transportation community recognizes that in order to understand more about transportation-related behaviours, we need to know more about the underlying mechanisms that shape them. To this end, this workshop focused on the influence of social context in our travel choices and how we may better incorporate and represent social context in research. The workshop participants represented a variety

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

442

Kelly J. Clifton

of countries and disciplines, bringing a wealth of experience and context to this discussion.1

24.2. Workshop Papers The workshop began with presentations of the three workshop papers that were commissioned to help frame these issues and initiate our discussion. They can be found in their entirety following this workshop report:  ‘Affective personal networks versus daily contacts: Analysing different name generators in an activity-travel behaviour context’ by Juan Antonio Carrasco, Cristian Bustos and Beatriz Cid-Aguayo;  ‘Surveying data on connected personal networks’ by Matthias Kowald and Kay W. Axhausen;  ‘Qualitative methods in transport research: Is ‘action research’ a methodology too far?’ by Karen Lucas. The first two of these papers dealt with social networks, specifically focusing on approaches to systematically collecting information about personal networks through surveys. The first (Carrasco, Bustos, & Cid-Aguayo, 2011) examined four methods for generating information of personal networks and compared them in terms of the size of the network, the spatial characteristics and the frequency of interactions. The second paper (Kowald & Axhausen, 2011) presented methods to gather information about personal social networks but also understand more of the larger global structure where this personal network resides. Both of these papers aim to understand social context in terms of individual relationships and their characteristics, including the nature of these relations (friends, family, professional, etc.), their strength, interconnectedness, and temporal and spatial extent. The discussion of these two papers by the workshop participants that followed these presentations acknowledged a need to understand the relationships between social networks and physical transportation networks and systems. They make a contribution in their presentation of how these networks can be captured in travel survey methods and represented in our research. The last paper (Lucas, 2011) changed course to introduce the method of action research, a qualitative approach to research engagement that aims for behavioural change and requires close interactions and involvement between the researcher and the community or individuals of interest. Here, the focus is less on a systemic travel survey method, but rather ways in which a research can engage with a community

1. Participants: Bianca Alves (Brazil); Sebastian Astroza (Chile); Juan Antonio Carrasco (Chile); Herna´n Carvajal Corte´s (France); Kelly Clifton (USA) (Chair); Yanina Girogis (Argentina); Susan Handy (USA) (Rapporteur); Borris Ja¨ggi (Switzerland); Matthias Kowald (Germany); Scott Le Vine (UK); Karen Lucas (UK); Herrie Schalekamp (South Africa); Elaine Schneider de Carvalho (Brazil) and Marc Weiner (USA).

Workshop Synthesis: The Social Context of Travel Behavior

443

to gain a better grasp of the context of transport and other social problems and how various solutions may crafted in response. The paper discussed the contributions of this line of inquiry to understanding context and offered a critique of when and how such methods are most appropriate and offer gains over traditional survey methods. While these three papers differed in their aims, methodological approaches and concerns, all three were interested in placing the social context of travel at the forefront of their inquiry. Given the theme of the conference, the papers provided some grounding from which to launch a larger discussion about what social context means and how we can better understand it through our research and information gathering methods. What follows is a brief synopsis of those workshop discussions, highlighting the key points and challenges that arose.

24.3. What is Social Context? Definitions and Bounds A significant portion of the ensuing discussion was devoted to putting more definition to the term social context — the inter-related set of conditions in which someone, something or some event exists or occurs. It is includes the sociodemographic characteristics of individuals and households that have been traditionally included in studies of travel behaviour. Social context, however, extends beyond gender, race, ethnicity, class, income and country of origin to include all of the people, institutions, culture, beliefs, meaning, values, influences, relationships and processes that inform our choices and motivate our actions. However, social context should not be considered a collection of static features residing in the background of our lives but rather, it represents those dynamic processes and conditions that are an integral part of our behaviour. The socio-ecological model (Bronfenbrenner, 1979) has been used to help explain the many influences on a variety of behaviours and offers a conceptual framework that may be useful in thinking about social context and how it shapes travel behaviours. The socio-ecological model explains context as a series of nested systems of influences and is illustrated in Figure 24.1. Here, the various realms of behavioural influence include individual level characteristics, including the obvious sociodemographic dimensions but also psychological factors, such as knowledge, beliefs, identity, and perceptions of oneself and others. These influences extend beyond the individual to the next level to include interpersonal relationships, such as social roles, friendships and other social networks, power and influence. Social capital and supports are the resources that one has access to through their social networks. Social context also has an institutional component. It is not just individuals and their households but also firms, governments and other organizations, both formal and informal, that shape our opportunities, activities, and schedules. The communities with which we associate shape social norms and values and reflect our culture and collective identity. At the broadest level, there is a policy context that exerts influence over our cities and neighbourhoods, impacting the provision of infrastructure and

444

Kelly J. Clifton

Figure 24.1: Socio-ecological model of influences of context on behaviour. services, the distribution and redistribution of wealth and resources, and the ability to access and engage in the world around us. These various levels of social context represented in this diagram have both spatial and temporal dimensions and there is interaction between them. While this model helps to conceptualize this complex and dynamic system, it does not fully capture the reciprocal relationships and interactions between levels, nor the spatial and temporal aspects. Nonetheless, it helps to convey the importance of social context and places it within the realm of individual decisions processes, choices and behaviours.

24.4. Why Do We Care about Social Context? The field of transportation is moving away from merely aiming to predict behavioural outcomes. Now, we desire to understand more about the behavioural process, including the motivations for and influences on decisions. The applications of this knowledge are many. We need a better grasp of social context in order to fully represent the decision and behavioural processes in and in our activity and transportation models (Pendyala & Bricka, 2006), which have been become increasingly sophisticated in their ability to capture complexity. But knowledge about the role of social context on activities and travel outcomes and conversely, the influence of activities and travel on social context can contribute to more than improvements to modelling and analytical tools. There is growing awareness that transportation policies should focus on changing individual level behaviours to promote those actions that have more positive social outcomes (Ampt, 2003). These desired behaviours include travelling by more

Workshop Synthesis: The Social Context of Travel Behavior

445

sustainable modes of transportation, consuming less fossil fuels, adopting technologies that increase the efficiency of travel, and making long-term choices, such as where to live with travel costs in mind. The adoptions of new technologies and policies that promote more environmentally conscious choices (Shaheen, 2004) and the process of adaptation to change and social learning (Arentze & Timmermans, 2004) holds increasing interest. Coincident with the social network is a flow of information and influence (Sunitiyoso, Avineri, & Chatterjee, 2010), which is critically linked to the choices made about activities and travel. In this arena, the transportation field has much to learn from the long-standing success that the medical and public health fields have made in this arena, with respect to smoking cessation programs, addiction management, and healthy eating and exercise (Sallis, Owen, & Fotheringham, 2000). The provision of transportation services and infrastructure has many equity and social justice issues to tackle as the social issues of race, gender and poverty persist (Lucas, 2004). Transportation research has infrequently examined the issues of immobility, instead focusing overwhelmingly on its counterpart — mobility (Madre, Axhausen, & Brog, 2007). Social context reveals much about the resources available to individuals and households, variations in strategies and choices across groups and locations, and the consequences for the transport disadvantaged, as qualitative studies have shown (Clifton, 2004). But social context can play an additional role of helping to guide education and empowerment programs aimed at helping these disenfranchised groups to better advocate for themselves and access political and governmental institutions on their own behalf (Schlossberg & Brehm, 2009). Increasingly, the field of transportation planning is moving away from a singular forecast of the future. Instead, the acknowledgement that uncertainty and risk surround in any future-oriented effort has led to the embrace of scenario planning (Bartholomew, 2007). Here, a range of potential futures are considered and analysis focuses on the sensitivity of transportation and other outcomes to these various contexts. In these cases the act of envisioning, imagining or speculating what the future may hold is in some ways is thinking about how the conditions of context — social and otherwise — will change.

24.5. How Can We Get Better Information about Social Context? Methods, Opportunities and New Technologies Because of the complexity surrounding social context, its various aspects are difficult to measure, account for and observe. The challenge in acquiring more information about social context is that it largely (or entirely) encompasses the non-physical realm. Aspects such as beliefs, perceptions, values, motivations, desires are difficult (impossible) to directly observe, although we can and may often make assumptions based upon our observations. It is challenging to gauge these psychological and social aspects through traditional lines of questioning, such as surveys, particularly when the researcher knows little about the population of interest. This reaffirms

446

Kelly J. Clifton

qualitative methods as an important and independent mode of inquiry to understand social context. This call for more qualitative research in transport is not new (Clifton & Handy, 2003; Grosvenor, 2000). There is increasing recognition that qualitative methods are a necessary approach to more fully understand the context of travel and its implications and as a result the integration of qualitative methods among the approaches that transportation researchers use is growing. This increased acceptance is welcomed; however, with the more widespread use of qualitative methods, particularly interviews and focus groups, there is some concern that uninformed and novice applications of qualitative methods may not be approached with the same rigour and standards as quantitative approaches. The group expressed concern that there is often a fundamental misunderstanding about the nature of a qualitative investigations and degree of expertise needed for these undertakings. With this, the challenge of properly educating and training the transportation researcher in the design and application of qualitative methods arises (Chen, 2011; Clifton, 2011). The continued separation of disciplines provides little interaction between those engaged in quantitative approaches and those well versed in qualitative methods. But there is much each of these silos can learn from each other and the state of the art in survey methods is moving towards mixed and hybrid methods of data collection and analysis (Creswell, 2009; Deutsch & Goulias, 2012). The field is embracing a much closer integration of qualitative and quantitative approaches and studies using these combined approaches will likely accelerate into the near future. The two workshop resource papers dealing with social networks demonstrated, new survey methodologies can be used to collect quantitative and qualitative information about the extent and qualities of the social relationships and this information can be represented quantitatively or graphically. Social media also provide a new opportunity to understand more about an individual’s social network, self-representation, community and quite possibly activities and travel. This use of social networking media, such as Facebook and Twitter, is changing rapidly with widespread adoption and the potential applications for travel surveys are emerging (Augustine, 2012). This media combined with smart phone technologies still have many challenges to overcome before they can be fully employed in travel surveys. Yet the potential exists to better integrate the information gleaned from these sources into our data collection approaches to better represent social context.

24.6. Conclusions The ongoing workshop discussion raised more questions about the definition of social context, our motivations for including it in our transport research and the best approaches to collect information about it. Below are a few key points concluded at the end of our session:  Methodological approaches should be guided by the research objectives;

Workshop Synthesis: The Social Context of Travel Behavior

447

 The way we ask questions matters and our research should be sensitive to context, even if it is not central to the research objectives;  Social context information can be included studies using traditional quantitative methods;  Mixed method approaches off the most potential; and  We have more work to do to define what is needed to measure social context and how to go about it. The struggles of this group reflect the ongoing dialogue in the field of travel behaviour and elsewhere concerning issues of interdisciplinary engagement, mixed methods of inquiry, social equity and social-psychological factors. At the end, the group concluded that social context helps us to tell more complete, simple and intuitive stories about how transportation interacts with our daily life.

References Ampt, E. (2003). Voluntary household travel behaviour change: Theory and practice. Paper presented at the 10th International Conference on Travel Behavior Research, Lucerne. Arentze, T. A., & Timmermans, H. J. P. (2004). A learning-based transportation oriented simulation system. Transportation Research Part B: Methodological, 38(7), 613–633. doi:10.1016/j.trb.2002.10.001 Augustine, C. (2012, January 22–26). Using and analyzing data from Twitter, Facebook, Text Messaging, and other sources. Paper presented at the workshop Incorporating Social Media into Transportation Surveys at 91th Annual Meeting of the Transportation Research Board, Washington, DC. Bartholomew, K. (2007). Land use-transportation scenario planning: Promise and reality. Transportation, 34(4), 397–412. doi:10.1007/s11116-006-9108-2 Bronfenbrenner, U. (1979). The ecology of human development. Cambridge, MA: Harvard University Press. Carrasco, J. A., Bustos, C., & Cid-Aguayo, B. (2011, November 14–18). Affective personal networks versus daily contacts: Analysing different name generators in an activity-travel behaviour context. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile. Chen, C. (2011, January 23–27). Which method? Which question! Paper presented at the workshop Qualitative Research Methods in Transportation: New Approach to New Challenges at the 90th Annual Meeting of the Transportation Research Board, Washington, DC. Clifton, K. J. (2004). Mobility strategies and food shopping for low-income families: A case study. Journal of Planning Education and Research, 23(4), 402–413. doi:10.1177/ 0739456X04264919 Clifton, K. J. (2011, January 23–27). Principles of qualitative research. Paper presented at the workshop Qualitative Research Methods in Transportation: New Approach to New Challenges at the 90th Annual Meeting of the Transportation Research Board, Washington, DC. Clifton, K. J., & Handy, S. L. (2003). Qualitative methods in travel behaviour research. In P. R. Stopher & P. Jones (Eds.), Transport survey quality and innovation (Chapter 16). Bingley, UK: Emerald Group Publishing Ltd.

448

Kelly J. Clifton

Creswell, J. W. (2009). Research design: Qualitative, quantitative, and mixed methods approaches (3rd ed.). Thousand Oaks, CA: Sage Publications. Deutsch, K. E., & Goulias, K. G. (2012, January 22–26). Understanding place using mixedmethod approach. Paper presented at the 91th Annual Meeting of the Transportation Research Board, Washington, DC. Grosvenor, T. (2000). Qualitative research in the transport sector. Resource paper for the Workshop on Qualitative/Quantitative Methods, proceedings of an International Conference on Transport Survey Quality and Innovation, Grainau, Germany, May 24–30, 1997. Kowald, M., & Axhausen, K. W. (2011, November 14–18). Surveying data on connected personal networks. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile. Lucas, K. (2004). Running on empty: Transport social exclusion and environmental justice. Bristol, UK: Policy Press. Lucas, K. (2011, November 14–18). Qualitative methods in transport research: Is ‘action research’ a methodology too far. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile. Madre, J. L., Axhausen, K. W., & Brog, W. (2007). Immobility in travel surveys. Transportation, 34, 107–128. doi:10.1007/s11116-006-9105-5 Pendyala, R. M., & Bricka, S. (2006). Defining and collecting behavioral process data for travel analysis: Challenges and issues (invited). In P. R. Stopher & C. Stecher (Eds.), Travel survey methods: Quality and future directions (pp. 511–530). Oxford, UK: Elsevier. Sallis, J. F., Owen, N., & Fotheringham, M. J. (2000). Behavioral epidemiology: A systematic framework to classify phases of research on health promotion and disease prevention. Annals of Behavioral Medicine, 22(4), 294–298. doi:10.1007/BF02895665 Schlossberg, M. A., & Brehm, C. (2009). Participatory GIS and active transportation: Collecting data and creating change. Transportation Research Record: Journal of the Transportation Research Board, 2105, 83–91. Shaheen, S. A. (2004). Dynamics in behavioral adaptation to a transportation innovation: A case study of Carlink — A smart carsharing system. PhD Thesis Report, Institute of Transportation Studies, UC Davis (UCD). Retrieved from http://escholarship.org/uc/item/ 87n6958h Sunitiyoso, Y., Avineri, E., & Chatterjee, K. (2010). Complexity and travel behaviour: A multi-agent simulation for investigating the influence of social aspects on travellers’ compliance with a demand management measure. In E. Silva & G. de Roo (Eds.), A planners encounter with complexity: New directions in planning theory (pp. 209–226). Aldershot: Ashgate.

PART III FOCUS ON NEW METHODS AND DATA SOURCES: THEMES 6 TO 8

THEME 6 NEW CHALLENGES IN DEALING WITH TIME: ENVIRONMENTAL PEAKS AND PLANNING HORIZONS

Chapter 25

Empirically Constrained Efficiency in a Strategic-Tactical Stated Choice Survey of the Usage Patterns of Emerging Carsharing Services Scott Le Vine, Aruna Sivakumar, Martin Lee-Gosselin and John Polak

Abstract Purpose — The principal hypothesis of this program of research is that people’s choices of which resources to own are a function of expected travel needs. Methodology/approach — This chapter reports recent research using a statedchoice survey design that is innovative in two respects. First, respondents are asked to consider two types of choice having different time horizons but which are thought to be linked in a strategic-tactical structure. The two types of choices are (a) purchasing ‘mobility resources’, which include commitments such as car ownership and subscription to carsharing services and (b) choosing a mode of transport for a particular instance of travel. The second methodological innovation is that respondents indicate their choices in the context of giving advice to a demographically similar ‘avatar’. The development of a technique for ‘empirically constrained’ efficient design is discussed, as is its application to this survey. This objective is to provide survey designs with a high degree of statistical efficiency whilst maintaining plausibility in the combination of attribute levels. Field data from an empirical application (n ¼ 72) was collected and analysed. Findings — The proposed method for efficient design proved successful. The main substantive findings from the empirical application are presented, along with detailed results relating to how different demographic classes of

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

454

Scott Le Vine et al.

respondents engaged with the instrument. For instance, living with one’s partner and living with no children at home were associated with high scores on a scale of similarity between the experimental choice context and one’s real-world mobility choices. Research limitations/implications — The proposed techniques appear promising, though the empirical results must be viewed as indicative only due to the size and coverage of the field data sample. Keywords: Efficient design; stated preference; multi-horizon choice

25.1. Introduction This chapter presents an innovative stated-choice survey that looks at two linked choices: a generalization of car ownership to include other types of ‘mobility resources’ on the one hand, and the use of transport modes on the other. It is argued that these two choices exhibit a particular form of multi-dimensionality, where the former can be thought of as a strategic level of choice-making and the latter a tactical one. The motivation for this research is to understand how people choose whether or not to use carsharing services. It is hypothesized that people choose which mobility resources to own on the basis of how those resources would perform for their expected travel needs. This is the second in a series of articles; the first (Le Vine et al., 2011) discussed qualitative research using Gaming–Simulation techniques that explored how people view the prospect of using a carsharing service, which led to the major design decisions of the Advanced Vehicular/Activity/Travel And Resource (AVATAR) survey. This chapter has two objectives. The first is to present in detail the development and application of a hybrid method of efficient survey design (which we term ‘empirically constrained efficiency’) that balances between the principles of statistical optimality and plausibility in the combination of attribute levels that are presented to respondents. The second is to present the main results from the field data collected with the AVATAR survey, in particular the insights they yield into the use of this and similar complex survey methods. The rest of this chapter is structured as follows. Section 25.2 presents an overview of the theory of design optimality as it relates to stated-choice survey methods. Section 25.3 describes the AVATAR survey design, and Section 25.4 describes the design decisions regarding empirically constrained efficiency. Section 25.5 presents results from the empirical application, and Section 25.6 concludes this chapter.

25.2. Background on Design Optimality Optimal stated-choice survey designs (as opposed to orthogonal designs) attempt to maximize the amount of useful information that can be obtained from the relatively

Empirically Constrained Efficiency

455

expensive collection of SC data. In essence, the researcher attempts to structure the trade-offs that characterize each choice situation such that the respondent’s choices yield as much statistically relevant information as possible. By way of contrast, the inclusion of dominant (and hence dominated) alternatives, or other poorly structured alternatives, within an SC choice set provides little information regarding the choice-maker’s preferences. Using an efficient design allows the researcher to either maximize the efficiency of the parameter estimates from a given sample size, or identify the predicted minimum required sample size (PMRSS) needed to achieve a target level of efficiency in the estimates, though the achieved level of efficiency will depend on how well the priors represent the ‘true’ parameter estimates, which is not knowable a priori. The term ‘efficient’ is more general than ‘optimal’ in the case of survey design, as in many cases the researcher does not exhaustively sample all possible designs. In preparing an efficient design, the researcher must start with a set of fully specified utility functions that are believed to describe the behaviour under study and an informed guess as to the value of each of the parameters (the priors). In many applications the priors take the form of point values, though they can equally be specified as distributions with some given parameters if desired (Bliemer, Rose, & Hess, 2008). Other aspects of the design that the researcher must specify include the set of different levels for each design attribute, the choice set of alternatives amongst which respondents choose, and the number of replications per respondent. When the researcher has used this information to prepare a candidate design (i.e. a full set of attribute levels for each replication of the SC task), there is a unique mapping from any design onto the asymptotic parameter variance–co-variance (AVC) matrix (Rose and Bliemer, 2005). The researcher then proceeds to randomly generate a large number of designs, each of which is a combination of independently drawn attribute levels for each alternative in each replication, and to compare the relative statistical efficiency of the candidates. Various metrics exist to quantify the relative ‘efficiency’ of a candidate design as encapsulated in the AVC. Minimizing the determinant (D-efficiency) of the AVC matrix of the parameter estimates appears to be most widely used efficiency criterion, though the literature discusses others, including (see Huber & Zwerina, 1996; Ibanez, oner, & Daly, 2007; Kessels, Goos, & Vandebroek, 2006):  Minimizing the trace of the AVC matrix (A-efficiency);  Minimizing the largest (co-variance) value on the diagonal of the AVC matrix (S-efficiency);  Obtaining rough ‘utility balance’ between the alternatives (B-efficiency). Several properties of the matrix determinant appear to have contributed to the wide use of D-efficiency as an efficiency criterion. It accounts for the co-variances of the AVC matrix — the off-diagonal elements — unlike A-efficiency which takes account of only the diagonal elements of the AVC matrix. B-efficient designs which present respondents with choice sets of several nearly equally attractive alternatives (i.e. where the choice probabilities based on the priors are roughly equal) do not in

456

Scott Le Vine et al.

general maximize identifiability of parameters; in the case of binary choice this coincides with the range of roughly 70%-30% choice probabilities rather than near the 50%-50% point representing perfect (binary) utility balance. (Kanninen, 2002)

25.3. Overview of AVATAR Survey The AVATAR survey was designed to provide empirical data to test a proposed form of discrete choice model that is strategic-tactical in structure. The specification operationalizes the hypothesis that people make ‘strategic’ choices relating to mobility in part on the basis of how those strategic choices would constrain or enable their travel in expected future ‘tactical’ situations. The ultimate intent was to develop techniques to predict the size of the market and impacts of carsharing. Further details of the model specification can be found in Le Vine (2011). Carsharing organizations (known as ‘car clubs’ in British English) offer a type of short-term car rental where subscribers are able to reserve and access any of a fleet of cars that are parked in distributed parking spaces, rather than airports and storefronts as with traditional car rental. The choice task facing a respondent in the AVATAR survey is to indicate a choice of which method of transport to use to access each of a set of activities and which, if any, mobility resources to acquire. A traditional stated-choice survey designed around the choice of travel mode for a single archetypal journey was considered, but rejected due to the nature of the carsharing context, particularly potential users’ perceptions of the radical re-structuring of costs associated with car use. In addition, carsharing presents complex qualitative differences vis a` vis personal car ownership. Use of a carsharing vehicle generally requires a higher degree of advance planning, including timing and duration of the use episode, and at peak times subscribers may find that nearby cars are fully booked. In the example shown in Figure 25.1, the respondent has chosen to use public transport to access her work and to drive a personal car to the rest of her activities. She has also indicated that she would purchase both a car and a public transport season ticket. Had she not selected to purchase a car, the options to drive to/from each of the activities would be shown as unavailable, and purchasing a season ticket allows her to avoid pay-per-use costs of public transport. As can be seen in Figure 25.1, the respondent is asked to declare ‘I would do this if I were Jane [Joe]’ to end a replication of the SC game rather than a more straightforward statement of intent such as ‘I would do this’. The respondent is told that they are advising an avatar (a virtual character to whom the respondent is ‘introduced’ in the early stages of the survey) rather than the typical dual role of the respondent as both the choice-making agent and the person who would face (were the situation not hypothetical) the consequences of their stated choices. It was decided to employ the avatar device in order to allow respondents to take part in the survey with only a single interview of reasonable duration; alternative methods involving pivoting the SC design around respondents’ individual multi-day activity-travel

Empirically Constrained Efficiency

457

Figure 25.1: Screen capture of the stated-choice game board.

diaries were considered but rejected as the level of effort per respondent was inconsistent with the desired sample size and available resources. Readers are referred to Le Vine et al. (2011) for a detailed discussion of the design decisions associated with the avatar methodology. The generic guidance, in instances of ‘choice-making-by-proxy’, to maximize the congruence with ‘choices-for-oneself’ is to design the proxy (avatar) to be as vivid and concrete as possible to the respondent. Each respondent’s avatar was specified to have the same observable sociodemographics as him/her amongst the following characteristics:      

gender (two categories), age band (three), domiciling with/without one’s partner (two), presence of children in household (two), employment status (two: employed and not employed), location (two: Inner and Outer London).

Thus each respondent (and their avatar) is classed into one of 96 sociodemographic categories. Each interview consisted of four SC replications, presented to the respondent as different neighbourhoods to which their avatar is considering to move. The game board for the first two replications was as shown in Figure 25.1. A round-trip carsharing service was available in all rounds of an interview; after the second replication the respondent was introduced to a prospective ‘one-way’ carsharing service (cf. www.car2go.com; www.drive-now.com), which was added as an option in the latter two replications.

458

Scott Le Vine et al.

Due to the complexity of the stated-choice task, before the main SC exercise each respondent performed a practice round of the game. An interviewer was present at each interview; in the directed practice the interviewer verified (using a checklist) that the respondent understood the various functions of the game board. If the interviewer was not satisfied the respondent was guided to repeat the practice round. In addition to the respondent being introduced to an avatar that is somewhat similar to them demographically, the five activities that they are presented with are drawn from the five most-frequently undertaken activities amongst people in their demographic category, as observed in Great Britain’s 2004/2005 National Travel Survey (NTS) (Abeywardana, Christophersen, & Tipping, 2006). This alignment of the avatar and his/her activities with the respondent’s demographics is analogous to traditional ‘pivoting’ in SC surveys, with the difference that this design is less personalized; it is tailored to archetypal observed behaviour of the respondent’s demographic class rather than (as in a mode choice context) to recent real-world behaviour of the respondent. The generic activity descriptions from the NTS (e.g. personal business, recreation, etc.) were mapped onto descriptions of specific activities for presentation to respondents (e.g. visit hairdresser, visit leisure centre). The travel to access each activity was explicitly presented as two-journey tours to/from each out-of-home activity, rather than more-complex multi-stop tours, in the interests of minimizing the complexity of a design which of necessity asked the respondent to consider many distinct pieces of information simultaneously. Displayed itinerary times for performing each of these journeys by walking, cycling, car travel and public transport were generated from online travel planning services. Most activities were presented as occurring with once/week frequency, though work or school activities were presented as occurring once/weekday or five times per week. The decision to specify five activities, and to present this relatively small set of activities as the avatar’s ‘typical week’, was made on the basis of a balance between the high information load placed on the respondent (which would suggest presenting a small number of activities) and the desire to specify a profile of distinct activities that represents a person’s overall travel wants/needs as plausibly as possible (which would tend to argue for presenting a large number of activities).

25.4. Application of Empirically Constrained Efficiency As the number of survey respondents would be limited to c.75, it was decided to implement a form of design optimality in an attempt to maximize the precision of parameter estimates obtained from the survey’s output dataset. However, the nature of the AVATAR survey presented two main challenges to the application of textbook design-efficiency principles. First, the AVATAR survey was designed to jointly estimate parameters with a revealed-choice dataset (Britain’s NTS), in which respondents’ recorded a week of

Empirically Constrained Efficiency

459

their travels as well as the set of mobility resources they own. The revealed-choice dataset provides support for estimating a number of parameters, but contains no information on subscriptions to or usage of carsharing services; the AVATAR dataset’s role in the joint estimation process is to provide estimates of parameters relating to carsharing. In other words, the data arising from the AVATAR survey was used to identify some parameters, rather than the full set of parameters. The second challenge relates to the substantive characteristics of the choice context. As respondents were told that they would be considering different neighbourhoods in London for each stated-choice replication, we sought to maximize the authenticity of the choice experiment by ensuring some degree of plausibility in the combinations of attribute levels. For instance, a respondent may view a 10-minute bus ride to work to be a plausible attribute level, and the same for a 90-minute car journey to work, but in combination they may not be seen as plausible options for performing the same journey. The design variables in the AVATAR survey were the travel time (in minutes) and journey costs (in British pounds) of accessing each activity. Additional variables were included in the choice experiment (e.g. the fixed costs of owning a car were stated to be d4000/year), though it was decided not to vary this and other attribute levels across respondents or replications. A sample AVC matrix from this study with arbitrarily chosen element values is shown in Table 25.1. We denote A {A1, A2, A3} as the subset of parameters which are in principle identifiable from the NTS portion of the combined NTS/AVATAR estimation dataset, and B {B1, B2, B3, B4} as the subset of parameters which are only identifiable from the AVATAR part of the dataset. The elements potentially relevant to the design of the AVATAR survey are the black (bottom right, (B by B)) and grey (top right (A by B) and bottom left (B by A)) sections of the matrix in Table 25.1. Elements in the black sub-space are the (co)variance elements of parameter subset B, such as those related to carsharing subscription and usage. Elements in the grey space are the co-variances between parameters in subset A and B; large co-variances in this space appear undesirable

Table 25.1: Sample of an asymptotic variance/co-variance matrix, containing dummy element values.

A1 A2 A3 B1 B2 B3 B4

A1

A2

A3

B1

B2

B3

B4

26.86 22.11 1.25 3.53 13.79 4.35 3.50

22.11 23.37 1.57 3.35 10.68 7.73 4.17

1.25 1.57 0.27 0.28 0.80 0.67 0.86

3.53 3.35 0.28 0.56 1.86 0.90 0.99

13.79 10.68 0.80 1.86 11.66 3.50 4.94

4.35 7.73 0.67 0.90 3.50 6.46 3.06

3.50 4.17 0.86 0.99 4.94 3.06 7.20

460

Scott Le Vine et al.

based on the same arguments as those for using D- or S-efficient criteria rather than A-efficiency. In considering the merits of the various metrics for characterizing design efficiency, it was noted that S-efficient designs, as opposed to D-efficient ones, tend to have (a priori predicted) variances for parameters which correspond to t-values in a relatively narrow range just meeting the criterion specified by the researcher (e.g. 1.96). A D-efficient design, as it is based on a matrix-wide ‘global’ measure, may well have some elements in its AVC matrix which correspond to t-values much larger than the minimum threshold defined as ‘acceptable’; this is inefficient if the objective is parameter identifiability as a binary yes/no measure and further decreases in t-values beyond this are not considered to be of value.1,2 If the objective is to be able to identify all parameters to at least some minimum degree of confidence, an optimal design would provide roughly equal information (i.e. t-value) on each of the parameters, which is a general property of S-efficient designs. The PMRSS consistent with this objective can then be determined by scaling all elements in the AVC up or down (through adjusting the sample size) until the smallest (of the roughly equal) tvalue takes a value just larger than the pre-determined threshold for identifiability. Thus, it was decided to employ an S-efficiency-based metric, where the efficiency of each candidate design was defined to be the maximum of the t-values corresponding to the diagonal values in the black matrix sub-space (there being no elements within the grey space which are on the diagonal of the AVC matrix). Sefficiency was chosen, as the primary objective in this instance was to maximize the efficiency of all parameters of interest in binary yes/no terms for a given sample size. The process of generating the efficient AVATAR survey design was facilitated by modifying a spreadsheet to prepare standard D-efficient survey designs (Rose and Bliemer, 2010). The complete model specification included 17 parameters; the subset the efficient design process focused on consisted of 3 parameters: 2 mode-specific travel time parameters (1 for each of 2 types of car club service models) and 1 cost parameter. (One of the two types of carsharing service models did not have a separate cost parameter as by design that service model had costs that were perfectly correlated with travel time, e.g. X pence per minute.) Two alternative-specific constants — one for each of the carsharing service models — were also estimated, but

1. It is noted that as with A-efficiency, S-efficiency does not explicitly account for off-diagonal elements of the AVC matrix, whilst D-efficiency does. A structural property of an AVC matrix, however, is that the largest (absolute value) in the matrix must lie on the diagonal, hence neglecting the off-diagonal elements is warranted since the S-efficiency criterion is to minimize the largest value on the diagonal and hence in the entire AVC matrix. 2. Though the possibility exists that two candidate designs may yield identical S-efficiency measures by virtue of their minimum t-values being equal, intuitively it would seem possible to select one or the other on a systematic basis. For instance let us consider two designs, one where the t-values corresponding to the diagonal elements of the AVC matrix are {1,2,3} and the other where they are {2,2,3}. Intuitively one would select the first design as being more efficient, because it is precisely equal to the second in the identifiability of the least-identifiable parameter, and performs better in identifying the remaining parameters (the ‘non-least-identifiable’ parameters).

Empirically Constrained Efficiency

461

not included in the efficient-design process as there were no a priori expectations for their values. The empirical distributions of journey itinerary length by mode from the NTS data were binned into a small number of levels (5, 10, 15, 20, 30, 45, 60, 90 and 120+ minutes). Judgement was applied to truncate the upper tails of the distributions. Figure 25.2 shows the bivariate distribution of itinerary times for commuting by car and public transport (in the interests of clarity cycling and walking are not shown here) for residents of Inner and Outer London (on the left- and right-hand-sides, respectively). The centroids of the distributions are located at 32 minutes/47 minutes (car/public transport) for Inner Londoners and 34/57 minutes for Outer Londoners. The preceding discussion describes the generation of travel times for candidate designs. Corresponding journey costs for car and public transport travel were drawn from discrete distributions generated by the analysts. The distributions of costs (separate distributions were generated for each of the XX activity purposes) were allowed to vary within fairly narrow bands, in an effort to balance between ensuring the plausibility of the time–cost combinations and allowing sufficient variation for the S-efficiency algorithm to work within. Twenty-five candidate designs (each drawn randomly from the empirical distribution just described) were evaluated for each demographic class. It was decided to perform relatively few draws in the interests of ensuring that the selected designs for each class would be likely to be drawn from within the well-covered portions of the distribution spaces, and thus likely to be perceived as plausible by respondents. Taking a larger number of draws could have led to the designs tending to be drawn from the sparser portions of the distribution spaces, gaining some additional efficiency at the expense of plausibility.

Figure 25.2: Example (for commuting journeys) of bivariate empirical distribution of duration itineraries, for residents of Inner London (left) and Outer London (right). (Values lower than 1% suppressed.)

462

Scott Le Vine et al.

As the design was tailored to each respondent’s demographic characteristics, it was not possible (unless the precise demographic composition of the sample could be known a priori) to estimate the PMRSS with certainty (such that the predicted minimum sample size would be accurate to the degree that the parameter priors were correctly guessed). Separate PMRSS values were calculated for each one of the 96 demographic classes, as a distinct efficient design was identified (and then taken forward for the field data collection) for each of these classes. PMRSS for each class was calculated in the usual way, by calculating the AVC matrix for a single archetypal respondent, determining the least-identifiable parameter within the relevant subspace of the matrix, and then calculating the critical sample size — the PMRSS — at which this least-identifiable parameter would be statistically identifiable from zero (i.e. a t-statistic of 1.96). Figure 25.3 shows how the PMRSS is distributed across the demographic classes. The arithmetic average of the PMRSS across all 96 classes was predicted to be 204, though with wide variance across the different demographic classes (standard deviation of 281). For comparison purposes a second curve is included in Figure 25.3, showing a similar distribution from a randomly drawn (rather than efficient) design. It can be seen that the gains in statistical efficiency of the efficient design increase with PMRSS — in other words, large calculated PMRSS values from the efficient design are associated with very large values from the randomly drawn design.

Figure 25.3: Distribution of estimated minimum-required sample sizes for the demographic classes, for the selected efficient design and a randomly drawn design.

Empirically Constrained Efficiency

463

Field data collection took place in Spring 2011. The sample consisted of 72 driving-license holders living in Greater London, 54% of them female and 71% employed. Thirty-one per cent were carsharing subscribers; 35% owned a car. Respondents were recruited by a specialized market research firm; the sample was stratified by car access type (car owner, carsharing subscriber, or neither), employment status (employed/not employed), age (o35, 35+ ) and place of residence (Inner or Outer London). Thirty-one of the 96 demographic classes were represented by at least one of the 72 respondents. The remaining 65 designs were therefore not used in the field data collection, though it was necessary to identify a design for each of them as the composition of the sample was not known a priori. During post-processing the sample was re-weighted to adjust for the overrepresentation of carsharers and non-car owning people relative to the sample frame of Londoners with driving licenses. Statistical representativeness was deemed infeasible with available resources; the sample was instead intended to be as diverse as practical. Available resources was the binding constraint on the number of respondents; given that the average PMRSS was over 200 respondents this implied that it was unlikely that all parameters would be statistically significant at the po0.05 level (which was indeed found to be the case in model estimation). The substantive findings (cf. Le Vine, 2011) as they pertain to the carsharing market should thus be viewed as indicative in nature.

25.5. Empirical Results The presentation of results begins with the highlights of the patterns in respondents’ choices in the SC game, which are found in Tables 25.2 and 25.3. As this chapter is principally methodological, the remainder of the results pertain to how respondents interacted with the survey instrument. The most striking of these results was the large proportion of choices to ride a bicycle. Though this was addressed in the piloting and was found to be lower after revisions to the instrument (Le Vine et al., 2011), the modal share of cycling remained implausibly high. Cycling choices were found to be correlated with being young, not

Table 25.2: Summary of mobility resource choices. Percent of replications Purchase Purchase Purchase Purchase Purchase

a a a a a

personal car public transport season ticket bicycle carsharing subscription ‘one-way carsharing’ subscription

30% 35% 51% 4% 27%

NB: Total does not sum to 100% due to respondents choosing multiple resources in some replications.

464

Scott Le Vine et al.

Table 25.3: Summary of transport mode choices. Percent of journeys Driving a personal car Public transport Riding a bicycle Walking Taking a taxi or minicab Driving a carsharing car Driving a ‘one-way carsharing’ car

17% 27% 27% 14% 7% 1% 8%

being employed, being male, not living with a partner, not living with children and short interview duration. Cycling itineraries in the survey design generally offered relatively fast journey times, which were paired with no out-of-pocket cost and a relatively low cost of acquisition. The negative correlation with interview duration may indicate that a substantial portion of respondents followed a choice-making process of minimizing these quantities and thus avoiding high-cognition consideration of the qualitative aspects of cycling. Table 25.4 presents the correlations between respondents’ choices within each replication of the SC game. High correlation can be seen between each of the mobility resources and the modes of travel that they facilitate, with the closest relationship between purchasing a car and driving it. Purchasing a car was negatively correlated with purchasing all other resources, though this was only found to be statistically significant in the case of purchasing a bicycle. Purchasing a public transport season ticket was found to correlate positively with carsharing usage, though negatively with subscription to a one-way carsharing service. Using a oneway carsharing service was found to correlate negatively with the number of walking journeys. Table 25.5 shows results from three linear regression models using each of the interview metrics as the dependent variables and respondent demographics as the explanators. Following a specification search, all parameters found to have a p-value of 0.15 or less are included in the results shown below. Respondent demographics were generally found to be poor predictors of the length of an interview and level of mouse activity. Being older was associated with longer interview durations, and higher education levels were associated with shorter interviews. Women were found to have performed fewer mouse clicks to seek information during the SC experiment than men did, though this result is only significant at the p ¼ 0.10 level. Table 25.6 presents similar regression results, where the dependent variables are responses to three post-game questions (PGQs) that explore different aspects of respondent engagement with the SC experiment. PGQ #1 asked respondents how similar or different the SC experiment was to how they think about their own mobility; it was intended to investigate how well the SC

.91

 .07  .12

.06  .10  .08

 .41 .81

 .39

.09

 .34

 .35

 .09  .10

.55

 .30  .34

 .17  .18

 .14  .11 .87

.04

.13

 .10  .10

 .08

 .05

 .11

 .21

.82

.04

.06

.05

 .09

 .26  .25 .02

 .01

.21

 .19  .23

 .10

 .37  .36 .04

 .01

 .12

 .08  .06

.00  .12  .08

 .12

 .08

 .20  .17

# Taximinicab journeys

.01  .14  .09

 .18

 .14

 .07  .03

# Carsharing # # ‘One-way # ‘One-way # Walk subscriptions Carsharing carsharing’ carsharing’ journeys journeys subscriptions journeys

 .31  .37

NB: Absolute values larger than 0.11/0.10 are significant at the po0.05/po0.10 level.

# Car purchases # Car driving journeys # Public transport season tickets # Public transport journeys # Bicycle purchases # Cycling journeys # Carsharing subscriptions # Carsharing journeys # ‘One-way carsharing’ subscriptions # ‘One-way carsharing’ journeys # Walk journeys # Taxi-minicab journeys

# Car # Car # Public # Public # Bicycle # Cycling purchases driving transport transport purchases journeys journeys season tickets journeys

Table 25.4: Correlation matrix of stated choices within replications of the SC game.

0.01 – 0.07 – – – – – – 0.13 – –

– –



– 5.5 – –

Significance level

35.4 – 5.8 – –

Parameter estimate

0.12

– – – –



– –

155.9 13.2 – – –

Parameter estimate

0.06

– – – –



– – – –



– –

28.5 5.1 – – –

o0.01 0.15 – – – – –

Parameter estimate

0.07

– – – –



– –

o0.01 0.10 – – –

Significance level

Number of information-seeking mouse clicks

Significance level

Number of mouse clicks

Level of education was coded on an ordinal scale with the following values (based on the British education system): 0 ¼ GCSE, 1 ¼ A level, 2 ¼ Diploma, 3 ¼ Degree, 4 ¼ Masters/PhD. b Income was coded on the following ordinal scale: 0 ¼ up to d25K/year, 1 ¼ d25 K  d50K/year, 3 ¼ d50K + /year. c Respondents lived in either Inner London or Outer London.

a

Constant Female Age Living with partner # Children in household Owns ‘personal’ car Household owns car(s) Carsharing subscriber Employed Level of educationa Family incomeb Living in Outer Londonc

Respondent demographics

r2

Interview duration (minutes)

Dependent variable

Table 25.5: Correlation of respondent demographics with interview metrics.

466 Scott Le Vine et al.

3.0 – – 1.0 1.0

o0.01 0.12 – 0.01 0.08 – – – – – – –

4.2 0.7 – 1.2 0.7

– –



– – – –

– – – –



– –

Parameter estimate

Significance level

Parameter estimate

0.30

0.18

– – – –



– –

o0.01 – – 0.08 0.03

Significance level

PGQ #2

1.1 – 0.6 –



– –

5.9  1.0 – 0.9 –

Parameter estimate

0.34

0.05 – 0.06 –



– –

o0.01 0.04 – 0.07 –

Significance level

PGQ #3

NB: PGQs are answered on a 7-point Likert scale, with 1 ¼ ‘Very different’ and 7 ¼ ‘Very similar’. PGQ #1: How similar or different was this game to how you think about getting around? PGQ #2: Would you say Jane’s/Joe’s routine of activities is similar or different to yours? PGQ #3: You just gave some advice to Jane/Joe. As you thought through Jane’s/Joe’s choices, how close was your thinking to how you make choices for yourself?

Constant Female Age Living with partner # Children in household Owns ‘personal’ car Household owns car(s) Carsharing subscriber Employed Level of education Family income Living in Outer London

Respondent demographics

r2

PGQ #1

Dependent variable

Table 25.6: Correlation of respondent demographics with responses to post-game questions (PGQs).

Empirically Constrained Efficiency 467

468

Scott Le Vine et al.

game’s strategic/tactical choice context aligned with respondents’ thinking. Being male seemed to lead to a higher degree of similarity with one’s own thinking about their personal travel, though this was only significant at the 0.12 level. Living with one’s partner and living without children at home were also associated with similarity between the SC game and respondents’ views of their travel-choice context. PGQ #2 then asked respondents about the degree of similarity between their avatar’s set of five activities and their own routine. This question explored how well the small set of somewhat-specific activities was seen to align with the likely morecomplex set of activities of the respondent’s own life. Living with children at home was found to lead to lower reported similarity in this aspect, but the opposite was found for living with one’s partner. None of the other demographic variables were found to be significant. Of further note, responses to PGQ #2 had the lowest average score of the three PGQs reported here (the averages for PGQ #s 1, 2 and 3 were 4.8, 4.1 and 5.5, respectively). The third PGQ inquired about how similar the respondent’s choice-making process (relating to their advice to their avatar) was to how they make choices for themselves. This question was intended to assess how well or poorly the ‘avatar’ device served in its intended role (as a proxy for each respondent’s personal choicemaking) and how this may have varied amongst different demographic groups. The fact that the average response was 5.5 on a scale from one to seven was somewhat reassuring, though a number of demographic descriptors were found to have statistically significant effects on answers to PGW #3. Women and respondents living with their partner reported greater differences in than men and respondents not living with their partner, respectively, and being in employment and having higher family income were associated with a higher degree of similarity between own-choicemaking and self-avatar choice-making.

25.6. Conclusion This article is the second in a set relating to the design of a strategic-tactical survey in which respondents are asked to give advice to a demographically similar avatar rather than, as is typical of SC experiments, to make choices solely for themselves. The first (Le Vine et al., 2011) described the theory and empirical data underpinning the two major survey design decisions (the strategic/tactical structure and the avatar device), and results from field testing of the survey instrument. This chapter discusses the development of a technique termed ‘empirically constrained efficiency’, in which the principles of design optimality are applied using draws from empirical distributions of the design variables to capture some degree of real-world correlation patterns, rather than using independent draws from uniform distributions of the levels of each design variable. This technique proved successful, and it is shown that, despite the use of the empirically constrained distributions, this method identified a design that provided large gains in statistical efficiency versus a randomly generated design.

Empirically Constrained Efficiency

469

We present here a number of headline results from the empirical application of this survey. Though model specifications and subsequent statistical support for the hypothesized strategic-tactical relationship are not the focus of this chapter, it is shown here that those interviewed chose to own mobility resources and use methods of transport in ways both broadly plausible and in keeping with the underlying hypothesis; the one exception to plausibility being a rate of cycling choices beyond a reasonable level. The authors anticipate presenting the substantive results in greater detail in forthcoming articles focused on these issues. How respondents interacted with the survey’s structure and the instrument package are then examined. Being an existing subscriber to a carsharing service or owning a car were not found to have any significant effects on the interview metrics or responses to the PGQs. Living with one’s partner was found to be associated with reporting similarity between the nature of the choice context and how one thinks about their own mobility, though the opposite effect was found for living with children at home. Men were found to have performed more information-seeking mouse activity during the interview than women, and to have indicated that the use of the avatar device was a better simulation of how they consider and make real-world mobility choices than women indicated. Being employed and having higher family income were associated with less reported discrepancy in self-avatar choice-making processes. These findings have implications for the application of stated-choice experiments intended to simulate people’s real-world mobility choices with multiple time horizons. Addressing issues of differential interaction with survey instruments of this class is an important item on the agenda for research into complex survey design.

Acknowledgement The authors wish to thank the RAC Foundation for partial sponsorship of this research, though responsibility for any errors remains with the authors.

References Abeywardana, V., Christophersen, O., & Tipping, S. (2006). National Travel Survey 2005 technical report. Prepared for the UK Department for Transport by the National Centre for Social Research. Bliemer, M. C. J., Rose, J. M., & Hess, S. (2008). Approximation of Bayesian efficiency in experimental choice designs. Journal of Choice Modeling, 1(1), 98–127. Huber, J., & Zwerina, K. (1996). The importance of utility balance in efficient choice designs. Journal of Market Research, 33(3), 307–317. doi: 10.2307/3152127 Ibanez, J. N., Toner, J., & Daly, A. (2007). Optimality and efficiency requirements for the design of stated choice experiments. European Transport Conference (ETC), Leiden, Netherlands, 17–19 Oct. 2007.

470

Scott Le Vine et al.

Kanninen, B. (2002). Optimal design for multinomial choice experiments. Journal of Marketing Research, 39, 214–227. Kessels, R., Goos, P., & Vandebroek, M. (2006). A comparison of criteria to design efficient choice experiments. Journal of Marketing Research, 43(3), 409–419. doi: 10.1509/jmkr.43.3.409. Le Vine, S. (2011). Strategies for personal mobility: A study of consumer acceptance of subscription drive-it-yourself car services. Doctoral dissertation. Imperial College London, London. Le Vine, S., Lee-Gosselin, M. E. H., Sivakumar, A., Polak, J. (2011). Design of a strategictactical stated-choice survey methodology using a constructed AVATAR. Transportation Research Record, 246, 55–63. Rose, J. M., & Bliemer, M. C. J. (2005). Sample optimality in the design of stated choice experiments. Working Paper ITLS-WP-05-13. Institute of transport and logistics studies, University of Sydney, Sydney. Rose, J. M., & Bliemer M. C. J. (2010). Generating stated choice experimental designs. Accompanying Microsoft Excel worksheet. Presented at the 89th Annual meeting of the Transportation Research Board.

Chapter 26

WORKSHOP SYNTHESIS: METHODS FOR CAPTURING MULTI-HORIZON CHOICES$ Chandra Bhat and Matthew Roorda 26.1. Introduction and Purpose Transportation researchers are being confronted by new questions about decisions that span multiple time horizons. These include long-term strategic commitments such as residence location or mobility tools (e.g. vehicle ownership, public transport season tickets, or subscriptions to shared vehicle services); tactical short-term daily choices such as alternative destinations, travel timing, route, mode or accompanying persons; and en route choices such as spontaneous activity stops or re-routing. In the longer time frame of a year or several years, households may change in composition, acquire a vehicle, move to another house, or have a member join or depart from the labour force or change jobs. Travel surveys sometimes include the history of choices over long-time frames, but the interdependence between choices over such long time frames is poorly understood and rarely addressed. This workshop explored the complexities of measuring and analysing these multi-horizon choices and their interdependencies.

26.2. Summary of Resource and Contributed Papers Two papers were presented in this workshop (Le Vine, Sivakumar, Lee-Gosselin, & Polak, 2011; Heinen & Maat, 2011). Eric Miller (University of Toronto) was invited to provide additional insights on the topic.

$

Participants: Margarita Amaya (Chile), Chandra Bhat (USA) (Chair), Mark Bradley (USA), Juan Antonio Carrasco (Chile), Kenneth Casavant (USA), Carlos Florez (Colombia), Susan Handy (USA), Eva Heinen (Netherlands), Scott Le Vine (UK), Kees Maat (Netherlands), Jean-Loup Madre (France), Eric J. Miller (Canada), Luis Rizzi (Chile), Matthew Roorda (Canada), Veronique Van Acker (Belgium).

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

472

Chandra Bhat and Matthew Roorda

The first paper, presented by Scott Le Vine, was entitled ‘Empirically Constrained Efficiency in a Strategic-Tactical Stated Choice Survey of the Usage of Patterns of Emerging Carsharing Services’ (in this volume). This paper reported recent research using a stated-choice survey design that is innovative in two respects. First, respondents were asked to consider two types of choice having different time horizons but which are thought to be linked in a strategic-tactical structure. The two types of choices are (a) purchasing mobility resources, which are defined to include commitments such as car ownership and subscription to carsharing services, and (b) choosing a mode of transport for a particular instance of travel. The principal hypothesis of this program of research was that people’s choices of which resources to own are a function of expected travel needs. The second methodological innovation was that respondents were asked to indicate their choices in the context of giving advice to a demographically similar avatar rather than choosing on their own behalf. The development of a technique for empirically constrained efficient design was discussed, as was its application to this survey. The objective of this method was to provide survey designs with a high degree of statistical efficiency whilst maintaining plausibility in the combination of attribute levels presented to respondents in a multi-horizon stated-choice experiment. The main substantive findings from the empirical application were presented, along with detailed results relating to how different demographic classes of respondents engaged with the instrument. Men and those respondents living with no children at home, for instance, indicated that the experiment was a better simulation of the nature of their real-world mobility choices than women and those living with children indicated. The second paper, presented by Eva Heinen, was entitled, ‘Are Longitudinal Data Unavoidable? Measuring Variation in Bicycle Mode Choice’. This paper described mode alternation in the Netherlands and compared data from a longitudinal survey with a single-moment survey focusing on bicycle commuting to evaluate the reliability of the latter. Travel data are usually collected at a single moment in time. The use of single-moment survey data to investigate variable behaviour raises questions about the reliability of such data. Repeated measures, resulting in longitudinal data, may be the solution, as in this case individuals only have to report one day at a time, which is relatively easy. Collecting longitudinal panel data, however, is more expensive and takes more time. Analyses show that mode alternation occurs frequently, especially with respect to car and bicycle commuters. Single-moment surveys seem unable to measure this accurately. Many respondents inaccurately report being a part-time cyclist when they always cycle to work or conversely claim to be full-time cyclists when they vary between modes. Moreover, many do not report their cycling frequency accurately in a single-moment survey. Unfortunately, it is difficult to determine the characteristics which predict the probability of an accurate report. In addition, the degree of accuracy is not readily linked to individual characteristics. Thus, it seems that the error in single-moment surveys cannot be easily corrected. Nevertheless, this paper reveals that even repetitive behaviour is subject to much variation and surveys should include specific questions to increase insight into this phenomenon and transport models should include mode variation in their models. For this it is essential to collect and analyse longitudinal data.

Workshop Synthesis: Methods for Capturing Multi-Horizon Choices

473

The remarks by Eric Miller helped to elaborate the concept of multi-dimension choice by identifying that long-term decisions typically involve resource acquisition and change, while short-term decisions typically involve choices about how resources are used. Decisions about resource acquisition and change are influenced by experiences and needs associated with short-term activities and travel, which are influenced by experiences on the transportation networks. Miller also pointed out that even long-term decisions are made in ‘the now’, and are potentially triggered by events that occur in the short-term (e.g. a household may choose to change their residence as a result of the birth of a baby). These trigger events occur within a context of household ‘stress’, which is the difference between the current state and some ‘ideal’ state. Miller pointed out that the challenge of multi-horizon data collection is to identify triggers and contributors to household stress, to address the dynamic connections between these elements of household decision-making, and to feed models that can represent them. Panels and retrospective surveys are the standard methods for longitudinal data collection, though both have advantages and disadvantages. Panels are expensive and time consuming and involve challenges of attrition and respondent fatigue; while retrospective surveys are cheap and easy, but are subject to the recollection abilities of respondents. Regardless of the longitudinal survey method, it is also important to gather information about transportation/land use supply (e.g. how did the road networks and land use change over the observation period).

26.3. The Context Two fundamental issues were identified by workshop participants, which became the basis for discussions in the workshop.  The labels ‘short-term’ and ‘long-term’ are not descriptive and behaviourally helpful; there is a need to consider several dimensions that characterize multihorizon choices.  The transportation field has focused on modelling outcomes of choice processes, without giving adequate attention to the process by which choices are made and the context within which choices are made. Indeed, the data collected and the methods used to understand choice processes and for modelling choice decisions need to be revisited, since many considerations are ignored in current approaches to data collection and analysis.

26.3.1. Redefining Multi-Horizon Choices The first contribution of the workshop was to recognize the inadequacy of the terms ‘long-term’ and ‘short term’ to describe the complexities of multi-horizon choices. The words short-term and long-term imply two things that are not necessarily true

474

Chandra Bhat and Matthew Roorda Instead of a single binary attribute: Long term

or

Short term

Five choice attributes on a continuum: Resource change High transaction cost Life changing Collective Long term

Resource use Low transaction cost Day to day Individual Short term

Figure 26.1: Characterization of choices. about the behaviour we are interested in. First it implies that time is the only important attribute in these decisions. Second, it implies that the time periods are discrete and binary; that is, decisions can be nicely classified as belonging to either the long-term category or short-term category. The workshop participants developed a set of attributes that more completely describe the nature of multi-horizon choices than this overly simplistic time frame-based conception. Workshop participants proposed five attributes of decisions which are shown in Figure 26.1. For each attribute, a decision may fall somewhere on a continuum. For example, the purchase of a house is a life changing, long-term decision about a major resource, with high transaction costs (legal and agent fees, plus the effort to search for the house), and likely involving other household members. A metro pass falls in the middle of the continuum, since a pass is a mobility resource purchased monthly as an individual, with low transaction costs.

26.3.2. Outcomes, Processes and Context The second theme addressed in the workshop involved definition of and relationships between the three concepts of context, process, and outcomes of multi-horizon choices. While these are important elements of any decision, the distinction is particularly important for capturing multi-horizon choices because the processes and outcomes differ by contexts. Ignoring the contextual information immediately leads to a loss in heterogeneity in behaviour attributable purely to context, which then also gets incorrectly manifested as heterogeneity in processes and outcomes. The result could be inappropriate information for policy analysis and for model forecasting. After lengthy discussion, workshop participants agreed that Figure 26.2 captures the relationship between the three concepts of context, process and outcomes. Decision outcomes are the actions made by a decision-maker that generally can be observed or self-reported. Decision outcomes can be viewed as either the transactions themselves or as the state resulting from those transactions. For the longer horizon ‘resource change’ decisions, transactions include residential moves, job changes, vehicle transactions, and changes to the economic or social structure of the family such as marriages and divorces. The corresponding states would be the location of

Workshop Synthesis: Methods for Capturing Multi-Horizon Choices

475

Context

Process

Outcomes

Figure 26.2: Relationship between decision context, process and outcomes. the household, the jobs held by household members, the vehicle fleet, and the household structure. Most data collection efforts focus on obtaining information about decision outcomes, simply because transactions and states are tangible and can usually be easily observed or self-reported by respondents. Processes are the methods by which decision-makers arrive at decisions that lead to outcomes. Processes are less tangible (both to researchers and to respondents), less easy to observe or self-report, and are in general less well-understood in the travel behaviour community. Processes have seldom been the subject of data collection efforts. The following observations were made by workshop participants regarding multi-horizon choice processes:  Utility maximization is clearly a simplification of choice processes.  Some decisions are forced upon individuals and are not based on a conscious choice process (e.g. you can lose your job, or be evicted from your home, or your vehicle can die)  Different people make decisions differently; further, decision processes, attitudes and perceptions change for a person over time. In addition, constraints and opportunities are important parts of the context of choice processes, and more information on constraints and opportunities needs to be collected. Related issues of how individuals perceive and filter information, and how much search effort they put in, also need attention.  The choice set of alternatives for any choice process is a dynamically evolving one and not a static one; similarly, seemingly instantaneous choice decisions (outcomes) themselves are based on aspirations, household lifestyle objectives, and cumulative experiences (including learning and history), indicating a need to examine threshold-based process formulations and consider instantaneous contextual factors that trigger decision-making. The prevailing approach to model dynamic choice processes as instantaneous processes needs serious re-visiting.  Lifecycle and lifestyle changes are closely inter-related; when there is a change in a lifecycle choice (such as a birth of a child or a child becoming 16 years of age), there tends to be changes in lifestyle changes (such as a change of residence or

476

Chandra Bhat and Matthew Roorda

a change in vehicle ownership). The ‘bundling’ of these major choices has received little attention. Further, personal indicators in the form of attitudes and other ‘soft’ qualitative preferences can be important determinants of choice decisions, and needs attention in future research.  It is possible that exceptional situations drive major decisions (e.g. the once a month family trip to a lake house may dictate the choice of a minivan within the vehicle fleet) Building on the observations above, Figure 26.2 shows a two-way relationship between decision processes and outcomes. While the arrow from process to outcome is obvious (the decision process leads to an outcome, by definition), the return arrow reflects that decision-makers gain experience (learn) from the outcomes of previous decisions. These two-way interactions occur between choices of the same type (e.g. the decision process to replace a car is influenced by your experience with your current car and previous car transactions) or of different types and at different scales (the process of activity scheduling is influenced by the outcome of employment choices). To summarize, multi-horizon choices are made within a context that changes over time. Representation of context is crucial in multi-horizon decisions because many choices are highly constrained. Some of the critical dimensions of the context identified in the workshop include economic, social, information, time, and space constraints and considerations. The context of decision-making develops as an interaction between the larger environment (built environment, regional economy, culture, technology) and the state of the individual decision-maker (their own economic and physical resources, social network). Constraints and opportunities arise as a result of this interaction. It is within this context that processes and outcomes then interact.

26.4. Recommendations Towards the end of the workshop, the discussion refocused on developing recommendations for future research and associated data collection to support a better understanding/representation of multi-horizon choices. First some observations were made about the current state of data collection in this area: (a) Multi-horizon choices generally require longitudinal data collection (b) The two methods that have traditionally been used for multi-horizon data collection have been panel surveys and retrospective surveys (c) Data collection has focused on measuring a trajectory of choice outcomes, without always collecting all of the important elements of context. Also, choice processes have not been adequately observed in surveys. (d) Surveys have either focused on long-term ‘resource change’ decisions, or shortterm ‘resource use’ decisions, but only rarely have simultaneously collected information on both.

Workshop Synthesis: Methods for Capturing Multi-Horizon Choices

477

(e) Developments in travel survey methods to better understand multi-dimensional choices will likely require an iterative approach, since we don’t yet know what data we need to model the process, and we won’t know until we have data to analyse. The workshop participants identified two promising directions for future research. (1) Engage in a comprehensive longitudinal survey including (a) important linked ‘multi-horizon’ outcomes such as home location changes, job location changes, car transactions, transit pass purchases and travel patterns/travel lifestyle choices, (b) traditional variables/indicators such as household and personal economics and demography, (c) contextual information on built environment, housing supply, transportation networks and regional economy (not at one instance but over the time frame of the longitudinal survey), and additional ‘process related’ information (on social networks, perceptions, learning, experience, thresholds and aspirations) as we learn what we need. (2) Engage in behavioural (qualitative and quantitative) research into the choice contexts and processes, as discussed in Section 26.3.2. Such fundamental behavioural research would help us understand how to better conduct large scale surveys to support better models. For instance, the research would help inform (a) How, why and to what extent are the many dimensions linked?, (b) What is ‘the story’ behind the trajectory of changes?, (c) How can we change behaviour in positive ways?, and (d) What theories of cognitive processes apply? A variety of potential survey methodological approaches are available to collect data for the research efforts identified above. Participants did not reach any conclusions as to which are the best methods, but promising techniques that should be considered include:         

Panel surveys Retrospective surveys Retrospective surveys followed by a panel Prospective surveys Before/after studies Real-world trials and experiments Mixed methods Stated preference Methods that utilize technology

References Heinen, E., & Maat, K. (2011, November 14–18). Are longitudinal data unavoidable? Measuring variation in bicycle mode choice. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile.

478

Chandra Bhat and Matthew Roorda

Le Vine, S., Sivakumar, A., Lee-Gosselin, M., & Polak, J. (2011, November 14–18). Empirically-Constrained efficiency in a strategic-tactical stated choice survey of the usage of patterns of emerging carsharing services. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile.

Chapter 27

Survey Data to Model Time-of-Day Choice: Methodology and Findings Julia´n Arellana, Juan de Dios Ortu´zar and Luis Ignacio Rizzi

Abstract Purpose – Departure time choice not only depends on the desire to carry out activities at certain times and places; it is a complex decision making process influenced by travel conditions, congestion levels, activity schedules, and external trip factors. To estimate departure time choice models capturing the factors influencing it in appropriate form, a complex data collection procedure allowing to obtain detailed input data from different sources and at different time periods is required. The main aim of this chapter is to describe and discuss the survey methodology we used in a time-of-day choice project, involving the collection of revealed preference (RP) and stated preference (SP) data to estimate hybrid discrete departure time choice models incorporating latent variables. Preliminary model results are also presented as an example. Methodology/approach – Data was obtained from 405 workers at different private and public institutions located in the centre of Santiago, Chile. The survey process had three different stages and used various collection methods (e-mail, web-page, and personal interviews at the workplace) in order to satisfy efficiency, reliability and cost criteria. The RP component survey design was based on the last origin-destination survey implemented in Santiago (i.e. a travel diary filled under an activity recall framework). Relevant level-of-service measures at different time periods were obtained from GPS data measured from instrumented vehicles in the public and private transport networks. A SP-off-RP optimal design considering

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

480

Julia´n Arellana et al.

dependence among attribute levels was also developed. Finally, several 1–7 Likert scale questions were included to incorporate the latent variables. Findings – The survey methodology described in this chapter represents a successful experience in terms of collecting high quality data, from different sources, with the aim of estimating appropriate time-of-day choice models. The data collection process was carried out in different stages, by means of web pages, email, and personal interviews. The data was further enriched with levelof-service attributes measured at different times of the day with unusual precision. Preliminary results reported in this chapter show that data obtained through this methodology are appropriate to model time-of-day choices. Originality/value of chapter – The novelty of the survey methodology described in this chapter is the collection of data of a different nature for time-of-day choice modelling through the integration of different collection techniques. Acquisition of very precise information about preferred departure/arrival times, level of service at different times of the day, detailed information about flexibility in schedules, employment information and attitudes towards departure times, should allow practitioners to estimate hybrid time-of-day choice models incorporating latent variables. Keywords: Time-of-day choice (TOD); revealed preference (RP); stated preference (SP); GPS; survey; travel

27.1. Introduction Understanding time-of-day (TOD) choice is crucial when studying people’s behaviour in todayus congested networks and evaluating the effectiveness of transport policies designed to deal with this problem (Bhat & Steed, 2002). Reductions in congestion can be achieved by spreading departure times into the ‘shoulder’ or off-peak periods, or by achieving a significant shift from private to public transport. Empirical evidence suggests that modifications in departure time are a more frequent response strategy for avoiding congestion or the charges associated with travel demand management (TDM) policies than changes in transport mode (Bianchi, Jara-Dı´ az, & Ortu´zar, 1998; Hendrickson & Plank, 1984; Hess, Daly, Rohr, & Hyman, 2007a). Notwithstanding, such shifts in departure time still rank below route changes (Ortu´zar & Willumsen, 2011). Thus, although TOD choice is one of the main determinants of the temporal and spatial distribution of demand, it has historically received less attention than mode or route choice. Usually one of the biggest barriers to TOD model development and implementation is associated with the type of input data required. To properly estimate TOD models capturing those factors influencing the choice it is necessary to develop complex data collection procedures that allow obtaining detailed input data from different sources and at different time periods. TOD choice

Survey Data to Model Time-of-Day Choice: Methodology and Findings

481

not only depends on the desire to carry out activities at certain times and places; it is a complex decision process influenced by travel conditions, congestion levels, activity schedules, and external trip factors. Due to the complexity of TOD decision associated with the influence of different choice factors, we argue that a possible way to include them could be by estimating hybrid choice models mixing revealed preference (RP), stated preference (SP) and attitudinal questions. The main focus of this chapter is the description and discussion of the survey methodology we designed to obtain high quality data for TOD modelling considering different factors affecting the choice. The survey procedure was developed in three stages for a sample of 405 individuals that work in the centre of Santiago, Chile. Various collection methods were used (e-mail, web-page, and personal interviews at the workplace). Preliminary model results are also reported, as an example, as they do not take into account the full complexity of the behavioural processes involved. The remainder of the chapter is organised as follows. We first present a brief review of relevant literature regarding departure time choice modelling, looking separately at formulation features and types of data reported in previously applications. This is followed by a detailed description of our survey work, followed by the presentation of some preliminary model results. Finally, some conclusions and directions for further research are given.

27.2. Literature Review 27.2.1. Departure Time Choice Models The Scheduling Model (SM) developed by Small (1982) is the most widely used model formulation in this area. The popularity of SM is due mainly to the inclusion of schedule delay (SD) terms, which are motivated by the earlier work of Vickrey (1969) and represent the amount of time people arrive late or early at their destinations in comparison with their desired arrival times. SM can successfully represent trade-offs between travel time and schedule delay terms, and can be written as in equations (27.1) to (27.5): V i ¼ bTT TT i þ bSDE SDE i þ bSDL SDLi þ dL d L

(27.1)

SDE i ¼ MaxfSDi ; 0g

(27.2)

SDLi ¼ Maxf0; SDi g

(27.3)

( dL ¼

1 0

if SDLi 40 if SDLi ¼ 0

SDi ¼ Observed arrival time  Preferred arrival time

(27.4) (27.5)

482

Julia´n Arellana et al.

where the subscript i refers to alternatives (given by time periods), TTi indicates the travel time when departing in period i, SDLi and SDEi represent SD for arriving early or late, respectively at period i, and dL is a penalty for arriving late at the destination (independent of the actual amount of lateness). Although the original Small’s formulation does not include the cost of travel, most applied work underpinned by this formulation does it. Some important changes to this formulation have been the inclusion of travel time variability (Small, Noland, & Koskenoja 1995) and daily activity participation time (de Jong et al., 2003; Ettema, Ashiru, & Polak, 2004; Hess, Polak, Daly, & Hyman, 2007b). Travel time variability is inherent to any transport network and could have an impact on the departure time choice behaviour of risk-adverse travellers who want to avoid it. Daily activity participation time is relevant as well because it influences trip making, the order of activity participation and trip departure time choice. Departure time choices are not only determined by the attributes discussed above but should consider employment characteristics, individual socio-economic characteristics, schedule preferences, and eventually information from other choices which may interact with time-of-day choice (e.g., route and mode choices), among others. Several complementary attributes have been used in TOD choice models studies but there is no consensus about which ones perform better or how to include key attributes such as flexibility in work schedules. More insights on the factors influencing departure time choices are needed to define which attributes and their proper functional form could be useful to better represent the TOD choice process.

27.2.2. Survey Data For Departure Time Models In recent years, most studies concerned with TOD choice have made use of SP data and have been based on estimating SM using fairly simplistic model structures (i.e. MNL). SP data are more popular on departure time modelling work than RP data, because the latter are very difficult to obtain (Hess et al., 2007b; Tseng, Koster, Peer, Knockaert, & Verhoef, 2011) and require a rigorous and highly expensive data collection procedure, while also being affected by significant problems with intercoefficient correlations. Even though mixing RP and SP data for estimating richer models has a long tradition in transport studies, this is not the case for TOD choice studies. To our knowledge, only Bo¨rjesson (2008) and Tseng et al. (2011) have used an explicit mixed RP/SP approach for departure time modelling (see Table 27.1). Another issue that remains unexplored in TOD models is the incorporation of attitudes and perceptions of individuals. The inclusion of latent variables (LV) capturing attitudes and perceptions has gained popularity in recent years but again this has not been the case for TOD choice models. It is not clear which non-observed factors could influence the choice process and further research is needed in this area. It could be expected that mixing data and embedding LV in TOD models should increase the explanatory power of TOD models.

Survey Data to Model Time-of-Day Choice: Methodology and Findings

483

Table 27.1: Survey data and model type used in some previous TOD studies. Survey Data

Discrete model type

Authors

RP RP RP

Small (1982), Hendrickson and Plank (1984) Steed and Bhat (2000) Bhat (1998)

SP SP SP

MNL MNL, OGEV MNL, NL, OGEV, ML MNL NL ML

SP

MNL, NL, ML

SP RP/SP

OP ML

Ettema et al. (2004) Polak and Jones (1994) Bajwa, Bekhor, Kuwahara, and Chung (2008) de Jong et al. (2003), (Hess et al., 2007a, 2007b) Bianchi et al. (1998) Bo¨rjesson (2008), Tseng et al. (2011)

Note: NL, nested logit; OGEV, ordered generalised extreme value; ML, mixed logit; OP, ordinal probit.

27.3. Survey Design The survey methodology described below was developed as a part of a TOD choice project in Santiago, the capital and most important city of Chile. Its population is approximately 6 million, living in an area of approximately 15,400 km2. According to the 2001 Origin-Destination Survey (DICTUC, 2003), about 16.3 million journeys took place in Santiago every working day, most of them being radial (i.e. into the CBD in the morning and out again in the evening). The increased congestion and the forthcoming consideration of TDM strategies in Santiago motivated the development of our project to study departure time decisions in the context of transport project appraisal. Due to Chile’s fast economic growth in the last 20 years, car ownership and motorised trip rates have increased substantially, causing congestion in the city at certain hours and locations. This has led to repeated consideration of TDM strategies by local authorities. The instrument traditionally used by Santiago local authorities to both plan and evaluate changes regarding the city’s transport system has been the strategic transport model ESTRAUS (Ferna´ndez & De Cea, 1990). This is a highly sophisticated model which, even contains a departure time module based on entropy maximisation principles (De Cea, Ferna´ndez, Dekock, & Soto, 2005). Although ESTRAUS is recalibrated periodically using new mobility data, the departure time module has not been calibrated and the survey data available for calibration does not include appropriate information (i.e. preferred arrival/departure times, detailed employment information, programmed schedules, etc.) such as activity participation and schedule delays measures.

484

Julia´n Arellana et al.

As TOD choice is influenced by multiple and diverse factors, it was decided to collect different data sources (RP, SP and LV) using various collection methods (e-mail, web, and personal interviews) following efficiency, reliability and cost criteria. Surveys were conducted at different employment centres in Santiago, on an initial sample of approximately 600 workers. Due to the amount of information sought, it was decided to organise the data collection programme in three different stages, the first two for the RP component and the last one for collecting attitudinal questions and an SP exercise based on the previous RP responses.

27.3.1. Preliminaries We decided to conduct a preliminary survey to deepen the knowledge about the attributes influencing TOD choice to aid the design of the RP, SP and LV survey instruments. Approximately 250 workers who did not participate in the subsequent stages of the project were surveyed in this preliminary survey (Arellana, 2012; Arellana et al., 2012b). Its results showed that TOD choice appeared to be influenced mainly by congestion, schedules, travel conditions, and external factors. Congestion could be represented by attributes such as travel time variability, travel time, and information about congestion in the network. Schedules could be defined by starting/ ending time of activities, amount of working hours and planned activities at different places, among others. Travel conditions could be related with attributes such as travel cost, comfort, trip security, and mode of transport to complete the journey. External factors could be associated with weather conditions, travelling alone, and constrains imposed by other activities or circumstances. Another important finding of this preliminary survey was the importance of the above factors for each type of trip considered. Schedules seem to be more important for work trips than for non-works trips. Work trips usually have tight arrival and departure time constraints imposed by employers (i.e. official starting/ending work times), while non-work trips tend to be more flexible as they can be moved easily through the day or even transferred from one day to another. The relevant factors found in the preliminary survey were included in the final survey forms through the different attributes within final RP, SP and LV components. To ensure relevance and to improve the clarity of the survey questions we asked first for expert opinion. We then conducted three focus groups to test understanding of the survey forms for the SP component, to define the way of applying the survey in the field, and to select the most attractive incentives for respondents. Finally, a pilot survey was applied as a last step prior to final implementation to determine non-response rates, identify problematic questions and improve the survey design.

27.3.2. Overview Of Design Features For The Whole Survey Due to the budget limitations inherent to any academic study, we decided to narrow the scope of the study to examine the departure time of work trips and,

Survey Data to Model Time-of-Day Choice: Methodology and Findings

485

instead of using the household as the survey sampling unit, the individual was used. We decided to focus on work trips because they represent nearly 70% of the total weekday trips in the morning peak, according to the 2006 Santiago’s Origin Destination survey (DICTUC, 2006). On the other hand, although the household is the most common sampling unit in transport surveys it is more expensive and more difficult to collect data from households than to gather individual responses. As the main aim of the survey was to obtain robust data at minimum cost, the survey was designed following efficiency, reliability and cost criteria. A three-step collection process was used to minimise the respondent burden associated with the high amount of information required. Additional advantages of collecting data in different steps are the possibility of validating data in the interim, correcting earlier stages of item non-response, and customising parts of the survey, if desired. Drawbacks of implementing a 3-step survey are that some respondent desertion is expected to happen between each stage of the survey and that higher costs and times are associated with having more visits to each respondent. The first stage of the survey was a CAPI interview focused on collecting socioeconomic and employment data, factors influencing scheduling decisions, and information about the schedule of planned activities for the second stage travel day. The survey’s second step involved filling a web-based travel diary following an activity recall framework (Ampt & Ortu´zar, 2004). Finally, the third stage involved another CAPI to collect responses to a SP-off-RP experiment along with an attitudinal questionnaire (both focused on work based trips). In order to minimise the total cost of the survey, it was decided to collect the first and third stages directly at the workplaces and to use a web-based selfcompletion form for the second stage. Personal interviews have the advantage of better response rates (Ortu´zar, 2006). Surveying directly at employment centres where the respondent’s superiors have previously introduced the interviewers and supported the study, has shown very high response rates and time collection efficiency in previous surveys in Chile (Cantillo, Heydecker, & Ortu´zar, 2006; Caussade, Ortu´zar, Rizzi, & Hensher, 2005; Hojman, Ortu´zar, & Rizzi, 2005). On the other hand, the decision to use self-completion forms for the survey’s second stage was an outcome of one of the focus group made prior to the final survey implementation. The success of the survey implementation process involving personal interviewing was clearly a function of the workplaces selected and interviewer training. Respondents were selected randomly from workplaces previously chosen, where the authors and/or the main supporter institution of this project, Pontificia Universidad Cato´lica de Chile, had a direct relation with at least one member of their managing staff. Even though work destinations were fixed a priori, work morning trip origins were spread all over the city (see Figure 27.1). Because of higher response rates during previous studies and as an outcome of the focus groups, it was also decided to use female students as interviewers. These were especially trained for two days and until we confirmed their full understanding of the survey.

486

Julia´n Arellana et al.

Figure 27.1: Location of respondent’s homes and workplaces.

27.3.3. Sample Size Calculation We approached this subject using Smith (1979) formula, as indicated in equation (27.6).



S2  Z 2a d2

(27.6)

where S refers to the standard deviation of the variable under study, d to the absolute accuracy level desired, Za is the standard Normal value for an a confidence level. Our objective was to obtain an indicative amount of the sample size required to replicate the distribution pattern of working trips every 15 min on a working day as reported in the 2006 Santiago’s Origin Destination survey (DICTUC, 2006); in that

Survey Data to Model Time-of-Day Choice: Methodology and Findings

487

survey the critical 15 min period carried 13.75% of the total work trips in the morning peak (i.e. between 7:00 am and 9:15 am). Then, S can be determined as indicated in equation (27.7). S¼

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pð1  pÞ ¼ 0:1375ð1  0:1375Þ ¼ 0:344

(27.7)

Hence, assuming an absolute accuracy level of 0.02 and a 90% confidence level, the sample size should be 487 individuals according to equation (27.6). However, as some unit non-response among different survey stages is expected (Stopher, Wilmot, Stecher, & Alsnih, 2006), we decided to collect a higher sample at the first stage. Then, the total sample collected in the first stage was 600 individuals (i.e. approximately 20% more than the calculated sample size, based on non-response rates obtained from the pilot survey). To minimise unit non-response some maintenance methodologies were used. Initial contact of participants was made through the management staff at each workplace that supported the study. Our experience has shown that respondents get obviously more involved when their superiors recommend them to participate in the study. 27.3.4. Strategies To Mitigate Respondents’ Attrition Through Successive Survey Stages To keep respondents interested during the survey implementation, some reminder emails and letters of gratitude were sent to each participant before and after each stage of the survey. In addition, we tried to maintain the same interviewer for the same respondent at different stages of the survey. Finally, as previous experience with large scale surveys in Chile had shown that respondents prefer raffles as incentive mechanism, all respondents were included in several monetary sweepstakes carried out after completion of the last two survey stages. The number of sweepstakes at every stage was five, and each offered the equivalent to a cash price of Ch$ 100,000 (i.e. about 200 US$). Although the transport survey literature does not provide information about the effectiveness of sweepstakes as incentives for all respondents (Stopher et al., 2006), participation in these monetary raffles was the preferred incentive among the different ones tested in the focus groups as well (alternatives tested were: cash incentives for each participant, five raffles of two return air tickets between Santiago and Buenos Aires, and laptops and other electronic devices).

27.3.5. Data Collection Programme Our three-stage survey data collection process used an established formal procedure, based on Table 27.2, containing several scheduled personal/virtual contacts and the aid of different types of reminders to minimise respondent desertion. Note that the first two contacts were made to managing staff at the employment centres where the surveys were collected. After step 3, contacts and visits were

488

Julia´n Arellana et al.

Table 27.2: Schedule of contacts and reminders for data collection. Step 1 1a 1b 2 3

4

5

Content Workplace recruitment letter Reminder 1-workplace recruitment letter Reminder 2-workplace recruitment call Sample selection and first visit appointment Visit 1 (RP part 1) — Motivational interviewer speech and memory jogger delivery Reminder to use the memory jogger and thanking for completing RP part 1 RP part 2 delivery and motivation to complete this part

Contact type

Dates

E-mail E-mail

November 2010 1 week after step 1

Telephone E-mail

1 week after step 2 1 week before the visit December–February (day before travel day) December–February (travel day)

CAPI

E-mail

5a

Reminder 1 — complete RP part 2

E-mail

5b

Reminder 2 — complete RP part 2

E-mail

5c

Reminder 3 — complete RP part 2

E-mail

5d

Reminder 4 — complete RP part 2

E-mail

6

Incentives winners announcement, thanking for completing part 2 and information about beginning of visit 2 Visit 2 (SP-off-RP part + attitudinal questionnaire) Final incentives winners announcement, thanking for completing whole survey

E-mail

December–February (day after travel day) 1 week after travel day, if necessary 2 weeks after travel day, if necessary 3 weeks after travel day, if necessary 4 weeks after travel day, if necessary May

CAPI

June

E-mail

July

7 8

E-mail

scheduled directly with respondents. In addition, very precise level-of-service (LOS) variables were simultaneously measured during a specific week when the RP component data collection was taking place. As mentioned before, the survey had four information sources: RP data collected between steps 3 and 5, SP-off-RP games collected at step 7, attitudinal/LV data collected during steps 3 and 7, and LOS variable measurements.

Survey Data to Model Time-of-Day Choice: Methodology and Findings

489

27.3.6. RP Component The survey collection protocol began with a brief introduction of the project and the interviewer. In this introduction the interviewer had to explain the purpose of the study, who was conducting it, the terms of participation, and ask the respondent to sign a consent agreement. Then, RP survey was presented. The RP survey instrument was based on the 2001 Origin-Destination survey for Santiago (Ampt & Ortu´zar, 2004). Because of the need to collect a huge amount of detailed information, the RP data collection was divided into two parts. The first was a CAPI interview at the workplace collected by especially trained interviewers using an online form implemented in Surveymonkeys. The first stage questionnaire had the following sections: a) Personal information: As the survey was designed to be anonymous and the data gathered was confidential, in this section a unique identification number (ID) was assigned to each respondent and a contact e-mail requested. In addition, typical socioeconomic data were collected, such as age, gender, educational level, and possession of a driving license. b) Household information: This section requested some other common household related information, such as: home address, borough, household size, number and types of motorised/non-motorised vehicles in the household, and number of household members with a driving license. c) Employment information: Variables considered in this section were number of working hours and number of times per week that the respondent worked at the same place, possibility of working from home, whether the respondent had to work a fixed or variable number of hours per day, and information about additional services provided by the employer such as free parking, a company vehicle, fuel subsidy, or free bus transportation to and from work. d) Factors influencing TOD choice: This involved two rating questionnaires about the influence of different attributes on TOD choice, focusing on work and nonwork trips during a weekday, using a 1 to 7 Likert scale. These questionnaires were the same that had been applied in the preliminary stage and their use is not reported in this chapter. Answers to these questionnaires were used mainly for segmentation purposes in a qualitative work-in-progress study of departure time preferences (Arellana et al., 2012b) e) Plan of activities for the travel day: In order to capture the preferred arrival and departure times during the day, respondents were asked to state the mode of transport used and the starting/ending times of all activities planned in advance for the day when the survey diary was to be completed. At the end of the first stage of the survey, the interviewers explained the second stage collection method, assigned a RP diary completion day, and gave a memory jogger to help respondents with the second part of the data collection process. This involved filling in a web-page self-completion travel diary following an activity recall framework (Ampt & Ortu´zar, 2004), respondents had to record their journeys in

490

Julia´n Arellana et al.

the context of the activities undertaken during the selected travel day, rather than simply reporting the trips they made (Ortu´zar, 2006). This framework is used to facilitate the task of recording all trips made during the day; authors have reported more accurate travel measurement data using this method (Stopher, 1998). The second stage of the RP component of the survey began with an e-mail sent to each respondent at the beginning of the assigned travel day, to remind them of filling in their memory joggers. Memory joggers allow respondents to record the activities undertaken during the assigned travel day in a simple way; they were designed for easy self-completion and to be of pocket size for carrying convenience. At the end of the day or during the next day, a web-page self-completion form implemented in Surveymonkeys and requesting full details of all their trips was sent to each respondent. To facilitate the completion of this form, respondents were asked to use the memory joggers to help recalling all the journeys made during the travel day. The form requested stage-based trip data, i.e. all movements on a public street (e.g. separating walk to the bus from travelling on the bus) to ensure detailed mode/TOD data analysis (Ortu´zar, 2006). Information about all travel modes, including non-motorised trips, was demanded. Variables captured for each trip at this stage were: departure/arrival times, travel times, waiting times, access times, costs, routes and modes of transport used, origin, destination, and trip purpose, among others. In addition, information about official work starting/ending times and schedule flexibility were collected. Specifically, the latter was captured through questions about penalties and incentives for arriving at times different from their official work starting/ending times, and by the amount of time respondents were willing to change their usual departure times associated with work and non-work trips. During a four-week period, reminders to fill both the memory joggers and to complete the second RP part were sent to individuals each week if responses were not received. Reminders recommended filling both forms within one day delay to avoid recalling problems associated with keeping detailed records of their journeys after a few days (Ya´n˜ez, Mansilla, & Ortu´zar, 2010)

27.3.7. Measuring Level-Of-Service Variables To enrich the RP observations and to generate some of the SP attributes, highly precise LOS measures at different time periods were obtained from GPS measurements using instrumented vehicles in the public and private transport networks. In-vehicle GPS technology provides the possibility of using vehicles as probes for estimating more accurately network LOS performance measures (Storey & Holtom, 2003). Using accurate LOS data in travel demand modelling is not the usual practice (Daly & Ortu´zar, 1990). Public transport operators (i.e. Metro S.A. and Transantiago S.A.) provided us with GPS data for all public transport vehicles travelling in the city during the selected week in January when the RP component data collection process was taking place. LOS variables for private transport networks were measured using GPS

Survey Data to Model Time-of-Day Choice: Methodology and Findings

491

instrumented cars travelling in most of the routes reported by our respondents at different times of the day. As GPS data were available every 30 seconds and Santiago’s transport network is highly dense, handling these rich data was challenging. To avoid labour intensive and time consuming procedures for LOS estimation, a novel methodology to obtain LOS measures at either fixed or variable space-time aggregations was developed (Arellana et al., 2012c). An appealing feature of our procedure is that all estimation stages use freely available software such as Google MapsTM, Octave and mySQL. To assign relevant LOS variables to each individual, a virtual network with different modes and routes available at different times of the day for every respondent was created. The origin and destinations of trips were first geocoded within a geographical information system (GIS) and then exported to TRANSCAD (www.caliper.com), where the network was created and LOS attributes assigned to every respondent. The LOS measures assigned to the virtual network of each individual were: cost, travel times among public transport stops, car travel time by road segment, travel time variability, access time, and waiting times for every stop and each period of the day.

27.3.8. SP Component The SP component of the project had the aim of incorporating the effect of two demand management policies that are not currently in use in Chile (road pricing and flexible entry hours to work). Only people travelling by motorised transport modes and not transferring among public and private transport modes were considered to answer this survey stage (359 respondents). The first screen on this CAPI was only informative, and its aim was to provide some context information indicating the reasons for considering TDM policies in Santiago, which strategies would be evaluated (i.e. work hour flexibility and the implementation of congestion charging), and which behaviour was expected to be studied (i.e. re-timing of activities and/or mode switching behaviour). As the RP and SP data collection stages were not too close in time, some important features of each trip reported in the RP component were listed to avoid possible forgetfulness of specific travel situations which were the base of the SP games. After presenting this preliminary information, two SP questionnaires were presented sequentially to each respondent. The first one contained five choice situations while the second had eight. The first questionnaire was focused on trips to work in the AM peak and the second considered the complete work tour comprising its outbound and return legs. The design procedure used to obtain such questionnaires at this stage was fairly challenging. There is no consensus regarding the design generation process for TOD choice SP experiments. Indeed, commonly used design procedures in past studies have not used efficiency criteria, which is a standard feature in current SP applications. Just recently, Koster and Tseng (2010) developed an experimental design

492

Julia´n Arellana et al.

procedure including efficiency criteria in the design generation to address one of the most important difficulties when generating scheduling model based choice experiments, that is, when the model attributes are not the same attributes shown to the respondents (e.g. schedule delays terms are not directly presented to respondents; rather, they are obtained from the difference between presented arrival times and preferred arrival times, which are not presented). However, to achieve realistic SM based choice experiments, the design procedure must also deal with: (i) the potential dependency among different attribute levels of the same alternative, and (ii) the fact that customised choice situations based on specific characteristics of the journeys by each respondent can make the experiment more realistic but can also impose difficulties for the use of a unique efficient design for the whole sample. Dependency, where attribute levels of alternative j are generated from the attribute levels of a reference alternative i, can be accommodated in pivot designs such as those discussed by Rose, Bliemer, Hensher, and Collins (2008). However, additional complications arise when an attribute level within alternative j depends on another attribute level of the same alternative j, which in turn is also part of the design. This last kind of dependency is the one usually present in TOD models because travel times and costs depend on departure times. Not accounting for such dependency in a survey can give rise to unrealistic choice situations, as can a failure to align the scenarios presented with actual perceived alternatives in terms of realistic attribute combinations from the respondent’s perspective. While pivoting around current values can help in this context, customised levels must be carefully revised before applying the survey to avoid presenting unfeasible or irrelevant trade-offs to respondents. Sometimes certain variation levels do not work well for the entire sample, because the differences postulated between reported and new alternatives are too large or too small. For these reasons, some additional constraints are needed to give more realism to the choice situations and avoid presenting useless trade-offs. To address interdependence among attribute levels and to cope with the other abovementioned difficulties in these designs, a Bayesian efficient SP-off-RP step design was developed (Arellana et al., 2012a). A necessary condition for developing these designs is to have detailed information about the reference point schedule of each respondent; this was obtained at the RP component of the survey. Such a customised design was implemented in Excel. Example SP choice screens are shown in Figure 27.2. Each questionnaire contained five columns: The first gives information about attributes, the second, third and fourth represent the timing of alternatives (i.e., travelling at early/current/late time), and the rightmost column offers the possibility of travelling by a different mode, at around the same time as the originally reported trip. Public transport was input as the alternative mode for private transport users; if available, car was the primary alternative to public transport users; if not, users were offered a new shared-taxi service. To minimise the impacts of inertia or of reading left-to-right effects, the position of the re-timing alternatives was randomised. Each respondent was asked to make one choice within the four available alternatives in the first questionnaire, and two choices in the second one (for the

Survey Data to Model Time-of-Day Choice: Methodology and Findings

493

Figure 27.2: Illustrative SP choice screens for both questionnaires (original was in Spanish and cost were in Chilean pesos; Ch$500 ¼ 1 US$). outbound and return legs of the tour). Unless respondents decided to change mode, they had the possibility of choosing different alternatives for the outbound and return trips, generating a 10-alternative choice set as illustrated in Figure 27.3.

27.3.9. Attitudinal Questionnaire To allow for the potential embedding of latent variables (LV) in TOD choice modelling, we included a section with eight attitudinal questions plus a personal

494

Julia´n Arellana et al.

Outbound trip

Current

Early

Late

Change

Return trip

Alternative

Current

Alternative1

Early

Alternative 2

Late

Alternative 3

Current

Alternative 4

Early

Alternative 5

Late

Alternative 6

Current

Alternative 7

Early

Alternative 8

Late

Alternative 9

Change

Alternative 10

Figure 27.3: Available alternatives in tour models.

income question at the end of the SP questionnaire. This attitudinal questionnaire was also applied to people that did not face the SP questionnaire and had the purpose of collecting attitudes indicators to form the LV. Hence, we asked people to rate the eight statements shown in Table 27.3 in a 1-7 scale, where 1 meant strong disagreement and 7 meant strong agreement with the statement. Table 27.3 summarizes the statistical results of this questionnaire. Respondents were more worried about experiencing variable travel times (statement 1) than meeting schedules. Furthermore, respondents appear to be willing to pay more for travelling at their current times than having to change their amount of working hours for travelling at less congested times.

27.4. Results 27.4.1. Implementation Results A total sample of 405 respondents completed the whole survey. In the pilot survey relatively high non-response (NR) rates were observed, about 20% among the first two stages. In the final survey, NR rates for the first and second stages were lower than in the pilot but the third stage NR was higher. Table 27.4 shows the total number of respondents at each stage of the survey.

Survey Data to Model Time-of-Day Choice: Methodology and Findings

495

Table 27.3: Subjective attitudes by users (N ¼ 405).

1. 2. 3. 4. 5. 6. 7. 8.

Statement

Mean

SD

I really dislike not being able to predict the length of my journey. It is very important for me to be on time at work. It is very important for me to leave work on time. I care more about meeting schedules in the morning than in the evening. I am willing to travel earlier or later than normal if these decisions can reduce my travel time. I would dislike to pay more to travel at my preferred time. I am happy to vary the amount of work in different days in order to travel during less congested hours. I am willing to pay more for travelling in more comfortable conditions and/or during less congested hours when my travel times are reduced.

6.7 5.4 5.7 5.3

0.7 1.7 2.2 2.1

5.2

1.9

4.4 4.8

2.1 2.1

5.4

1.8

Table 27.4: Number of survey respondents per stage. Stage

Description

Collection method

Projected respondents

Collected respondents

NR (%)

Pilot survey 1 RP part 1 2 RP part 2 3 SP + attitudinal

CAPI Web self-response CAPI

50 40 32

40 32 30

20 20 6

Final survey 1 RP part 1 2 RP part 2 3a SP + attitudinal 3b Only attitudinal

CAPI Web self-response CAPI CAPI

600 598 359 134

598 493 278 97

0 18 23 28

The higher non-response rate obtained at the third stage of the final survey could be caused by reorganization at some worker’s locations or by a change of some staff in two of the seven employment centres where the survey took place. Because respondent selection was random inside the selected employment centres, some participants had temporal contracts; thus, some people had moved to another employment after reorganization and their new location was not found. In addition, NR in the final survey was expected to be greater because most workers

496

Julia´n Arellana et al.

within the pilot sample were from our university and had greater commitment with the study.

27.4.2. Socioeconomic Characteristics A summary analysis of the data reveals that 53% of the respondents were males and that ages ranged between 19 to 82 years old, with 22% of respondents being less than 30 and around 22% older than 50. About 76% of the sample possessed driver license but only about 32% of the total used car regularly to work. About 83% of respondents declared to have some degree of flexibility in their work schedules (i.e. starting and ending work times) and only some 25% got paid for working extra time.

27.4.3. Departure Time Trends Respondent’s stated departure times to work during the morning peak have a similar trend to those observed in the 2006 Santiago’s Origin Destination survey data (see Figure 27.4). Differences in percentages on observed trips in later periods could be explained by a possible bias for a higher amount of white collars workers (i.e. administrative staff and professionals) in our survey. Possibly in the 2006 Santiago’s Origin Destination data, non-skilled workers (i.e. maintenance and general duties workers) were a more important part of the sample and that is why peaks could be find earlier in the 2006 data. White collar workers usually have higher schedule flexibility and can arrive later to their jobs without incurring in penalties.

Figure 27.4: Percentage of trips to work in the morning peak.

Survey Data to Model Time-of-Day Choice: Methodology and Findings

497

27.4.4. Discrete Choice Model Results Preliminary estimation results based only on our SP data (i.e. 308 individuals) were obtained using generic formulations (8) and (9) for trips and tours games, respectively: V trip ¼ ASC i þ bTT TT i þ bTime_diff Time_diff i þ bC cos ti þ bSDE SDE i þ bSDL SDLi i (27.8) ¼ ASC outbound þ ASC return þ bTT TT i þ bTime_diff Time_diff i þ bC cos ti V tour i i i þ boutbound SDE outbound þ boutbound SDLoutbound þ bPTD PTDi þ bPTI PTI i ; SDE SDL i i (27.9) where Time_diff is defined as the difference between the ‘‘worst’’ travel time presented in each choice scenario and the usual travel time divided by the latter in order to normalise it. All other remaining variables were defined above with subscript i representing the four alternatives in the trip data and the ten in the tour data. To link trips before and after work in the tour game’s alternatives, activity participation penalties (i.e. PTDi and PTIi) were introduced as proposed by de Jong et al. (2003) and Hess et al. (2007b). Separate constants for the outbound and return legs across tour game’s alternatives were used to capture general preferences for departing at specific times or modes in both legs. The attributes TT, Time_diff, and Cost refer to both legs in the tour game’s alternatives; while SDE and SDL are outbound specific (return-specific values cannot be included in a model which also has activity time values). From the above generic formulations, joint multinomial logit (MNL) and mixed logit (ML) models allowing for scale differences between the two games were coded and estimated in OX (Doornik, 2007). Most estimated coefficients had the expected sign and were significant at the 95% confidence level, except for the lower comfort level coefficient in the MNL model and some interactions in the ML model. Interactions in the ML model became less important (i.e. less significant) as observed heterogeneity was captured by the random components, which seem to represent heterogeneity better in the sample (see Table 27.5). A likelihood ratio test between these models confirms that the ML model should be preferred over the MNL model (LR ¼ 3201.04, p ¼ .000). In general terms, people prefer arriving earlier rather than later at their work place, and are more worried about meeting schedules in the morning (i.e. the outbound trips’ constants are more negative). The ML model estimates indicate that if attributes are kept equal among alternatives, people are more likely to change their departure time than to travel by a different mode, in line with previous findings by de Jong et al. (2003), and Hess et al. (2007a, 2007b). On the other hand, the MNL model shows a different result regarding departure time and mode change sensitivities. In particular, the sensitivity to early departures is lower than that to changing mode but higher than that for late departures, maybe because the constants are capturing some of the heterogeneity not accounted for when a simple model structure is used.

498

Julia´n Arellana et al.

Table 27.5: Estimation results for joint models. MNL

Estimate

t-stat

 0.823  15.70 – – – –  0.407  4.71  0.6717  7.45  0.4882  5.84  0.625  5.80  1.580  13.60  1.114  11.93

 2.549 – –  0.960  0.600  0.799  1.444  1.722  1.549

 16.73 – –  4.77  4.06  6.75  5.99  9.17  11.65

 0.689  0.011  0.012  0.013  0.787  0.012  0.007  0.038  0.113

 11.15  5.40  8.14  7.60  4.44  7.93  5.52  0.69  2.09

 1.554  0.048  0.056  0.082  1.669  0.035  0.013  0.256  0.585

 8.89  7.93  12.57  12.08  5.70  8.10  5.83  2.31  5.30

0.014

4.04

0.065

6.72

 0.013  0.023  0.008  0.005 0.024

 3.66  3.78  3.64  2.82 5.23

 0.018  0.037  0.021 0.009 0.020

 1.73  2.11  3.93 1.40 2.39

0.012

3.97

0.007

1.12

– – – – – –

– – – – – –

2.916 0.120  0.049  0.063 0.040  0.020

13.93 18.17  13.75  13.62 9.73  13.83

Estimate Alternative specific constant (ASCi) Change mode Current departure time — outbound Current departure time — return Early departure time — trip Early departure time — outbound Early departure time — return Late departure time — trip Late departure time — outbound Late departure time — return Attributes Log-Cost (bC) Travel time (bTT) Schedule delay early in minutes (bSDE) Schedule delay late in minutes (bSDL) Difference in travel times (bTime_diff) Decreased work time penalty (bPTD) Increased work time penalty (bPTI) Comfort — Standing (bCOM3) Comfort — Standing in crowded conditions and sometimes have to wait for next vehicle (bCOM5) Interactions TT — Penalties for arriving/departing at different times SDE — Postgraduate studies SDL — Postgraduate studies SDE — Morning arrival incentive SDL — Afternoon departure penalty PTD — Possibility of working from home PTI — Possibility of working from home Standard deviation Log-Cost (SC) Travel time (STT) Schedule delay early (SSDE) Schedule delay late (SSDL) Decreased work time penalty (SPTD) Increased work time penalty (SPTI)

MMNL

t-stat

Survey Data to Model Time-of-Day Choice: Methodology and Findings

499

Table 27.5: (Continued ) MNL Estimate Scale factor Scale factor-over trip data Estimation summary Number of parameters Number of observations Number of individuals Number of normal draws Final log-likelihood

1.464

MMNL

t-stat 14.12

25 4068 308 1  6167.87

Estimate 0.672

t-stat 16.71

31 4068 308 100  4657.35

Another interesting result is related with the scale factor of the joint estimation which is lower in the ML model. As the reported scale factor values are equal to the ratio of the inverse of the trip/tour games’ error variances, a possible explanation for the difference could be the change in the ML error structure due to the inclusion of the standard deviations of the random parameters. Incorporating new terms in the systematic utility function decreases the magnitude of the Gumbel error terms causing a lower scale factor in the ML model and higher mean parameter estimates (see the discussion in Sillano & Ortu´zar, 2005).

27.4.5. Latent Variables A preliminary factor analysis indicated that a two-factor structure could be extracted from the data. The two LV were labelled Departure time change (DTChange), related with respondents’ willingness to change their departure times, and On time (Ontime), related with attitudes toward meeting schedules or arriving/departing on official times to/from work. Based on this factor analysis results, a Multiple Indicators Multiple Causes (MIMIC) model was proposed adding some explanatory variables (e.g. work constraints and socioeconomic variables) that could explain each latent variable (see Figure 27.5). Latent variables are observed through six indicators (i.e. statements 1, 2, 3, 6, 7 and 8 on Table 27.3) that have associated some measurement error denoted as ei. To fully specify the MIMIC model, two disturbance terms (i.e. N1 and N2) are included. It is hypothesised that educational level (postgr), wage rate (w_min) and presence of penalties for arriving/departing at different times from the established ones (penal) could influence the respondent’s propensity to being at work on time. Gender (sex), possibility of working from home (work_home), and a dummy variable

500

Julia´n Arellana et al.

Figure 27.5: Preliminary MIMIC model structure.

indicating if the respondent is also a student (student_work), were proposed to explain the willingness to change departure times. A sample of 405 individuals was used to estimate this model using maximum likelihood techniques. Here, workers who also study, females, and people without postgraduate studies, tend to have lower income. Respondents not being penalised for arriving or departing at different times to/from work usually have higher incomes in the sample as well. The proposed MIMIC model has satisfactory fit indices. Although we have more than 400 cases, the model’s Chi-squared statistic is non-significant indicating that it can consistently reproduce the sample covariance matrix. The proposed model also meets the rule of thumb that the ratio of Chi-squared to degrees of freedom should be less than two. Furthermore, the goodness-of-fit index (GFI) index is close to one and the root mean squared error of approximation (RMSEA) is less than 0.05, suggesting a correct model fit (see Table 27.6). All estimated parameter signs are in line with expectations; and most of the estimated parameters are significant at the 95% confidence level, except for work_home, postgr and travel time variability valuation (val_reliab). It should be noted that these coefficients have their expected correct signs and are significant at the 90% level. Results show that women tend to be more engaged with work but have more favourable attitudes to change their departure time during the day, probably due to home or family related commitments. People who can work from home and workers simultaneously studying, have also favourable attitudes towards departure time changes, being more likely to accept congestion charges and more prepared to vary their amount of work changing their trip timing behaviour to travel in less congested hours.

Table 27.6: MIMIC model results. Dependent variable Regression weights DTChange DTChange DTChange Ontime Ontime Ontime wtp_com work_time_change wtp_DT ontime_am ontime_pm val_reliab

Cause ’ ’ ’ ’ ’ ’ ’ ’ ’ ’ ’ ’

student_work sex work_home w_min postgr penal DTChange DTChange DTChange Ontime Ontime Ontime

Variances w_min student_work Postgr Sex work_home Penal ZN2 ZN1 se1 se2 se5 se6 se7 se3 Covariance w_min w_min w_min w_min Model fit Number of estimated parameters Degrees of freedom (df) w2 p(w2) w2/df GFI RMSEA

2 2 2 2

student_work postgr sex penal

SD

t-stat

0.396 0.366  0.607  0.002  0.222  0.075 0.977 1.087 1.000 1.000 2.011 0.715

0.201 0.137 0.354 0.001 0.125 0.034 0.258 0.291 Fixed Fixed 0.666 0.425

1.973 2.665  1.714  2.319  1.771  2.181 3.781 3.741

1,353.510 0.106 0.069 0.249 0.033 0.857 0.675 0.147 0.390 2.154 3.586 2.473 2.690 4.809

94.496 0.007 0.005 0.018 0.002 0.060 0.237 0.057 0.062 0.273 0.327 0.297 0.281 0.345

14.323 14.213 14.213 14.213 14.213 14.213 2.843 2.561 6.271 7.899 10.969 8.317 9.560 13.926

-1.687 3.015  3.274 3.969

0.559 0.487 0.867 1.575

 3.019 6.189  3.777 2.519

Estimate

28 50 65.166 0.073 1.303 0.973 0.027

3.017 1.680

502

Julia´n Arellana et al.

On the other hand, high income people with postgraduate studies tend to be less worried about arriving on time to work in the morning or leaving at the official ending time in the afternoon. In addition, respondents working in companies who penalize their employees for arriving or leaving at different times than established, tend to give more importance to meeting schedules (i.e. being on time to perform their activities). Also, people who give more importance to arriving or leaving their workplace on established times are also more worried about trip travel time variability. Uncertainty in travel times can affect willingness to arrive on time at the destinations and respondents would find distressing not being able to predict how long their journeys will take. Finally, the variances of ZN2 and ZN1 are significantly different from zero suggesting that the definition of formative constructs Ontime and DTChange could be improved by considering more relevant explanatory causes. The lower the variance of the disturbance term the clearer the construct definition is (Diamantopoulos, Riefler, & Roth, 2008).

27.5. Conclusions And Future Work A survey methodology to collect high quality data for TOD modelling considering different factors affecting this choice process has been presented. It is a three stage process which involves collecting revealed preference (RP), stated preference (SP) and attitudinal data. Various collection methods (email, web-page, and personal interviews) were used and their specific features discussed. The data collection process comprised several scheduled personal/virtual contacts, the use of different types of reminders and included strategies to maximise response rates and minimise respondent burden (e.g. monetary incentives, letters of gratitude, and the use of the same interviewer team). Even though, the data collection process was challenging and time consuming, the total cost of implementing this high quality survey was not very expensive. The cost of each RP response reached only US$ 21.14 whilst each SP response cost roughly US$ 9.66. The novelty of the survey methodology described in this chapter is the collection of different data types for TOD modelling and the acquisition of very precise information about preferred departure/arrival times through the use of an activities plan for the travel diary questionnaire. In addition, gathering detailed information about flexibility in schedules, employment information and attitudes towards departure times should allow estimation of hybrid TOD choice models incorporating latent variables. Every part of the survey required a methodological effort and at least the following features could be highlighted:  The RP component was a successful experience in applying a travel diary using a two-step process that first involved a CAPI and then a web base self-completion form.

Survey Data to Model Time-of-Day Choice: Methodology and Findings

503

 An efficient methodology to assign LOS variables at different times of the day for every individual to enrich the RP observations and for helping on the generation of SP attributes.  The development of a design procedure to obtain realistic SP choice experiments for TOD modelling that allowed inclusion of dependency between attribute levels in both trip and tour contexts.  Relatively high non-response rates were obtained, suggesting that more effective strategies could be developed to maximise survey response. Even though the sample size of the survey is small in comparison with, for example, the 2006 Santiago Origin-Destination survey, observed respondents’ departure times to work show similar patterns and preliminary model results are promising in terms of the possibility of a large scale application in the future. The high quality data gathered in the survey allows good estimation of both simple and advanced hybrid departure time models, but a larger sample size would always be desirable. Preliminary results show that people prefer to travel at non-usual times rather than changing their mode of transport. Attitudes towards changes in departure times and being on time at the workplaces are important in the TOD decision making process and could be influenced by educational levels, income, gender, and work conditions. In addition, respondents showed a strong dislike for uncertainty in travel times and for having to wait for an additional vehicle when they cannot board the current one due to crowded travel conditions. Finally, departure time decisions seem to be more influenced by the desire of meeting schedules than by the possibility of experiencing higher travel times. Further research is needed to extend our preliminary results to account for the full complexity of the behavioural processes. This will probably require the use of more advanced models and make use of different data types such as the ones collected. We are now developing a joint RP-SP-attitudinal model in line with preliminary findings reported in this chapter.

References Ampt, E., & Ortu´zar, J. de D. (2004). On best practice in continuous large-scale mobility surveys. Transport Reviews, 24, 337–363. Arellana, J. (2012). Modelos de Eleccio´n de Hora de Inicio de Viajes. PhD Thesis, Department of Transport Engineering and Logistics, Pontificia Universidad Cato´lica de Chile, Santiago. Arellana, J., Daly, A. J., Hess, S., Ortu´zar, J. de D., & Rizzi, L. I. (2012a). Developing an advanced departure time choice model for transport planning in emerging economies. 91st Annual Meeting Transportation Research Board (TRB) Conference 2012. Washington, USA. Arellana, J., Olaru´, D., Ortu´zar, J. de D., & Rizzi, L. I. (2012b). All travellers care about the same when deciding the time of travel? A multivariate analysis of departure time preferences. Working Paper, Department of Transport Engineering and Logistics, Pontificia Universidad Cato´lica de Chile, Santiago.

504

Julia´n Arellana et al.

Arellana, J., Ortu´zar, J. de D., Rizzi, L. I., & Zun˜iga, F. (2012c) Obtaining public transport level-of-service measures using in-vehicle GPS data and freely available GIS web-based tools. Conference on Advanced Systems for Public Transport-CASPT 12. Santiago, Chile. Bajwa, S., Bekhor, S., Kuwahara, M., & Chung, E. (2008). Discrete choice modelling of combined mode and departure time. Transportmetrica, 4, 155–177. doi:10.1080/18128 600808685681 Bhat, C. R. (1998). Accommodating flexible substitution patterns in multi-dimensional choice modelling: formulation and application to travel mode and departure time choice. Transportation Research Part B: Methodological, 32, 455–466. doi:10.1016/S0191-2615(98)00011-3 Bhat, C. R., & Steed, J. L. (2002). A continuous-time model of departure time choice for urban shopping trips. Transportation Research, 36B, 207–224. doi:10.1016/S0191-2615(00)00047-3 Bianchi, R., Jara-Dı´ az, S. R., & Ortu´zar, J. de D. (1998). Modelling new pricing strategies for the Santiago Metro. Transport Policy, 5, 223–232. doi:10.1016/S0967-070X(98)00025-0 Bo¨rjesson, M. (2008). Joint RP-SP data in a mixed logit analysis of trip timing decisions. Transportation Research, 44E, 1025–1038. Cantillo, V., Heydecker, B. G., & Ortu´zar, J. de D. (2006). A discrete choice model incorporating thresholds for perception in attribute values. Transportation Research Part B: Methodological, 40, 807–825. doi:10.1016/j.trb.2005.11.002 Caussade, S., Ortu´zar, J. de D., Rizzi, L. I., & Hensher, D. A. (2005). Assessing the influence of design dimensions on stated choice experiment estimates. Transportation Research Part B: Methodological, 39, 621–640. doi:10.1016/j.trb.2004.07.006 Daly, A. J., & Ortu´zar, J. de D. (1990). Forecasting and data aggregation: Theory and practice. Traffic Engineering and Control, 31, 632–643. De Cea, J., Ferna´ndez, J. E., Dekock, V., & Soto, A. (2005). Solving network equilibrium problems on multimodal urban transportation networks with multiple user classes. Transport Reviews, 25, 293–317. doi:10.1080/0144164042000335805 de Jong, G., Daly, A. J., Pieters, M., Vellay, C., Bradley, M., & Hofman, F. (2003). A model for time of day and mode choice using error components logit. Transportation Research, 39E, 245–268. Diamantopoulos, A., Riefler, P., & Roth, K. P. (2008). Advancing formative measurement models. Journal of Business Research, 61, 1203–1218. doi:10.1016/j.jbusres.2008.01.009 DICTUC. (2003). Encuesta de movilidad 2001 de Santiago. Santiago: Departamento de Ingenierı´ a de Transporte, Pontificia Universidad Cato´lica de Chile, Santiago. DICTUC. (2006) Encuesta de movilidad 2006 de Santiago. Santiago: Departamento de Ingenierı´ a de Transporte, Pontificia Universidad Cato´lica de Chile, Santiago. Doornik, J. A. (2007). Object-oriented matrix programming using Ox (3rd ed). London: Timberlake Consultants Press and Oxford. Ettema, D., Ashiru, O., & Polak, J. (2004). Modelling timing and duration of activities and trips in response to road-pricing policies. Transportation Research Record, 1894, 1–10. doi:10.3141/1894-01 Ferna´ndez, J. E., & De Cea, J. (1990). An application of equilibrium modelling to urban transport planning in developing countries: The case of Santiago de Chile. In H. E. Bradley (Ed.), Operational Research ‘90. Oxford: Pergamon. Hendrickson, C., & Plank, E. (1984). The flexibility of departure times for work trips. Transportation Research, 18A, 25–36. Hess, S., Daly, A. J., Rohr, C., & Hyman, G. (2007a). On the development of time period and mode choice models for use in large scale modelling forecasting systems. Transportation Research, 41A, 802–826.

Survey Data to Model Time-of-Day Choice: Methodology and Findings

505

Hess, S., Polak, J., Daly, A. J., & Hyman, G. (2007b). Flexible substitution patterns in models of mode and time of day choice: New evidence from the UK and the Netherlands. Transportation, 34, 213–238. doi:10.1007/s11116-006-0011-7 Hojman, P., Ortu´zar, J. de D., & Rizzi, L. I. (2005). On the joint valuation of averting fatal and severe injuries in highway accidents. Journal of Safety Research, 36, 377–386. doi:10.1016/j.jsr.2005.07.003 Koster, P., & Tseng, Y. Y. (2010). Stated choice experimental designs for scheduling models. In S. Hess & A. Daly (Eds.), Choice Modelling: The State-of-the-Art and the State-ofPractice. Bingley, UK: Emerald. Ortu´zar, J. de D. (2006). Travel survey methods in Latin America. In P. Stopher & C. Stecher (Eds.), Travel Survey Methods: Quality and Future Directions. Oxford: Elsevier. Ortu´zar, J. de D., & Willumsen, L. G. (2011). Modelling Transport. Chichester: John Wiley and Sons. Polak, J., & Jones, P. (1994). A tour-based model of journey scheduling under road pricing. 73rd Annual Meeting of the Transportation Research Board. Washington, DC. Rose, J. M., Bliemer, M. C. J., Hensher, D. A., & Collins, A. T. (2008). Designing efficient stated choice experiments in the presence of reference alternatives. Transportation Research, 42B, 395–406. doi:10.1016/j.trb.2007.09.002 Sillano, M., & Ortu´zar, J. de D. (2005). Willingness-to-pay estimation with mixed logit models: Some new evidence. Environment and Planning A, 37, 525–550. doi:10.1068/a36137 Small, K. A. (1982). The scheduling of consumer activities: Work trips. The American Economic Review, 72, 467–479. Small, K. A., Noland, R. B., & Koskenoja, P. (1995). Socio-economic attributes and impacts of travel reliability: a stated preference approach. Research Reports, California Partners for Advanced Transit and Highways (PATH), Institute of Transportation Studies, UC Berkeley. Smith, M. E. (1979). Design of small sample home interview travel surveys. Transportation Research Record, 701, 29–35. Steed, J. L., & Bhat, C. R. (2000). On modelling departure time choice for home-based social/ recreational and shopping trips. Transportation Research Record, 1706, 152–159. doi:10.3141/ 1706-18 Stopher, P. (1998). Household travel surveys: New perspectives and old problems. In P. Stopher & P. Jones (Eds.), Transport Survey Quality and Innovation. Oxford: Pergamon Press. Stopher, P., Wilmot, C., Stecher, C., & Alsnih, R. (2006). Household travel surveys: Proposed standards and guidelines. In P. Stopher & C. Stecher (Eds.), Travel Survey Methods: Quality and Future Directions. Oxford: Elsevier. Storey, B., & Holtom, R. (2003). The use of historic GPS data in transport and traffic monitoring. Traffic Engineering and Control, 44, 376–379. Tseng, Y. Y., Koster, P., Peer, S., Knockaert, J., & Verhoef, E. (2011) Discrete choice analysis for trip timing decisions of morning commuters: estimations from joint SP/RP-GPS data. International Choice Modelling Conference. Leeds. Vickrey, W. (1969). Congestion theory and transport investment. The American Economic Review, 59, 251–260. Ya´n˜ez, M. F., Mansilla, P., & Ortu´zar, J. de D. (2010). The Santiago Panel: Measuring the effects of implementing Transantiago. Transportation, 37, 125–149. doi:10.1007/s11116-009-9223-y

Chapter 28

Collection of Time-Dependent Data Using Audio-Visual Stated Choice Chester Wilmot and Ravindra Gudishala

Abstract Purpose — A new method of collecting hurricane evacuation data using timedependent stated choice is developed and evaluated in this study. Methodology/approach — Hypothetical storms are presented in a video in a sequence of scenarios showing prevailing conditions at discrete points in time as each storm approaches land. Respondents are exposed to nine hypothetical storms representing a range of hurricane characteristics. One of the hypothetical storms is secretly the same as an actual storm the respondents experienced in the past and for which they are required to report their behaviour in a revealed preference survey. Findings — Stated and actual behaviour was compared and general agreement was found between what people say they would do and what they did. The revealed preference (RP) data was supplemented with time-dependent data from official sources and hurricane evacuation demand models estimated on this enhanced RP data, as well as on a combination of the enhanced RP and time-dependent stated choice (SC) data. When the models were applied to a different data set than the ones on which the models were calibrated, the combined time-dependent RP/SC model performed slightly better than the enhanced RP model. Detailed accounting revealed that time-dependent SC data is 25 percent more expensive to collect than enhanced RP data, although some of this cost may be due to the first-time collection of this type of data. Keywords: Dynamic; stated choice; time-dependent; evacuation; survey

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

508

Chester Wilmot and Ravindra Gudishala

28.1. Introduction Transportation planning has traditionally been conducted using cross-sectional data. One of the consequence of this practice has been that relatively little attention has been given to the dynamic aspects of travel in data collection. Travel diaries do provide more time-related information than the original travel surveys, and panel surveys provide intermittent snapshots of changing conditions and behaviour over relatively long intervals of time, but little attention has been given to the dynamics of travel behaviour within an individual analysis period. This is important where travel conditions and travel behaviour change dramatically within the period of analysis resulting in highly non-uniform traffic flow in the transportation network. One area where this occurs is in emergency evacuation, and particularly in hurricane evacuation where the evacuation period is long and traffic conditions vary considerably within that period. One approach to addressing the lack of dynamic information in past hurricane evacuation surveys has been to supplement existing data sets with dynamic information from external sources (Fu, Wilmot, & Baker, 2006). For example, dynamic storm characteristics can be downloaded from archives kept at the National Hurricane Center, emergency management decisions can be obtained from public records, interviews and newspaper articles, and network conditions can be gleaned from traffic counts and other official information sources. However, the data still relates to a single storm and a storm must first occur before data can be collected. An alternative approach is to employ stated choice where a survey can be launched at any time and multiple storms with a range of characteristics can be considered. Several researchers have used stated choice in hurricane evacuation data collection in the past (Baker, 1995; Kang, Lindell, & Prater, 2007; Whitehead, 2005), but most have used text to describe the scenarios. Pictures and diagrams have been used to describe stated choice scenarios of bush fires in Australia (Alsnih, Rose, & Stopher, 2005) and an audio-visual simulation program called Stormview has recently been developed to determine how coastal residents make decisions in response to a hurricane threat (Meyer, 2012). However, to the authors’ knowledge, no one has used audio and visually enhanced scenarios in stated choice surveys to collect hurricane evacuation travel data. In the research presented in this paper, a method of depicting the changing conditions of an approaching hurricane audio-visually and recording the dynamic response of the public to these conditions is explained and tested. Because momentby-moment portrayal of a storm would be impractical due to the time it would require in presentation and response, each storm is presented in this study as a sequence of several time-dependent snapshots of changing conditions as a storm approaches. The method is evaluated, first, by secretly making one of the storm scenarios in the stated choice survey the same as an actual storm for which revealed preference data is also collected, and then comparing actual and stated behaviour of this storm. Second, the goodness-of-fit of evacuation demand models estimated on the revealed preference data with added dynamic data from external sources is compared with that of a model estimated on combined revealed preference and stated

Time-Dependent Stated Choice

509

choice data. Last, the method is evaluated by comparing a detailed cost of collecting revealed preference data of hurricane evacuation behaviour supplemented with dynamic information from external sources with that of stated choice.

28.2. Survey Design A survey was conducted by the authors in New Orleans, Louisiana and surrounding area in which 300 households completed both a traditional mail-out, mail-back revealed preference survey of evacuation behaviour during hurricane Gustav, and a time-dependent, audio-visual stated choice survey of nine hypothetical storms. The revealed preference questionnaire included questions on the dynamic behaviour of the responding household such as if and when the household evacuated, where they evacuated to, and when they arrived. Other information included the form of transport used, number of vehicles employed in the evacuation process, whether boats or trailers were towed, whether persons in the household required assistance in evacuating, and whether the job of anyone in the household required that they remain in the area during an evacuation. Supplemental information on the dynamic characteristics of hurricane Gustav (e.g. category, projected path and location) was obtained from the archives of the National Hurricane Center for each six-hour advisory over the entire duration of the storm. Information on management decisions (e.g. timing and nature of evacuation notices, initiation and termination of contraflow) was obtained from newspaper archives and Wikipedia. Flooding conditions were estimated from surge estimates from the National Hurricane Center and ground elevations were obtained from USGS. The stated choice survey involved randomly selecting nine past storms that made landfall in the vicinity of New Orleans and represented a variation in category, path and the time of day at which landfall occurred. Each surveyed household was asked to estimate their response to three storms randomly selected from among the nine. Each storm was displayed audio-visually in a video through which respondents navigated using controls on a DVD player or computer. Each storm was presented in four ‘snapshot’ scenarios of conditions prevailing at approximately 72, 48, 24 and 12 hours before landfall. In each scenario, the storm was presented in terms of its current position, predicted path, current category, estimated time to landfall and whether an evacuation order had been issued. Households were instructed to respond in terms of their own personal conditions (e.g. their home location, household composition, health and mobility of household members, vehicle ownership, income, work commitments and presence of pets). Scenarios were presented chronologically, and if a household chose not to evacuate in the first scenario, the decision to not evacuate was entered on the video screen and the program would automatically progress to the scenario of the next time period. If, on the other hand, a household chose to evacuate in the first or any subsequent time period, the decision to evacuate would be entered on the video screen, information regarding the evacuation (e.g. time of departure, mode, destination, route) would be entered in a booklet accompanying the video, and the program would then automatically progress to the next storm. If a

510

Chester Wilmot and Ravindra Gudishala

household chose not to evacuate in a particular time period, the program automatically progressed to the next timed period and the new scenario was presented. If a household progressed through all time intervals without evacuating, the decision to not evacuate was recorded in the booklet and the next storm was automatically introduced. An image of a typical video screen is shown in Figure 28.1. It shows the first storm scenario for storm 1. Audio described the scenario and included within the oral script the storm information written on the screen. Households were recruited using a purchased list of addresses and telephone numbers of randomly selected households in 10 parishes in and surrounding the New Orleans metropolitan area. The survey was initiated by an advance letter introducing the survey and explaining that a telephone call would follow in which they would learn more about the survey and would be invited to participate. A total of 3498 such letters were sent out and 180 were returned due to incorrect addresses. Further information on the sample disposition is shown in Table 28.1. Recruitment was conducted by telephone by the Public Policy Research Lab at LSU, a unit within the university that conducts a variety of surveys for internal and external clients. Only households who experienced hurricane Gustav and owned video players were recruited. The instrument design for both revealed preference and stated choice surveys were tested in two focus groups and a pilot study involving 50 respondents. The pilot study resulted in the decision to include a $20-incentive for a completed survey. The full survey was conducted between September and

Figure 28.1: Storm scenarios on video.

Time-Dependent Stated Choice

511

Table 28.1: Sample disposition. Disposition code 1 2 3 4 5 6 7 8 9 10 11 12 13 20 21 22

Description

Records

Hard refusal No eligible respondent Business Busy No answer Callback later Disconnected Fax Soft refusal Partially complete Language barrier Not qualified Don’t have a DVD player Complete Never call Not attempted

311 30 31 62 1571 60 410 64 120 2 11 12 59 706 44 5

Eligibility Eligibility Ineligible Ineligible Eligibility Eligibility Eligibility Ineligible Ineligible Eligibility Eligible Ineligible Ineligible Ineligible Eligible Ineligible Eligibility

unknown

unknown unknown unknown

unknown

unknown

December 2009. A total of 666 households were recruited and survey material consisting of a cover letter, a DVD, a revealed preference self-administered questionnaire, a stated choice response booklet and a prepaid return envelope were mailed to them. A telephone helpline was made available 24 hours a day, 7 days a week for the duration of the survey. It was a cell phone especially purchased for this purpose and carried by one of the authors; a total of 23 calls were received seeking information during the entire survey. Three reminder calls were used to increase response. A total of 331 responses were received of which 288 had complete information on all variables. The estimated response rate using the CASRO method was 12 percent. The inaccuracies in the sample brought about by not being able to achieve a truly random sample of households and the coverage error introduced by requiring that respondents experienced hurricane Gustav and own a video player, required that the sample be weighted. Expansion and weighting factors were estimated using iterative proportional fitting based on household size, vehicle ownership and ethnicity obtained from the American Community Survey for 2009.

28.3. Survey Results The weighted socio-economic characteristics of the sample were household size of 2.5 persons, average household income of $45,000 per annum, 15 percent of households owned no private vehicle, and 30 percent of households had children under the age of 17. Key weighted and unweighted demographics of the sample are shown in Table 28.2.

512

Chester Wilmot and Ravindra Gudishala

Table 28.2: Unweighted and weighted key demographics of the sample. Demographic

Cumulative number of evacuating households (1000's)

Average household size Average household income Average number of vehicles owned by a household Average number of members younger than 17 per household

Unweighted

Weighted

2.10 59,000 1.95 0.55

2.40 45,000 1.64 0.48

140 120 100

Hypothetical Gustav Real Gustav

80 60 40 20

Voluntary Mandatory Gustav Landfall

0

Date and time

Figure 28.2: Stated and actual evacuation behaviour for hurricane Gustav. The 7th hypothetical storm presented to the respondents had the same characteristics as hurricane Gustav but the similarity was not publicized to the respondents in order to get an unbiased response on actual versus stated behaviour. Hurricane Gustav was briefly a category 4 but it weakened to a category 3 and finally to a 2 by the time it made landfall south-east of New Orleans at approximately 9.30 a.m. on 1 September 2008. The dynamic evacuation behaviour from the revealed preference and stated choice surveys of households who completed both surveys on hurricane Gustav is shown in Figure 28.2. Overall, the timing of evacuation response is similar between stated and actual behaviour but the total number of evacuees is clearly different. Figure 28.2 shows that approximately 127,000 households actually evacuated for hurricane Gustav while only approximately 94,000 said they would in the stated choice survey. Among the survey respondents, 68 percent were consistent in their actual and stated behaviour, 24 percent said they would not evacuate when they did, and 8 percent said

Time-Dependent Stated Choice

513

they would evacuate when they didn’t. The actual evacuation not only resulted in more evacuations but also in a more rapid rate of evacuation as evidenced by the slope of the curve. The difference may be due to the fact that hurricane Gustav was the first major hurricane after hurricane Katrina three years earlier, and therefore could have resulted in greater actual evacuation. On the other hand, the survey was conducted a year after hurricane Gustav and residents may have become reassured after the minimum damage caused by hurricane Gustav and reported less responsive data in the stated choice portion of the survey. Noting the date and time on the abscissa of Figure 28.2, the impact of time of day on evacuation rate can be discerned in both the hypothetical and actual responses in that the evacuation rate is higher during the day and lower at night. The diurnal effect is clearer in the actual responses (i.e. real Gustav) as expected, but it is heartening to see that it is also reflected in the stated responses to hypothetical Gustav. The impact of time of day on evacuation rates is investigated further later. With respect to evacuation mode, 99 percent of respondents reported using a private vehicle to evacuate from real Gustav, even though 15 percent of the households did not own a vehicle. It appears that the majority of carless households either borrowed/rented a private vehicle or rode with others. In the stated choice survey, 98 percent of the households exposed to hypothetical Gustav stated they would use a private vehicle in one form or another. The unusually high percentage of respondents who reported using a private vehicle to evacuate from hurricane Gustav may be partly due to the sample consisting of households with a DVD player, thus tending to exclude the poor and the elderly who are more likely to use transit, but it may also be due to the path of Gustav that skirted the southern rural areas of the New Orleans area where private vehicle ownership is high irrespective of economic status or age. The comparison of destination type for real and hypothetical hurricane Gustav is shown in Figure 28.3. The results are similar even though the percentage choosing a hotel or motel is higher than normal and those choosing a friend or relative are lower than typical (Baker, 1995). Less than 15 percentage points separate any of the destination types shown in Figure 28.3. Past studies have shown that time of day affects time of departure during evacuation; all else being equal, people prefer to evacuate in daylight and generally prefer the morning to the afternoon (Fu et al., 2006). This tendency is not always apparent in reported evacuation behaviour because several other factors, such as storm proximity, storm intensity, and evacuation orders, confound the ability to detect the influence of time-of-day alone. However, in a single storm these factors can be noted and considered as the background or context in which the impact of timeof-day is evaluated. Taking hurricane Gustav as an example, it can be described as a storm that weakened from a brief reign as a category 4 more than 2 days from landfall to a category 2 when it made landfall the morning of 1 September 2008. A voluntary evacuation order was issued the morning of August 30, and a mandatory evacuation order 24 hours later on the morning of 31 August 2008. The observed time of departure by time of day is shown in Figure 28.4. As expected, the majority of the evacuation occurred during the day although daytime travel was reinforced in

514

Chester Wilmot and Ravindra Gudishala

Percent of evacuating households

60 50 40 30 20 10 0 Hotel/Motel

Public shelter Friend/Relative

Hypothetical Gustav

Other

Real Gustav

Figure 28.3: Stated and actual destination choice for hurricane Gustav.

Percent of evacuating households

60 50 40 30 20 10 0 12 am to 6 am

6 am to 12 pm

Hypothetical Gustav

12 pm to 6 pm

6 pm to 12 am

Real Gutav

Figure 28.4: Stated and actual time of departure for hurricane Gustav. this case by the timing of the evacuation orders and the time at which landfall occurred. What is interesting is that the stated choice data also portrays the majority of the evacuation occurring during the day but in the morning rather than the afternoon. It is possible that stated choice decision makers underestimate the time needed to prepare for evacuation.

Time-Dependent Stated Choice

515

The extent to which the time-dependent audio-visual scenarios were able to convey the conditions they represented, and respondents were able to react realistically, can be assessed by observing whether varying conditions in the scenarios generated plausible responses in stated behaviour. To test this, stated behaviour to a strong and weak storm among nine hypothetical storms were compared. Table 28.3 shows the characteristics of hypothetical storm 1 (a strong storm) with that of hypothetical storm 9 (a weak storm), while Figure 28.5 shows the stated response to each. Table 28.3: Characteristics of a weak and strong hypothetical storm. Characteristic

Scenario 1

Scenario 2

Scenario 3

Scenario 4

1

Category Order Time of day Time to landfall Day of week

4 None 10.15 a.m. 70 hrs Wed

4 Voluntary 6.15 a.m. 50 hrs Thu

4 Mandatory 12.15 a.m. 32 hrs Thu

3 mandatory 2.15 p.m. 18 hrs Fri

9

Category Order Time of day Time to landfall Day of week

5 None 6.30 a.m. 75 hrs Saturday

3 Voluntary 10.30 a.m. 47 hrs Sunday

2 Voluntary 6.30 a.m. 27 hrs Monday

1 Voluntary 10.30 p.m. 11 hrs Monday

Cumulative number of evacuating households (1000's)

Storm #

140 120 100 80 60 40 20 0 100

80

60 40 Time to expected landfall Storm 9

20

0

Storm 1

Figure 28.5: Evacuation behaviour for a weak and strong hypothetical storm.

516

Chester Wilmot and Ravindra Gudishala

Figure 28.5 shows that in stated response a stronger storm generates more evacuees than a weaker storm (as expected), the evacuation rate is greater with the strong storm, and the evacuation rate in both storms display the regular daily fluctuation observed in actual evacuation. By coincidence, time of day in relation to time to landfall is virtually identical for storms 1 and 9, so daytime (i.e. 6.00 a.m. to 6.00 p.m.) occurs approximately between 74–62, 50–38 and 26–14 hours to landfall in Figure 28.5. For the most part, this is when reported evacuation rates were the highest, and nighttimes (86–74, 62–50, 38–26 and 14–2 hours to landfall) were when they were the lowest.

28.4. Model Estimation Revealed preference data was arranged in 6-hourly intervals for a period of 132 hours before landfall of hurricane Gustav. This resulted in a line in the data set for each household at 6-hour intervals up to the time the household evacuated. If the household did not evacuate, then 132/6, or 22 lines of data appeared in the data set for that household. Data items that were time dependent, such as distance to the storm or storm category, varied by line, while fixed data such as household size or vehicle ownership remained static for a household. A total of 288 households were used to estimate the RP model. Stated choice data was also arranged so that for each responding household a separate line of data appeared for each scenario up to the time period in which the household evacuated, or if the household did not evacuate, then four lines of data appeared in the dataset for that household, one for each scenario presented on each storm. A time-dependent sequential logit model was estimated on the data (Fu et al., 2006). The model assumes that respondents assess the utility of evacuating or not evacuating at each time interval based on the characteristics of the storm, their own prevailing conditions, and evacuation orders in force at that moment. The choice of whether to evacuate in each time period is given by a binary logit model of the following form: Pnt ¼

1 1 þ eaþbxnt

(28.1)

where Pnt is the momentary probability that household n will evacuate at time t, xnt is the vector of storm, household and evacuation order condition experienced by household n at time t, and a and b are parameters. The probability of evacuation shown in equation (28.1) is the momentary probability of evacuation based on conditions in that time period. However, a household can only face an evacuation decision at time t if it did not evacuate in all previous time periods. Thus, assuming independence among the probabilities estimated by equation (28.1), the conditional probability of household n evacuating in time period t is: P0nt ¼ ð1  Pn1 Þ  ð1  Pn2 Þ  . . .  ð1  Pnt1 Þ  Pnt

(28.2)

Time-Dependent Stated Choice

517

Table 28.4: Model estimation results. Variable

RP model Estimate

Hurricane category (1–5) Evacuation order (in effect ¼ 1, none ¼ 0) Early morning (12.00–6.00 a.m.) Morning (6.00 a.m.–12.00 p.m.) Afternoon (12.00–6.00 p.m.) Distance from residence to storm (miles) Flooding (yes ¼ 1, no ¼ 0) Constant RP Constant SC lSC Number of observations Number of cases Log likelihood at zero Log likelihood at market shares Log likelihood at convergence Likelihood ratio index

Joint RP and SC model

t-statistic

Estimate

t-statistic

0.47 0.66

6.6 3.0

0.35 1.06

6.9 6.8

1.23

4.2

1.14

4.5

1.92

6.6

1.55

6.3

0.83

2.7

1.17

4.9

760.15

4.2

555.61

4.4

0.91  5.91

2.4 18.0

0.60  5.67  6.16 0.83

4.1 19.2 10.9 7.2

4774 288  3309  1225

7355 1136  5098  3062

 722

 1845

0.41

0.39

A high probability of evacuating in an earlier period decreases the probability of evacuating in a later period and low probabilities of evacuation in early periods permits high probabilities to occur in later periods. Equation (28.1) is used to estimate the parameters of the model and equation (28.2) is used to estimate the probabilities of evacuation of each household in each time period. Following detailed testing of many specifications, the following time-dependent variables were selected for inclusion in the model: hurricane category, presence of an evacuation order, time of day, flooding potential of the home, and time-dependent distance between the home and the eye of the storm. Evacuation order, time of day, and flooding potential of the home featured as dummy variables. Specifically, the variable ‘‘order’ in Table 28.4 attained a value of 1 if either a voluntary or mandatory evacuation order was in effect, and zero at other times. Time of day was divided into 4 six-hour periods starting at midnight. Dummy variable ‘early morning’ attained the value of 1 if the time of observation was between 12.00 a.m. and 6.00 a.m., and zero if not; dummy variable ‘morning’ attained a value of 1 if the time was between 6.00

518

Chester Wilmot and Ravindra Gudishala

a.m. and 12.00 p.m., and the dummy variable ‘afternoon’ was valid for the period 12.00 p.m. to 6.00 p.m.. The base period was between 6.00 p.m. and 12.00 a.m. when dummy variables ‘early morning’, ‘morning’, and ‘afternoon’ were all zero. Homes most vulnerable to flooding due to storm surge were given a value of 1 on the dummy variable ‘flooding’, and zero otherwise. Hurricane category was retained as a numeric variable even though it was recognized that it did not strictly possess ratio scale properties. Time-dependent distance to the storm was included as a lognormal transformation of distance because distance was considered to have a non-linear impact on the evacuation decision. Specifically, the authors postulate that a change in distance when the storm is either distant or very close has less effect than when the storm is at a distance where the evacuation decision is critical. Models were estimated separately on the RP data, the SC data, and then on the combined RP and SC data. The same model specification was used in each model to facilitate comparison. Because SC data on its own can produce model estimates that can be unrealistic for prediction purposes, it is common to use a combined RP/SC to capitalize on the strengths each data set brings to the estimation. Specifically, SC data provides great variability in variable values thus facilitating accurate parameter estimates while RP data allows the parameter values to be scaled to produce realistic results. The results of the model estimated on RP data alone, and that of the model estimated on the joint RP and SC data is shown in Table 28.4. The parameter values in each model have the correct sign and are all significant at the 95 percent significance level. Both models indicate that the least attractive time of day to evacuate is in the evening (6.00 p.m. to 12.00 a.m.), and the most attractive time of day is in the morning (6.00 a.m. to 12.00 p.m.). Distance was transformed into the ordinate value of the lognormal probability density function so that the positive sign of the parameter indicates that as the ordinate value increases so the probability of evacuation increases. The joint model has a different alternative specific constant for the RP and SC data, and the joint model contains a scaling factor, lSC, that is applied to the SC observations when estimating probabilities from SC data. The two models both produced high market-share-based likelihood ratio indices showing almost identical goodness of fit. The parameter estimates are similar between the models although they appear to be slightly smaller in the joint model and yet have higher significance levels. This is likely due to the higher number of observations in the joint model.

28.5. Model Application To test the models on data different to that on which they were estimated, both models were applied to hurricane Georges that struck New Orleans in 1998. RP data collected by the University of New Orleans was used (Howell, 1998) and supplemented with time-dependent data in the same way as with hurricane Gustav. Time-dependent stated choice was not available in this or other data sets, so the

Time-Dependent Stated Choice

519

stated choice constant or scaling factor could not be tested in this application. The RP data was prepared in the required format and both models shown in Table 28.4 used to estimate timed evacuation behaviour. However, alternative specific constants (ASCs) in the models were adjusted to ensure the model predicted the correct total number of evacuations. This was accomplished in the same manner in which ASCs are adjusted to match sample market shares with population market shares in a regular logit model (Ben-Akiva and Lerman, 1985), but with an adjustment in the process because the estimation and application data environments are different in this case. In this application, new alternative specific constants are estimated in an iterative process using the following expression: ! E k k1 þ ln k1 k ¼ 1; 2; . . . ;  n C ¼C E^ where Ck is the ASC for hurricane George data on iteration k (C0is the ASC from k1 is Table 28.4), E is the total number of evacuees in hurricane Georges data and E^ the estimated number of evacuees in hurricane Georges data using ASC Ck  1. k1 The iterative process was terminated when E^ i approached Ei. This resulted in the ASC in the RP model changing from  5.91 to  6.76, and the RP ASC in the joint RP/SC model changing from  5.67 to  6.46. Using the modified ASCs, the models predicted the timed evacuation behaviour on hurricane Georges as shown in Figure 28.6. The RP and joint RP/SP models predict hurricane evacuation behaviour on hurricane Georges similarly but their predictions deviate from the observed evacuation pattern particularly during the peak evacuation period. The relatively high

100

Number of Evacuations

90 80 70 60 50 40 30 20 10 0 140

120

Observed

100 80 60 40 Time to expected landfall (in hours) RP model prediction

20

0

RP/SP prediction

Figure 28.6: Comparison of predictions from the RP and RP/SP models.

520

Chester Wilmot and Ravindra Gudishala

sustained evacuation in the observed data over approximately 36 hours during the peak evacuation period is unusual but was probably caused by the fact that hurricane Georges retained its strength right up to landfall and its projected path was right over New Orleans. A chi-square test of the similarity of the predicted evacuation distributions with that of the observed distribution resulted in both being rejected at the 95 percent level of significance. The root-mean-square-error between the observed and estimated time evacuations was 19.2 evacuations per 6-hour time period for the RP model and 16.5 for the joint RP/SP model. Another comparison conducted between the conventional RP method of data collection and the new time-dependent stated choice method was to compare the cost of using each method. Detailed accounting was conducted during the execution of both surveys to establish the cost of conducting each survey. This involved careful timekeeping during execution of the survey of activities associated with either the RP or the SC data collection process. Time or cost which involved both surveys, such as preparing survey packages for mailing and the mailing costs, were equally divided. The result showed that the time-dependent stated choice method cost approximately 25 percent more than the conventional RP method. However, some of these costs may be due to more time being spent on the design of the SC survey due to it being its first application. Analysis of survey respondents’ feedback showed that respondents not only enjoyed participating in the survey but also had positive experiences in filling out the stated choice questionnaire with the aid of visually presented hypothetical scenarios. In addition, survey respondents expressed willingness to participate in future visual surveys if approached.

28.6. Conclusion The objective of this study was to test a new method of data collection of hurricane evacuation behaviour that could be used in evacuation modelling. In particular, the new method involved collecting time-dependent data using stated choice and then comparing it to revealed preference data collected at the same time from the same respondents. Observed and stated behaviour was compared and the two methods produced similar, but not identical, results. Particularly, while the total number of evacuees differed, evacuation timing, destination type and mode of evacuation were similar. Respondents reacted to the survey in a generally positive manner and some even expressed their enjoyment in completing the DVD-based survey. Comments suggested that the respondent burden on the SC-based survey was less than we anticipated. Overall, the stated choice method appeared to produce behaviour that was plausible and logical, displaying a credible response to portrayed conditions in the time-dependent scenarios. It is therefore concluded that the audio-visual method of time-dependent stated choice data collection employed in this study holds promise as a data collection technique, at least among hurricane evacuation evacuees in areas where respondents are familiar with storms of this nature. However, the results of

Time-Dependent Stated Choice

521

this study are from one application, and more studies are required to effectively confirm the value of the time-dependent audio-visual stated choice method of data collection. The second method of assessing the new time-dependent method of data collection was to observe whether it was able to produce an evacuation demand model that performed better than one estimated on RP data with supplemental timedependent data. It was found that a model estimated on RP data with added timedependent data displayed a similar goodness of fit to that of a model estimated on combined RP and SC data. However, when the models were applied to other data than that on which they were estimated, the combined RP/SC model produced a root-mean-square error that was 14 percent lower than that produced by the RP model alone. The predicted distribution of time-dependent evacuation was significantly different to the observed evacuation in the non-estimation data set for both models, although the predictions from the combined RP/SC data more closely resembled the observed distribution than those from the RP model alone. The third method of assessing the new time-dependent method of data collection was to compare its cost with that of more conventional methods of collecting dynamic evacuation behaviour. A detailed cost assessment revealed that the new method cost approximately 25 percent more than the regular method of RP data supplemented with dynamic information from external sources. However, at least part of this cost can be ascribed to the fact that this was the first time the new method was applied and subsequently required more time and effort in designing the process than would be the case in later applications. In the end though, if SC data is to be used in combination with RP data in establishing a model, then the new method will always cost more than an RP data collection exercise on its own because additional information has to be collected. A combined RP/SC model is recommended over a model estimated on either RP or SC data alone because the two data collection techniques each bring their own strength to the estimation process and their strengths complement each other. SC data allows a wide variation in variable values to enter the estimation process thereby providing the means to accurately determine the impact of these variables on evacuation behaviour; RP data allows the predictions from the model to be rooted in reality thereby making them more realistic. It is concluded that the method of audio-visual, time-dependent stated choice data collection described in this paper is capable of producing credible data on hurricane evacuation behaviour. The process can be improved by providing more realistic audio-visual scenarios, increasing the number of scenarios in each storm, and possibly incorporating video game technology to enhance animation and allow interactive participation of respondents.

Acknowledgement The research on which this paper is based was funded by the Louisiana Transportation Research Center.

522

Chester Wilmot and Ravindra Gudishala

References Alsnih, R., Rose, J., & Stopher, P. (2005, January). Understanding household evacuation decisions using a stated choice survey-case study of bush fires. Transportation Research Board 84th annual meeting. Washington, DC. Baker, E. J. (1995). Public response to hurricane probability forecasts. Professional Geographer, 47(2), 137–147. doi:10.1111/j.0033-0124.1995.00137.x Ben-Akiva, M., & Lerman, S. R. (1985). Discrete choice analysis: Theory and application to travel demand. Cambridge, MA: MIT Press. Fu, H., Wilmot, C. G., & Baker, E. J. (2006). A sequential logit dynamic travel demand model and its transferability. Transportation Research Record, 1977, 17–24. Howell, S. (1998). Evacuation behavior in Orleans and Jefferson parishes. Retrieved from http://poli.uno.edu/unopoll/studies/docs/1998EvacuationBehavior.pdf. Accessed on April 2011. Kang, J. E., Lindell, M. K., & Prater, C. S. (2007). Hurricane evacuation expectations and actual behavior in Hurricane Lili. Journal of Applied Social Psychology, 37, 887–903. doi:10.1111/j.1559-1816.2007.00191.x Meyer, R. (2012). Dynamic lab simulations in understanding hurricane risk response. Presentation at the National Hurricane Conference, Hilton, Orlando, March 26–29. Whitehead, J. C. (2005). Environmental risk and averting behavior: Predictive validity of jointly estimated revealed and stated behavior data. Environmental and Resource Economics, 32, 301–316. doi:10.1007/s10640-005-4679-5

Chapter 29

WORKSHOP SYNTHESIS: SURVEY METHODS INFORM POLICY MAKERS ON ENERGY, ENVIRONMENT, CLIMATE AND NATURAL DISASTERS

TO

Gerd Sammer and Juan de Dios Ortu´zar 29.1. Introduction and Scope Specialised survey and analysis tools are needed to study the interaction of the transport system and the environment, especially regarding energy demand, atmospheric emissions and impacts on ecosystems, as well as its disruption by extreme natural events such as earthquakes, cyclones and floods. These issues were explored and discussed at a special workshop in the recent Puyehue conference.1 In particular, we discussed the data collection methods and variables required for modelling the environmental performance of transport systems and the potential to improve it through changes in technology, infrastructure or regulations. Of particular interest were the dynamics of user behaviour when facing unexpected events. The topic is broad so it was not possible to cover all issues in the same detail.

29.2. State-of-the-Art Applications Wilmot and Gudishala (2011) discuss the collection of time-dependent data using audio-visual stated choice (SC) methods for giving advance warning of disasters,

1. Participants: Ricardo Alvarez Daziano (USA), Julia´n Arellana (Chile), David Pe´rez Barbosa (Colombia), Floridea di Ciommo (Spain), Akimasa Fujiwara (Japan), Nicolas Haverkamp (Germany), Boris Ja¨ggi (Switzerland), Tuuli Jarvi (Finland), Anders Jensen (Denmark), Karen Lucas (UK), Malesela Makgeta (South Africa), Claudine Moutou (Australia), Juan de Dios Ortu´zar (Chile) (Rapporteur), Gerd Sammer (Austria) (Chair), Oscar Sa´nchez (Mexico), Franc- oise Potier (France), Herrie Schalekamp (South Africa), Katlego Setshogoe (South Africa), Chester Wilmot (USA).

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

524

Gerd Sammer and Juan de Dios Ortu´zar

such as hurricanes and wildfires. A video with a sequence of scenarios related with hypothetical storms shows the prevailing conditions, at discrete points in time, as each storm approaches land. Although respondents do not know this, one of the hypothetical storms is the same as an actual storm which they experienced in the past, and for which they were required to report their behaviour in a revealed preference (RP) survey. The RP data were later supplemented with time-dependent data from official sources, and hurricane evacuation demand models were estimated on the basis of these enhanced RP data, as well as on a combination of this and the time-dependent SC data. Stated and actual behaviour were compared showing that the former actually matched real behaviour. Also, when the estimated models were applied to a different data set the combined time-dependent RP/SC model performed slightly better than the model for enhanced RP data alone. Precise calculations revealed that the timedependent SC data were 25% more expensive to collect than the enhanced RP data, although some of this extra cost could be due to the fact that the former was collected for the first time. Daziano and Chiew (2011) addressed the topic of data needs to forecast consumer reactions to sustainable energy sources in personal transport. Since standard vehicles rely on fossil fuels, electric/hybrid vehicles have been proposed to promote sustainable transport. The chapter discusses the data required to estimate a general demand model for vehicle purchases at an individual level. The list of potential data items considers the particularities of low-emission vehicles with emphasis on the trade-offs between their cost-reliability and their environmental benefits as well as the potential to evaluate welfare-improving policies related to the adoption of energy efficient technologies. Arellana, Ortu´zar, and Rizzi (2011) dealt with the issue of data to model time-ofday choice. This choice involves a complex decision-making process influenced by travel conditions, congestion levels, activity schedules and external trip factors. The survey process proposed had three different stages and used various collection methods (e-mail, web page and personal interviews at the workplace) in order to satisfy efficiency, reliability and cost criteria. First, an RP component based on a travel diary following an activity-based framework; relevant level-of-service values at different time periods were obtained from GPS data gathered from instrumented vehicles in public and private transport networks. Second, an SC-off-RP optimal design, considering dependence among attribute levels, allowed obtaining realistic choice situations for time-of-day modelling in both trip and tour contexts and, thirdly, several questions using a 1–7 Likert scale were included to incorporate latent variables. Non-response rates were comparatively high, suggesting that more effective strategies should be developed to maximise survey response. Three of six posters related to discussions at the workshop proved of interest. Ja¨ggi, Dobler, and Axhausen (2011) presented a survey combining a SC experiment and a priority evaluator. They looked at ways in which people could invest in energy efficiency in general, and considered the differences between private homes and in private transport. The exercise consisted of a paper-and-pencil questionnaire with the SC experiment followed by an internet-based priority evaluator. Both components

Survey Methods on Energy, Environment, Climate and Natural Disasters

525

were personalised to present respondents with meaningful choice sets. In the SC experiments, respondents were asked to choose between four alternatives to increases in fuel prices: insulating the house, buying a heat pump, buying a new and more efficient car or selling the car and switching to public transport. In the priority evaluator, respondents interactively used an internet application to optimise their CO2 output, choosing between long-term investments and short-term measures. As the same participants were questioned in two different ways results could be directly checked for consistency. Data from the priority evaluator not only replicated the narrow experiment space of long-term investment choices but also expanded it to short-term choices (i.e. reducing the kilometre driven per year or omitting flights). Although a fairly complex choice set was used, most respondents managed to complete their tasks. Moutou, Greaves, and Puckett (2011) analysed the methods used during a pilot survey of small business owners and managers in Sydney designed to assess responses to various sustainable transport initiatives. The survey was conducted as a hybrid computer-assisted personal interview with the help of an internet-enabled Apple iPad, partly to improve interest in the survey and respondent’s willingness to participate. The chapter reports on the effectiveness of various recruitment approaches, the characteristics of those who responded and those who did not, together with the main reasons for non-response. Finally, Tuuli (2011) considered the issue of combining different methods for measuring road traffic volumes (traffic counts, odometer reading, total fuel consumption, etc.) by car type and age in Finland, a key data for a model to calculate emissions. She concluded that detailed information by car type and characteristics of the respective owner are required. The diversity of the above discussions shows that this is a very broad subject of interest indeed. Although we only managed to cover certain aspects, more or less randomly, they demonstrate that there are new questions which cannot be answered sufficiently well with currently existing survey methods.

29.3. Important Subjects for Future Transport Policy Decisions It is worth mentioning that, in general, experts and decision makers responsible for transport policy do not necessarily share the same opinions regarding the topics and contents required to cope with traffic problems in a sustainable way. When trying to implement solutions experts frequently wonder to which degree political decision makers are really interested on counting with reliable data for taking traffic-related decisions; after all, such data could significantly limit their freedom for decisionmaking. Many traffic-related decisions are guided by the interests of influential lobbying groups and are difficult to influence by factual arguments. On the other hand, consistent research results have gone a long way to produce a fairly broad agreement among experts that technical solutions, as well as changes in behaviour and an internalisation of the external costs of transport, are essential to cope with future transport problems.

526

Gerd Sammer and Juan de Dios Ortu´zar

Notwithstanding, those who are responsible for taking traffic-related decisions usually look only at possible technical solutions. Other types of measures are unpopular and currently meet with little approval by the population and business representatives. Future traffic-related challenges identified at the workshop discussions can be summarised as follows:  Need to guarantee an environmentally sound future for traffic in the absence of fossil fuels and the reliance on secure and sustainable energy supply; answers to these questions must be found before a serious supply shortage of fossil energy takes place.  Need to devise strategies for avoiding traffic-related climate changes and to adapt to expected climate changes; this includes the development and evaluation of environmentally friendly technologies for modern transport (e.g. battery electric cars), but also of sound evacuation programmes in case of natural disasters.  Need to guarantee the provision of a (yet to be defined) basic accessibility for all mankind; the transport-related objectives of more developed countries differ considerable from those of less developed nations. Highly developed countries focus mainly on maintaining/guaranteeing current mobility levels. Whilst, at least in the short term, developing countries struggle to achieve a minimum level of accessibility and mobility to improve their living conditions. These different objectives may lead to different policy measures, and different data needs and survey methods.  Need for a significant increase in the acceptance of the need to implement necessary but unpopular transport-related decisions; a key problem here is the internalisation of the external costs of transport. To achieve this, certain key policy measures are required, such as the introduction of area-wide road pricing or an environmental tax on fossil energy (Ho¨ssinger et al., 2011). These measures, required to achieve pre-specified targets, must be financially feasible, gain sufficient approval and distribute the burden fairly among various population groups and income classes.

29.4. Identification of Open Questions, Needs and New Solutions When dealing with the above topics during the workshop, it became clear that different groups of experts (i.e. social scientists vs. traffic engineers and transport scientists) held vastly different opinions. For example, the social scientists suggested starting with the discussion and identification of data and variables which are normally lacking to provide decision makers with quality information. But the traffic engineers/transport scientists suggested that an identification of the problems was prerequisite for the identification of potentially missing variables and data. Such differences in opinion appeared several times during discussions regarding the development of suitable survey methods to face the above challenges and to

Survey Methods on Energy, Environment, Climate and Natural Disasters

527

establish methodological priorities. Social scientists were mainly in favour of developing and applying new and innovative survey methods (e.g. observation of ‘real life’ experiments), while transport scientists preferred an evolutionary development of already existing survey methods (i.e. RP and SC approaches). The reasons for these differences could be traced back to different methodological approaches; but they could also be attributable to personal attitudes that may not be necessarily representative of the respective disciplines. A compromise was found which led to one obvious conclusion: to optimise travel behaviour surveys closer cooperation is needed among different interested disciplines to integrate and reconcile sometimes vastly different viewpoints.

29.5. Data Needs to Deliver Valid Information for Transport Policy Once objectives became clear, a hotly contested discussion allowed us to identify several types of data and variables required for transport policy.

29.5.1. Interactions between Regular Travel Behaviour and Environmental Effects Travel behaviour data include all traditional and currently used influential variables necessary for modelling and estimating future travel behaviour. However, there are some influential variables that have hardly been considered in transport surveys and models so far, such as, attitudes towards the environment and user knowledge about cause-effect interdependences between the transport system and its environmental impacts. We need to investigate how much these variables influence travel behaviour as represented by the following decisions (Sammer & Ho¨ssinger, 2011) which may go beyond those considered in traditional applications:  Choices regarding place of residence and/or place of work which impact traffic and incorporate relevant aspects for the environment (Galilea & Ortu´zar, 2005; Sillano & Ortu´zar, 2005);  Choices regarding ownership/availability of transport modes (i.e. cars with different modes of drive, bicycles, etc.);  Decisions regarding physical mobility (trip generation), destination choice, mode route and time of day choice, and  Reflections on traffic-related decisions made in the past to create the basis for new decisions (Yan˜ez, Mansilla, & Ortu´zar, 2010). The reflection on past travel behaviour can be achieved with the help of programmes of behaviour change (Ampt, Knight, & Adelaide, 2008; Bro¨g & Erl, 2008), but in future there might be automated comparisons of behaviour with the aid of traffic telematics systems currently under development.

528

Gerd Sammer and Juan de Dios Ortu´zar

29.5.2. Under-Coverage of Non-Motorised Modes of Transport in Traditional Travel Behaviour Surveys Non-motorised trips (pedestrians, cyclists) are underestimated by up to 30% in traditional surveys as far as the length of daily trips is concerned (Kohla & Meschik, 2011), as such trips are easily forgotten or because users do not consider them important. This leads to a significant underestimation in comparison with motorised traffic; this also means that decision makers and experts seldom consider such trips in their planning strategies. But non-motorised traffic is the most environmentally friendly and sustainable mode of transport. Special survey methods need to be developed to overcome this problem. The application of GPS technology is one method currently being developed.

29.5.3. Travel Behaviour Data Regarding Use of New Transport Technology and Interaction Data about Its Environmental Impact Little is known about the reaction of transport users to new developments in transport technology. Among these are electric, hybrid and hydrogen-driven vehicles and the use of synthetic fuels and biofuels; but this item also considers new forms of organisation, such as a door-to-door services as a ‘one-stop-shop’ of intermodal transport chains, which could compete against motorised private transport (Stark, Link, Raich, & Sammer, 2011).

29.5.4. Travel Behaviour in Reaction to Environmental Regulatory Measures for Modes Which Use Fossil Energy One such measure might be the establishment of low-emission zones where vehicles not meeting clearly defined standards regarding their environmental effects are prohibited.

29.5.5. Vulnerability of Transport System to Natural Disasters and Catastrophes caused by Humans The vulnerability of the transport system to unexpected developments and the effectiveness of suitable measures to avoid or reduce their negative impacts are of concern. Such events might be either natural disasters (tsunamis, avalanches, earthquakes, hurricanes and the like) or catastrophes caused by humans, such as terror attacks and nuclear accidents. The effects can be short or long-term and involve avoidance, adaptation and evacuation measures. This requires the provision of databases for emergency planning which concern decision makers, emergency services and the population as a whole. It is crucial to provide the right information at the right moment for certain target groups who need it to take sensible decisions

Survey Methods on Energy, Environment, Climate and Natural Disasters

529

about when to provide and what kind of information to the population (e.g. alerts, recommendations and compulsory evacuation). 29.5.6. Data about Attitudes and Acceptance by Users and the Population as a Whole Many key transport-related measures are not popular since they involve restrictions and burden for both transport users and the population as a whole. Analyses show (Sammer, Ho¨ssinger, Mensik, Voigt, 2002) that it is possible to convince some affected parties of the necessity and usefulness of measures, if reliable and factual information is provided. To adapt the information to the respective needs of different target groups specific survey methods are required. 29.5.7. Data about Attitudes and Acceptance to Increase the Awareness of Decision Makers Although most traffic experts take as a fact that the current transport system is not developing in a sustainable way, the majority of traffic-related decision makers do not share this opinion (Sammer et al., 2002). This is partly due to lack of information among decision makers but also due to their attitudes: they tend to consider shortterm economic targets as more important than sustainable solutions. To change this situation data for policy makers requires being processed in different ways. Board games, as used in strategic military planning, and also focus groups might be interesting potential solutions.

29.6. Methodological Requirements for Survey Methods to Inform Political Decision Makers As mentioned above experts in fields such as sociology and transport engineering do not necessarily agree about the kind of research which is lacking, the most suitable methodological approaches, and the priority regarding development of innovative survey methods. The following text includes some suggestions discussed. 29.6.1. Evolutionary Development of RP and SP Surveys Using RP and SP survey methods to attempt to explain travel behaviour has two weaknesses. If the analyst is not careful, the design can imply combinations which are non-plausible to respondents leading to biased answers. Also, the information which respondents already have about travel behaviour alternatives is an important but generally neglected influential factor in RP and SP surveys. This is of particular importance for the issues considered in this chapter. The inclusion of latent and psychological variables in the analysis of RP and SP survey data helps to significantly improve the explanatory quality of travel behaviour

530

Gerd Sammer and Juan de Dios Ortu´zar

models (Arellana et al., 2011). Latent variables allow covering the influence of attitudes and awareness indirectly and can also be used for the creation of behaviourally homogeneous groups.

29.6.2. The Collection of Time-Dynamic Variables for SP Methods The collection and modelling with time-dynamic variables on short-term periods is a particular challenge. This is especially important for modelling decision-making processes related to travel behaviour in the case of natural disasters. This is true for the population as a whole but also for decision makers and their request to start evacuations. Since it is quite time-consuming and tiring for the target individuals to take part in SP surveys for several time periods, audio-visual or computer-animated survey instruments can be very useful in this context (Wilmot & Gudishala, 2011).

29.6.3. Survey Methods to Investigate Attitudes and the Raising of Awareness Over long periods people’s attitudes and awareness may change dynamically (Sammer, 1999) and such changes have a significant impact on travel behaviour. Specialised surveys to measure this may involve decision makers from various political parties, traffic experts, stakeholders and journalists. Collecting and analysing time series data is important but only represents knowledge at specific points in time. As far as the subjects covered in this chapter are concerned, panel surveys offer far better information (Yan˜ez et al., 2010). People participating in a panel are asked questions at regular intervals. To avoid panel attrition and to increase response rates it has proved useful to replace some members at regular intervals by newly recruited target people (Zumkeller, Madre, Chlond, & Armoogum, 2006). A panel makes it possible to consider the change and variance in behaviour of each person and to ask for the reasons of such behavioural changes from one round to the next. Notwithstanding, panel participation can lead to panel conditioning and thus to biased results. The nature of the bias is worth studying, especially in attitudinal surveys. The analysis of panel data using the tools of repeated cross-section data is useful, but not the only possibility.

29.6.4. Qualitative Survey Methods Qualitative survey methods are particularly suited for first-time explorations of any subject. Examples include the purchase and use of car technologies that have not been widely experienced because they are not yet available, or the design of measures to implement the internalisation of external mobility costs. It is also important to consider variables related with unreliable information when making travel-related decisions. Qualitative survey methods have as an objective the investigation of cause-effect interdependences and the development of hypotheses

Survey Methods on Energy, Environment, Climate and Natural Disasters

531

which are then further investigated with the help of quantitative methods; a good example is the need to investigate the factors influencing the purchases and use of battery-electric cars (Stark et al., 2011). Various types of qualitative survey methods can be used in our field, for example narrative interviews and group discussions. Quantitative and qualitative surveys complement each other very well. This useful complementarity should lead to their integration in order to achieve synergies in overall data quality.

29.6.5. Controlled Experiments and Application Tests These are tried and tested methods in psychology and sociology, designed to test interdependencies of measures and framework conditions, and to analyse their effects. The experiments differ regarding their closeness to real life behaviour as they can be designed as ordinary or blind field tests (i.e. experiments which are run with participants who lack information about the experiment). If participants are informed this may lead to bias that are similar to those in laboratory-tests. Experiments that work with volunteers are naturally not fully representative. Controlled experiments can put participants into virtual markets and guide them with the help of some software tool. The difference with SP methods is that participants do not interact with an interviewer but with other people acting on the market, in a way similar to board game techniques. In traffic engineering there are some examples of controlled experiments:  In the 1990s, 30 km/h were introduced as legal speed limit for secondary roads in Graz (Sammer, 1997). As the measure was controversial, it was agreed to evaluate it after one year of operation before making the final decision. The evaluation also included a mode and route choice RP survey, a survey about attitudes to the speed limit in time, and observations about the interaction of car drivers, cyclists and pedestrians. Results helped explain an increase in acceptance of the measure (from 44% to 80%) in one year. So, controlled experiments combined with suitable RP data can help to raise awareness and help guide measures to increase the acceptance of controversial measures.  As part of the European research project Icaro, car sharing projects were initiated as controlled experiments in seven European towns with different basic conditions and accompanied by various types of measures. The effects upon behaviour and the environment were evaluated in a uniform way; using as a basis a standardised before and after survey. For controlled experiments before and after surveys of travel behaviour and attitudes are generally conducted, frequently with the help of a panel but also as interactive in-depth surveys. There are some methodological challenges involved (i.e. panel attrition or lack of representativeness in small-scale experiments). Controlled experiments can be useful if their implementation costs are low as little money is being wasted. They permit a quick test of controversial projects in real life. The

532

Gerd Sammer and Juan de Dios Ortu´zar

approach can be also particularly useful to gather information about the use of new technologies such as battery-electric cars.

29.6.6. Use of Innovative Survey Techniques (GPS-TRACKING, iPad, Smartphone, Internet Surveys, Collection of Mobile Radio Data) The use of new technologies offers a number of promising advantages for travel behaviour surveys. For example, it improves recording of trips which are frequently not covered by traditional surveys (i.e. short car trips or non-motorised trips). However, there are still a number of problems which have not been fully solved so far which are related with the automated detection of stops between trips, trip purposes and modes of transport, the high rate of refusal, etc.

29.6.7. Interdisciplinary Data Collection The lack of interdisciplinary cooperation regarding travel and attitudinal surveys is deplorable. Sociologists, psychologists, market researchers and traffic experts hardly ever cooperate when it comes to the design and conduction of travel behaviour and attitudinal surveys. This has nothing to do with methodology but has a strong impact on quality as experiences cannot be exchanged and potential synergy effects are ignored.

29.6.8. Methods Used in Long-Term Surveys to Identify Rare Events Some events occur infrequently, such as environmental alerts or natural disasters, but have significant effects (i.e. bans on driving cars or traffic closures). Such events are difficult to predict or even unpredictable and it is difficult to prepare a survey which could take place at an unknown point in time. In general, retrospective survey methods can be used but these illustrate past travel behaviour insufficiently well and also inaccurately. New data collection techniques, such as GPS devices installed in cars, offer the possibility to analyse specific periods in time; furthermore, the data gathered might be combined with that from in-depth interactive interviews and supplemented by retrospective surveys after the event. Continuous surveys are also useful to capture rare events. Especially those that occur unpredictably.

29.6.9. Appropriate Survey Methods for Developing Countries An important question relates the potential need for differentiation between survey methods in developed and less developed countries. In developing countries data may be needed for planning an environmentally friendly basic transport infrastructure.

Survey Methods on Energy, Environment, Climate and Natural Disasters

533

It should be possible to conduct a survey with simple cost-efficient methods. Compared with this requirement, the problems with data collection and surveys addressed in this chapter seem like ‘luxury demands’ of highly developed countries. At the workshop in the Puyehue conference, this discrepancy became apparent as 90% of participants came from more developed countries. The experience suggests that it may be better to cover the data needs and survey requirements of less developed countries separately.

29.7. Summary and Recommendations This chapter considered a variety of themes and presented a range of methodological solutions to gather valid information for political decision-making. Key issues discussed can be summarised as follows:  To make sure that traffic will still be possible in future we need to address its environmental compatibility and focus on mobility without fossil fuels;  There is a need to develop efficient strategies in the transport sector to adapt to existing climate changes and avoid further climate changes;  There is a need to develop procedures to make certain key but frequently unpopular policy measures significantly more acceptable to decision makers, stakeholders and the population as a whole. To achieve this, a range of variables and data are required. First, an appropriate specification of existing data and methods that allow covering all relevant issues related with the environment and also with social questions; the goal is to allow consideration of sustainable development policies in our transport models. Second, there is a range of issues which require suitable survey methods. For example, the behaviour of transport users with regard to new technological solutions (e.g. batteryelectric vehicles), extreme energy price increases and energy shortages; the behaviour of individuals in the case of rare events such as natural disasters, and the attitudes to and increase awareness about unpopular measures to improve traffic-related decisions. The main current challenge is how to design a useful and cost-effective data collection process for the aforementioned cases. An evolutionary development of currently available procedures seems to be an option (e.g. including newly required variables in SP and RP surveys). This ‘pragmatic’ approach was favoured by the traffic experts participating in the workshop, but the development of completely new methods was also suggested by social scientists at the same workshop. Among these, controlled experiments and survey methods using new technologies were mentioned. Workshop participants agreed that an interdisciplinary approach to the challenges offered by data needs in these cases was essential: if experience from different professional areas could be used, this should lead to synergy effects and improve quality. It seems advisable to cooperate in order to speed up the development of

534

Gerd Sammer and Juan de Dios Ortu´zar

high-quality survey methods to improve data quality even further. The variety of survey methods addressed makes it advisable to offer specialist workshops at the next ISCTSC Conference. Finally, developing countries may require specific survey methods which were not covered due to lack of time and expertise. Therefore, this subject could also be given more priority at the next ISCTSC Conference.

References Ampt, E., Knight, S., & Adelaide, M. (2008, July 16–18). A personal responsibility to behaviour change. Proceedings of the 4th international symposium on travel demand management, Vienna–Semmering. Arellana, J. A., Ortu´zar, J. de D., & Rizzi, L. I. (2011, November 14–18). Survey data to model time-of-day choice. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile. Bro¨g, W., & Erl, E. (2008, July 16–18). Global problems need global solutions-behavioural changes in three continents. Proceedings of the 4th international symposium on travel demand management, Vienna–Semmering. Daziano, R. A., & Chiew, E. (2011, November 14–18). Electric vehicles rising from the dead: Data needs for forecasting consumer response toward sustainable energy sources in personal transportation. Paper presented at 9th international conference on transport survey methods (ISCTSC), Termas de Puyehue, Chile. Galilea, P., & Ortu´zar, J. de D. (2005). Valuing noise level reductions in a residential location context. Transportation Research, 10D, 305–322. Ho¨ssinger, R., Link, Ch., Raser, E., Sammer, G., Stark, J., Lechner, y , Sonntag, A. (2011). Emissionshandel im Strassenverkehr, Moeglichkeiten und Auswirkungen eines eu-weiten CO2-Zertifikatehandels fuer den Strassenverkehr in Oesterreich. Research project MACZE, Institute for Transport Studies, Vienna. Ja¨ggi, B., Dobler, C. H., & Axhausen, K. W. (2011, November 14–18). Surveying energy efficiency in housing and transport using a priority evaluator. Poster presented at 9th international conference on transport survey methods (ISCTSC), Termas de Puyehue, Chile. Kohla, B., & Meschik, M. (2011, November 14–18). Comparing trip diaries with GPS tracking: Results of a comprehensive Austrian study. Paper presented at 9th International Conference on Transport Survey Methods (ISCTSC), Termas de Puyehue, Chile. Moutou, C., Greaves, S., & Puckett, S. (2011, November 14–18). Response to sustainable transport initiatives: A survey of small business owners. Poster presented at 9th international conference on transport survey methods (ISCTSC), Termas de Puyehue, Chile. Sammer, G. (1997). A general 30 km/h speed limit in the city: A model project in Graz, Austria. In R. Tolley (Ed.), The greening of urban transport. Chichester: Wiley. Sammer, G. (1999, September 27–29). Attitudes toward transport policy in the light of the result of long-term city transport policy. Proceedings European Transport Conference, Cambridge. Sammer, G., & Ho¨ssinger, R. (2011, November 14–18). Contribution to identify and quantify the survey bias for stated response surveys caused by immanent knowledge and awareness raising information during the stated response interview. Poster presented at 9th international conference on transport survey methods (ISCTSC), Termas de Puyehue, Chile.

Survey Methods on Energy, Environment, Climate and Natural Disasters

535

Sammer, G., Ho¨ssinger, R., Mensik, K., & Voigt, H. Ch. (2002). Analyse und Erklaerung der verkehrspolitischen Einstellungen von Entscheidungstraegern, Interessensvertretern und Bu¨rgern. Schriftenreihe des Bundesministeriums fu¨r Verkehr, Innovation und Technologie, Forschungsarbeiten aus dem Verkehrswesen, Band 122, Vienna. Sillano, M., & Ortu´zar, J. de D. (2005). Willingness-to-pay estimation with mixed logit models: Some new evidence. Environment and Planning, 37A, 525–550. Retrieved from http:// dx.doi.org/10.1068/a36137 Stark, J., Link, Ch., Raich, U., & Sammer, G. (2011). Nutzer- und Marktpotential; verkehrsund umweltrelevante Auswirkungen von Smart-Electric-Mobility (Chapter 2.6). Final Report of SEM Smart Electric Mobility, Research Project of New Energies 2020, Project Consortium Technical University Vienna, Austrian Institute of Technology and Institute for Transport Studies, Vienna. Tuuli, J. (2011, November 14–18). Combining different methods for measuring road traffic volumes by car type and age (traffic counts, odometer reading, ANPR, NTS and fuel consumption) in Finland. Poster presented at 9th international conference on transport survey methods (ISCTSC), Termas de Puyehue, Chile. Wilmot, C., & Gudishala, R. (2011, November 14–18). Collection of time-dependent data using audio-visual stated choice. Paper presented at 9th international conference on transport survey methods (ISCTSC), Termas de Puyehue, Chile. Yan˜ez, M. F., Mansilla, P., & Ortu´zar, J. de D. (2010). The Santiago Panel: Measuring the effects of implementing Transantiago. Transportation, 37, 125–149. Retrieved from http:// dx.doi.org/10.1007/s11116-009-9223-y Zumkeller, D., Madre, J. L., Chlond, B., & Armoogum, J. (2006). Panel surveys. In P. Stopher & C. Stecher (Eds.), Travel survey methods, quality and future directions. Amsterdam: Elsevier.

THEME 7 NEW PERSPECTIVES ON OBSERVING CHOICE PROCESSES: PSYCHOLOGICAL FACTORS

Chapter 30

Factors Affecting Respondents’ Engagement with Survey Tasks$ Peter Bonsall, Jens Schade, Lars Roessger and Bill Lythgoe

Abstract Purpose — The research was designed to explore people’s willingness/ability to understand complex road user charges. However, the results raise issues about respondent engagement and ecological validity and so have important implications for questionnaire practice. Methodology — Computer-based experiments administered in the United Kingdom and Germany gathered respondents’ estimates of road user charges along with their response latencies, personal characteristics, acceptance of road charging, assessments of task complexity and attitudes to analytical tasks. Findings — The results demonstrate questionnaire learning effects and show the effect of personal characteristics on the accuracy and speed of questionnaire completion. The tendency of males, younger people and students to complete the task more quickly is interesting as is the fact that fewer and smaller errors were made by participants who claimed to gain satisfaction from completing a task which has involved mental effort. Engagement was seen to vary with personal characteristics, attitudes to decision making, task complexity and acceptance of the policy being tested. A key finding is that disengagement was more evident among participants who were broadly supportive of road charging than among those who were not.

$

Revised and extended from a paper presented at the IATBR conference in Jaipur in December 2009 under the title: ‘Factors Affecting People’s Engagement the Assessment of Road Charges in an Experimental Setting’.

Transport Survey Methods: Best Practice for Decision Making Copyright r 2013 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISBN: 978-1-78190-287-5

540

Peter Bonsall et al.

Implications — The findings have important implications for the design of data collection exercises and for the interpretation of resulting data. It is concluded that repeated choice experiments are an inappropriate source of data on responses to unfamiliar circumstances. The collection of data on response latencies and the inclusion of questions on respondents’ attitudes to task completion is a strongly recommended addition to standard questionnaire practice. The extent to which disengagement in an experimental context is, or is not, indicative of real-world behaviour is an important and urgent subject for further research. Keywords: Questionnaire; task engagement; task performance; road charging; acceptance; heuristic decision-making; ecological validity; psychology

30.1. Introduction 30.1.1. Background This chapter presents evidence on people’s engagement with a questionnaire task — the estimation of road charges for a specified journey in a hypothetical context. The data came from an experiment developed as part of an EU-funded investigation1 of the extent to which complex charging schemes might fail to achieve their potential if people were unwilling or unable to deal with complex pricing signals. The original aim was to explore these issues and draw conclusions on their implications for policy. However, as the work progressed, it became clear that our results would have wider implications for data analysts and modellers. In particular, the results throw light on the issues of task engagement and learning effects and raise questions about the ecological validity of questionnaire tasks. These issues are addressed towards the end of the chapter. 30.1.2. Previous Work Our work builds on previous research in three areas; decision-making strategies, decision latencies and the effect of topic acceptance on task engagement. Previous research on people’s response to complex decision tasks has identified the role of simplification strategies and the use of heuristics when tasks become too demanding or when there is low motivation to complete a full analysis of all the relevant information (e.g. Baron, 2008; Newall & Simon, 1972). In the current context, Bonsall, Shires, Maule, Matthews, & Beale (2007) found that many people have a limited ability or motivation to respond to complex price signals and a limited ability to assess the implications of complex pricing regimes.

1. In project DIFFERENT — see www.different-project.eu

Engagement with Survey Tasks

541

Task engagement may be defined as a serious attempt to complete an exercise to the best of one’s ability. Previous research on factors affecting people’s engagement with tasks has identified the importance of their ability and motivation to perform any required analysis. The ‘ability’ issue can be subdivided into (i) their cognitive intellectual capacity and (ii) their access to any relevant facts. The ‘motivation’ issue is more complex. It includes not only the question of tangible external incentives or rewards for identifying the ‘right’ solution (or penalties for failing to do so), but also their attitude to the topic in question (e.g. their perception of its relevance to themselves or to their aspirations) and their general attitude to intellectual tasks (e.g. their tolerance of ambiguity, their need for cognition or their need to evaluate — see, respectively, Cacioppo & Petty, 1982; Jarvis & Petty, 1996; Budner, 1962; or Lamberton, Fedorowitz, & Roohani, 2005). Previous research on decision latencies (e.g. Bettman, Johnson, & Payne, 1990) has established that the time taken to complete a task is related to the nature and inherent complexity of that task and to the decision context and that there can be a significant learning effect. Learning, in this context, is manifest as an increase in the speed and/or accuracy with which a given task is completed. Latencies and accuracies are known to vary significantly between individuals. A previous paper (Bonsall & Lythgoe, 2010) used part of the current data set to explore the amount of effort which participants devoted to answering the questions and focused on the interaction between latency, complexity, propensity to error and decision-making style. The current chapter builds on that analysis, employing a larger data set and paying greater attention to the question of whether the participant’s performance is affected by (in psychological terms, moderated by) their attitudes towards road charging policy.

30.1.3. The Data Source The data analysed here was derived from an experiment conducted in laboratory conditions in Leeds (United Kingdom) and Dresden (Germany). The experiment, conducted with just over 300 participants, was designed to explore the relationship between participants’ attitudes to a controversial policy intervention (road charging), and their engagement with the task of assessing the implications of complex examples of such policies and how they might respond to them. The experimental protocol, including a list of questions asked, is provided in Appendix 30.A.1 but, for convenience, is briefly summarised here. There were three parts to the experiment. In the first part, the participant was presented with descriptions of road charging schemes in an entirely hypothetical network (five such schemes, labelled A–E, were presented in randomised order — each with a different level of complexity — the most complex having three zones and two time periods). Figure 30.1 shows how, in the English language version, the simplest and most complex schemes (A and E respectively) were presented. Equivalent presentations were used in the German version. After each presentation the participant was asked to estimate what the charge would be for a specified journey in that network, how certain they were about that

542

Peter Bonsall et al.

Figure 30.1: Screen dumps for Schemes A (left) and E (right).

estimate (task certainty), and how easy it had been to understand the price structure (task difficulty). In the second part of the experiment, the participant was given a description, accompanied by an appropriate map, of a charging scheme which might be introduced in their home city and was asked to estimate what such a scheme might cost them per month (assuming no change in behaviour), how confident they were of that estimate, how effective they thought such a scheme might be, whether they expected to be better or worse off if such a scheme were introduced, what response they thought they would be most likely to make, whether they thought the charge fair or unfair, whether they would approve its introduction, and how complicated they thought it had been to understand. Five different schemes were used — each characterised by a different degree of complexity. In the UK experiment, the participant was shown all five schemes in randomised order. In the German version, only one, randomly selected, scheme was presented. In the final part of the experiment, participants were asked questions about their decision-making style2 and to provide socio-economic data including gender, age, employment status, educational background and household income.

2. Previous research into decision-making styles has employed extensive batteries of questions to explore people’s need for cognition, their need to evaluate and their tolerance of ambiguity (see, respectively, Cacioppo & Petty, 1992; Jarvis & Petty, 1996; and Budner, 1962 or Lamberton et al., 2005). However, as is generally the case with field questionnaires, it would not have been practical to extend an already lengthy questionnaire by requiring participants to answer almost 100 further questions. We sought instead to see whether the use of a single question drawn from each of the full batteries would provide a useful and practical approach by which to classify participants’ decision-making styles — in doing this we recognise, of course, that we cannot claim that our single questions are fully representative of the batteries from which they were drawn and we therefore do not claim that our categorisation of decision-making styles is the same as that used in previous literature in that field.

Engagement with Survey Tasks

543

The experimental software stored the decision latencies (time taken by the participant to answer each question). All data were collected in a controlled laboratory environment in which all the computers were all of the same speed and were displaying the briefing material in an identical way (thus minimising external influences on the latencies). The controlled environment also made it possible to ensure that participants were not interrupted during the task and that each participant had the same amount of preparation time. The experiments were conducted in batches between June 2007 and January 2009. Participants were recruited using a variety of methods including emails, posters, personal networks and on-street recruitment. Participants were offered a small financial inducement (d5 in the United Kingdom, 5h in Germany) for participating in ‘a 15 minute questionnaire on a computer’ or ‘a survey about motoring costs’. The UK sample was limited to people who were drivers in the Leeds area but no equivalent restriction was applied to the Dresden sample. Data were collected from 305 participants (for detailed sample description see Appendix 30.A.2). Data from seven participants were rejected as incomplete or having unusually high latencies (W3 S.D above the mean for the full sample). Although the samples were not designed to be representative of any specific population, it turns out that the Leeds sample is fairly representative of drivers within the Leeds area — all be it with a bias towards students, people with degrees, females and people on higher incomes, while the Dresden sample is fairly representative of the local student population.

30.2. Results 30.2.1. Introduction The data from Leeds and Dresden were merged and the combined dataset was explored using tabulations, graphs, statistical tests and models to test for expected relationships.3 Where interesting results were apparent, further tests were run to explore possible explanations. Our findings are presented and discussed below under a series of headings with relevant graphs or tables included in the text where appropriate. However, for convenience and to avoid repetition, the results of the modelling work are all presented together in one table. Ideally, the models would permit detailed exploration of joint impacts, and allow for correlation between explanatory variables and the theoretical lower bound on

3. There are some differences between the results obtained in Leeds and Dresden. It is not clear whether they are due to differences in the characteristics of the samples, differences between the cities (e.g. in terms of familiarity with road charging) or to minor differences in the experimental protocols — not least the impossibility of expressing questions in precisely the same way in two different languages.

0.46 27.59 1490

*

–3.72 –2.27

*

*

*

*

* * 7.17 16.05 0.20 6.96 47.14 26.07 0.05 14.32 0.14 12.50

0.68 200.81 298

*

*

*

104.80 3.91

*

*

0.06 239.5 195

*

*

* 85.90 2.32 –95.15 –1.99

*

2b TotTim Leeds 917.06 34.00

0.09 77.11 103

*

*

*

* 40.23 2.61 62.78 2.33

2c TotTim Dresden 295.71 24.60

0.08 84.20 298

–0.288 –2.51

*

*

*

*

*

*

–34.09 –3.35

0.583 2.46

*

56.04 3.29

3 ProbErrr

(0.27)® n.a. 1490

*

* –0.93 19.07 –0.39 5.41 –0.48 7.78 –0.73 12.92

*

*

* * 0.40 29.60 0.03 76.13 2.25 37.22

*

–4.55 75.24

4 Err?

0.05 141.1 1490

–0.33 –2.84

*

*

*

*

*

*

–34.07 –4.52

* 0.691 5.20 58.77 5.61

* *

*

35.06 3.81

5 EstErr

0.01 0.918 1490

*

*

*

*

0.095 1.96

*

*

0.002 2.97

*

* *

2.86 61.97

6 OverEst

0.16 28.40 298

*

*

* 24.16 7.20

–8.89 –2.69

*

29.08 10.52

7 MeanAp

0.01 37.51 1490

–0.97 –3.08

*

59.20 41.01

8 EstTim

0.49 28.74 745

*

–5.33 –2.38

*

*

*

*

* * 8.06 12.30 0.21 5.22 51.47 19.37 0.04 9.40 0.129 8.06

*

9a EstTim low ap –19.93 –3.93

0.43 26.20 745

*

*

*

*

*

*

* * 6.24 10.33 0.18 4.63 42.69 17.52 0.05 10.49 0.17 10.04

*

9b EstTim high ap –19.95 –4.43

0.02 0.911 1490

0.004 4.94

2.97 82.66 –0.269 –4.99

10 OverEst

Notes: Bold figures are the regression coefficients . Italic figures are the test statistics (t for all models except 4 – for which it is w). Stripe-shaded blocks indicate variables not offered to the model. * Indicates explanatory variables offered to the model, but excluded because they added insufficient additional explanation. ® Model 4 is a binary logistic model for which we show Nagelkerkes R square .

Root MSE Number of obs

Adjusted R square

EstTim

ExtraLat

RichD

DegreeD

StudentD

OldD

FemD

MentalSatD

MeanAp

OthertimG

OthertimU

FirstD

Complic†

NZones NPeriods NNos

*

891.74 48.13 –576.27 –22.01

–19.81 –5.76

Constant

DresdenD

2a TotTim

1 EstTim

Model Dependent Variable Subset?

Table 30.1: Regression models.

Explanatory variables:

544 Peter Bonsall et al.

:

Engagement with Survey Tasks

545

546

Peter Bonsall et al.

latencies.4 Resources did not permit such an approach and our exploration of the data has therefore been limited to the construction of regression models. Table 30.1 contains 13 of the models which were tested and shows values for the variables which offered significant explanation (at 5%) of the dependent variable. Except where indicated otherwise, the models were run using SAS stepwise regression as described in the notes following Table 30.1.

30.2.2. General Findings on Time Taken to Answer the Questions On average, participants took about 55 seconds to estimate each charge. However, the time taken varied tremendously between different participants and for different schemes, and was highest for the first presented scheme. Figure 30.2 shows how the time taken varies with the order of presentation and shows clear evidence of a learning effect (the first presented scheme typically took about 90 seconds while the fifth took around 40 seconds). The difference between successive presentations is greatest between the first and second and decreases at a decelerating rate beyond that. Figure 30.3 shows the time taken to estimate the charges for each of the five charging schemes.5 There is a progression from Scheme A which took an average of 37 seconds, to Scheme E which took an average of 82 seconds. The difference between times taken for Scheme B (which was spatially differentiated) and Scheme C (which was temporally differentiated) is minimal but, on balance, more time seems to have been spent on the temporally differentiated scheme. The significantly greater time required for Scheme E suggests that its greater complexity presents significant challenges for the respondents.

latency (seconds)

100 80 60 40 20 0 #1

#2

#3

#4

#5

Figure 30.2: Decrease in latency with order of presentation.

4. Strictly speaking, the specification of explanatory models of latencies should allow for the fact that latencies cannot be less than zero. However, this omission does not affect our conclusions. 5. The schemes are defined in Appendix 30.A.1, but, to summarise: Scheme A had one zone and one time period; Scheme B had two zones and one time period; Scheme C had one zone but two time periods: Scheme D had two zones and two time periods; Scheme E had three zones and two time periods.

latency (seconds)

Engagement with Survey Tasks

547

90 80 70 60 50 40 30 20 10 0 SchemeA

SchemeB

SchemeC mean

SchemeD

SchemeE

SD

Figure 30.3: Time taken to estimate charge for each scheme.

Model 1 indicates that the main determinants of the amount of time that a participant takes to estimate the cost of a given hypothetical scheme are: the amount of time they took to answer all other questions (Othertim), whether it was the first scheme presented to them (FirstD), how complex it was (NNos) and how difficult they had found it to understand (Complic). Over and above this, it appears that people with degrees took less time. These results are intuitively reasonable and confirm the importance of personal factors, the inherent complexity of the task and the presence of a learning effect. Given the importance of the OtherTim variable in Model 1, a separate Model 2a was run to see if the total time taken could be explained by any personal characteristics. From these models we learn that age (OldD), gender (FemD) and being a student (StudentD) all have an influence — with people under 35, males and students all tending to take less time.6

30.2.3. General Findings on the Accuracy of the Estimated Charges We have explored the accuracy of charge estimates provided by the participants in several different ways. On average, 24% of the estimates of the hypothetical charges were incorrect. Figure 30.4 shows that this proportion was lowest for Scheme A and highest for Scheme E, thus confirming our expectation that errors would increase with scheme complexity.

6. The differences between Models 2a, 2b and 2c are largely explained by the fact that the Dresden experiment took less time overall (because it included only one set — compared to Leeds’ five sets — of the ‘local’ scheme questions) — hence the massive negative value for DresdenD in Model 2a. The absence of StudentD from Model 2c is probably explained by the fact that the Dresden sample was predominantly students.

548

Peter Bonsall et al. 60

%

40 20 0

A

B

C

D

E

Figure 30.4: Percentage of erroneous estimates — varying with scheme complexity.

50 40 30 20 10 0

%

#1

#2

#3

#4

#5

Figure 30.5: Percentage of erroneous estimates — varying with presentation order. Figure 30.5 shows that the proportion of errors is highest for the first presented scheme and was lowest for the last presented — confirming our expectation that the likelihood of error would reduce with experience. The fact that errors did not re-emerge during the later presentations suggests that respondent fatigue was not a problem during the exercise — although it is possible that any fatigue effect was simply drowned out a continued learning effect. Model 3 seeks to explain the probability that a given participant provides an inaccurate estimate of the charge.7 It suggests that this probability will be higher for people who: report finding the schemes difficult to understand (Complic); do not claim satisfaction from completing a mental task (MentalSatD); and provided the estimate quickly (EstTim). These results are intuitively reasonable and may be interpreted as evidence of disengagement (feeling no desire for the mental challenge, finding it difficult and completing it quickly). Model 4 seeks to explain the likelihood that a given estimate will be erroneous. It suggests that errors are more likely when scheme is complicated (NNos); is perceived as difficult by the participants (Complic); is the first scheme presented (FirstD) and when the estimate was made unusually quickly (ExtraLat). It also suggests that participants who are students, hold a degree or come from the top three income categories are less likely to make errors. Models 5 and 6 seek to explain the nature of errors made. Model 5 seeks to explain their size and suggests that they tend to be larger if the scheme was the first

7. Since each respondent provides five estimates, this probability will be 0.0, 0.2, 0.4, 0.6, 0.8 or 1.0.

Engagement with Survey Tasks

549

one tackled by the participant (FirstD) and the more difficulty the participant reported having experienced (Complic). However, given the difficulty, the less time the participant took to make the estimate (EstTim) the more error they made. This is interesting because it suggests that large errors are associated with attempts to produce estimates more quickly than is warranted by the difficulty of the task. Other interesting aspects of Model 5 are the findings that people who derive satisfaction from completing a mental task (MentalSatD) make more accurate estimates and that, having allowed for perceived difficulty (Complic) none of the objective measures of complexity (i.e. NZones, Nperiods, and NNos) add further explanation. Model 6 is statistically weak but suggests that the general tendency to overestimate charges was most marked for female participants (FemD) and when the participant reported finding the scheme difficult to understand (Complic).

30.2.4. Further Findings on Effect of Increasing Complexity We have already noted that participants took longer and made more errors to estimate the charges for the most complex schemes. Figure 30.6 shows that the tendency to make significant errors (i.e. at least 10% different from the true value) increases with the complexity of the scheme (reflecting what was seen in Figure 30.3). However, it appears that minor errors (within 10% of the true value) are non-existent for Schemes A, B and C; they only occur for the more complex Schemes D and E. This result could be interpreted as evidence of different influences being responsible for moderate and extreme errors. It may be that, for the simpler schemes (for which the estimation is quite simple), errors only occur as a result of lack of engagement, whereas, for the more complex schemes, errors also occur as a result of lack of cognitive capacity. Lack of engagement can lead to lapses in 45 40 35

%

30 25 20 15 10 5 0 Scheme A significant error

Scheme B

Scheme C

minor error

Scheme D

Over prediction

Scheme E underprediction

Figure 30.6: Percentage occurrence of error, overestimation and underestimation for each scheme.

Peter Bonsall et al.

550

%occurrence of error

attention or to deliberated violations (both of which are likely to lead to significant errors), whereas errors due to constrained cognitive capacity are more likely to lead to minor errors (Reason, 1990). Figure 30.6 also shows that the tendency to underestimate, and to overestimate, the true charge varies for the different schemes with scheme complexity. Generally, the tendency to over-estimate is greater than the tendency to under-estimate, however, for Scheme B (and, to a much lower extent, in Scheme D) this tendency is reversed. The fact that B and D had two zones may be relevant here — perhaps the framing of the charge description encouraged under-prediction? Figure 30.7 shows, for Schemes A and E, how the likelihood of error (an incorrect estimate of the true charge) varies with the length of time taken to estimate the charge (results for people in the fastest five percentiles are shown at the extreme left of the graph; those in the slowest five percentiles are at the extreme right). The likelihood or error is clearly greater for the more complicated scheme but, more interestingly, it seems that, for the simplest scheme (A), additional time taken is generally associated with an increased likelihood of error. Whereas, for the most complex scheme (E), except for the slowest 10% of people, additional time taken is generally associated with reduced likelihood of error. Our interpretation of this result is that, when the task is simple, extra time taken is an indication that the respondent is having difficulty whereas, when the task is more complex, it generally indicates greater effort is being made to obtain the correct answer (only for the slowest 10% of people does it seem to indicate difficulties which are not overcome by additional effort). Figure 30.8 shows that, compared to the simplest scheme (A), the most complex scheme (E) is perceived to be more difficult to understand. This is unsurprising; however, it also reveals a very interesting ‘U’ shaped relationship between time taken and difficulty reported. It shows that, for both schemes, the faster respondents (in the left-hand part of the graph) tend to report less difficulty in understanding if they have spent more time estimating the charge, whereas the slower respondents (in the right-hand part of the graph) tend to report more difficulty the longer they 80 70 60 50 40 30 20 10 0 1st - 5th

6th - 10th

11th - 25th

26th - 50th

51st - 75th

76th - 90th 91st - 95th 96th - 100th

latency percentiles for specified scheme Scheme A

Scheme E

Figure 30.7: Likelihood of error versus latency.

Engagement with Survey Tasks

551

80 perception of difficultly (scale 1-100)

70 60 50 40 30 20 10 0 1st - 5th

6th - 10th 11th - 25th 26th - 50th 51st - 75th 76th - 90th 91st - 95th 96th - 100th

latency percentiles for specified scheme Scheme E

Scheme A

Figure 30.8: Perception of difficulty as a function of time taken.

latency (seconds)

120 100 80 60 40 20 0 #1

#2

#3

#4

nnos = 1

nnos = 2

nnos = 4

nnos = 6

#5

Figure 30.9: Latency versus order of presentation — disaggregated by scheme complexity.

have spent trying to estimate the charge. It is not immediately clear how this relates to the patterns noted in Figure 30.7. The fastest participants are, perhaps, those who are not motivated to invest any cognitive effort and, having surmised that the task would require some thought, simply made a guess (and consequently made many errors — though less so in Scheme A because the answer was very easy to derive). Participants in the mid-range of times perhaps perceived less difficulty because they were initially less disengaged, invested some effort and consequently understood the scheme. Figure 30.9 summarises the same data as Figure 30.2 (showing how the time taken to estimate the charge falls with the order of presentation) but is disaggregated to

552

Peter Bonsall et al.

show how this differs according to the complexity of the scheme (separate lines are shown for Scheme A — whose description required one number, Schemes B and C — whose descriptions required two numbers, Scheme D — whose description required four numbers, and Scheme E — whose description required six numbers). It is clear that all schemes show a learning effect — with respondents taking longer for the first presentation than for any other. Further, it seems that, if defined as a reduction in time with each successive presentation, the learning effect seems to persist longest for the most complex scheme (it decelerates markedly after the second presentation for Schemes A, B and C). Expressed in absolute terms, the learning effect is also strongest for the most complex scheme (the time taken to estimate the charge when it is a first presentation is a whole minute longer than when it is a fifth presentation).8

30.2.5. Findings on Effect of Acceptance In this section we investigate the effect of the acceptability of transport pricing strategies on decision latencies, on the accuracy of charge estimates and on related subjective evaluations (perceived task certainty and difficulty). This investigation was conducted in order to test the assumption that their attitude towards a policy instrument (‘acceptance’) might impact individuals’ willingness or motivation to engage with the assessment of that instrument (Schade & Schlag, 2003).9 The acceptance analysis is based on attitudinal information gathered during the second and third parts of the experiment (see Appendix 30.A.1). Each Dresden participant supplied one acceptance score; each Leeds participant supplied five (because, in the Leeds experiment, we wished to have a separate value for each of five different charging schemes). In order to test whether each Leeds participant’s five acceptance values indicate attitudes towards a specific scheme or reflect a general attitude towards road charges (independent of any specific scheme) we computed Cronbach’s Alpha for the five acceptance statements from each Leeds participant. The result was a ¼ 0.93. This high value suggests that the statements serve as indicators for a general attitude towards charging schemes. Based on these results we computed a mean acceptance value for each Leeds participant. For further analysis we divided the whole sample into two groups of participants (Low accepters N ¼ 149; High accepters, N ¼ 149) using a median split of this acceptance value distribution.10

8. We note, however, that expressed in relative terms, the learning effect for Scheme A is actually greater than that for Scheme E — the time taken for the fifth presentation of Scheme A is about 40% of the first presentation whereas the 5th presentation of Scheme E takes around 50% of the time required for its first presentation 9. For a further discussion of additional variables see Hoffmann, Schade, Schlag, and Bonsall (2006). 10. We are aware that this method might produce weaker results but the low sample size prevented use of smaller extreme groups.

Engagement with Survey Tasks

553

Before examining the effect of acceptance, it is worth noting that Model 7 (in Table 30.1) identifies a significant tendency for students to give higher acceptance scores and a lesser, but still significant, tendency for females to give lower acceptance scores. No other variable available to us had any significant influence. Analysis of the original Leeds scores (before they were averaged into one score per participant) allowed us to explore factors associated with acceptance of a scheme. This analysis showed that objective attributes of the scheme had no influence on acceptance but that participants were more likely to accept a scheme if they expected it to be effective in reducing congestion, and if they thought it fair (also, though not significant at 1%, if they expected to benefit from its introduction). 30.2.5.1. Effect on latency We began by testing whether acceptance of road charging (as measured by questions which asked participants to indicate whether they would approve of a specified congestion charging scheme in their city) has an impact on the time taken by participants to estimate the charges. Figure 30.10 suggests that there is some impact — with the high acceptance group tending to take less time for all schemes except Scheme B and most particularly for the most complex schemes (D and E). Regression Model 8 is statistically weak but suggests a negative association between approval (acceptance) and decision time. This is confirmed by an ANOVA (for repeated measurements) in a two factorial design (the dependent variable being the time taken to estimate the charge, Factor 1 being the complexity of the charging scheme and Factor 2 being the acceptance group) — there is an effect for Factor 1 (p ¼ 0.000), Factor 2 (p ¼ 0.045) and an interaction effect of Factor1*Factor2 (p ¼ 0.038). Figure 30.11 shows that the high acceptance group is faster than the low acceptance group only in the first two rounds. In all latter rounds there are no differences (ANOVA for repeated measurements; Factor 1 (order), po0.001, Factor 2 (acceptability); p ¼ 0.045, Factor 1  Factor 2; p ¼ 0.02). Comparison of Models 9a and 9b shows that the latencies of the low acceptance group shows a stronger learning effect, closer link with reported difficulty, and more impact of having a degree — but shows less impact of the time taken to complete the rest of the questionnaire.

90

sec.

70 50 30 10 –10

Scheme A

Scheme B

low acceptance

Scheme C

Scheme D

Scheme E

high acceptance

Figure 30.10: Latency (in seconds) versus complexity and acceptability.

554

Peter Bonsall et al. 120 100 sec.

80 60 40 20 0

#1

#2

#3

#4

low acceptance

#5

high acceptance

Figure 30.11: Latency (in seconds) versus presentation order and acceptance.

50 40 %

30 20 10 0 Scheme A

Scheme B

Scheme C

low acceptance

Scheme D

Scheme E

high acceptance

Figure 30.12: Percentage occurrence of error versus complexity and acceptability. 30.2.5.2. Effect on probability of error We then investigated whether the probability of making an erroneous estimate of the charge (irrespective of size or direction) depends on acceptance of road charging. Figure 30.12 shows that the high acceptance group is slightly more likely to make errors — but the effect occurs only in Schemes A, B and E and is very small. A regression model (not shown in Table 30.1) indicated that the acceptance score offered no significant explanation of the probability of making an erroneous estimate. Figure 30.13 suggests that the effect of presentation order on the likelihood of making errors is slightly different for the two acceptance groups during the latter stages of the experiment. For the first two rounds there are no differences between the groups but, from the third round onwards, high accepters tend to make more errors than low accepters. However, these differences were not significant (w2-test, pW0.201, two-tailed). 30.2.5.3. Effect on overestimation of charges Our measure of error likelihood ignores the direction (and the size) of the error. Overestimations of true prices (and larger estimation errors) might have different behavioural effects than

%

Engagement with Survey Tasks

555

50 45 40 35 30 25 20 15 10 5 0 #1

#3

#2

#4

low acceptance

#5

high acceptance

Figure 30.13: Percentage occurrence of error versus presentation order and acceptance.

30 25 20 15 10 5 0 Scheme A

Scheme B

Over(Hi)

Scheme C

Over(Lo)

Scheme D

Under(Hi)

Scheme E

Under(lo)

Figure 30.14: Percentage occurrence of overestimation and underestimation for each scheme as a function of acceptability.

underestimations (and smaller errors). Model 10 is statistically weak but suggests that high acceptance is associated with overestimation. Figure 30.14 shows that people in the high acceptance group are more likely to overestimate and people of low acceptance group are more likely to underestimate. Interestingly the tendency for high accepters to overestimate increases with the complexity of the scheme. However, these effects do not quite achieve statistical significance (using Mann–Whitney U-Test, Friedman-Test). Figure 30.15 shows a general tendency for under-prediction and over-prediction to reduce as the experiment progresses (this seems to be a general learning effect). For under-predictions, both groups show a particularly marked reduction between the first and second presentations — and the level of under-prediction for the low acceptance group is consistently higher than that of the high acceptance group. For over-predictions, both groups make fewer as the experiment progresses but this reduction is much less marked for high accepters than that of low accepters. Thus, high accepters begin by making a similar number of under-estimations and over-estimations and subsequently make far fewer underestimations, whereas low

556

Peter Bonsall et al. 25 20 15 10 5 0

#1 Over (Hi)

#2

#3

#4 Under (Hi)

Over (Lo)

#5 Under (Lo)

Figure 30.15: Percentage occurrence of overestimation and underestimation as a function of presentation order and acceptance.

Task difficulty

100 80 60 40 20 0 Scheme A

Scheme B

Scheme C

low acceptance

Scheme D

Scheme E

high acceptance

Figure 30.16: Task difficulty depending on complexity and acceptance. accepters make progressively fewer errors but are never more likely to over-estimate than to under-estimate. Differences are significant for round #4 at 5% level and #5 at 10% level (w2-tests, p ¼ 0.021 and 0.058 respectively). 30.2.5.4. Effect on perception of difficulty, and on reported uncertainty Figure 30.16 shows the subjective evaluations of perceived task difficulty for each scheme separately for the two acceptance groups. It shows that Schemes A and C are perceived as less difficult by high accepters (not significant except main effect, complexity po0.001). In all other cases there are no real differences in the perception of task difficulty. Figure 30.17 shows that high accepters’ perception of task difficulty falls dramatically after the first presented scheme whereas that of low accepters falls fairly steadily throughout the experiment (effect of presentation order p ¼ 0.001). Figure 30.18 shows that high accepters are more uncertain about the accuracy of their estimates — and that the difference is fairly constant across all schemes (p ¼ 0.209).

Engagement with Survey Tasks

557

Task Difficulty

50 45 40 35 #1

#3

#2

#4

low acceptance

#5

high acceptance

Uncertainty of estimation(1;very certain,2: