Marketing research [4th Asia- Pacific edition.] 9780170369824

487 103 30MB

English Pages xxx, 602 [633] Year 2017

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

The International Politics of the Asia-Pacific [4th Revised Edition] 1138647063, 9781138647060, 1138647071, 9781138647077

This fully revised fourth edition of Michael Yahuda's successful textbook brings the subject up-to-date, introducin

1,633 90 3MB Read more

Management, 5th Asia-Pacific Edition 9780170259798

711 110 6MB Read more

Mental Health Research in Asia and the Pacific 9780824885908

216 17 125MB Read more

Marketing For Dummies [4th Edition] 9781118880807, 9781118880906, 9781118880654, 1118880803

633 65 3MB Read more

Marketing: Prozess- und praxisorientierte Grundlagen [4th updated edition] 9783110413960

The textbook covers the basic elements of conceptual marketing. The focus is on the marketing of consumer goods, and the

443 113 5MB Read more

Principles of Microeconomics Asia-Pacific Edition [7 ed.]

4,393 322 43MB Read more

Fundamentals of Management [6th Asia-Pacific Edition] 9780170388443

3,540 285 18MB Read more

Broadening Cultural Horizons in Social Marketing: Comparing Case Studies from Asia-Pacific [1st ed.] 9789811585166, 9789811585173

This book presents a series of empirically based case studies conducted by social change scholars from Asia-Pacific, sho

1,097 86 7MB Read more

Marketing Research 9781259004902

3,118 450 11MB Read more

Marketing research [Eight edition] 9780134167404, 1292153261, 9781292153261, 0134167406

TheEighth EditionofMarketing Researchcontinues to provide readers with a "nuts and bolts" introduction to the

1,827 97 26MB Read more

Marketing research [4th Asia- Pacific edition.]
9780170369824

Author / Uploaded
William Zikmund
Steven D'Alessandro
Ben Lowe
Hume Winzar
Barry J. Babin.

Table of contents :
Prelims
Half Title Page
Title page
Copyright Page
Brief contents
Contents
Preface
Acknowledgements
About the authors
Guide to the text
Guide to the online resources
Part One: Introduction to the research process
Chapter 01: The role of marketing research and the research process
The nature of marketing research
Marketing research defined
Basic research and applied research
The managerial value of marketing research for strategic decision-making
Identifying and evaluating opportunities
Analysing and selecting target markets
Planning and implementing a marketing mix
Analysing marketing performance
When is marketing research needed?
Time constraints
Availability of data
Nature of the decision
Benefits versus costs
Business trends in marketing research
Global marketing research
Growth of the Internet and social media
Stages in the research process
Alternatives in the research process
Discovering and defining the problem
Uncertainty influences the type of research
Planning the research design
Descriptive research
Causal research
Sampling
Gathering data
Processing and analysing data
Drawing conclusions and preparing a report
The research program strategy
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
Coursemate online study tools
Written case study 1.1: Buying new zealand made online in china
Written case study 1.2: The quit campaign in australia and singapore
Part Two: Defining the problem
Chapter 02: Problem definition and the research process
The nature of marketing problems
The importance of proper problem definition
The process of defining the problem
Ascertain the decision-makers’ objectives
Understand the background of the problem
Isolate and identify the problem, not the symptoms
Determine the unit of analysis
Determine the relevant variables
State the research questions and research objectives
Clarity in research questions and hypotheses
Decision-oriented research objectives
How much time should be spent defining the problem?
The research proposal
Anticipating outcomes
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
CourseMate online study tools
Written case study 2.1: Drinkwise: changing australia’s drinking culture
Written case study 2.2: A tender for market research services by the australian communications and media authority (acma)
Ongoing case study: Mobile phone switching and bill shock
Part Three: Planning the research design
Chapter 03: Qualitative research
What is qualitative research?
Uses of qualitative research
Diagnosing a situation
Screening alternatives
Discovering new ideas
Qualitative versus quantitative research
Qualitative research orientations
Phenomenology
Ethnography
Grounded theory
Case studies
Analysing qualitative responses
Common techniques used in qualitative research
Focus group interviews
Depth interviews
Projective techniques
Modern technology and qualitative research
A warning about qualitative research
Summary
Key terms and concepts
Questions for review and critical thinking
Coursemate online study tools
Written case study 3.1: Up, up and away. Airborne focus groups with air New zealand
Written case study 3.2: Getting a grip: focus groups and beaurepaires tyres
Ongoing case study
Chapter 04: Secondary research with big data
Secondary data research
Advantages
Disadvantages
Typical objectives for secondary data research designs
Fact-finding
Model building
Data mining
Database marketing and customer relationship management
Sources of secondary data
Internal and proprietary data sources
External data: the distribution system
Information as a product and its distribution channels
Single-source data-integrated information
Sources for global research
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
CourseMate online study tools
Written case study 4.1
Written case study 4.2
Chapter 05: Survey research
The nature of surveys
Survey objectives: Type of information gathered
Advantages of surveys
Errors in survey research
Random sampling error
Systematic error
Respondent error
Nonresponse error
Response bias
Administrative error
Rule-of-thumb estimates for systematic error
What can be done to reduce survey error?
Classifying survey research methods
Structured and disguised questions
Temporal classification
Different ways that marketing researchers conduct surveys
Human interactive media and electronic interactive media
Noninteractive media
Using interviews to communicate with respondents
Personal interviews
Telephone interviews
Self-administered questionnaires
Mail questionnaires
Self-administered questionnaires that use other forms of distribution
Email surveys
Internet surveys
Kiosk interactive surveys
Survey research that mixes modes
Selecting the appropriate survey research design
Pretesting
Ethical issues in survey research
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
CourseMate online study tools
Written case study 5.1
Written case study 5.2
Chapter 06: Observation
What is observation?
When is observation scientific?
What can be observed?
The nature of observation studies
Observation of human behaviour
Complementary evidence
Direct observation
Errors associated with direct observation
Scientifically contrived observation
Combining direct observation and interviewing
Ethical issues in the observation of humans
Observation of physical objects
Content analysis
Mechanical observation
Television monitoring
Monitoring website traffic
Scanner-based research
Measuring physiological reactions
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
CourseMate online study tools
Written case study 6.1
Written case study 6.2
Notes
Ongoing case study
Chapter 07: Experimental research and test marketing
The nature of experiments
Step 1: Field and laboratory experiments
Step 2: Decide on the choice of independent and dependent variable(s)
Step 3: Select and assign test units
Step 4: Address issues of validity in experiments
Step 5: Select and implement an experimental design
Step 6: Address ethical issues in experimentation
Test marketing: An application of field experiments
Step 1: Decide whether to test market or not
Step 2: Work out the functions of test market
Step 3: Decide on the type of test market
Step 4: Decide on the length of the test market
Step 5: Decide where to conduct the test market
Step 6: Estimate and project test market results
Projecting test market results
Consumer surveys
Straight trend projections
Ratio of test product sales to total company sales
Market penetration × repeat-purchase rate
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
Written case study 7.1: Test marketing guinness in Perth, Western Australia
Written case study 7.2: Kellogg Australia revamps nutri-grain
Ongoing case study
Chapter 08: Measurement
The measurement process
Step 1: Determine what is to be measured
Concepts
Step 2: Determine how it is to be measured
Step 3: Apply a rule of measurement
Types of scales
Mathematical and statistical analysis of scales
Step 4: Determine if the measure consists of a number of measures
Computing scale values
Step 5: Determine the type of attitude and scale to be used to measure it
Normative vs ipsative scales
Attitudes as hypothetical constructs
Measuring attitudes is important to managers
The attitude-measuring process
Method of summated ratings: The Likert scale
Semantic differential
Numerical scales
Stapel scale
Constant-sum scale
Graphic rating scales
Measuring behavioural intention
Behavioural differential
Ranking
Sorting
Randomised response questions
Other methods of attitude measurement
Selecting a measurement scale: Some practical decisions
Attitudes and intentions
Multi-attribute attitude score
Best–worst scaling
Sample balanced incomplete block (BIB) designs
Pricing
The net promoter score
Step 6: Evaluate the measure
Reliability
Validity
Reliability versus validity
Sensitivity
Practicality
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
Coursemate online study tools
Written case study 8.1: measuring australia’s social progress
Written case study 8.2: New Zealand consumer confidence edges up in 2015
Ongoing case study
Appendix 8A: Conjoint analysis: measuring Consumer utility
Background to conjoint analysis
Ratings-based conjoint
Advantages of ratings-based conjoint
Disadvantages of ratings-based conjoint
Choice-based conjoint
Advantages of choice-based conjoint
Disadvantages of choice-based conjoint
Conjoint analysis using best–worst scaling
Advantages of best–worst conjoint
Disadvantages of best–worst conjoint
Final comments on conjoint analysis
Chapter 09: Questionnaire design
Step 1: Specify what information will be sought
Step 2: Determine the type of questionnaire and survey research method
Step 3: Determine the content of individual questions
Asking sensitive questions
Step 4: Determine the form of response to each question
Types of closed-response questions
Step 5: Determine the wording of each question
Avoid complexity: use simple, conversational language
Avoid leading and loaded questions
Avoid ambiguity: Be as specific as possible
Avoid double-barrelled items
Avoid making assumptions
Avoid burdensome questions that may tax the respondent’s memory
Step 6: Determine question sequence
Provide good survey flow
Step 7: Determine physical characteristics of the questionnaire
Traditional questionnaires
Internet questionnaires
Step 8: Re-examine and revise steps 1–7 if necessary
Step 9: Pretest the questionnaire
Designing questionnaires for global markets
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
CourseMate online study tools
Ongoing case study
Written case study 9.1: Mobile phone switching and bill shock
Written case study 9.2: May the force be with you! The worldwide growth of jedi knights as a religion
Part Four: Planning the sample
Chapter 10: Sampling: Sample design and sample size
Sampling terminology
Why sample?
Cost and time
Accurate and reliable results
Destruction of test units
Practical sampling concepts
Defining the target population
The sampling frame
Sampling units
Less than perfectly representative samples
Probability versus nonprobability sampling
Probability sampling
Simple random sampling
Systematic sampling
Stratified sampling
Proportional versus disproportional sampling
Cluster sampling
Nonprobability sampling
Convenience sampling
Judgement sampling
Quota sampling
Sampling rare or hidden populations
What is the appropriate sample design?
Degree of accuracy
Resources
Time
Advance knowledge of the population
National versus local project
Mobile devices and the Internet change everything
Website visitors
Panel samples
Recruited ad hoc samples
Opt-in lists
Sample size
Random error and sample size
Systematic error and sample size
Factors in determining sample size for questions involving means
The influence of population size on sample size
Determining sample size on the basis of judgement
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
CourseMate online study tools
Written case study 10.1: The victorian population health survey 2014
Written case study 10.2: The australian taxation office
Part Five: Collecting the data
Chapter 11: Editing and coding: Transforming raw data into information
Stages of data analysis
Before the survey
During the survey
After the survey responses
After the survey
Coding
Code construction
Multiple-response questions
Coding open-ended questions
Coding and analysis of qualitative research data
Qualitative data sources
Coding qualitative data
Coding strategies
Software for qualitative research
Text-mining software
Data-base management software
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
CourseMate online study tools
Written case study 11.1: Questionnaire editing
Part Six: Analysing the data
Chapter 12: Univariate statistical analysis: A recap of inferential statistics
Descriptive and inferential statistics
Sample statistics and population parameters
Making data usable
Frequency distributions
Measures of central tendency
Measures of dispersion
The normal distribution
Why do I need to know about the standardised normal distribution?
Population distribution, sample distribution and sampling distribution
Central-limit theorem
Estimation of parameters
Point estimates
Confidence intervals
Determining sample size
Stating a hypothesis
What is a hypothesis?
Null and alternative hypotheses
Hypothesis testing
The hypothesis-testing procedure
An example of hypothesis testing
Type I and Type II errors
Choosing the appropriate statistical technique
Type of question to be answered
Number of variables
Scale of measurement
Parametric versus nonparametric hypothesis tests
Some practical univariate tests
The t-distribution
Calculating a confidence interval estimate using the t-distribution
Univariate hypothesis test using the t-distribution
Conducting a one-sample t-test in SPSS
Conducting a one-sample t-test in Excel
The Chi-square test for goodness of fit
Conducting a Chi-square test in SPSS
Conducting a Chi-square test in Excel
A reminder about statistics
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
CourseMate online study tools
Written case study 12.1: Attitudes towards water conservation in a drought-stricken area of australia
Written case study 12.2: Gambling in your community: what do the locals think?
Written case study 12.3: Victoria’s legal outsourcing system and customer satisfaction
Ongoing case study
Chapter 13: Bivariate statistical analysis: tests of differences
What is the appropriate test of difference?
The independent samples t-test for differences of means
Conducting an independent samples t-test in SPSS
Conducting an independent samples t-test in Excel
Paired-samples t-test
Conducting a paired samples t-test in SPSS
Conducting a paired-samples t-test in Excel
Analysis of variance (ANOVA)
The F-test
Calculating the F-ratio
Conducting an ANOVA in SPSS
Conducting an ANOVA in Excel
Nonparametric statistics for tests
Statistical and practical significance for tests of differences
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
CourseMate online study tools
Written case study 13.1: Do social marketing initiatives work? The not so dark side of marketing
Written case study 13.2: Gambling in your community: how do perceptions vary by different consumers?
Ongoing case study
Chapter 14: Bivariate statistical Analysis: Tests of association
The basics
Pearson’s correlation coefficient
An example
Coefficient of determination
Correlation matrix
Running a correlation in SPSS
Running a correlation in Excel
Nonparametric correlation
Regression analysis
Least-squares method of regression analysis
Drawing a regression line
Tests of statistical significance
Running a regression in SPSS
Running a regression in Excel
Cross-tabulations: The Chi-square test for goodness of fit
Cross-tabulation and Chi-square tests in SPSS
Cross-tabulation and Chi-square tests in Excel
Statistical and practical significance for tests of association
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
Written case study 14.1: Neighbourhood renewal and the use of Marketing research
Written case study 14.2: How innovative is a new product and why?
Written case study 14.3: Food labelling and country of origin
Ongoing case study
Chapter 15: Multivariate statistical analysis
The nature of multivariate analysis
Classifying multivariate techniques
The analysis of dependence
The analysis of interdependence
Influence of measurement scales
Analysis of dependence
n-way cross-tabulation
Partial correlation analysis
n-way univariate analysis of variance (ANOVA)
Multiple regression analysis
Binary logistic regression
Analysis of interdependence
Exploratory factor analysis
Cluster analysis
Multidimensional scaling
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
CourseMate online study tools
Written case study 15.1: Coastal star sales corporation
Part Seven: Formulating conclusions and writing the final report
Chapter 16: Communicating research results: research report, oral presentation and research follow-up
Insights from the communications model
The report in context
Report format
Body of the research report
Summary
Body of a formal descriptive research report
Two-stage research report – exploratory followed by descriptive
Additional parts of the report
Templates and styles
Presenting qualitative research
Deriving themes
Qualitative presentation styles
Checklist for reporting qualitative research
Graphic aids
Tables
Statistical output
Charts
Maximise the ‘data to ink’ ratio
The oral presentation
Learn by example and practice
Audio-visual aids
Summary
Key terms and concepts
Questions for review and critical thinking
Ongoing project
Coursemate online study tools
Appendix A: Statistical tables
Glossary
Index

Citation preview

4TH ASIA–PACIFIC EDITION

Express

STUDY SMARTER. GO ONLINE. Express AT

CENGAGEBRAIN.COM

BUY NEW F

OR 6-MO N ACCE TH SS!!

Bring your learning to life with interactive learning, study, and exam preparation tools that support the printed textbook. Your comprehension soars as you work with the printed textbook and the textbook-specific website. CourseMate Express includes 6-month access to datasets, quizzes and flashcards and more. Also included is 4-month access to Qualtrics.

Students located in Asia: please go to www.cengage.com/login to access online resources.

Search me! Marketing – An online research library just for this subject. Now it is easy to explore current issues relevant to your studies and research your assignments, without getting lost

INSTRUCTORS: Resources to help you teach are available for this textbook. Visit cengage.com.au/instructors or cengage.co.nz/instructors

WILLIAM ZIKMUND STEVE D’ALESSANDRO HUME WINZAR BEN LOWE BARRY BABIN

MARKETING RESEARCH

ZIKMUND D’ALESSANDRO WINZAR LOWE BABIN

AVAILABLE THROUGH >> CENGAGEBRAIN.COM

MARKETING RESE ARCH

ACCESS

4TH ASIA–PACIFIC EDITION

’T DONOUT! MISS E W

N B UFYOR YOURE

IN ONLURCES!!

RESO

For learning solutions, visit cengage.com

zikmund_sb_69824_cvr_final.indd All Pages

16/09/2016 3:45 PM

MARKETING RESEARCH

4TH ASIA–PACIFIC EDITION

WILLIAM ZIKMUND STEVE D ’ALESSANDRO BEN LOWE HUME WINZAR BARRY J. BABIN

MARKETING RESEARCH

Australia • Brazil • Japan • Korea • Mexico • Singapore • Spain • United Kingdom • United States

Marketing Research 4th Edition William Zikmund Steve D’Alessandro Hume Winzar Ben Lowe Barry Babin

Publishing manager: Dorothy Chiu Publishing editor: Dorothy Chiu Developmental editor: Lydia Crisp Project editor: Kate McGregor Art direction: Danielle Maccarone Cover designer: Watershed Design/Leigh Ashforth Text designer: Watershed Design/Leigh Ashforth Permissions/Photo researcher: Helen Mammides Editor: Helen Moore Proofreader: James Anderson Indexer: Julie King Macmillian Publishing Solutions Any URLs contained in this publication were checked for currency during the production process. Note, however, that the publisher cannot vouch for the ongoing currency of URLs.

© 2017 Cengage Learning Australia Pty Limited Copyright Notice This Work is copyright. No part of this Work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means without prior written permission of the Publisher. Except as permitted under the Copyright Act 1968, for example any fair dealing for the purposes of private study, research, criticism or review, subject to certain limitations. These limitations include: Restricting the copying to a maximum of one chapter or 10% of this book, whichever is greater; providing an appropriate notice and warning with the copies of the Work disseminated; taking all reasonable steps to limit access to these copies to people authorised to receive these copies; ensuring you hold the appropriate Licences issued by the Copyright Agency Limited (“CAL”), supply a remuneration notice to CAL and pay any required fees. For details of CAL licences and remuneration notices please contact CAL at Level 15, 233 Castlereagh Street, Sydney NSW 2000, Tel: (02) 9394 7600, Fax: (02) 9394 7601 Email: [email protected] Website: www.copyright.com.au For product information and technology assistance, in Australia call 1300 790 853; in New Zealand call 0800 449 725 For permission to use material from this text or product, please email [email protected]

Third edition published in Australia in 2014 National Library of Australia Cataloguing-in-Publication Data Creator: Zikmund, William, author. Title: Marketing research / William Zikmund, Steven D'Alessandro, Hume Winzar, Ben Lowe, Barry J. Babin. Edition: 4th edition. ISBN: 9780170369824 (paperback) Notes: Includes index. Subjects: Marketing research--Textbooks. Marketing research--Study and teaching. Other Creators/Contributors: D'Alessandro, Steven, author. Winzar, Hume, author. Lowe, Ben, 1978- author. Babin, Barry, author. Dewey Number: 658.83

Cengage Learning Australia Level 7, 80 Dorcas Street South Melbourne, Victoria Australia 3205 Cengage Learning New Zealand Unit 4B Rosedale Office Park 331 Rosedale Road, Albany, North Shore 0632, NZ For learning solutions, visit cengage.com.au Printed in China by China Translation & Printing Services. 1 2 3 4 5 6 7 21 20 19 18 17

BRIEF CONTENTS PREFACE XVIII ACKNOWLEDGEMENTS XXIII ABOUT THE AUTHORS

XXIV

GUIDE TO THE TEXT

XXVI

GUIDE TO THE ONLINE RESOURCES

XXX

PART ONE

01 » THE ROLE OF MARKETING RESEARCH AND THE RESEARCH PROCESS 4

PART TWO

02 » PROBLEM DEFINITION AND THE RESEARCH PROCESS

PART THREE

03 » QUALITATIVE RESEARCH

62

04 » SECONDARY RESEARCH WITH BIG DATA

98

05 » SURVEY RESEARCH

126

INTRODUCTION TO THE RESEARCH PROCESS

DEFINING THE PROBLEM

PLANNING THE RESEARCH DESIGN

40

06 » OBSERVATION 172 07 » EXPERIMENTAL RESEARCH AND TEST MARKETING

192

08 » MEASUREMENT 235

APPENDIX 8A CONJOINT ANALYSIS: MEASURING CONSUMER UTILITY

285

09 » QUESTIONNAIRE DESIGN

301

PART FOUR

10 » SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

342

PART FIVE

11 »

EDITING AND CODING: TRANSFORMING RAW DATA INTO INFORMATION

376

PART SIX

12 »

UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

398

13 »

BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

441

PLANNING THE SAMPLE

COLLECTING THE DATA

ANALYSING THE DATA

PART SEVEN

FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT APPENDIX A STATISTICAL TABLES

14 » BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

471

15 »

516

MULTIVARIATE STATISTICAL ANALYSIS

16 » COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

548

574

GLOSSARY 581 INDEX

591

v

CONTENTS PREFACE XVIII ACKNOWLEDGEMENTS XXIII ABOUT THE AUTHORS

XXIV

GUIDE TO THE TEXT

XXVI

GUIDE TO THE ONLINE RESOURCES

PART ONE

INTRODUCTION TO THE RESEARCH PROCESS

XXX

01 » THE ROLE OF MARKETING RESEARCH AND THE RESEARCH PROCESS The nature of marketing research

Marketing research defined Basic research and applied research The managerial value of marketing research for strategic decision-making

Identifying and evaluating opportunities Analysing and selecting target markets Planning and implementing a marketing mix Analysing marketing performance When is marketing research needed?

Time constraints Availability of data Nature of the decision Benefits versus costs

4 5

6 7 7

8 9 9 12 13

13 14 14 14

Business trends in marketing research

15

Global marketing research Growth of the Internet and social media

15 16

Stages in the research process

Alternatives in the research process Discovering and defining the problem Uncertainty influences the type of research Planning the research design Descriptive research Causal research Sampling Gathering data Processing and analysing data Drawing conclusions and preparing a report The research program strategy

16

18 18 20 21 24 25 27 29 29 30 31

Summary 32 Key terms and concepts

33

Questions for review and critical thinking

33

Ongoing project

34

CourseMate online study tools

Written case study 1.1 Buying New Zealand made online in China Written case study 1.2 The Quit campaign in Australia and Singapore

vi

35

35 35

CONTENTS

PART TWO

DEFINING THE PROBLEM

02 » PROBLEM DEFINITION AND THE RESEARCH PROCESS

40

The nature of marketing problems

41

The importance of proper problem definition

42

The process of defining the problem

Ascertain the decision-makers’ objectives Understand the background of the problem Isolate and identify the problem, not the symptoms Determine the unit of analysis Determine the relevant variables State the research questions and research objectives

43

43 44 45 46 46 48

Clarity in research questions and hypotheses

49

Decision-oriented research objectives

50

How much time should be spent defining the problem?

51

The research proposal

52

Anticipating outcomes

55

Summary 56 Key terms and concepts

56

Questions for review and critical thinking

56

Ongoing project

57

CourseMate online study tools

57

Written case study 2.1 DrinkWise: Changing Australia’s drinking culture Written case study 2.2 A tender for market research services by the Australian Communications and Media Authority (ACMA)

57

58

Ongoing case study: Mobile phone switching and bill shock 59

PART THREE

PLANNING THE RESEARCH DESIGN

03 » QUALITATIVE RESEARCH What is qualitative research?

62 63

Uses of qualitative research

63

Diagnosing a situation Screening alternatives Discovering new ideas

64 64 65

Qualitative versus quantitative research Qualitative research orientations

Phenomenology Ethnography Grounded theory Case studies

66 66

66 67 68 68

Analysing qualitative responses

70

Common techniques used in qualitative research

71

Focus group interviews Depth interviews Projective techniques Modern technology and qualitative research A warning about qualitative research

71 79 80 85 88

Summary 93

vii

viii

CONTENTS

Key terms and concepts

94

Questions for review and critical thinking

94

Ongoing project

95

Coursemate online study tools

95

Written case study 3.1 Up, Up and Away. Airborne focus groups with Air New Zealand Written case study 3.2 Getting a grip: Focus groups and Beaurepaires Tyres Ongoing Case Study

04 » SECONDARY RESEARCH WITH BIG DATA Secondary data research

Advantages Disadvantages Typical objectives for secondary data research designs

Fact-finding Model building Data mining Database marketing and customer relationship management Sources of secondary data

Internal and proprietary data sources External data: the distribution system Information as a product and its distribution channels

95 95 96

98 99

99 100 103

103 105 109 111 112

113 113 113

Single-source data-integrated information

121

Sources for global research

121

Summary 122 Key terms and concepts

123

Questions for review and critical thinking

123

Ongoing project

123

CourseMate online study tools

124

Written case study 4.1 Twitter: Truth or dare? Written case study 4.2 The electric car in Australia Ongoing case study

124 124 125

05 » SURVEY RESEARCH

126

The nature of surveys

127

Survey objectives: Type of information gathered Advantages of surveys

127 127

Errors in survey research

128

Random sampling error Systematic error

128 128

Respondent error

Nonresponse error Response bias Administrative error Rule-of-thumb estimates for systematic error What can be done to reduce survey error?

129

129 131 134 135 135

CONTENTS

Classifying survey research methods

Structured and disguised questions Temporal classification

136

136 136

Different ways that marketing researchers conduct surveys

138

Human interactive media and electronic interactive media Noninteractive media

138 139

Using interviews to communicate with respondents

Personal interviews Telephone interviews Self-administered questionnaires

Mail questionnaires Self-administered questionnaires that use other forms of distribution

Email surveys Internet surveys Kiosk interactive surveys Survey research that mixes modes Selecting the appropriate survey research design

139

139 145 150

150 157

157 159 163 163 163

Pretesting 165 Ethical issues in survey research

165

Summary 166 Key terms and concepts

167

Questions for review and critical thinking

168

Ongoing project

169

CourseMate online study tools

169

Written case study 5.1 Google Consumer Surveys Written case study 5.2 Pedal power in Auckland Ongoing case study

170 170 170

06 » OBSERVATION172 What is observation?

173

When is observation scientific?

173

What can be observed?

173

The nature of observation studies

174

Observation of human behaviour

175

Complementary evidence Direct observation

Errors associated with direct observation Scientifically contrived observation

176 177

178 178

Combining direct observation and interviewing

178

Ethical issues in the observation of humans

179

Observation of physical objects

179

Content analysis

180

Mechanical observation

181

Television monitoring Monitoring website traffic

181 183

ix

x

CONTENTS

Scanner-based research Measuring physiological reactions

183 185

Summary 188 Key terms and concepts

188

Questions for review and critical thinking

189

Ongoing project

189

CourseMate online study tools

190

Written case study 6.1 The Pepsi / Coke challenge and neuroscience Written case study 6.2 Mazda and Syzygy Ongoing case study

07 » EXPERIMENTAL RESEARCH AND TEST MARKETING The nature of experiments

Step 1: Field and laboratory experiments Step 2: Decide on the choice of independent and dependent variable(s) Step 3: Select and assign test units Step 4: Address issues of validity in experiments Step 5: Select and implement an experimental design Step 6: Address ethical issues in experimentation

190 190 191

192 193

194 196 199 200 206 216

Test marketing: An application of field experiments

217

Step 1: Decide whether to test market or not Step 2: Work out the functions of test market Step 3: Decide on the type of test market Step 4: Decide on the length of the test market Step 5: Decide where to conduct the test market Step 6: Estimate and project test market results

218 219 220 224 224 226

Projecting test market results

Consumer surveys Straight trend projections Ratio of test product sales to total company sales Market penetration 3 repeat-purchase rate

228

228 228 228 228

Summary 229 Key terms and concepts

231

Questions for review and critical thinking

231

Ongoing project

232

Written case study 7.1 Test marketing Guinness in Perth, Western Australia Written case study 7.2 Kellogg Australia revamps Nutri-Grain Ongoing case study

233 233 233

08 » MEASUREMENT235 The measurement process

236

Step 1: Determine what is to be measured

236

Concepts

237

CONTENTS

Step 2: Determine how it is to be measured

238

Step 3: Apply a rule of measurement

239

Types of scales Mathematical and statistical analysis of scales Step 4: Determine if the measure consists of a number of measures

Computing scale values Step 5: Determine the type of attitude and scale to be used to measure it

Normative vs ipsative scales Attitudes as hypothetical constructs Measuring attitudes is important to managers The attitude-measuring process Method of summated ratings: The Likert scale Semantic differential Numerical scales Stapel scale Constant-sum scale Graphic rating scales Measuring behavioural intention Behavioural differential Ranking Sorting Randomised response questions Other methods of attitude measurement Selecting a measurement scale: Some practical decisions Attitudes and intentions Multi-attribute attitude score Best–worst scaling

Sample balanced incomplete block (BIB) designs Pricing The net promoter score Step 6: Evaluate the measure

Reliability Validity Reliability versus validity Sensitivity Practicality

239 241 242

242 244

244 245 246 246 248 251 253 253 254 254 256 258 258 259 260 261 261 264 264 266

270 272 273 274

274 276 277 278 279

Summary 279 Key terms and concepts

280

Questions for review and critical thinking

280

Ongoing project

282

CourseMate online study tools

282

Written case study 8.1 Measuring Australia’s social progress Written case study 8.2 New Zealand consumer confidence edges up in 2015 Ongoing case study

282 283 283

xi

xii

CONTENTS

APPENDIX 8A: Conjoint analysis: Measuring consumer utility Background to conjoint analysis Ratings-based conjoint

Advantages of ratings-based conjoint Disadvantages of ratings-based conjoint Choice-based conjoint

Advantages of choice-based conjoint Disadvantages of choice-based conjoint Conjoint analysis using best–worst scaling

285 285 286

289 289 290

294 294 295

Advantages of best–worst conjoint Disadvantages of best–worst conjoint

296 298

Final comments on conjoint analysis

299

09 » QUESTIONNAIRE DESIGN

301

Step 1: Specify what information will be sought

302

Step 2: Determine the type of questionnaire and survey research method

303

Step 3: Determine the content of individual questions

Asking sensitive questions Step 4: Determine the form of response to each question

Types of closed-response questions Step 5: Determine the wording of each question

Avoid complexity: use simple, conversational language Avoid leading and loaded questions Avoid ambiguity: Be as specific as possible Avoid double-barrelled items Avoid making assumptions Avoid burdensome questions that may tax the respondent’s memory Step 6: Determine question sequence

Provide good survey flow Step 7: Determine physical characteristics of the questionnaire

Traditional questionnaires Internet questionnaires

304

304 306

308 310

310 311 313 314 315 316 317

319 321

321 326

Step 8: Re-examine and revise steps 1–7 if necessary

331

Step 9: Pretest the questionnaire

332

Designing questionnaires for global markets

333

Summary 333 Key terms and concepts

334

Questions for review and critical thinking

335

Ongoing project

337

CourseMate online study tools

337

Ongoing case study

337

Written case study 9.1 May the Force be with you! The worldwide growth of Jedi Knights as a religion Written case study 9.2 Marketing strategies in China

337 338

CONTENTS

PART FOUR

PLANNING THE SAMPLE

10 » SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE Sampling terminology Why sample?

Cost and time Accurate and reliable results Destruction of test units Practical sampling concepts

Defining the target population The sampling frame Sampling units Less than perfectly representative samples Probability versus nonprobability sampling Probability sampling

Simple random sampling Systematic sampling Stratified sampling Proportional versus disproportional sampling Cluster sampling Nonprobability sampling

Convenience sampling Judgement sampling Quota sampling Sampling rare or hidden populations What is the appropriate sample design?

Degree of accuracy Resources Time Advance knowledge of the population National versus local project Mobile devices and the Internet change everything

Website visitors Panel samples Recruited ad hoc samples Opt-in lists Sample size Random error and sample size Systematic error and sample size Factors in determining sample size for questions involving means The influence of population size on sample size Determining sample size on the basis of judgement

342 343 343

343 344 344 344

344 345 348 349 350 351

353 353 354 354 355 356

356 357 357 358 359

359 361 361 361 361 361

362 363 363 363 364 364 365 365 366 366

Summary 368 Key terms and concepts

369

Questions for review and critical thinking

369

Ongoing project

370

CourseMate online study tools

370

Written case study 10.1 The Victorian Population Health Survey 2014 Written case study 10.2 The Australian Taxation Office

371 371

xiii

xiv

CONTENTS

PART FIVE

COLLECTING THE DATA

11 »

DITING AND CODING: TRANSFORMING RAW E DATA INTO INFORMATION

376

Stages of data analysis

377

Before the survey During the survey After the survey responses After the survey

377 378 378 379

Coding 380

Code construction Multiple-response questions Coding open-ended questions Coding and analysis of qualitative research data

Qualitative data sources Coding qualitative data Coding strategies

381 381 382 386

386 387 388

Software for qualitative research

389

Text-mining software Data-base management software

389 390

Summary 392 Key terms and concepts

393

Questions for review and critical thinking

393

Ongoing project

394

CourseMate online study tools

Written case study 11.1 Questionnaire editing

PART SIX

ANALYSING THE DATA

12 » UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

394

394

398

Descriptive and inferential statistics

399

Sample statistics and population parameters

399

Making data usable

Frequency distributions Measures of central tendency Measures of dispersion The normal distribution

Why do I need to know about the standardised normal distribution?

400

400 400 402 405

407

Population distribution, sample distribution and sampling distribution

408

Central-limit theorem

411

Estimation of parameters

413

Point estimates Confidence intervals

413 414

Determining sample size

416

Stating a hypothesis

417

What is a hypothesis? Null and alternative hypotheses Hypothesis testing

The hypothesis-testing procedure

417 417 417

418

CONTENTS

An example of hypothesis testing Type I and Type II errors Choosing the appropriate statistical technique

Type of question to be answered Number of variables Scale of measurement Parametric versus nonparametric hypothesis tests Some practical univariate tests

418 420 422

422 422 422 422 423

The t-distribution 423

Calculating a confidence interval estimate using the t-distribution 425 Univariate hypothesis test using the t-distribution 426 427 Conducting a one-sample t-test in SPSS 428 Conducting a one-sample t-test in Excel The Chi-square test for goodness of fit 429 Conducting a Chi-square test in SPSS 431 Conducting a Chi-square test in Excel 432 A reminder about statistics

434

Summary 434 Key terms and concepts

435

Questions for review and critical thinking

435

Ongoing project

437

CourseMate online study tools

438

Written case study 12.1 Attitudes towards water conservation in a drought-stricken area of Australia 438 Written case study 12.2 Gambling in your community: What do the locals think? 438 Written case study 12.3 Victoria's legal outsourcing system and customer satisfaction439 Ongoing case study

13 » BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

439

441

What is the appropriate test of difference?

442

The independent samples t-test for differences of means

442

Conducting an independent samples t-test in SPSS Conducting an independent samples t-test in Excel

446 447

Paired-samples t-test 449

Conducting a paired samples t-test in SPSS Conducting a paired-samples t-test in Excel Analysis of variance (ANOVA)

The F-test Calculating the F-ratio Conducting an ANOVA in SPSS Conducting an ANOVA in Excel

449 450 453

456 457 460 462

Nonparametric statistics for tests of differences

463

Statistical and practical significance for tests of differences

463

Summary 464 Key terms and concepts

465

xv

xvi

CONTENTS

Questions for review and critical thinking

465

Ongoing project

468

CourseMate online study tools

468

Written case study 13.1 Do social marketing initiatives work? The not so dark side of marketing Written case study 13.2 Gambling in your community: How do perceptions vary by different consumers? Ongoing case study

14 » BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION The basics Pearson’s correlation coefficient

An example Coefficient of determination Correlation matrix Running a correlation in SPSS Running a correlation in Excel Nonparametric correlation Regression analysis

Least-squares method of regression analysis Drawing a regression line Tests of statistical significance Running a regression in SPSS Running a regression in Excel Cross-tabulations: The Chi-square test for goodness of fit

Cross-tabulation and Chi-square tests in SPSS Cross-tabulation and Chi-square tests in Excel Statistical and practical significance for tests of association

468 469 469

471 472 473

475 476 477 478 480 482 483

486 489 490 492 496 499

502 504 508

Summary 509 Key terms and concepts

510

Questions for review and critical thinking

510

Ongoing project

513

Written case study 14.1 Neighbourhood renewal and the use of marketing research Written case study 14.2 How innovative is a new product and why? Written case study 14.3 Food labelling and country of origin Ongoing case study

15 » MULTIVARIATE STATISTICAL ANALYSIS

513 514 514 515

516

The nature of multivariate analysis

518

Classifying multivariate techniques

519

The analysis of dependence The analysis of interdependence Influence of measurement scales

519 519 520

Analysis of dependence

N-way cross-tabulation Partial correlation analysis N-way univariate analysis of variance (ANOVA)

520

520 522 523

CONTENTS

Multiple regression analysis Binary logistic regression Analysis of interdependence

Exploratory factor analysis Cluster analysis Multidimensional scaling

525 533 535

535 537 540

Summary 543 Key terms and concepts

544

Questions for review and critical thinking

544

Ongoing project

544

CourseMate online study tools

Written case study 15.1 Coastal Star Sales Corporation

PART 7

FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

16 » COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

544

545

548

Insights from the communications model

549

The report in context

550

Report format

Body of the research report Summary Body of a formal descriptive research report Two-stage research report – exploratory followed by descriptive Additional parts of the report Templates and styles Presenting qualitative research

Deriving themes Qualitative presentation styles Checklist for reporting qualitative research Graphic aids

Tables Statistical output Charts Maximise the ‘data to ink’ ratio The oral presentation

Learn by example and practice Audio-visual aids

550

550 552 552 552 553 554 555

556 557 557 558

559 560 563 567 568

568 570

Summary 571

APPENDIX A STATISTICAL TABLES

Key terms and concepts

572

Questions for review and critical thinking

572

Ongoing project

572

CourseMate online study tools

572

574

GLOSSARY 581 INDEX

591

xvii

PREFACE This is the fourth Asia Pacific edition of Marketing Research and it continues to reflect the importance of social media, ‘big data’, neuromarketing and the use of online technology in qualitative and quantitative data collection. Throughout this text, we refer to a leading online survey software Qualtrics and how it can be used at each stage of the research process, under sections called ‘Survey this!’ Most importantly, we argue that an understanding of online software such as Qualtrics is crucial, not only to demonstrate how market research is done now, but also to show how it will be increasingly done in the future. In this edition there is a greater focus on measurement issues in market research and the use of qualitative software to identify themes in social media exchanges. This edition of Marketing Research also examines practical examples of market and social research, and what students can learn from the advantages and disadvantages of each research approach when they are applied in real life (Real world snapshots). We also provide tips for conducting research and doing flowcharts, and offer improved and more detailed worksheets that will greatly facilitate the understanding and application of market research techniques by the student and practitioner. We have worked diligently and carefully to make this edition a book that reflects the fast-paced and dynamic practice of marketing research. We have retained our central approach of making the subject interesting and entertaining for the student. This, we believe, is consistent with the style and learning approach of the original author, Professor William G. Zikmund. Market research, we argue, can be the most fascinating subject for the student, as it is now applied in diverse fields such as health, politics, social marketing, media and law. Therefore, we have continued to include a wider set of examples that reflect this throughout the text.

NEW TO MARKETING RESEARCH This edition places greater emphasis on applying marketing research and providing guidance when conducting marketing research. It also examines the effect of the increasing regulation of the market research industry. Where appropriate, ethical issues have also been addressed in this book. We believe that the addition of the trial version of the online survey software Qualtrics facilitates and empowers students’ ability to undertake market and social research, as it greatly reduces the time and cost required to conduct it, as it does for many companies and organisations. In this new edition we have included an ongoing series of case studies based on a real market research study, looking at mobile phone switching and bill shock in Australia. This provides students and lecturers with a structured learning approach to all the different aspects of the market research process and how they might be applied in a single commercial study. It also addresses the choices and trade-offs many researchers face when doing market research in the real world. We have also provided worksheets to assist in the development of each part of a research study, and detailed flowcharts for each major section of the research process. A reference to each worksheet is made at the end of each chapter. Access to the worksheets is provided online. It was important to us that this new material not be lumped into added chapters at the end of the book or into a single chapter on survey research. There are unique aspects of online research that touch on qualitative research, observation, gathering of secondary data, survey design, sample selection, questionnaire design and many other topics. →→ Chapter 1: The role of marketing research and the research process. This chapter begins with how commercial and not-for-profit organisations have effectively used market research as judged

xviii

PREFACE

by the Australian Market and Social Research Society. What is made clear is that usually more than one method of market research, often both qualitative and quantitative research, is used by market research companies in providing their reports to clients. Some of the award-winning market research programs have helped vulnerable communities such as those suffering from mental illness (beyondblue) and Indigenous nations in Australia (Cultural and Indigenous Research Centre Australia and the Market Research Unit, Department of Health, research on the campaign to address hearing loss in Aboriginal communities). We examine some important trends influencing market research, including neuroscience and the growth of social media. The chapter introduces the research process and the structure of the book and provides a number of examples of types of research conducted by government and non-profit as well as business organisations. →→ Chapter 2: Problem definition and the research process. This chapter starts with the use of market research to assist in social marketing campaign to assist in the transition from welfare to work. This detailed research study included a combination of qualitative approaches; ethnographic studies and group discussions that provided a 360-degree perspective on the complex issue of welfare to work. Outcomes of the research were used by the Australian Government to formulate its communication strategy. This study demonstrates the crucial importance of understanding a research problem so that the correct tools for a research design can be applied. Problem definition is often the most problematic part of the research process and guidelines are provided so that effective research – providing information that aids the decision-making of managers – can be developed. The chapter includes research proposals from NSW Trade & Investment and the Australian Communications and Media Authority. →→ Chapter 3: Qualitative research. This chapter deals with one of the most common forms of market research. As one of the most varied research forms, it includes not only focus groups and in-depth interviews but also projective techniques, story-telling and the new phenomenon of ‘social listening’, where detailed qualitative research is collected and analysed from social media such as Twitter and Facebook. As well as dealing with these diverse approaches, we examine contributions from other social sciences such as phenomenology, ethnography, grounded theory and case studies. In this chapter we provide an example of how qualitative software such as Leximancer can be used to identify themes in social media exchanges. The applications of this suite of related methodologies are also discussed and include health, media and politics. The chapter concludes with a caveat about the misuse of qualitative research. We have also included two new case studies on the use of airborne focus groups by Air New Zealand and how a retail chain Beaurepaires Tyres was repositioned using focus groups and brand tracking. →→ Chapter 4: Digital research using secondary data. This chapter starts with the question: ‘Who is afraid of metadata?’ Big data as discussed in this chapter is of interest not only to market researchers but also to government in terms of national security. Interestingly, New Zealand research, presented in the opening vignette, suggests that consumers are more comfortable with government using this information rather than industry. The chapter also deals with a multitude of information collected by other parties that can answer many research problems. It includes a guide as to how to assess the quality of this information (publication currency). Examples of how secondary information may be used in descriptive as well as forecasting analyses are also included. Increasingly, a large scale of data is being made available to organisations by social media and through the digitalisation of transactions and records, and we discuss some simple strategies and approaches that can be used to organise and interrogate such information in the age of big data.

xix

xx

PREFACE

→→ Chapter 5: Survey research. This chapter starts with the use of surveys to measure community moods and outlooks, and asks whether life in Sydney is any better now compared to five years ago. This chapter contains significant updates on the use of the Internet and mobile phones as a means of collecting survey research data. There are additional advantages and problems in using these new research techniques – for example, response and sample biases – and these are addressed in this chapter. How survey biases occur when asking questions of respondents from different cultures are also discussed in this chapter. This chapter includes a case study on Google Surveys, where responses to fairly straightforward questions can be obtained at fairly low rates of 10 cents per response, plus a new case study on the use of surveys to examine cycling behaviour in Auckland. →→ Chapter 6: Observation. This chapter starts with a content analysis of advertisements in industry publications aimed at New Zealand doctors and how this type of advertising has led to some health experts in New Zealand calling for greater monitoring and regulation of pharmaceutical marketing. Problems with companies having too much observational data (or big data) are also discussed in this chapter. The chapter concludes with the growing importance of mechanical observation, particularly those of Web metrics, which are important in assessing the effectiveness of online marketing strategies. Recent advances in eye- and brain-scanning technologies are also discussed. →→ Chapter 7: Experimental research and test marketing. This chapter starts with a new form of test markets – online test markets, which can cost clients less than $5000 compared to more expensive field designs which can cost upwards of $100 000 to administer. The trade-offs between other forms of test markets and experiments and that of cost and time are also discussed in this chapter. Ethical issues in experimental research are discussed, as are some of the major pitfalls within this methodology. The chapter concludes with an overview of test markets and the development of controlled and online test markets. We have also suggested that virtual-reality simulations of markets may be constructed. →→ Chapter 8: Measurement. This chapter starts by looking at measuring readership versus circulation, a simple yet perplexing issue in market research. A readership survey by Ipsos called emma (Enhanced Media Metrics Australia) claims to be able to measure not only what you read but also view on computers, smartphones and tablets. We go on to outline a six-stage measurement process and discuss a number of practical issues of measurement that need to be faced by researchers. Issues of multi-item measures are also discussed, as are the type of attitude scales that can be used in online surveys. The chapter is supplemented by an appendix that illustrates techniques for measuring consumer utility – conjoint analysis. →→ Chapter 9: Questionnaire design. This chapter follows a simple nine-stage process of the development of questionnaires, which is still an art as much as a science. The chapter also includes issues that need to be considered for online and email questionnaires. Poor questionnaire design is often the cause of many problems reported by field researchers in market research. →→ Chapter 10: Sampling: Sample design and sample size. This chapter introduces basic concepts in sampling, raising pertinent questions such as ‘How should the sample be selected?’ and ‘How many people should be in a sample?’ The chapter opens by describing some typical political polls that use sample sizes of just over 1000 to predict political preferences of a population of 23 million Australians. Using some intuitive examples, the chapter then goes on to explain the rationale behind sampling, how to sample, and factors to take into consideration when determining sample size. The advantages and disadvantages of sampling techniques are discussed, along with the use of panels and in ‘opt-in’ samples.

PREFACE

→→ Chapter 11: Editing and coding: Transforming raw data into information. This chapter explains the task of taking the responses from individuals and arranging them in a form that allows easy analysis using statistical spreadsheet software. Common problems encountered by researchers are discussed along with the relationship between question form, data scales and the resulting information. →→ Chapter 12: Univariate statistical analysis: A recap of inferential statistics. This chapter begins with an intuitive example of the sampling error involved in establishing a sample statistic. After introducing some basic measures of central tendency and measures of dispersion, the chapter then discusses the concept of inference and how inferential statistics can be used to estimate sampling error. The normal distribution and central-limit theorem are discussed to establish the groundwork for further hypothesis testing. →→ Chapter 13: Bivariate statistical analysis: Tests of differences. This chapter uses an example about consumer price perceptions in international markets to establish the rationale for testing for differences. Using t-tests and ANOVA, the chapter introduces basic tests and provides worked examples in SPSS and Excel to illustrate how to conduct and interpret these tests. →→ Chapter 14: Bivariate statistical analysis: Tests of association. This chapter extends the discussion of typical statistical tests, examining the concept of association by highlighting a contemporary issue in marketing. Using correlation, regression and Chi-square analysis, the chapter discusses typical tests for association and, like Chapter 13, shows how to conduct and interpret these tests in SPSS and Excel. →→ Chapter 15: Multivariate statistical analysis. This chapter extends our understanding of bivariate statistics to investigate the problem of multiple variables. The problem of confounding relationships in binary statistics is first introduced and then extended to cover problems in analysis of variance and multiple regression. Techniques for discovering patterns in data are covered in factor analysis and cluster analysis. →→ Chapter 16: Communicating research results: Research report, oral presentation and research follow-up. This chapter completes the research process with the important function of ensuring that the valuable completed research reaches the right people in a way that is informative, persuasive and actionable. Importantly, the correct formatting of tables versus graphs is discussed in detail, along with the presentation of qualitative data and the effective use of graphic aids. These items are often the crucial elements of any market research report.

ORGANISATION OF THE BOOK The organisation of the second Asia Pacific edition of Marketing Research follows each step of the marketing research process that is introduced as a learning model in Chapter 1. The book is organised into another six parts. Each part represents each stage of the research process and discusses how each stage relates to decisions about conducting specific projects. →→ Part 1: Introduction to the research process discusses the scope of marketing research, provides an overview of the entire marketing research process, and explains how the Internet and globalisation are changing the nature of information systems. Each stage in the research process is then examined. →→ Part 2: Defining the problem covers problem definition and research proposals. This stage sets the research objectives used to design the research study.

xxi

xxii

PREFACE

→→ Part 3: Planning the research design examines the concepts and issues related to research designs, such as using exploratory/qualitative research, surveys, observation studies and experiments. Issues of measurement and questionnaire design are then addressed in this section. →→ Part 4: Planning the sample explains why sampling is required, how to design samples and how to determine sample size. →→ Part 5: Collecting the data discusses the importance of fieldwork, editing and coding, as the quality of data from a research study is determined by this stage of the process. →→ Part 6: Analysing the data covers descriptive data analysis, inferential statistical analysis and multivariate analysis, and provides practical advice as to how data can be analysed to answer research questions. A new appendix on conjoint analysis has been added to this section of the text for more advanced marketing research courses. The material in each chapter is comprehensive yet accessible to the student. Sample output of many examples in using SPSS and Excel have also been included. →→ Part 7: Formulating conclusions and writing the final report discusses the communication of research results. Often it is this step that is the most important in the research process as information and findings that cannot be presented clearly are of little use to the client. The text ends with a final note on the use of marketing research. The book also features two appendices and an extensive glossary. Two other appendices can be found online: Comprehensive cases with a computerised database (comprising six cases), which provides materials that challenge students to apply and integrate the concepts they have learned, and the Code of Conduct of the Australian Market and Social Research Society.

HELP US WRITE AN EVEN BETTER MARKETING RESEARCH TEXTBOOK We have worked very hard to update, regionalise and make this an interesting textbook in marketing research, but we know that with hindsight things can always have been done better. If you have any suggestions for the next edition of Zikmund, D’Alessandro, Lowe, Winzar and Babin, please email us. →→ For Chapters 1 to 9: Steven D’Alessandro [email protected] →→ For Chapters 10, 11, 15 and 16: Hume Winzar [email protected] →→ For Chapters 12, 13 and 14: Ben Lowe [email protected]

ACKNOWLEDGEMENTS STEVEN D’ALESSANDRO Steve dedicates this book to his entire family in Australia, Italy and the United States. It is joy to work with co-authors such as Ben and Hume and we are grateful for the guidance and support shown to us by Lydia Crisp and Dorothy Chiu at Cengage Learning Australia. To all the students of Marketing Research, we hope you find this an informative yet engaging read. Lastly, thank you Michelle for your support for the last 16 years and to my two daughters Sophie and Stanton, I wish you all the happiness in life. La dolce vita!

HUME WINZAR Thanks to Lydia Crisp and Dorothy Chiu at Cengage Learning Australia for their guidance and patience in this process. I do not think I could work with a better team of co-authors and support. Thanks to more than a generation of undergraduate and postgraduate students, whose often profound questions have driven much of what appears in this text. Finally, thanks to my wonderful partner, Trish, for your support and help setting priorities.

BEN LOWE Thanks to my co-authors and the team at Cengage for another great edition of a great book. Thanks also to my wonderful family for all their support!

Reviewers The authors also wish to acknowledge the following reviewers: Dr Marthin Nanere – La Trobe University Linda Robinson – RMIT University Dr Jeffrey Lim – University of Sydney Dr R Anne Sharp – University of South Australia Abhishek Dwivedi – Charles Sturt University Dr Md Akhtaruzzaman – University of Newcastle Bopha Roden – Swinburne University of Technology

xxiii

ABOUT THE AUTHORS DR STEVEN D’ALESSANDRO Steve is a Professor of Marketing at Charles Sturt University in New South Wales. He has published 94 refereed papers in leading international journals (including The European Journal of Marketing, International Marketing Review, Marketing Letters, Journal of Business Research, Journal of Services Marketing, Journal of Macromarketing, Food Quality and Preference, Psychology and Marketing, and Applied Economics), books and conferences. Steve has also worked as a market research consultant for blue-chip companies such as Pacific Dunlop, ANZ, Challenge Bank, BHP, Telstra and Ford. His research interests include the effect of country-of-origin information on consumer judgements and choice; brand switching, complex systems, luxury consumption consumer durable purchasing behaviour; service quality; marketing strategy and the environment; diffusion of innovations, marketing of pharmaceuticals and body image; global branding; wine marketing; and privacy research. In 2012, he was named as ANZMAC Distinguished Educator of the year award, in recognition of his expertise and innovation in marketing education. BEN LOWE Ben Lowe is Professor of Marketing at Kent Business School, University of Kent. Ben has been teaching courses in marketing research and marketing at undergraduate and postgraduate level for over 10 years in the UK, Europe and Australia, and has consulted on market research projects to organisations in Australia and the UK. Ben’s primary research interests relate to consumer behaviour and consumer acceptance of innovations. Specifically, Ben’s research interests are in pricing, consumer evaluations of introductory promotions, diffusion of innovations, pioneer brand advantage, the Theory of Planned Behaviour, the Technology Acceptance Model and consumer acceptance of innovations in developing countries. Ben has published more than 30 refereed articles in journals such as Psychology & Marketing, the European Journal of Marketing, Technovation, the Journal of Interactive Marketing, the Journal of Marketing Management, the Journal of Consumer Behaviour, the American Journal of Agricultural Economics, the Journal of Marketing Education and others. Ben’s research method expertise lies in survey design and experimentation. HUME WINZAR Hume is Associate Professor of Business at Macquarie University, Sydney. Before joining academia, Hume worked in outdoor education, banking, public relations, advertising and product management. Over the last 30 years he has taught a range of marketing courses in universities across Australia, South-East Asia, Eastern Europe, Canada and the United States. His expertise lies in marketing research techniques, data analysis and consumer behaviour. Hume’s current research interests are business analytics techniques, consumer product evaluation and the application of complexity theory to market models. BARRY J. BABIN Barry Babin has authored more than 70 research publications in some of the most prestigious research periodicals, including the Journal of Marketing, the Journal of Consumer Research, the Journal of Business Research, the Journal of Retailing, Psychological Reports, Psychology and Marketing and the Journal of the Academy of Marketing Science, among others. Barry is currently Max P. Watson, Jr. Professor of Business and Department Chair of the Department of Marketing and Analysis at Louisiana Tech University. He has won numerous honours

xxiv

ABOUT THE AUTHORS

for his research, including the USM Louis K. Brandt Faculty Research Award (which he won on three occasions while a member of that faculty), the 1996 Society for Marketing Advances (SMA) Steven J. Shaw Award, and the 1997 Omerre Deserres Award for Outstanding Contributions to Retail and Service Environment Research. He is also an affiliate member of the Scientific Research Committee at Reims Management School in France. Barry is Past-President of the Academy of Marketing Sciences and former president of the Society of Marketing Advances. He is also the Marketing Editor for the Journal of Business Research. Barry’s research focuses on the effect of the service environment in creating value for both employees and customers. His expertise is in building and understanding value that leads to long-lasting, mutually beneficial relationships with employees and customers. He also has expertise in creative problem solving and in wine marketing. His primary teaching specialties involve consumers and service quality, marketing research and creative problem solving. He is well respected internationally and has lectured in many countries outside of the United States including Australia, South Korea, France, Germany, Canada, Sweden and the United Kingdom, among others.

In Remembrance PROFESSOR WILLIAM G. ZIKMUND William G. Zikmund, former professor of marketing at Oklahoma State University, received his Bachelor of Science in marketing from the University of Colorado, a Masters of Science from Southern Illinois University and a PhD in business administration from the University of Colorado. Professor Zikmund worked in marketing research for Conway/Millikien Company and Remington Arms Company before beginning his academic career. In addition, he had extensive consulting experience with many business and not-for-profit organisations. Professor Zikmund published many articles and successful textbooks. His books include Marketing, Effective Marketing, Exploring Marketing Research and Business Research Methods. He was an active teacher who strived to be creative, and innovate in the classroom. His books have been used in universities in Europe, Asia, Africa, South America and North America. More than half a million students have read his books. Professor Zikmund died in 2002.

xxv

Guide to the text As you read this text you will find a number of features in every chapter to enhance your study of marketing research and help you understand how the theory is applied in the real world.

PART OPENING FEATURES

CHAPTER OPENING FEATURES

Understand how key concepts are connected across all chapters in the part by viewing the concept map.

Identify the key concepts that the chapter will cover with the learning objectives. 12

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

01 »

PROMOTION RESEARCH

THE ROLE OF MARKETING

RESEARCH AND THEsampling deals and other sales Research that investigates the effectiveness of premiums, coupons,

PART 4: Sampling

RESEARCH PROCESS promotions is classified as promotion research. Promotion research includes buyer motivation studies

WHAT to generate ideas for copy development, media research and studies of advertising effectiveness. YOU WILL LEARN

IN THISand effort are spent on advertising research. However, the most time, money campaign by advertising company M&C Saatchi, One way in which the CHAPTER

What made

impact of marketing

which has been effective in promoting customer

market that research Marketing research findings have found different are effective service benefits of the Commonwealth Bank. in mainland China can be evaluatedappeals To understand

DEFINE TARGET POPULATION

the importance of marketing research as a management decision-making tool.

research

is by examining some of the

In 2014 the winner for Policy/Social Research

Market and Social Research Society’s Winners of Research Excellence Awards for 2014. These awards also show the wide application of market research to commercial, government and not-for-profit sectors. The 2014 Award1 for Consumer Insights was awarded to Hall & Partners Open Mind, for research done for beyondblue, which provided research aimed at encouraging the low uptake mental health services (only 30 per cent of those with a mental illness do so, and men account for 80 per cent of death by suicides in Australia). The research used qualitative research including interviews, case studies and online bulletin boards, to challenge and engage diverse cultural and social male groups struggling with mental illness. This research led to substantial changes in the way beyondblue engaged with men. These included a change of language from help-seeking to taking action, talking man-to-man about experiencing anxiety and a new man-focused online service, through the Man Therapy Website https://www.mantherapy.org.au. The award for Communications Strategy Effectiveness went to Jem Wallis and Ainslie Williams of Vivid, for ‘CommBank CAN’, which used eight rounds of qualitative research over 19 months to provide information to shape advertising strategy, especially copy, which shows triggers of customer satisfaction. The researchers used diverse elements of qualitative research of different formats including stealomatics, key frames, narrative tapes, basic headlines and partially formed scripts which were read out in the group. The research provided important real-time feedback on creative elements of the

and Nick Connelly, of the Cultural and Indigenous

wentemphasise to Anne Redman, Mary Raftos, Helen Price winners of the Australian that and Hong Kong. Chinese consumers prefer advertisements concrete, functional and effective in

2014?

Research Centre Australia and Market Research Unit, To define marketing practical product benefits, research. rather than symbolic themes. The ‘more sophisticated’ Hong Kong Chinese Department of Health, for research entitled ‘Listening to the community – research and evaluation for the

To understand the regard such advertising themes as dull and are more interestedNational in emotional advertisements that are Indigenous Ear Health Campaign’. This difference between basic and applied

research aimed to address the higher incidence of

17 loss among Aboriginal and/or Torres Strait marketing research. entertaining and communicate a sense of personal relevance.hearing Based on such research, marketers To understand the Probability sample

Yes

No

Do we have a list?

managerial value of marketing research and its role in the development and implementation of marketing strategy.

Islander people(s). The research used both qualitative

and quantitative approaches. The qualitative therefore need to adopt separate media campaigns for regions within China. component included group discussions with Aboriginal

Non-probability sample

and Torres Strait mothers and carers, fathers, and

Media research helps an advertiser decide whether television, newspapers, magazines or other in-depth interviews with grandparents and elders in 14 urban, regional and remote areas. Such research was

To understand when backed byamong a detailed literature review, case studies and media are best suited to convey the advertiser’s message. Choices media alternatives may be marketing research

Population members

To list the stages in

iStock.com/tetmc

Stratified random sample

exploratory research, descriptive research or causal research.

No

No

Snowball sample

approach was a cross-cultural media partnership,

which delivered social marketing initiatives that were dominant publishing groups. Consumers believe they reinforcedeveloped a sense of belonging and are likely to locally by Indigenous people.

Can field staff easily identify target respondents?

Simple random sample

and community groups and health professionals

To classify marketing and the Care for Kids’ Ears website. local newspapers that are research widely read by many local communities. These areThe second not owned by the two as

To explain the

As can be seen, market research uses a wide set difference between a purchase products and services from companies that advertise ofin them. approaches (surveys, focus groups, and online and research project and a research program.

Yes

Cluster sample

including resources for parents and carers’ resource

the marketingof research reach. Although the population New Zealand is small, at around 4.5andmillion, are a number of kits for teachers teacher’s aides’ earlythere childhood process.

No

Additional relevant information in the list?

consultations. The outcome of the research led to a

two-pronged communications strategy which focused based on research that shows how many people in the target audience each advertising vehicle can on distribution of Care for Kids’ Ears resources

Are most people in the target population?

Type of list? Location only

is needed and when it should not be conducted.

Yes

offline research) that can help organisations to better

meet important outcomes (profits, better mental Research in Japan suggests that most Japanese favour the 15-second advertising slot for image-based

Yes

Judgement sample

health, less hearing loss and more effective advertising

and communications)a andday even help to save people’s and peripheral messages. Long commute times of around 70 minutes the office mean they lives. This is an exciting and dynamic field of business. This chapter examines the nature of market research have ample time to read newspapers and magazines. Therefore, in Japan, based on this and how marketers it can be used, and introduces a step-by-step

Convenience sample

process of conducting research that can be applied

for any type of organisation or client. research, run two campaigns: an awareness and corporate branding campaign on television, and a

4 more detailed and informative campaign in print.

341

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

THE INTEGRATED MARKETING MIX

Prepare for what the chapter will cover through the chapter vignette.

The individual elements of the marketing mix do not work independently. Hence, many research studies investigate various combinations of marketing ingredients to gather information to suggest the best possible marketing program.

Analysing marketing performance

FEATURES WITHIN CHAPTERS

After a marketing strategy has been implemented, marketing research may serve to inform managers whether planned activities were properly executed and are accomplishing what they were expected to achieve. In other words, marketing research may be conducted to obtain feedback for evaluation and control of marketing programs. This aspect of marketing research is especially important for

Definitions of important key terms are located in the margin for quick wholesale and retail activity to ensure early detection of sales declines and other anomalies. In the grocery reference. and pharmaceutical industries, sales research may use Universal Product Codes (UPC) on packages read successful total quality management.

12

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

PROMOTION RESEARCH Research that investigates the effectiveness of premiums, coupons, sampling deals and other sales promotions is classified as promotion research. Promotion research includes buyer motivation studies to generate ideas for copy development, media research and studies of advertising effectiveness. However, the most time, money and effort are spent on advertising research. Marketing research findings have found that different appeals are effective in mainland China and Hong Kong. Chinese consumers prefer advertisements that emphasise concrete, functional and practical product benefits, rather than symbolic themes. The ‘more sophisticated’ Hong Kong Chinese regard such advertising themes as dull and are more interested in emotional advertisements that are entertaining and communicate a sense of personal relevance.17 Based on such research, marketers therefore need to adopt separate media campaigns for regions within China. Media research helps an advertiser decide whether television, newspapers, magazines or other media are best suited to convey the advertiser’s message. Choices among media alternatives may be based on research that shows how many people in the target audience each advertising vehicle can reach. Although the population of New Zealand is small, at around 4.5 million, there are a number of local newspapers that are widely read by many local communities. These are not owned by the two dominant publishing groups. Consumers believe they reinforce a sense of belonging and are likely to purchase products and services from companies that advertise in them. Research in Japan suggests that most Japanese favour the 15-second advertising slot for image-based and peripheral messages. Long commute times of around 70 minutes a day to the office mean they

performance-monitoring research Research that regularly provides feedback for evaluation and control of marketing activity.

Performance-monitoring research refers to research that regularly, sometimes routinely, provides

feedback for evaluation and control of marketing activity. For example, most firms continuously monitor

by electronic cash registers and computerised checkout counters to provide valuable market-share information to store and brand managers interested in the retail sales volumes of their products. Market-share analysis and sales analysis are the most common forms of performance-monitoring research. Almost every organisation compares its current sales with previous sales and with competitors’ sales. However, analysing marketing performance is not limited to the investigation of sales figures. Other forms of performance-monitoring research include the ‘Tell Coles’ customer feedback surveys and the research on Victorian Child Protection Agency, using the exit surveys of past staff to

have ample time to read newspapers and magazines. Therefore, marketers in Japan, based on this research, run two campaigns: an awareness and corporate branding campaign on television, and a more detailed and informative campaign in print.

THE INTEGRATED MARKETING MIX The individual elements of the marketing mix do not work independently. Hence, many research studies investigate various combinations of marketing ingredients to gather information to suggest the best possible marketing program.

Analysing marketing performance After a marketing strategy has been implemented, marketing research may serve to inform managers whether planned activities were properly executed and are accomplishing what they were expected to achieve. In other words, marketing research may be conducted to obtain feedback for evaluation and control of marketing programs. This aspect of marketing research is especially important for successful total quality management. performance-monitoring research Research that regularly provides feedback for evaluation and control of marketing activity.

Performance-monitoring research refers to research that regularly, sometimes routinely, provides feedback for evaluation and control of marketing activity. For example, most firms continuously monitor wholesale and retail activity to ensure early detection of sales declines and other anomalies. In the grocery and pharmaceutical industries, sales research may use Universal Product Codes (UPC) on packages read by electronic cash registers and computerised checkout counters to provide valuable market-share information to store and brand managers interested in the retail sales volumes of their products. Market-share analysis and sales analysis are the most common forms of performance-monitoring research. Almost every organisation compares its current sales with previous sales and with competitors’ sales. However, analysing marketing performance is not limited to the investigation of sales figures. Other forms of performance-monitoring research include the ‘Tell Coles’ customer feedback surveys and the research on Victorian Child Protection Agency, using the exit surveys of past staff to

xxvi

SURVEY THIS!

Explore questioning techniques and test the effectiveness of market surveys using Qualtrics and the Survey this! questions. Qualtrics provides a robust platform to design surveys, and distribute and evaluate survey results

GUIDE TO THE TEXT

REAL WORLD SNAPSHOT

xxvii

Analyse practical examples of how chapter concepts are applied in a business context through the Real world snapshot boxes.

EXPLORING RESEARCH ETHICS

Explore the real world ethical issues that can affect the marketing research process with the Exploring research ethics boxes.

TIPS OF THE TRADE

WHAT WENT WRONG?

Get helpful, practical hints on how to conduct market research successfully with the Tips of the trade feature.

Learn about the successes and failures of various marketing research strategies in the real business environment with the What went right? and What went wrong? boxes.

WHAT WENT RIGHT?

ICONS The Ongoing project icon highlights the key concepts relevant to the Ongoing project found at the end of each chapter.

CHAPTER 03 > QUALITATIVE RESEARCH

73

themselves. Ideally, the discussion topics emerge at the group’s initiative. Focus groups allow people to discuss their true feelings, anxieties and frustrations, as well as the depth of their convictions, in their own words. The primary advantages of focus group interviews are that they are relatively

focus group

fast, easy to execute and inexpensive. In an emergency situation, three or four group sessions can be conducted, analysed and reported in less than a week at a cost substantially lower than that of other attitude-measurement techniques. Remember, however, that a small group of people will not be a representative sample no matter how carefully they are recruited. Focus group interviews cannot take the place of quantitative studies.

ONGOING PROJECT

The flexibility of focus group interviews has some advantages, especially when compared with the rigid format of a survey. Numerous topics can be discussed and many insights can be gained, particularly with regard to the variations in consumer behaviour in different situations. Responses that would be unlikely to emerge in a survey often come out in group interviews: ‘If it is one of the

ONGOING PROJECT

three brands I sometimes use and if it is on sale, I buy it; otherwise, I buy my regular brand’ or ‘If the day is hot and I have to serve the whole neighbourhood, I make cordial; otherwise, I give them Pepsi or Coke.’ If a researcher is investigating a target group to determine who consumes a particular beverage or why a consumer purchases a certain brand, situational factors must be taken into account. If the researcher does not realise the impact of the occasion on which the particular beverage is consumed, the results of the research may be general rather than portraying the consumer’s actual thought process. Focus groups often are used for concept screening and concept refinement. The concept may be continually modified, refined and retested until management believes it is acceptable. Political parties in Australia, for example, formulate, test and refine policies based on focus group research. This is discussed in the ‘Exploring research ethics’ box.

FOCUS GROUPS: THE DEATH OF POLITICAL LEADERSHIP IN AUSTRALIA?21

All political parties in Australia use focus group research to monitor the mood of the voters. The research is paid for by Australian taxpayers. Commenting on their widespread use a leading marketing academic Professor Pascale Quester, former Executive Dean of the Faculty of Professions at the University of Adelaide, noted: ‘The use of focus groups in politics is actually the death of the conviction politician.’ Professor Quester has studied focus groups and conducted them as well. In pure marketing terms she thinks they can be useful in the early development of a product to find out its strengths and weaknesses. But politicians should avoid them. Nevertheless focus group are widely used to canvass policy changes and to examine the political mood of voters in marginal seats. The danger is the small numbers may not reflect overall community sentiment and politicians may become extremely short term and reactive in their thinking. The specific advantages of focus group interviews have been categorised as follows:22 → Synergy: The combined effort of the group will produce a wider range of information, insights and ideas than the accumulation of separately secured responses from a number of individuals. → Snowballing: A bandwagon effect often operates in a group interview situation. A comment by one individual often triggers a chain of responses from the other participants. Brainstorming of ideas frequently is encouraged in focus group sessions. → Serendipity: It is more likely that an idea will drop out of the blue in a focus group than in an individual interview. The group also affords a greater opportunity to develop an idea to its full potential.

EXPLORING RESEARCH ETHICS

Discover current research articles by accessing the Search me! marketing database and searching the suggested key terms in each chapter.

188

xxviii

GUIDE TO THE TEXT

PART THREE > PLANNING THE RESEARCH DESIGN

SUMMARY DISCUSS THE ROLE OF OBSERVATION AS A MARKETING RESEARCH METHOD

Practicality

06

CHAPTER 08 > MEASUREMENT

ETHICAL ISSUES OF OBSERVATION

279

END-OF-CHAPTER FEATURES Contrived observation, hidden observation and other observation

Lastly, measures must be practical. Practicalresearcher. measures are shorterresearch (have fewer items), whileuse stilldeception being often raise ethical Observation is a powerful tool for the marketing designs that might Scientific is the systematic process recording the sensitive,observation and are easy to administer, timely of and simple enough forconcerns respondents understand. abouttosubjects’ right toResults privacy and right to be informed. 37 they are behavioural patterns of people, objects and occurrences This chapter provides checklist of ethical issues that need to be from practical measures should be easy to interpret.as This is not always the case withasome measures witnessed. Questioning or otherwise subjects addressed observational research isbut to be used. such as those sometimes found communicating in psychology,with which, for example, mayifmeasure personality, does notresults occur. A widenot variety of information about theresearchers. behaviour Skill in designing practical measures is whose may be practical for marketer of people and objects can be observed. Seven kinds of phenomena EXPLAIN often due to questionnaire design and this is discussed in the next chapter.THE OBSERVATION OF PHYSICAL OBJECTS AND are observable: physical actions, verbal behaviour, expressive MESSAGE CONTENT behaviour, spatial relations and locations, temporal patterns, Physical trace evidence serves as a visible record of past events. physical objects, and verbal and pictorial records. Thus, both verbal Researchers may examine whether there is physical evidence and nonverbal behaviour may be observed. of consumer behaviour via garbage analysis or items in a A major disadvantage of the observation technique is that consumer’s pantry. Content analysis obtains data by observing and cognitive phenomena such as attitudes, motivations, expectations, analysing the contents of the messages in written and/or spoken SUMMARY intentions and preferences are not observable. Furthermore, only communications. It is often used to assess political, economic and overt behaviour of a short duration can be observed. Nevertheless, social trends and the way they are depicted in the media. many types of data can be obtained more accurately through direct DETERMINE WHAT IS TO BE MEASURED DETERMINE THE TYPE OF ATTITUDE AND THE SCALE observation than by questioning respondents. Observation is often TO BE USED TOMAJOR MEASURE IT OF MECHANICAL DESCRIBE THE TYPES We must know what we want to measure, and how it fits our the most direct or the only method for collecting certain data. OBSERVATIONS AND OBSERVATIONS OF PHYSIOLOGICAL Many methods for measuring have been developed, such as research questions or hypotheses. Examination of published REACTIONS ranking, rating, sorting and choice techniques. research and findings from past studies should help us determine DESCRIBE THE USE OF DIRECT OBSERVATION AND Mechanical observation a variety of devices record behaviour One class of ratinguses scales, category scales,toprovides several what it is we want to measure. CONTRIVED OBSERVATION directly. Mechanical takes manyto forms. National response categoriesobservation to allow respondents indicate the intensity Marketing researchers employ both human observers and machines television audience The ratings are based on mechanical and of their attitudes. simplest attitude scale callsobservation for a ‘yes/no’ DETERMINE HOW IT IS TO BE MEASURED designed for specific observation tasks. Human observation is computerised data collection. There is alsoquestion. a numberThe of Likert web-based or ‘agree/disagree’ response to a single scale In order to measure we must determine some rules of commonly used when the situation or behaviour to be recorded is not behavioural that can assess thesubjects effectiveness online uses a seriesmeasures of statements with which indicateofagreement measurement, This will determine the easily predictableorinoperational advance ofdefinitions. the research. Mechanical observation marketing campaigns, as website hits andweights unique site or disagreement. The such responses are assigned thatvisits. are choice of rules ofthe measurement, the number can be used when situation or behaviour to of bemeasurement recorded Scanner-based research product category sales data summed to indicate the provides respondents’ attitudes. items to be used and the type of attitude scales. Operational is routine, repetitive or programmatic. Human or mechanical recorded by laser scanners in retail syndicated The semantic differential uses stores. a seriesMany of attitude scalesservices definitions tell to measure and what rules observation maythe beresearcher unobtrusive.how Human observation brings the offer secondary data collected scanner systems. anchored by bipolar adjectives.through The respondent indicates where of measurement to apply. For example, our measure of social possibility of observer bias, even though the observer does not such the as arousal or eye movement his Physiological or her attitudereactions, falls between polar attitudes. Variations on entrepreneurship near the start of thiscan chapter is made of interact with the subject. Observation sometimes be up contrived patterns, may such be observed using scales a number devices. this method, as numerical and of themechanical Stapel scale, are composite additive score of several items. by creating the situations to be observed. This can reduce the time The forscale manyputs of these techniques thecentre of a alsomajor used.problem The Stapel a single adjectiveisininthe and expense of obtaining reactions to certain circumstances. interpretation of physiological reactions. range of numerical values from 13 to 23. APPLY A RULE OF MEASUREMENT Graphic rating scales use continua on which respondents indicate There are four types of measuring scales that can be applied. their attitudes. Constant-sum scales require the respondent to Nominal scales assign AND numbers CONCEPTS or letters to objects only for KEY TERMS divide a constant sum into parts, indicating the weights to be given identification or classification. Ordinal scales arrange objects to various attributes of the item being studied. Several scales, such television monitoring observation at-home scanning system to their magnitudes in or alternatives according an ordered as the behavioural differential, have been developed to measure the observation content analysis relationship. Interval scales measure order (or observer distance)bias in behavioural component of visible attitude. voicepreferences. pitch analysis psychogalvanometer contrived observation units of equal intervals. Ratio scales are absolute scales, People often rank order their Thus, ordinal scales direct observation starting with absolute zero, at which there is a pupilometer total absence that ask respondents to rank order a set of objects or attributes may be eye-tracking monitor of the attribute. The type of scale determines response the form latency of developed. In the paired-comparison technique, two alternatives are scanner-based consumer panel functional resonance statisticalmagnetic analysis to use. paired and respondents are asked to pick the preferred one. Sorting imaging (fMRI) requires respondents to indicate their attitudes by arranging items into hidden observation DETERMINE IF THE MEASURE CONSISTS OF A NUMBER piles or categories. OF MEASURES The accuracy of answers to sensitive questions may be enhanced by using randomised response questions and calculations Index (composite) measures often are used to measure complex based on probability theory. concepts with several attributes. Asking several questions may yield CHAPTER 06 > OBSERVATION 189 The researcher can choose among a number of attitude scales. a more accurate measure than basing measurement on a single Choosing among the alternatives requires considering several question. These measures need to be correlated so that they are questions,THINKING each of which is generally answered by comparing the internally consistent. QUESTIONS FOR REVIEW AND CRITICAL advantages of each alternative to the problem definition.

At the end of each chapter you will find several tools to help you to review, practise and extend your knowledge of the key learning objectives. Review your understanding of the key chapter topics with the Summary.

Expand your knowledge by using the Key terms and concepts to conduct further research on the Search me! marketing database. Test your knowledge and consolidate your learning through the Questions for review and critical thinking.

08

1

Read the opening vignette on the use of observational research and the nature of advertising to New Zealand doctors. What flaws do you think there could be in the use of observational research in this case? How could it be improved?

2 What are the ethical issues of using hidden or unobtrusive observation? 3 Click-through rates for advertisements placed in websites are very low (1 per cent or less). What types of error might exist using click-through rate data as a measure of an advertisement’s success? 4 What are some problems faced by firms with using neuroscience approaches? 5 A multinational fast-food corporation plans to locate a restaurant in Jakarta, Indonesia. Secondary data for this city are outdated. How might you determine the best location using observation? 6 How might an observation study be combined with a qualitative in-depth interviews?

e

An online magazine publisher wishes to determine exactly what people see and what they pass over while reading one of her magazines. f A health food manufacturer wishes to determine how people use snack foods in their homes. g An overnight package delivery service wishes to observe delivery workers, beginning at the moment when they stop the truck, continuing through the delivery of the package, and ending when they return to the truck. h A motivational researcher wants to know if people wear sunglasses to protect their eyes and/or for reasons of fashion and style. 10 Watch the nightly news on a major network for one week. Observe how much time is devoted to national news, advertisements and other activities. (Hint: think carefully about how you will record the contents of the programs.) 11 Comment on the ethics of the following situations: a During the course of telephone calls with customers, a

to record their reactions to a new microwave dinner from A telecommunication provider wishes to collect data on behind a two-way mirror. the number of customer services and the frequency of c A marketing researcher arranges to purchase the garbage customer use of these services. from the headquarters of a majorGUIDE competitor. The purpose TO THE TEXT xxix b A state government wishes to determine the driving public’s is to sift through discarded documents to determine the use of texting whilst driving company’s strategic plans. c A researcher wishes to know how many women have been Apply market research techniques holistically by completing the project. 12 What is aOngoing fMRI device and howThe can itOngoing be used toproject research is featured on The Australian covers over the last 10 years. a continuing activity that builds upon each step of the marketing process. Refer to the Ongoing project advertising effectiveness? d A fast-food franchise wishes to determine how long a icon throughout the chapters to locate key concepts relevant to the project, and download the relevant customer entering a store has to wait for his or her order. a

worksheets from the CourseMate Express website.

ONGOING PROJECT DOING AN OBSERVATIONAL STUDY? CONSULT THE CHAPTER 6 PROJECT WORKSHEET FOR HELP

Observation research is one of the most flexible research methods. The research problem needs to be considered before using it.

Download the Chapter 6 project worksheet from the CourseMate website. It outlines the steps to be considered in choosing an observational design method. Make sure you meet any ethical considerations discussed in this chapter.

CourseMate Express online study tools help you revise and extend your understanding of the key chapter PART > PLANNING the THE RESEARCH 190 concepts as THREE you complete activitiesDESIGN on CourseMate Express and use the Search me! marketing database.

COURSEMATE ONLINE STUDY TOOLS PART THREE > PLANNING THE RESEARCH DESIGN 190 Flip to the start of your textbook and use the tear-out card to log ☑ crosswords on key concepts in toCOURSEMATE CourseMate for Marketing ONLINE Research. There you can test your STUDY TOOLS ☑ online research activities understanding and revise chapter concepts with: ☑ online video activities. to the start of your textbook and use the tear-out card to log ☑ crosswords on key concepts ☑Flip interactive quizzes in to CourseMate for Marketing Research. There you can test your ☑ online research activities ☑ flashcards understanding and revise chapter concepts with: ☑ online video activities. ☑ interactive quizzes

Reinforce the topics covered in the chapter and apply the concepts you have learnt with the Written case WRITTEN CASE STUDY 6.1 ☑ flashcards studies, which illustrate recent, relevant applications of marketing research in practice. THE PEPSI / COKE CHALLENGE AND NEUROSCIENCE23

WRITTEN CASE STUDY 6.1

Can marketers really trust customers to act on their intentions? Buyology, is that our brains encode products with values. We Although they’d like to think so, marketing research suggests experience a drink such as Coke not just through our senses that our thinking processes often pushAND us inNEUROSCIENCE one direction while23 but also through our emotional associations with the brand. THE PEPSI / COKE CHALLENGE our emotions tug us in another. It is claimed by researchers We may expect that we will like – and want to buy – an Can marketers really trust customers to act on their intentions? Buyology, is that our brains encode products with values. We in neuromarketing, who use brain scanning techniques unfamiliar item, but subconscious reactions will always come Although they’d like to think so, marketing research suggests experience a drink such as Coke not just through our senses such as functional magnetic resonance imaging (fMRI) and into play – all of which helps explain that oft-cited failure rate: that our thinking processes often push us in one direction while but also through our emotional associations with the brand. electroencephalogram (EEG), that most of our decisions are made 85 per cent of new products fail not because market research our emotions tug us in another. It is claimed by researchers We may expect that we will like – and want to buy – an subconsciously or due to a subconscious reaction. Neuroscientist showed that customers did not want the product, but because it in neuromarketing, who use brain scanning techniques unfamiliar item, but subconscious reactions will always come Read Montague discovered a vivid illustration of this phenomenon. suggested they did. such as functional magnetic resonance imaging (fMRI) and into play – all of which helps explain that oft-cited failure rate: In 2003, Montague the classic Pepsi Challenge Analyse andrecreated follow the steps of decisions the market research start tofail finish with the newresearch electroencephalogram (EEG), that most of our are made 85 process per cent offrom new products not because market W E N experiment, in which participantson aremobile given sips of both Coke Ongoing phones. QUESTIONS subconsciously orcase due tostudy a subconscious reaction. Neuroscientist showed that customers did not want the product, but because it and Pepsi and asked which they prefer. But he also monitored CHAPTER 06 > OBSERVATION 191 Read Montague discovered a vivid illustration of this phenomenon. 1 suggested did. What arethey the advantages of using brain scanning technology for the sippers’ brain activity. His results matched those of the In 2003, Montague recreated the classic Pepsi Challenge marketing research? original challenge: more than half preferred Pepsi. Participants’ experiment, in which participants are given sips of both Coke QUESTIONS brains, meanwhile, showed bursts of excitement in the region 2 What are some potential pitfalls? and Pepsi and asked which they prefer. But he also monitored that is activated by appealing tastes. Then Montague did the test What are the advantages brain scanningtechnology technology for 3 1 Do a Google search on the of useusing of brain-scanning the sippers’ brain activity. His results matched those of the again, but told participants which product they were drinking. marketing research? original challenge: more than half preferred Pepsi. Participants’ in marketing research. Are any of these pitfalls and problems This time, 75PHONE per cent voted for Coke. More significantly, their MOBILE SWITCHING AND BILL SHOCK. brains, meanwhile, showed bursts of excitement in the region 2 addressed What areby some potential pitfalls? marketers? brains also registered activity in an area linked to higher thinking that is activated byresearch, appealing tastes. Then and Montague did the test In the next stage of David, Leanne Steve were QUESTIONS 3 Do a Google search the use of brain-scanning technology 4 What are some ethicalon issues involved in the use of brain and discernment. The reason, according to neuromarketing again, told participants which product theythought were drinking. asked tobut examine evidence bill shock. the in marketing research. Are any of these pitfalls and problems guru Martin Lindstrom, whofor described theDavid research in histhat book 1 What are the advantages and disadvantages of David’s use of scanning technology in marketing research? This time, of 75mobile per cent votedbills for Coke. More significantly, their complexity phone may be contributing to this. addressed by marketers? observational research here? brains also registered activity in an area linked to higher thinking What are some ethicalwhich issuescould involved in the of brain and discernment. The reason, according to neuromarketing 2 4 Develop a spreadsheet be used touse collate this guru Martin Lindstrom, who described the research in his book scanning technology marketing research? information. Start with in your own mobile phone bill.

ONGOING CASE STUDY

WRITTEN CASE STUDY 6.2

3 What are some ethical issues for David, Leanne and Steve to

Guide to the online resources FOR THE INSTRUCTOR Cengage Learning is pleased to provide you with a selection of resources that will help you prepare your lectures and assessments. These teaching tools are accessible via cengage.com.au/instructors for Australia or cengage.co.nz/instructors for New Zealand.

Express CourseMate Express is your one-stop shop for learning tools and activities that help students succeed. As they read and study the chapters, students can access data sets, project worksheets, videos and research activities, review with flashcards and check their understanding of the chapter with interactive quizzing. CourseMate Express also features Engagement Tracker, a first-of-its-kind tool that monitors student engagement with the content. Ask your Learning Consultant for more details.

Qualtrics allows you to create and deploy surveys, and provides data for analysis. A survey included in the Chapter 1 Survey this! box invites students to respond to a sample survey. The sample survey data collected is then made available for learning exercises. Exercises and questions stemming from the survey in the earlier chapters encourage students to critically evaluate survey items and questionnaire construction. In the later chapters, the collected data provides a resource for hands-on analytics, revealing insights into students’ attitudes and behaviours.

INSTRUCTOR’S MANUAL The Instructor’s Manual includes: • • •

learning objectives chapter summary teaching notes

• • •

suggested solutions to the end-ofchapter questions and case studies additional research activities and more!

ARTWORK FROM THE TEXT Add the digital files of figures, graphs and pictures into your course management system, use them in student handouts, or copy them into your lecture presentations.

LOCAL VIDEOS Engage students and promote discussion by showing new local video interviews with industry professionals from a range of marketing research fields. These visual resources are available to instructors prescribing the text. POWERPOINTTM PRESENTATIONS Use the chapter-by-chapter PowerPoint presentations to enhance your lectures and handouts in order to reinforce the key principles of your subject. WORD-BASED TEST BANK This bank of questions has been developed with the text for the creation of quizzes, tests and exams for your students. Deliver tests from your learning management system and your classroom.

xxx

GUIDE TO THE ONLINE RESOURCES

FOR THE STUDENT New copies of this text come with an access code that gives you a 6-month subscription to the CourseMate Express website and Search me! marketing. Also included is a 4-month subscription to Qualtrics. Visit http://login.cengagebrain. com and log in using the code card.

Express Access your CourseMate Express website, which includes a suite of interactive resources designed to support your learning, revision and further research. Includes: • Data sets

• Project worksheets • Videos

Qualtrics is a tool that makes survey creation easy enough for a beginner while at the same time sophisticated enough for the most demanding academic. Qualtrics allows you to create and deploy surveys, and provides data for analysis. The Survey this! box in Chapter 1 invites you to respond to a sample survey. The sample survey data collected is then made available for learning exercises to allow you to critically evaluate survey items and questionnaire construction and to later enable hands-on analytics of the actual data.

• Research activities • Revision quizzes

Expand your knowledge with Search me! marketing. Fast and convenient, this resource provides you with 24-hour access to relevant full-text articles from hundreds of scholarly and popular journals and newspapers, including The Australian and The New York Times. Search me! marketing allows you to explore topics further and quickly find current references.

xxxi

ONE

INTRODUCTION TO THE RESEARCH PROCESS 01 » THE ROLE OF MARKETING RESEARCH AND THE RESEARCH PROCESS

PART 1: Introduction to the research process

Problem definition

Research design

Data analysis and report presentation

Fieldwork and data collection

Measurement

iStock.com/m-imagephotography

The marketing decision

3

01 » WHAT YOU WILL LEARN IN THIS CHAPTER

To understand the importance of marketing research as a management decision-making tool. To define marketing research. To understand the difference between basic and applied marketing research. To understand the managerial value of marketing research and its role in the development and implementation of marketing strategy. To understand when marketing research is needed and when it should not be conducted. To list the stages in the marketing research process. To classify marketing research as exploratory research, descriptive research or causal research. To explain the difference between a research project and a research program.

4

THE ROLE OF MARKETING RESEARCH AND THE RESEARCH PROCESS What made market research effective in 2014?

One way in which the impact of marketing research can be evaluated is by examining some of the winners of the Australian Market and Social Research Society’s Winners of Research Excellence Awards for 2014. These awards also show the wide application of market research to commercial, government and not-for-profit sectors. The 2014 Award1 for Consumer Insights was awarded to Hall & Partners Open Mind, for research done for beyondblue, which provided research aimed at encouraging the low uptake mental health services (only 30 per cent of those with a mental illness do so, and men account for 80 per cent of death by suicides in Australia). The research used qualitative research including interviews, case studies and online bulletin boards, to challenge and engage diverse cultural and social male groups struggling with mental illness. This research led to substantial changes in the way beyondblue engaged with men. These included a change of language from help-seeking to taking action, talking man-to-man about experiencing anxiety and a new man-focused online service, through the Man Therapy Website https://www.mantherapy.org.au. The award for Communications Strategy Effectiveness went to Jem Wallis and Ainslie Williams of Vivid, for ‘CommBank CAN’, which used eight rounds of qualitative research over 19 months to provide information to shape advertising strategy, especially copy, which shows triggers of customer satisfaction. The researchers used diverse elements of qualitative research of different formats including stealomatics, key frames, narrative tapes, basic headlines and partially formed scripts which were read out in the group. The research provided important real-time feedback on creative elements of the PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

campaign by advertising company M&C Saatchi, which has been effective in promoting customer service benefits of the Commonwealth Bank. In 2014 the winner for Policy/Social Research went to Anne Redman, Mary Raftos, Helen Price and Nick Connelly, of the Cultural and Indigenous Research Centre Australia and Market Research Unit, Department of Health, for research entitled ‘Listening to the community – research and evaluation for the National Indigenous Ear Health Campaign’. This research aimed to address the higher incidence of hearing loss among Aboriginal and/or Torres Strait Islander people(s). The research used both qualitative and quantitative approaches. The qualitative component included group discussions with Aboriginal and Torres Strait mothers and carers, fathers, and in-depth interviews with grandparents and elders in 14 urban, regional and remote areas. Such research was backed by a detailed literature review, case studies and consultations. The outcome of the research led to a two-pronged communications strategy which focused on distribution of Care for Kids’ Ears resources including resources for parents and carers’ resource kits for teachers and teacher’s aides’ early childhood and community groups and health professionals and the Care for Kids’ Ears website. The second approach was a cross-cultural media partnership, which delivered social marketing initiatives that were developed locally by Indigenous people. As can be seen, market research uses a wide set of approaches (surveys, focus groups, and online and offline research) that can help organisations to better meet important outcomes (profits, better mental health, less hearing loss and more effective advertising and communications) and even help save people’s lives. This is an exciting and dynamic field of business. This chapter examines the nature of market research and how it can be used, and introduces a step-by-step process of conducting research that can be applied for any type of organisation or client.

THE NATURE OF MARKETING RESEARCH

market research

Marketing research covers a wide range of phenomena. In essence, it fulfils the marketing manager’s need for knowledge of the market. The manager of a food company may ask, ‘Will a package change improve my sales?’ A competitor may ask, ‘How can I monitor my sales and retail trade activities?’ A marketing manager in the banking and finance industry may ask, ‘To whom am I losing sales? From whom am I taking sales?’ All of these marketing questions, as well as others related to specific marketing decisions, require information about how customers, distributors and competitors will respond to marketing decisions. Marketing research is one of the principal tools for answering such questions because it links the consumer, the customer and the public to the marketer with information used to identify and define marketing opportunities and problems; generate, refine, and evaluate marketing actions; monitor marketing performance; and improve understanding of marketing as a process.2 Marketing research specifies the information required to address these issues, designs the method for collecting information, manages and implements the data collection process, analyses the results, and communicates the findings and their implications. The task of marketing research is to help specify and supply accurate information to reduce the uncertainty in decision-making. Although marketing research provides information about consumers and the marketplace for developing and implementing marketing plans and strategies, it is not the only source of information. Every day, marketing managers translate their experiences with marketing phenomena into marketing strategies. Information from a manager’s experiences frequently is used in an intuitive manner because of time pressures on business decisions or because the problem does not warrant more formal research methods. However, the primary task of marketing management is effective decision-making. Relying on seat-of-the-pants decision-making – decision-making without systematic inquiry – is like betting on a long shot at the racetrack because the horse’s name is appealing. Occasionally there are successes, but in the long run intuition without research can lead to disappointment. Marketing research helps decision-makers shift from intuitive information gathering to systematic and objective investigating.

This book introduces the reader to the world of marketing research. Marketing research represents the eyes and ears of a competitive business firm. The researcher’s job includes determining what information is needed so that data can be analysed and become intelligence. Consumers play a crucial role in this process. They often are used as research participants and, with or without their knowledge, they provide the information. One way that consumers (and sometimes employees or managers) Courtesy of Qualtrics.com take part is by participating in surveys. Most readers have probably participated in surveys previously. Here is another chance to do so, only this time you will first play the role of a research participant. Later, you will fill the role of a research analyst and even a key marketing decision-maker as you try to make sense of data provided by the many users of this textbook. Your first interaction with the ‘Survey this!’ feature is simply to play the role of respondent and respond to the entire survey as honestly and completely as possible. Access the Qualtrics survey by following the instructions on the tear-out card in the front of this book. Your answers will be anonymously stored in the database. Once you’ve completed the survey, you can visit the course website and get a copy of the questions contained in the questionnaire.

SURVEY THIS!

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

5

6

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

Marketing research defined marketing research The systematic and objective process of generating information to aid in making marketing decisions.

Marketing research is defined as the systematic and objective process of generating information to aid in making marketing decisions. This process includes specifying what information is required, designing the method for collecting information, managing and implementing the collection of data, analysing the results, and communicating the findings and their implications. This definition suggests first that marketing research information is not intuitive or haphazardly gathered. Literally, research (re-search) means ‘to search again’. The term connotes patient study and scientific investigation wherein the researcher takes another, more careful look at the data to discover all that is known about the subject. Second, if the information generated or data collected is to be accurate, marketing researchers must be objective. Researchers should be detached and impersonal rather than biased and attempting to support their preconceived ideas. If bias enters into the research process, the value of the research is considerably reduced. As an example, a developer owned a large area of land and wished to build a high-prestige shopping centre. He wanted a research report to demonstrate to prospective retailers that there was a large market potential for such a centre. He conducted his survey exclusively in an elite suburb. Not surprisingly, the findings showed that a large percentage of the respondents wanted a high-prestige shopping centre. Results of this kind are misleading and should be disregarded. In this example, had the prospective retailers discovered how the results had been obtained, the developer would have lost credibility. Had the retailers been ignorant of the bias in the research design and unaware that the researchers were not impartial, their decision may have had more adverse consequences than one made strictly on intuition. The importance of striving for objectivity cannot be over-emphasised: without objectivity, research is valueless. This definition of marketing research is not restricted to any one aspect of the marketing mix. The objective of research is to facilitate the managerial decision-making process for all aspects of the firm’s marketing mix: pricing, promotion, distribution and product decisions. By providing the necessary information on which to base decisions, marketing research can reduce the uncertainty of a decision and thereby decrease the risk of making the wrong decision. However, research should be an aid to managerial judgement and not a substitute for it. Management is more than conducting marketing research; applying the research remains a managerial art. For example, a few years ago research indicated that women who bought frozen dinners tended to lead hectic lives and had trouble coping with everyday problems. Using this information, an advertising agency developed an ad for Beef Short Cuts in Australia showing a run-down working mother flopping into a chair just before her family was to arrive home for dinner. Suddenly realising that she had a problem, the woman had the bright idea of cooking a Beef Short Cuts meal. The beginning of the ad turned out to be a terrible mistake. The company quickly found out that the last thing working mothers wanted to be reminded of was how tired they were. Research can suggest directions for changes in the marketing mix, but it cannot ensure correct marketing execution. Finally, this definition of marketing research is limited by one’s definition of marketing. Although research in the marketing area of a for-profit corporation is marketing research, a broader definition of marketing research includes non-profit organisations such as the Transport Accident Commission (TAC Victoria), the Singapore Art Museum and the West Australian Symphony Orchestra. Each of these organisations exists to satisfy social needs, and each requires marketing skills to produce and distribute the products and services that people want. Hence, marketing research may be conducted by organisations that are not business organisations. National governments, for example, perform many functions that are similar, if not identical, to those of business organisations. Public service

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

7

managers may use research techniques for evaluative purposes in much the same way as managers at Samsung or Mazda. This book explores marketing research as it applies to all types of organisations and institutions that engage in some form of marketing activity.

Basic research and applied research One purpose of conducting marketing research is to develop and evaluate concepts and theories. Basic or pure research attempts to expand the limits of knowledge; it is not aimed at solving a particular pragmatic problem. It has been said that there is nothing so practical as a good theory. Although this is true in the long run, basic marketing research findings generally cannot be immediately implemented by a marketing executive. Basic research is conducted to verify the acceptability of a given theory or to learn more about a certain concept. Applied research is conducted when a decision must be made about a specific real-life problem. Our focus is on applied research – studies that are undertaken to answer questions about specific problems or to make decisions about particular courses of action or policies.

GOODE CHOICE, DAVID JONES!3

Research from Roy Morgan Research, found that the use of Indigenous AFL footballer Adam Goodes as a brand ambassador has received popular support from its customers. While 73.1 per cent of Australians aged 14 and over agree that ‘Aboriginal culture is an essential component of society’, this figure rises to 78.1 per cent of people who shop at David Jones. This is especially the case for those David Jones customers over 25 years, who were found to be above the population in their respect for Aboriginal culture.

REAL WORLD SNAPSHOT

Applied research is emphasised in this discussion because most students will be oriented towards the day-to-day practice of marketing management, and most students and researchers will be exposed to short-term, problem-solving research conducted for businesses or non-profit organisations. However, the procedures and techniques used by applied and basic researchers do not differ substantially: both employ the scientific method to answer the question at hand. Broadly defined, the term scientific method refers to the techniques and procedures used to recognise and understand marketing phenomena. In the scientific method, empirical evidence (facts from observation or experimentation) is analysed and interpreted to confirm or disprove prior conceptions. In basic research, testing these prior conceptions or hypotheses and then making inferences and conclusions about the phenomena lead to the establishment of general laws about the phenomena. Use of the scientific method in applied research ensures objectivity in gathering facts and testing creative ideas for alternative marketing strategies. The essence of research, whether basic or applied, lies in the scientific method. Much of this book deals with scientific methodology. Thus, the techniques of basic and applied research differ largely in degree rather than substance.

THE MANAGERIAL VALUE OF MARKETING RESEARCH FOR STRATEGIC DECISION-MAKING Effective marketing management requires research. Club Méditerranée found that extensive market research was required over an extended period of time in order to reposition itself as an attractive destination for Japanese tourists.4 Important changes made were the use of Japanese staff in each

basic (pure) research Research conducted to expand the limits of knowledge, to verify the acceptability of a given theory or to learn more about a certain concept. applied research Research conducted when a decision must be made about a real life problem. scientific method The techniques and procedures used to recognise and understand marketing phenomena.

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

location, the choice of Japanese and French food, provision of resorts on the Japanese mainland (since the time Japanese have for holidays is limited) and greater facilities for Japanese children (as many Japanese take their children on holiday with them). The prime managerial value of marketing research comes from the reduced uncertainty that results from information and facilitates decision-making about marketing strategies and tactics to achieve an organisation’s strategic goals. Developing and implementing a marketing strategy involves four stages: 1 Identifying and evaluating opportunities 2 Analysing market segments and selecting target markets 3 Planning and implementing a marketing mix that will satisfy customers’ needs and meet the objectives of the organisation 4 Analysing marketing performance.5

EXPLORING RESEARCH ETHICS

IS NEUROMARKETING RIGHT OR WRONG?6

Market researchers are now turning to the new technologies of EEG, fMRI (functional magnetic resonance imaging), galvanic skin response measures and eye-tracking tools to measure the hierarchy of effects of consumer emotion driving consumption, rather than relying on traditional methods, such as focus groups and surveys. This approach is often called neuromarketing. Experimental research suggests that consumers are more likely to believe that neuromarketing is ethical, even if it breaches concerns of privacy and informed consent ,when it is used for a not-for profitorganisation (e.g., an organisation seeking to reduce alcohol consumption amongst university students) rather than a profitorientated company (e.g., a brewery targeting university students). In fact, around 60 per cent of respondents reacted favourably to the use of neuromarketing techniques, despite their concerns, when it was used by a not-for-profit organisation.

iStockphoto/CGinspiration

8

Identifying and evaluating opportunities Before developing a marketing strategy, an organisation must ask where it wants to go and how to get there. Marketing research can help answer these questions by investigating potential opportunities to identify attractive areas for company action. Marketing research may provide diagnostic information about what is occurring in the environment. A mere description of some social or economic activity, such as trends in consumer purchasing behaviour, may help managers recognise problems and identify opportunities for enriching marketing efforts. One reason for Mattel Toys’ success in the rapidly changing toy market is the company’s commitment to consumer research. Much of its success may be traced to the way it goes about identifying opportunities for its new products. For example, marketing research showed that instead of military, spy or sports heroes, young boys preferred fantasy figures. Boys spend much time fantasising

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

about good versus evil. Research showed that timeless fantasy figures, both ancient and futuristic, were visually exciting; and because they were timeless, boys could do more with them. Several lines of action figures, including Rock ’Em Sock ’Em Robots and Max Steel toys, benefited from this research. Video games from Nintendo and Sony reflect an awareness of this research finding. The purpose of a research study on running shoes was to investigate the occasions or situations associated with product use – that is, when individuals wore their running shoes. The researchers found that most owners of running shoes wore the shoes while walking, not running. Also, most of this walking was part of a normal daily activity like shopping or commuting to work. Many of the people who wore running shoes for routine activities considered the shoes an alternative to other casual shoes. This research ultimately led to the development of the walking shoe designed for comfortable, everyday walking.7 Market opportunities may be evaluated using many performance criteria. For example, the performance criterion of market demand typically is estimated using marketing research techniques. Estimates of market potential or predictions about future environmental conditions allow managers to evaluate opportunities. Accurate sales forecasts are among the most useful pieces of planning information a marketing manager can have. Complete accuracy in forecasting the future is not possible because change is constantly occurring in the marketing environment. Nevertheless, objective forecasts of demand or changing environments may be the foundations on which marketing strategies are built.

Analysing and selecting target markets The second stage of marketing strategy development is to analyse market segments and select target markets. Marketing research is a major source of information for determining which characteristics of market segments distinguish them from the overall market. Market segmentation studies can be used, for example, to identify new segments of wine consumers. Treasury Wine Estates, an Australian company that exports to the United States, identified generation Y, or millennial women, as an important market segment for their products. Research showed these women found shopping for wine intimidating. The company introduced four new wine styles to appeal to this market. These four wine styles were Moscato, Riesling, Chardonnay and Pinot Grigio. Retailing for under US$12.99 (A$18.00), each bottle has a ‘mood’ description such as ‘radiant’ for Riesling.8

Planning and implementing a marketing mix Using the information obtained in the two previous stages, marketing managers plan and execute a marketing mix strategy. However, marketing research may be needed to support specific decisions about virtually any aspect of the marketing mix. Often the research is conducted to evaluate an alternative course of action. For example, advertising research investigated whether an actress, one of Hollywood’s most beautiful women, would make a good spokesperson for a specific brand of hair colouring. She was filmed in some test advertisements to endorse the brand, but the advertisements were never aired because, although viewers recognised her as an outstanding personality in the test advertisements, they did not perceive her as a user of home hair-colouring kits or as an authority on such products. Managers face many diverse decisions about marketing mixes. The following examples highlight selected types of research that might be conducted for each element of the marketing mix.

9

10

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

PRODUCT RESEARCH Product research takes many forms and includes studies designed to evaluate and develop new products and to learn how to adapt existing product lines. Concept testing exposes potential customers to a new product idea to judge the acceptance and feasibility of the concept. Product testing reveals a product prototype’s strengths and weaknesses or determines whether a finished product performs better than competing brands or according to expectations. Brand-name evaluation studies investigate whether a name is appropriate for a product. Package testing assesses size, colour, shape, ease of use and other attributes of a package. Product research encompasses all applications of marketing research that seek to develop product attributes that will add value for consumers. Before Cheetos became the first major brand of American snack food to be made and marketed in China, product taste tests revealed that traditional cheese-flavoured corn puffs did not appeal to Chinese consumers. So the company conducted consumer research with 600 different flavours to learn which flavours would be most appealing. Among the flavours Chinese consumers tested and disliked were ranch dressing, nacho, Italian pizza, Hawaiian barbecue, peanut satay, North Sea crab, chilli prawn, coconut milk curry, smoked octopus, caramel and cuttlefish. However, research showed that consumers did like some flavours. So, when Cheetos were introduced in China, they came in two flavours: savoury American cream and zesty Japanese steak.9

WHAT WENT RIGHT? ‘LOW CARB BEERS ANYONE?10

Since 2004, the demand for low carb beers in Australia has increased sevenfold. Low carb beers are made with less residual sugar and carbohydrates. But do low carb beers help you lose weight? Unfortunately, no. Research by VicHealth suggests that 87 per cent of consumers believe drinking low carb beers

helps avoid weight gain, but in fact the calories are in the alcohol content. Respondents in the same survey also believed that low carb beer had fewer kilojoules than normal beer (37 per cent) and 71 per cent thought it was a healthier option than other beers.

Research by cosmetic companies in Asia such as L’Oréal and Procter & Gamble revealed that Western tanning products were of no value to Asians. While a suntanned face in Western culture is associated with a hedonistic indulgence for the beach or the snow, Asian women generally avoid the sun as much as possible and desire a whiter complexion, which is seen as more pure and beautiful. For Indonesians, a dark complexion suggests that a person is probably a peasant, construction worker or fisherman exposed to the sun on a daily workaday basis. Consequently, L’Oréal and Procter & Gamble have developed whitening creams for Asian markets, which have no demand in the West.11

PRICING RESEARCH Most organisations conduct pricing research. A competitive pricing study is a typical marketing research project of this type. However, research designed to learn the ideal price for a product, or to determine if consumers will pay a price high enough to cover cost, is not uncommon. For example, a Bausch & Lomb survey of 5000 contact lens wearers showed that more than 60 per cent of them would be interested in one-day disposable lenses if the price came down to about a dollar a day. This led to the introduction of the SofLens brand, which proclaimed a significant price advantage over competitors. Pricing research may also investigate when discounts or coupons should be offered, explore whether there are critical product attributes that determine how consumers perceive value, or determine if a product category (such as soft drinks) has price gaps among national brands, regional brands and private labels.12

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

Research into the pricing of airline tickets between Perth and Singapore found that consumers expect to pay less for airline tickets purchased online. The brand name of the airline providing these tickets online was also important in terms of quality.13 A well-known airline brand was not only associated with higher quality, but also with higher expected prices and a greater likelihood of purchase. Well-known brands, such as Virgin Australia, were also found to be more credible to consumers when offering discounts online or offline than less well-known budget brands such as Tiger Airways. Research by INSEAD14 shows that price in South-East Asia (SEA) is often associated with higher quality, although the degree does vary. High prices in SEA are often important as a means of social acceptance in gift giving and entertainment. Another reason for paying a high price in Asia concerns status; for example, being the first to have the latest, more expensive technology. In China, there is a saying ‘xian wei ju’ (‘whoever comes first becomes the master’): thus paying more for new technology is associated with greater prominence in society. Research may answer many questions about price. Is there a need for seasonal or quantity discounts? Are coupons more effective than price reductions? Is a brand price elastic or price inelastic? How much of a price difference is optimal to differentiate items in the product line?

HOW LEGO BECAME HOT AGAIN15

The success of The Lego Movie and Lego sales concurrently at an all time high suggests that the company understands its consumers well; however, this was not always the case. Just a decade ago, sales of Lego were at an all-time low, the company was losing 1 million dollars a day and posting record losses. Lego then engaged a market research firm, specialising in anthropological research, to observe the roots of play of its customers (parents and children) in their own homes. These teams of anthropological researchers observed play in homes in Los Angeles, New York, Chicago, Munich and Hamburg. They made photo diaries, and interviewed parents and their children. The research revealed that play habits had not changed; children just wanted the freedom to experiment on their own with Lego and build something masterful, or as the CEO of Lego Paal Smith-Meyer puts it ‘Lego takes time’.

DISTRIBUTION RESEARCH Golden Books traditionally distributed its small hardcover children’s books with golden spines to upmarket book retailers. When it researched where its customers would prefer to purchase Golden Books, the company learned that mass merchandisers, supermarkets and pharmacies would be just as popular distribution channels as the upmarket stores. Coles, Sogo and Toys“R”Us are among the many major retailers that have researched home shopping services via the Internet. New interactive media and home delivery as a means of distribution have the potential to revolutionise channel-of-distribution systems, and millions of dollars are being spent to research this alternative. Although most distribution research does not have the dramatic impact of the research on Internet shopping systems, research focused on developing and improving the efficiency of channels of distribution is important to many organisations. Research has shown that in China, many consumers ‘window shop’ in more expensive joint-venture department stores in order to educate themselves about the latest products and fashions. Later, they buy these same products in the more hectic state-run department stores, which have poor service but cheaper prices.16 A typical study in the distribution area may be conducted to select retail sites or warehouse locations. A survey of retailers or wholesalers may be conducted because the actions of one channel member can greatly affect the performance of other channel members. Distribution research often is needed to gain knowledge about retailers’ and wholesalers’ operations, and to learn their reactions to a manufacturer’s marketing policies.

REAL WORLD SNAPSHOT

11

12

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

PROMOTION RESEARCH Research that investigates the effectiveness of premiums, coupons, sampling deals and other sales promotions is classified as promotion research. Promotion research includes buyer motivation studies to generate ideas for copy development, media research and studies of advertising effectiveness. However, the most time, money and effort are spent on advertising research. Marketing research findings have found that different appeals are effective in mainland China and Hong Kong. Chinese consumers prefer advertisements that emphasise concrete, functional and practical product benefits, rather than symbolic themes. The ‘more sophisticated’ Hong Kong Chinese regard such advertising themes as dull and are more interested in emotional advertisements that are entertaining and communicate a sense of personal relevance.17 Based on such research, marketers therefore need to adopt separate media campaigns for regions within China. Media research helps an advertiser decide whether television, newspapers, magazines or other media are best suited to convey the advertiser’s message. Choices among media alternatives may be based on research that shows how many people in the target audience each advertising vehicle can reach. Although the population of New Zealand is small, at around 4.5 million, there are a number of local newspapers that are widely read by many local communities. These are not owned by the two dominant publishing groups. Consumers believe they reinforce a sense of belonging and are likely to purchase products and services from companies that advertise in them. Research in Japan suggests that most Japanese favour the 15-second advertising slot for image-based and peripheral messages. Long commute times of around 70 minutes a day to the office mean they have ample time to read newspapers and magazines. Therefore, marketers in Japan, based on this research, run two campaigns: an awareness and corporate branding campaign on television, and a more detailed and informative campaign in print.

THE INTEGRATED MARKETING MIX The individual elements of the marketing mix do not work independently. Hence, many research studies investigate various combinations of marketing ingredients to gather information to suggest the best possible marketing program.

Analysing marketing performance After a marketing strategy has been implemented, marketing research may serve to inform managers whether planned activities were properly executed and are accomplishing what they were expected to achieve. In other words, marketing research may be conducted to obtain feedback for evaluation and control of marketing programs. This aspect of marketing research is especially important for successful total quality management. performance-monitoring research Research that regularly provides feedback for evaluation and control of marketing activity.

Performance-monitoring research refers to research that regularly, sometimes routinely, provides feedback for evaluation and control of marketing activity. For example, most firms continuously monitor wholesale and retail activity to ensure early detection of sales declines and other anomalies. In the grocery and pharmaceutical industries, sales research may use Universal Product Codes (UPC) on packages read by electronic cash registers and computerised checkout counters to provide valuable market-share information to store and brand managers interested in the retail sales volumes of their products. Market-share analysis and sales analysis are the most common forms of performance-monitoring research. Almost every organisation compares its current sales with previous sales and with competitors’ sales. However, analysing marketing performance is not limited to the investigation of sales figures. Other forms of performance-monitoring research include the ‘Tell Coles’ customer feedback surveys and the research on Victorian Child Protection Agency, using the exit surveys of past staff to

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

13

identify potential problems within the organisation. In many universities today, student surveys on unit evaluation and teaching form an important part of the performance-monitoring process of teaching. Increasingly, the evaluation of health programs is also conducted by market research companies: another type of performance monitoring. When analysis of marketing performance indicates that things are not going as planned, marketing research may be required to explain why something went wrong. Detailed information about specific mistakes or failures is frequently sought. If a general problem area is identified, breaking down industry sales volume and a firm’s sales volume into different geographical areas may explain specific problems. Exploring problems in greater depth may indicate which managerial judgements were erroneous.

WHAT WENT WRONG?

WHEN IS MARKETING RESEARCH NEEDED? A marketing manager confronted with two or more alternative courses of action faces the initial decision of whether or not to conduct marketing research. The determination of the need for marketing research centres on: 1 time constraints 2 the availability of data 3 the nature of the decision to be made 4 the value of the research information in relation to costs.

Time constraints Systematic research takes time. In many instances management believes that a decision must be made immediately, allowing no time for research. Decisions sometimes are made without adequate information or a thorough understanding of market situations. Although making decisions without researching a situation is not ideal, sometimes the urgency of a situation precludes the use of research.

Alamy/Jeff Greenberg ‘0 people images’

PROCTER & GAMBLE: RESEARCH SHOWS THE DIFFERENCES OF PACKAGING IN JAPAN

The type of packaging Western products need in Asian markets is very important. A large proportion of washing powder in Indonesia, for example, is sold in small sachets rather than large cardboard boxes, due to the limited purchasing power in a country where shopping is done on a daily basis. When pioneering disposable nappies in Japan, Procter & Gamble faced stiff competition from Japanese companies that entered the market with smaller boxes that contained the same number of nappies. The Japanese product was better suited to the market because of the general lack of space in Japan, both geographically and in the home. Western companies therefore need to adapt product sizes in Asian markets to take into account purchasing power and patterns and available space for storage at home. This again means that market research is of crucial importance in the Asia Pacific region.

14

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

Availability of data Often managers already possess enough information to make sound decisions without marketing research. When they lack adequate information, however, research must be considered. Managers must ask themselves if the research will provide the information needed to answer the basic questions about a decision. Furthermore, if a potential source of data exists, managers will want to know how much it will cost to obtain the data. If the data cannot be obtained, research cannot be conducted. For example, many African nations have never conducted a population census. Organisations engaged in international business often find that data about business activity or population characteristics that are readily available in Australia and New Zealand are non-existent or sparse in developing countries. Imagine the problems facing marketing researchers who wish to investigate market potential in places such as East Timor and Rwanda.

Nature of the decision The value of marketing research will depend on the nature of the managerial decision to be made. A routine tactical decision that does not require a substantial investment may not seem to warrant a substantial expenditure for marketing research. For example, a computer company must update its operator’s instruction manual when it makes minor product modifications. The research cost of determining the proper wording to use in the updated manual is likely to be too high for such a minor decision. The nature of the decision is not totally independent of the next issue to be considered: the benefits versus the costs of the research. In general, however, the more strategically or tactically important the decision, the more likely it is that research will be conducted.

Benefits versus costs There are both costs and benefits to conducting marketing research. Earlier we discussed some of the managerial benefits of marketing research. Of course, conducting research to obtain these benefits requires an expenditure of money. In any decision-making situation managers must identify alternative courses of action and then weigh the value of each alternative against its cost. Marketing research can be thought of as an investment alternative. When deciding whether to make a decision without research or to postpone the decision in order to conduct research, managers should ask three questions: 1 Will the payoff or rate of return be worth the investment? 2 Will the information gained by marketing research improve the quality of the marketing decision enough to warrant the expenditure? 3 Is the proposed research expenditure the best use of the available funds? For example, the development of the Mazda MX-5 or Miata sports car was conducted without detailed marketing research. The Japanese product development team wanted to present management with a fully working model that met engineering and design innovations, rather than being constrained by more conservative market research findings. The team was also worried that upper management may not have accepted on paper what was, for the company, a radical design. The MX-5, unlike other Mazdas at the time, had a rear-wheel drive, which meant it would be a risky proposition since production costs would be higher. However, since its launch in the late 1980s the car has been an unqualified success. Nowadays though, Mazda, in the development of new MX-5 models, conducts more detailed marketing research, especially with respect to body shape and design.

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

Nevertheless, without the luxury of hindsight, managers made a reasonable decision not to conduct research. They analysed the cost of the information (that is, the cost of test marketing) relative to the potential benefits of the information. The situation is different now as Mazda may have more to lose in terms of sales of a popular brand than was the case in the past, when the MX-5 was originally designed as a car for the speciality market. Exhibit 1.1 outlines the criteria for determining when to conduct marketing research.

← EXHIBIT 1.1 DETERMINING WHEN TO CONDUCT MARKETING RESEARCH

Time constraints

Availability of data

Nature of the decision

Is sufficient time available before a managerial decision must be made?

Is the information already on hand inadequate for making the decision?

Is the decision of considerable strategic or tactical importance?

No

Yes

No

Yes

Yes

Beliefs versus costs Does the value of the research information exceed the cost of conducting research?

No

No

Do not conduct marketing research

BUSINESS TRENDS IN MARKETING RESEARCH Marketing research, like all business activity, has been strongly influenced by two major trends in business: increased globalisation, and rapid growth of the Internet and other information technologies. These trends will continue, and likely accelerate, as the 21st century progresses. We consider their significance for marketing research here.

Global marketing research Marketing research has become increasingly global. Some companies have extensive international marketing research operations. Upjohn conducts marketing research in 160 different countries. Nielsen, known for its television ratings, is the world’s largest marketing research company. Two-thirds of its business comes from outside the USA. Companies that conduct business in foreign countries must understand the nature of those particular markets and judge whether they require customised marketing strategies. For example, although the nations of the European Union share a single formal market, marketing research shows that Europeans do not share identical tastes for many consumer products. Marketing researchers have found no such thing as a typical European consumer: language, religion, climate and centuries of tradition divide the nations of the European Union. A British firm that advised companies on colour preferences found inexplicable differences in Europeans’ preferences in medicines. The French prefer to pop purple pills, but the British and Dutch favour white ones. Consumers in all three countries dislike bright red capsules, which are big sellers in the USA. This example illustrates that companies that do business in Europe must judge whether they need to adapt to local customs and buying habits.18 Although the nature of marketing research can differ around the globe, the need for marketing research is universal. Throughout this book, we discuss the practical problems involved in conducting marketing research in Asia, Europe, the Middle East and elsewhere.

Yes

Conduct marketing research

15

16

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

Growth of the Internet and social media The Internet is transforming society. Time is collapsing. Distance is no longer an obstacle. Crossing oceans requires only a mouse click. People are connected 24 hours a day, seven days a week. Responsiveness and service have new meaning. Many people believe that the Internet is the most important communications medium since television. It has certainly changed the way millions of people think about getting and distributing information. And, of course, obtaining and communicating information is the essence of marketing research. Consider that a researcher seeking facts and figures about a marketing issue may find more extensive information on the Internet more quickly than by visiting a library. Another researcher who is questioning people from around the globe may do so almost instantaneously with an Internet survey and get responses 24 hours a day, seven days a week. Visitors to an organisation’s website may find that online questions are personalised, because the site incorporates information technology that remembers the particular ‘click stream’ of the web pages visited. These few examples illustrate how the Internet and other information technologies are dramatically changing the face of marketing research. Another important aspect is the growth of social media. In the last quarter of 2015, for example, there were more than 1.59 billion active users of Facebook.19 Companies are increasingly concerned about viral communication, online blogging and word of mouth. Consumers may also communicate directly to companies and organisations through brand Facebook pages, or post to their friends the positive and negative experiences they have had with that brand. This means that many organisations may wish to monitor online communication, but may lack the resources to do so. Increasingly there is a new area of market research – that of online ethnography, or netnography, where consumer posts and comments are monitored and interpreted on a daily basis. There is also the requirement that this information be presented quickly to clients, so the impact of the growth of the Internet and social media is that it is increasing the pace at which market research is done. In the 21st century, marketing research on the Internet is moving out of the introductory stage of its product life cycle into the growth stage. The rest of this book reflects this change. Throughout the book, we use the latest information technologies and their application to marketing research. Marketing research via the Internet has come of age.

STAGES IN THE RESEARCH PROCESS As previously noted, marketing research can take many forms, but systematic enquiry is a common thread. Systematic enquiry requires careful planning of an orderly investigation. Marketing research, like other forms of scientific enquiry, involves a sequence of highly interrelated activities. The stages of the research process overlap continuously, and it is somewhat of an oversimplification to state that every research project has exactly the same ordered sequence of activities. Nevertheless, marketing research often follows a general pattern. The stages are: 1 defining the problem 2 planning a research design 3 planning a sample 4 collecting the data 5 analysing the data 6 formulating the conclusions and preparing the report.

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

17

For example, the objectives of the research outlined in the problem definition will have an impact on the selection of the sample and how the data will be collected. The decision concerning who will be sampled will affect the wording of questionnaire items. If the research concentrates on respondents with low educational levels, the wording of the questionnaire will be simpler than it would be if the respondents were university graduates. Exhibit 1.2 portrays these six stages as a cyclical, or circular-flow, process. The circular-flow concept is used because the conclusions from research studies usually generate new ideas and problems that need to be investigated. Problem discovery and definition

Problem discovery

Sampling Selection of sample design

Selection of exploratory research technique

Secondary (historical) data

Pilot study

Probability sampling

Experience survey

Case study

Data gathering

Problem definition (statement of research objectives)

Planning the research design

Data processing and analysis

Selection of basic research method

Survey

Experiment

Interview Questionnaire

Laboratory Field

Secondary data study

Observation

Drawing conclusions and preparing report

Nonprobability sampling

Collection of data (fieldwork) Editing and coding data Data processing and analysis Interpretation of findings

Report

Note that the organisation of this textbook follows each stage in the research process, with Chapter 2, which discusses defining the problem, followed by a series of chapters on research designs such as exploratory/qualitative research (Chapter 3), secondary research (Chapter 4), survey research (Chapter 5), observational research (Chapter 6) and experiments (Chapter 7). This is then followed by a discussion of measurement (Chapter 8) and questionnaire design (Chapter 9). Chapters 3 to 9 thus make up the section of the text that addresses planning a research design. The next step, collecting a sample, is addressed by Chapter 10. Chapter 11 deals with the next step in the research process: fieldwork, editing and coding, or collecting the data. Chapters 12, 13, 14 and 15 discuss analysing data, with an overview of basic to advanced statistical analysis. Finally, Chapter 16 discusses the presentation of research reports, including deriving important conclusions for managers. Note that there is no one overall approach (methodology) – in terms of research design, means of collecting

← EXHIBIT 1.2 FLOWCHART OF THE MARKETING RESEARCH PROCESS

18

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

data, analysing data and presenting findings – that is perfect. In each stage of the research process, decisions and trade-offs must be made under constrained budgets and times. What can be said is that some research designs, samples, means of collecting data, analysis and market research reports are better than others. The choices a researcher must make throughout the research process are discussed briefly in the next section and in greater detail throughout this textbook.

Alternatives in the research process The researcher must choose among a number of alternatives during each stage of the research process. The research process can be compared to a map. On a map some paths are better charted than others; some are difficult to travel and some are more interesting and beautiful than others. Rewarding experiences may be gained during the journey. It is important to remember that there is no single right or best path for all journeys. The road one takes depends on where one wants to go and the resources (money, time, labour and so on) available for the trip. The map analogy is useful for the marketing researcher because at each stage of the research process there are several paths to follow. In some instances, the quickest path will lead to appropriate research because of time constraints. In other circumstances, when money and human resources are plentiful, the appropriate path may be quite different. Exploration of the various paths of marketing research decisions is the primary purpose. The following sections briefly describe the six stages of the research process. (Each stage is discussed in greater depth in later chapters.) Exhibit 1.2 shows the decisions that researchers must make in each stage. This discussion of the research process begins with problem discovery and definition, because most research projects are initiated to remedy managers’ uncertainty about some aspect of the firm’s marketing program.

Discovering and defining the problem Exhibit 1.2 shows that the research process begins with problem discovery. Identifying the problem is the first step towards its solution. In general usage, the word ‘problem’ suggests that something has gone wrong. Actually, the research task may be to clarify a problem, define an opportunity, or monitor and evaluate current operations. The concept of problem discovery and definition must encompass a broader context that includes analysis of opportunities. It should be noted that the initial stage is problem discovery rather than definition. The researcher may not have a clear-cut statement of the problem at the outset of the research process; often, only symptoms of the problem are apparent at that point. Sales may be declining, but management may not know the exact nature of the problem. Thus, the problem statement often is made only in general terms; what is to be investigated is not yet specifically identified.

DEFINING THE PROBLEM In marketing research, the adage ‘a problem well defined is a problem half solved’ is worth remembering. This adage emphasises that an orderly definition of the research problem lends a sense of direction to the investigation. A decision-maker must recognise the nature of the problem or opportunity, identify how much information is available and determine what information is needed. The major aspects of defining a problem in market research are the degrees of certainty and uncertainty, and the level of ambiguity.

Certainty Complete certainty means that all the information the decision-maker needs is available; the decisionmaker knows the exact nature of the marketing problem or opportunity. For example, an advertising

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

19

agency may need to know the demographic characteristics of subscribers to magazines in which it may place a client’s advertisements. The agency knows exactly what information it needs and where to find the information. If a manager is completely certain about both the problem or opportunity and future outcomes, then research may not be needed at all. However, perfect certainty, especially about the future, is rare.

Uncertainty Uncertainty means that the manager grasps the general nature of desired objectives, but the information about alternatives is incomplete. Predictions about forces that shape future events are educated guesses. Under conditions of uncertainty, effective managers recognise that spending additional time to gather information to clarify the nature of a decision can be valuable.

Ambiguity Ambiguity means that the nature of the problem to be solved is unclear. Objectives are vague and decision alternatives are difficult to define. This is by far the most difficult decision situation. Marketing managers face a variety of problems and decisions. Complete certainty and predictable future outcomes may make marketing research a waste of time. However, under conditions of uncertainty or ambiguity, marketing research becomes more attractive to the decision-makers. The more ambiguous a situation is, the more likely it is that additional time must be spent on marketing

problem definition stage

research. Careful attention to the problem definition stage allows the researcher to set the proper research objectives. If the purpose of the research is clear, the chances of collecting necessary and relevant information and not collecting surplus information will be much greater. Albert Einstein noted that ‘the formulation of a problem is often more essential than its solution’. This is good advice for marketing managers. Too often they concentrate on finding the right answer rather than asking the right question. Many managers do not realise that defining a problem may be more difficult than solving it. In marketing research, if the data are collected before the nature of the marketing problem is carefully thought out, they probably will not help solve the problem. To be efficient, marketing research must have clear objectives and definite designs. Unfortunately, little or no planning goes into the formulation of many research problems. One example is the low uptake of electric cars in many countries. Research in the United Kingdom suggests motorists are hesitant to switch to electric cars because of concerns about the lack of charging points and the perceived cost of replacing the battery. For car manufacturers, the concern is over the lack of a standard interchangeable battery across models.20 It should be emphasised that the word ‘problem’ refers to the managerial problem (which may be a lack of knowledge about consumers or advertising effectiveness) and the information needed to help solve the problem. Defining the problem must precede determination of the purpose of the research. Frequently the marketing researcher will not be involved until line management has discovered that some information about a particular aspect of the marketing mix is needed. Even at this point the exact nature of the problem may be poorly defined. Once a problem area has been discovered, the marketing researcher can begin the process of precisely defining it. Although the problem definition stage of the research process probably is the most important one, it frequently is a neglected area of marketing research. Too many researchers forget that the best place to begin a research project is at the end. Knowing what is to be accomplished determines the research process. An error or omission in problem definition is likely to be a costly mistake that cannot be corrected in later stages of the process. Chapter 2 discusses problem definition in greater detail. Marketing research provides information to reduce uncertainty. It helps focus decision-making. Sometimes marketing researchers know exactly what their marketing problems are and design careful

problem definition stage The stage in which management seeks to identify a clear-cut statement of the problem or opportunity.

20

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

studies to test specific hypotheses. For example, a soft drink company introducing a new clear cola might want to know whether a gold or a silver label would make the packaging more effective. This problem is fully defined and an experiment may be designed to answer the marketing question with little preliminary investigation. In more ambiguous circumstances, management may be totally unaware of a marketing problem. For example, McDonald’s may notice that Mo’s Burgers, a competitor in the Japanese market, has introduced Mo’s Roast Katsu Burger, a roast pork cutlet drenched in traditional Japanese katsu sauce and topped with shredded cabbage. The managers may not understand much about Japanese consumers’ feelings about this menu item. Some exploratory research may be necessary to gain insights into the nature of such a problem. To understand the variety of research activity, it is beneficial to categorise types of marketing research. Marketing research can be classified on the basis of either technique or function. Experiments, surveys and observational studies are just a few common research techniques. Classifying research by its purpose or function shows how the nature of the marketing problem influences the choice of methods. The nature of the problem will determine whether the research is (1) exploratory, (2) descriptive or (3) causal. ONGOING PROJECT

SELECTION OF THE BASIC RESEARCH METHOD Here again, the researcher must make a decision. Table 1.1 shows the four basic design techniques for descriptive and causal research: surveys, experiments, secondary data and observation. The objectives of the study, the available data sources, the urgency of the decision and the cost of obtaining the data will determine which method should be chosen. The managerial aspects of selecting the research design will be considered later.

TABLE 1.1 »

Possible situation

RELATIONSHIP OF UNCERTAINTY TO TYPES OF MARKETING RESEARCH

Exploratory research (ambiguous problem)

Descriptive research (partially defined problem)

Causal research (problem clearly defined)

‘Our sales are declining and we don’t know why.’

‘What kind of people are buying our product? Who buys our competitors’ products?’

‘Will buyers purchase more of our product in a new package?’

‘Would people be interested in our new product idea?’

‘What features do buyers prefer in our product?’

‘Which of two advertising campaigns is more effective?’

Note: The degree of uncertainty of the research problem determines the research methodology.

Uncertainty influences the type of research The uncertainty of the research problem is related to the type of research project. Table 1.1 illustrates that exploratory research is conducted during the early stages of decision-making, when the decision situation is ambiguous and management is very uncertain about the nature of the problem. When management is aware of the problem but lacks some knowledge, descriptive research is usually conducted. Causal research requires sharply defined problems.

STATEMENT OF RESEARCH OBJECTIVES research objectives

A researcher must initially decide precisely what to research. After identifying and clarifying the problem, with or without exploratory research, the researcher should make a formal statement of the

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

21

problem and the research objectives. This statement delineates the type of information that should be collected and provides a framework for the scope of the study. A typical research objective might seek to answer a question such as: ‘To what extent did the new pricing program achieve its objectives?’ In this sense the statement of the problem is a research question. The best expression of a research objective is a well-formed, testable research hypothesis. A hypothesis is a statement that can be refuted or supported by empirical data. For example, an exploratory study might lead to the hypothesis that a market share decline recognised by management is occurring predominantly among households in which the head of the household is 45 to 65 years old with an income of $65 000 per year or less. Another hypothesis might be that concentrating advertising efforts in monthly waves (rather than conducting continuous advertising) will cause an increase in sales and profits. Once the hypothesis has been developed, the researcher is ready to select a research design.

Planning the research design After the researcher has formulated the research problem, the research design must be developed as part of the research design stage. A research design is a master plan that specifies the methods and procedures for collecting and analysing the needed information; it is a framework for the

research design

research plan of action. The objectives of the study determined during the early stages of the research are included in the design to ensure that the information collected is appropriate for solving the problem. The researcher also must determine the sources of information, the design technique (survey or experiment, for example), the sampling methodology, and the schedule and cost of the research.

EXPLORATORY RESEARCH Exploratory research usually is conducted during the initial stage of the research process. This is generally qualitative research. The preliminary activities undertaken to refine the problem into researchable form need not be formal or precise. Exploratory research is not intended to provide conclusive evidence from which to determine a particular course of action. Mostly exploratory research is conducted with the expectation that subsequent research will be required to provide such conclusive evidence. Rushing into detailed surveys before less expensive and more readily available sources of information have been exhausted can lead to serious mistakes. Marketing researchers must be aware of potential problems when deciding exactly what research design will best solve their research problems.

research design stage The stage in which the researcher determines a framework for the research plan of action by selecting a basic research method. exploratory research Initial research conducted to clarify and define a problem.

REAL WORLD SNAPSHOT

HOW REALISTIC ARE RESEARCH RESULTS?

Many managers view consumer research as a necessary precursor to product introduction. Unfortunately, innovative products that lack much in common with existing products often prove this attitude to be wrong. Hairstyling mousse is now a massive hit, yet its initial US market tests flopped. People said it was ‘goopy and gunky’, and that they did not like its feel when it ‘mooshed’ through their hair. Similarly, when the telephone answering machine was consumer tested, it faced an almost universally negative reaction, since most individuals felt that using a mechanical device to answer a phone was rude and disrespectful. Today, of course, many people regard their answering machines as

research design A master plan that specifies the methods and procedures for collecting and analysing needed information.

»

22

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

Co rbi s/L ou

yos iho Ps ie

»

indispensable and would dread scheduling daily activities without them. In the same vein, the computer mouse flunked its initial testing. Surveys indicated that potential customers found it awkward and unnecessary. Surveys about new food products face terrible problems. A person’s desire for food is powerfully influenced by the ambience of the location, dining companions and what foods were eaten recently, all of which confound and confuse the results of survey research. Even more erratic results come from studies of children’s food, such as a new cereal or snack. Kids’ responses are strongly swayed by how well they like the people doing the test and the playthings available. Worse, kids quickly change their minds, and in a taste test of several foods a child can judge one food the best but an hour later proclaim the same food ‘icky’.

For example, suppose a Chinese fast-food restaurant chain is considering expanding its hours and product line with a breakfast menu. Exploratory research with a small number of current customers might find a strong negative reaction to eating a spicy vegetable breakfast at a Chinese fast-food outlet. Thus, exploratory research might help crystallise a problem and identify information needed for future research. The purpose of the exploratory research process is to progressively narrow the scope of the research topic and transform ambiguous problems into well-defined ones that incorporate specific research objectives. By investigating any existing studies on the subject, talking with knowledgeable individuals and informally investigating the situation, the researcher can progressively sharpen the concepts. After such exploration, the researcher should know exactly which data to collect during the formal phases of the project and how to conduct the project. Exhibit 1.2 indicates that managers and researchers must decide whether to use one or more exploratory research techniques. As Exhibit 1.2 indicates, this stage is optional. The marketing researcher can employ techniques from four basic categories to obtain insights and gain a clearer idea of the problem: secondary data analysis, pilot studies, case studies and experience surveys. Here we will briefly discuss secondary data and focus group interviews, the most popular type of pilot study.

Secondary data Corbis Australia/Clarissa Leahy

Secondary data, or historical data, are data previously collected and assembled for some project other than the one at hand. (Primary data are data gathered and assembled specifically for the project at hand.) Secondary data often can be found inside the company, at a public or university library, or on the Internet. In addition, some firms specialise in providing various types of information, such as economic forecasts, that are useful to many organisations. The researcher who gathers data from the Australian Bureau of Statistics (see http://www.abs.gov.au) or from the Economist Intelligence Unit (see http://www.eiu.com) is using secondary sources. A typical focus group session brings together six to 10 people to explore consumer opinions and behaviours. Focus groups, like other pilot studies, use sampling but do not apply rigorous standards.

A literature review – a survey of published articles and books that discusses theories and past empirical studies about a topic – is an almost universal first step in academic research projects. A literature survey also is common in many applied research studies. Students who have written term papers should be familiar with using computer search systems, indexes to published literature and other library sources to compile bibliographies of past research.

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

23

Suppose, for example, that a bank is interested in determining the best site for additional automated teller machines. A logical first step would be to investigate the factors that bankers in other parts of the country consider important. By reading articles in banking journals, management might quickly discover that the best locations are inside supermarkets located in residential areas where people are young, highly educated and earning higher-than-average incomes. These data might lead the bank to investigate census information to determine where in the city such people live. Reviewing and building on the work already compiled by others is an economical starting point for most research. Secondary data can almost always be gathered more quickly and more inexpensively than primary data. However, secondary data may be outdated and may not exactly meet researchers’ needs because they were collected for another purpose. Nevertheless, secondary sources often prove to be very valuable in exploratory research. Investigating such sources has saved many a researcher from ‘reinventing the wheel’ in primary data collection.

Pilot studies The term pilot study covers a number of diverse research techniques. Pilot studies collect data from the ultimate consumers or the actual subjects of the research project to serve as a guide for a larger study. When the term ‘pilot study’ is used in the context of exploratory research, it refers to a study whose data collection methods are informal and whose findings may lack precision. For instance, a city association concerned with revitalisation of the central business district conducted a very flexible survey using open-ended questions. The interviewers were given considerable latitude to identify changes needed in the shopping area. The results of this survey suggested possible topics for formal investigation. The focus group interview is a more elaborate kind of exploratory pilot study that has become increasingly popular in recent years. The focus group session brings together six to 10 people in a loosely structured format; the technique is based on the assumption that individuals are more willing to share their ideas when they are able to hear the ideas of others. Information obtained in these studies is qualitative and serves to guide subsequent quantitative study. For example, the National Drugs campaign, which aims to reduce the consumption of illicit drugs in Australia (see http://www.nationaldrugstrategy.gov.au), used focus groups to help identify attitudes and motivations of youth towards the use of illicit drugs. Fifty-seven focus groups were used, each consisting of three friends who had similar attitudes and behaviours towards drugs. These focus groups captured the variety of users of drugs, from light users of tobacco and alcohol, to cannabis users and through to intravenous users. The research suggested there were six distinct patterns when it came to the use of illegal drugs by youth: 1 considered rejectors 2 cocooned rejectors 3 careful curious 4 risk controllers 5 thrill seekers 6 reality swappers.21 Four basic methods of exploratory research have been identified, but such research does not have to follow a standard design. Because the purpose of exploratory research is to gain insights and discover new ideas, researchers may use considerable creativity and flexibility. Data generally are collected using several exploratory techniques. Exhausting these sources usually is worth the effort because the expense is relatively low. Furthermore, insights into how and how not to conduct research may be gained from activities during the problem definition stage. If the conclusions made during this stage suggest marketing opportunities, the researcher is in a position to begin planning a formal, quantitative research project.

pilot study A collective term for any smallscale exploratory research technique that uses sampling but does not apply rigorous standards.

24

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

Descriptive research descriptive research Research designed to describe characteristics of a population or phenomenon.

The major purpose of descriptive research, as the name implies, is to describe characteristics of a population. Marketing managers frequently need to determine who purchases a product, portray the size of the market, identify competitors’ actions and so on. Descriptive research seeks to determine the answers to who, what, when, where and how questions. Descriptive research often helps to segment and target markets. For example, Australian Consolidated Press, publisher of Woman’s Day and Cleo, found by survey research that there was a market niche for a magazine (which they christened Madison) for generation X and Y women who have successful careers, are self-empowered and not waiting around for a husband.22 Descriptive research is often used to reveal the nature of shopping or other consumer behaviour. Research by Market21 found that 60 per cent of supermarket transactions in Australia were less than $20. Convenience, rather than price, was more valued by shoppers, and shoppers in Australia are increasingly using non-supermarket forms of retailing for their secondary or topup shopping. This change in consumer behaviour in part explains the development of more than $30 billon spent in Australia annually on fast-moving consumer goods, outside of the traditional retail sector.23 Accuracy is of paramount importance in descriptive research. While they cannot completely eliminate errors, good researchers strive for descriptive precision. Suppose a study seeks to describe the market potential for 4G smartphones. If the study does not precisely measure sales volume, it will mislead the managers who are arranging production scheduling and budgeting, and making other decisions based upon it. Unlike exploratory research, descriptive studies are based on some previous understanding of the nature of the research problem. Although the researcher may have a general understanding of the situation, the conclusive evidence that answers questions of fact necessary to determine a course of action has yet to be collected. Many circumstances require descriptive research to identify the reasons consumers give to explain the nature of things. In other words, a diagnostic analysis is performed when consumers are asked questions such as ‘Why do you feel that way?’ Although they may describe why consumers feel a certain way, the findings of a descriptive study such as this, sometimes called diagnostics, do not provide causal evidence. Frequently, descriptive research attempts to determine the extent of differences in needs, attitudes and opinions among subgroups.

SURVEYS Surveys are the most common method of descriptive research. Most people have seen the results of political surveys by Newspoll or Roy Morgan Research, and some have been respondents (members of a sample who supply answers) to marketing research questionnaires. A survey is a research technique in which information is gathered from a sample of people using a questionnaire. The task of writing a list of questions and designing the format of the printed or written questionnaire is an essential aspect of the development of a survey research design. Research investigators may choose to contact respondents by telephone or mail, on the Internet or in person. An advertiser spending considerable money in buying advertisement time during the 2018 World Cup soccer final may use an online panel to quickly gather information concerning their responses to the advertising. A forklift truck manufacturer trying to determine a cause for low sales in the wholesale grocery industry might choose a mail questionnaire because the appropriate executives are hard to reach by email. A manufacturer of a birth-control device for men might determine the need for a versatile survey method wherein an interviewer can ask a variety of personal questions in a flexible format. While personal interviews are expensive, they are valuable because investigators can

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

25

use visual aids and supplement the interviews with observations. Each of these survey methods has advantages and disadvantages. A researcher’s task is to find the most appropriate way to collect the

AUDIENCE SURVEYS

Film marketers often use survey research. Gauging audience responses to three or four versions of trailers or television commercials is a typical research project designed to bring more people in on opening night. Audience previews have been responsible for the decisions about the final version of many films.

Corbis/Ocean

needed information.

REAL WORLD SNAPSHOT

Secondary data Like exploratory research studies, descriptive and causal studies use previously collected data. Although the terms ‘secondary’ and ‘historical’ are interchangeable, we use the term secondary data here. An example of a secondary data study is the use of a mathematical model to predict sales on the basis of past sales or a correlation with related variables. Manufacturers of digital cameras may find that sales are highly correlated with discretionary personal income. To predict future market potential, projections of disposable personal income may be acquired from the government or a university. This information can be manipulated mathematically to forecast sales. Formal secondary data studies have benefits and limitations similar to those of exploratory studies that use secondary data, but generally the quantitative analysis of secondary data is more sophisticated.

Observation The objective of many research projects is merely to record what can be observed – for example, the number of vehicles that pass by a proposed site for a petrol station. This can be mechanically recorded or observed by people. Research personnel known as mystery shoppers may act as customers to observe actions of sales personnel or do comparative shopping to learn prices at competing outlets. The main advantage of the observation technique is that it records behaviour without relying on reports from respondents. Observational data are often collected unobtrusively and passively without a respondent’s direct participation. For instance, the Nielsen company uses a ‘people meter’ attached to television sets to record the programs being watched by each household member. This eliminates the possible bias of respondents stating that they watched the Prime Minister’s address rather than a sitcom on another station. Observation is more complex than mere ‘nose counting’, and the task is more difficult than the inexperienced researcher would imagine. Several things of interest – such as attitudes, opinions, motivations and other intangible states of mind – simply cannot be observed.

Causal research The main goal of causal research is to identify cause-and-effect relationships among variables. Exploratory and descriptive research normally precedes cause-and-effect relationship studies. In causal studies, researchers typically have an expectation about the relationship to be explained, such as a prediction about the influence of price, packaging, advertising and the like on sales.

causal research Research conducted to identify cause-and-effect relationships among variables.

26

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

Thus, researchers must be quite knowledgeable about the subject. Ideally, the manager wants to establish that one event (say, a new package) is the means for producing another event (an increase in sales). Causal research attempts to establish that when we do one thing, another thing will follow. The word ‘cause’ is common in everyday conversation, but from a scientific research perspective, a true causal relationship is impossible to prove. Nevertheless, researchers seek certain types of evidence to help them understand and predict relationships. A typical causal study has management change one variable (for example, advertising) and then observe the effect on another variable (such as sales). Some evidence for causality comes from the fact that the cause precedes the effect. In other words, having an appropriate causal order of events, or temporal sequence, is one criterion for causality that must be met to be able to establish a causal relationship. If a consumer behaviour theorist wishes to show that an attitude change causes a behaviour change, one criterion that must be established is that attitude change precedes the behaviour change.

EXPERIMENTS Marketing experiments hold the greatest potential for establishing cause-and-effect relationships. Experimentation allows investigation of changes in one variable (such as sales), while manipulating one or two other variables (perhaps price or advertising) under controlled conditions. Ideally, experimental control provides a basis for isolating causal factors by eliminating outside, or exogenous, influences. Test marketing is a frequently used form of marketing experimentation. Increasingly, because of cost, this is being conducted online. Global Test Market.com (http://www.globaltestmarket.com) provides a panel of consumers from over 200 countries and examines, both qualitatively and quantitatively, reactions to new brands and products. Respondents are paid in terms of an online currency, which they can later redeem for products and services. This may include products and brands they were exposed to in the test market. Thus a realistic yet affordable option is currently available to marketers to examine the worldwide acceptance of their products. An experiment controls conditions so that one or more variables can be manipulated in order to test a hypothesis. Many companies in the fast-moving consumer goods industry conduct experiments that simply determine consumer reactions to different types of packaging. Results from field experiments can also lead to a deliberate modification of the marketing mix. Retailers using scanner data can determine if the use of loss leaders, or specials, generates greater sales in other product categories. Other experiments – laboratory experiments, for example – are deliberate modifications of an environment created for the research itself. One example of a laboratory experiment is a toy company showing alternative versions of a proposed television advertisement to groups of children and observing which keeps their attention the longest. Most basic scientific studies in marketing (for example, the development of consumer behaviour theory) ultimately seek to identify cause-and-effect relationships. One often associates science with experiments. To predict a relationship between, say, price and perceived quality of a product, causal studies often create statistical experiments with controls that establish contrast groups. A number of marketing experiments are conducted by both theory developers and pragmatic business people. More is said about experiments and causal research in Chapter 7.

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

27

WHAT WENT RIGHT? REINVENTING CONVENIENCE STORE FOOD AT 7-ELEVEN24

The winner of the ESOMAR research effectiveness awards in Australia for 2015 was 7-Eleven and research firm Bergent Australia. Changes in food packaging and advertising were made as the result of shopper-focused study that used qualitative, quantitative and controlled experiments to uncover surprising yet usable attitudes and behaviours in to the food-to-go market. Shopper-generated videos were also used to illustrate the keys to optimising range, service, packaging and marketing. Halfway

through implementing the research recommendations, since June 2015, 7-Eleven has already achieved 117 oer cent gross profit, 125 per cent sales volume, and 118% sales value for Food on the Go (FotG). Brand image also improved significantly and the first new-design store doubled FotG sales value (likefor-like stores). However, controversy in 2015, continuing into 2016, over not paying its staff award wages may well diminish the success of this campaign.

THE ‘BEST’ RESEARCH DESIGN It is argued that there is no single best research design and there are no hard-and-fast rules for good marketing research. This does not mean, however, that the researcher faces chaos and confusion. It means that the researcher can choose among many alternative methods for solving a problem. Consider the researcher who must forecast sales for the upcoming year. Some commonly used forecasting methods are surveying executive opinions, collecting sales force composite opinions, surveying user expectations, projecting trends and analysing market factors. The ability to select the most appropriate research design develops with experience. Inexperienced researchers often jump to the conclusion that the survey method is the best design because they are most familiar with this method. When Chicago’s Museum of Science and Industry wanted to determine the relative popularity of its exhibits, it could have conducted a survey. Instead, a creative researcher familiar with other research designs suggested a far less expensive alternative: an unobtrusive observation technique. The researcher suggested that the museum merely keep track of the frequency with which the floor tiles in front of the various exhibits had to be replaced, indicating where the heaviest traffic occurred. When this was done, the museum found that the chick-hatching exhibit was the most popular. This method provided the same results as a survey but at a much lower cost. After determining the proper design, the researcher moves on to the next stage – planning the sample.

Sampling Although the sampling plan is outlined in the research design, the sampling stage is a distinct phase of the research process. For convenience, however, we will treat the sample planning and the actual sample generation processes together in this section. If you take your first bite of a steak and conclude that it needs salt, you have just conducted a sample. Sampling involves any procedure that uses a small number of items or a portion of the population to make a conclusion regarding the whole population. In other words, a sample is a subset from a larger population. If certain statistical procedures are followed, a researcher need not select every item in a population because the results of a good sample should have the same characteristics as the population as a whole. Of course, when errors are made, samples do not give reliable estimates of the population. An infamous example of error due to sampling is the 1936 Literary Digest fiasco. The magazine conducted a survey and predicted that Republican Alf Landon would win by a landslide over Democrat Franklin D. Roosevelt in that year’s presidential election. This prediction was wrong – and

sampling stage The stage in which the researcher determines who is to be sampled, how large a sample is needed, and how sampling units will be selected.

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

the error was due to sample selection. Post-mortems showed that Literary Digest had sampled its readers as well as telephone subscribers. In 1936, these people were not a representative cross-section of voters, because in those days, people who could afford magazine subscriptions and a phone service were generally well-to-do – and a disproportionate number of them were Republicans.

WHAT WENT WRONG?

ELECTION RESULT DIFFICULT TO DIGEST

Surveys should be representative. In 1936, telephone subscribers and subscribers to Literary Digest were disproportionately Republicans who did not support Roosevelt.

Literary Digest 1936

28

This example suggests that the first sampling question to ask is: ‘Who is to be sampled?’ The answer to this primary question requires the identification of a target population. Defining this population and determining the sampling units may not be so easy. If, for example, a bank surveys people who already have accounts for answers to image questions, the selected sampling units will not represent potential customers. Specifying the target population is a crucial aspect of the sampling plan. The next sampling issue concerns sample size. How big should the sample be? Although management may wish to examine every potential buyer of a product or service, doing so may be unnecessary as well as unrealistic. Typically, larger samples are more precise than smaller ones, but proper probability sampling can allow a small proportion of the total population to give a reliable measure of the whole. A later discussion will explain how large a sample must be in order to be truly representative of the universe or population. The final sampling decision concerns choosing how to select the sampling units. Students who have taken a statistics course generally understand simple random sampling, in which every unit in the population has an equal and known chance of being selected. However, this is only one type of sampling. For example, a cluster sampling procedure may reduce costs and make data-gathering procedures more efficient. If members of the population are found in close geographical clusters, a sampling procedure that selects area clusters rather than individual units in the population will reduce costs. Rather than selecting 2000 individuals throughout New Zealand, it may be more economical to first select three local government areas and then sample within those areas. This will substantially reduce travel, hiring and training costs. In determining the appropriate sampling plan, the researcher will have to select the most appropriate sampling procedure for meeting the established study objectives. There are two basic sampling techniques: probability sampling and nonprobability sampling. A probability sample is a sample in which every member of the population has a known, nonzero probability of selection. If sample units are selected on the basis of personal judgement (for example, a test market city is selected because it appears to be typical), the sample method is a nonprobability sample. In reality, the sampling decision is not a simple choice between two methods. Simple random samples, stratified samples, quota samples, cluster samples and judgemental samples are some of the many methods for drawing a sample. Chapter 10 gives a full discussion of these techniques.

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

29

Gathering data Once the research design (including the sampling plan) has been formalised, the process of gathering

ONGOING PROJECT

or collecting information, the data-gathering stage, may begin. Data may be gathered by humans or recorded by machines. Scanner data illustrate electronic data collection by machine. Obviously, the many research techniques involve many methods of data gathering. The survey method requires some form of direct participation by the respondent. The respondent may participate by filling out a questionnaire or by interacting with an interviewer. If an unobtrusive method of data gathering is used, the subjects do not actively participate. For instance, a simple count of motorists driving past a proposed franchising location is one kind of data-gathering method. Whichever way the data are collected, it is important to minimise errors in the process. For example, data gathering should be consistent in all geographical areas. If an interviewer phrases questions incorrectly or records a respondent’s statements inaccurately (not verbatim), major data collection errors will result. Often there are two phases to the process of gathering data: pretesting and the main study. A pretesting phase using a small subsample may determine whether the data-gathering plan for the main study is an appropriate procedure. Thus, a small-scale pretest study provides an advance opportunity for an investigator to check the data collection form to minimise errors due to improper design, such as poorly worded or organised questions. There is also a chance to discover confusing interviewing instructions, learn if the questionnaire is too long or too short, and uncover other such field errors. Tabulation of data from the pretests provides the researcher with a format for the knowledge that may be gained from the actual study. If the tabulation of the data and statistical results does not answer the researcher’s questions, the investigator may need to redesign the study.

Processing and analysing data EDITING AND CODING After the fieldwork has been completed, the data must be converted into a format that will answer the marketing manager’s questions. This is part of the data processing and analysis stage. Data processing generally begins with editing and coding the data. Editing involves checking the data collection forms for omissions, legibility and consistency in classification. The editing process corrects problems such as interviewer errors (an answer recorded on the wrong portion of a questionnaire, for example) before the data are transferred to the computer. Before data can be tabulated, meaningful categories and character symbols must be established for groups of responses. The rules for interpreting, categorising, recording and transferring the data to the data storage media are called codes. This coding process facilitates computer or hand tabulation. If computer analysis is to be used, the data are entered into the computer and verified. Computerassisted (online) interviewing is an example of the impact of technological change on the research process. Telephone interviewers, seated at computer terminals, read survey questions displayed on the monitor. The interviewer asks the questions and then types in the respondents’ answers. Thus, answers are collected and processed into the computer at the same time, eliminating intermediate steps that could introduce errors.

data-gathering stage The stage in which the researcher collects the data. data processing and analysis stage The stage in which the researcher performs several interrelated procedures to convert the data into a format that will answer management’s questions.

30

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

ANALYSIS Analysis is the application of reasoning to understand the data that have been gathered. In its simplest form, analysis may involve determining consistent patterns and summarising the relevant details revealed in the investigation. The appropriate analytical technique for data analysis will be determined by management’s information requirements, the characteristics of the research design and the nature of the data gathered. Statistical analysis may range from portraying a simple frequency distribution to very complex multivariate analysis, such as multiple regression. Later chapters discuss three general categories of statistical analysis: univariate analysis, bivariate analysis and multivariate analysis.

ONGOING PROJECT

Drawing conclusions and preparing a report As mentioned earlier, most marketing research is applied research aimed at making a marketing decision. An important but often overlooked aspect of the marketing researcher’s job is to look at the analysis of the information collected and ask: ‘What does this mean to management?’ The final stage

conclusions and report preparation stage The stage in which the researcher interprets information and draws conclusions to be communicated to decisionmakers.

in the research process, the conclusions and report preparation stage, consists of interpreting the information and making conclusions for managerial decisions. The research report should effectively communicate the research findings. All too many reports are complicated statements of technical aspects and sophisticated research methods. Frequently, management is not interested in detailed reporting of the research design and statistical findings, but wishes only a summary of the findings. If the findings of the research remain unread on the marketing manager’s desk, the study will have been useless. The importance of effective communication cannot be overemphasised. Research is only as good as its applications. Marketing researchers must communicate their findings to a managerial audience. The written report serves another purpose as well – it is a historical document that will be a record that may be referred to later if the research is to be repeated or if further research is to be based on what has gone before. Now that we have outlined the research process, note that the order of topics in this book follows the flowchart of the research process presented in Exhibit 1.3. Keep this flowchart in mind while reading later chapters.

EXHIBIT 1.3 → STAGES IN THE RESEARCH PROCESS

1

6

Defining the problem

Formulating conclusions and writing the final report

5

Analysing the data

4

Collecting the data

2

Planning the research design

3

Planning the sample

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

31

WHAT WENT WRONG? HARLEY-DAVIDSON MEANS FAMILY25

the Indian government and difficult emission regulations proved too much of a barrier for Harley for several years. Successful lobbying of the Indian government eased some of the restrictions and Harley now offers 12 models ranging from a smallish 883 cc model to the 1800 cc model, more fitting of the title ‘hog’. In this way, Harley can address the needs of the luxury market and those desiring a more practical bike that still makes a statement. AP Photo/Krishnendu Kes

A good family doesn’t keep secrets. Harley-Davidson treats customers like family and believes that in a good relationship – one doesn’t keep secrets from the other. Harley practises a wide variety of research and one recent tool involves mining social media sites. For instance, Harley-Davidson discovered a link between Kirchart skateboards and ‘hogs’. Searching for the brand’s name on social network sites like Facebook, they discovered a link to a YouTube video in which professional skateboarders for Heath Kirchart rode Harley-Davidson motorcycles across the country on their summer tour. As a result of this information, Harley leveraged this information by becoming an official sponsor of Kirchart’s tours. Harley, an iconic American brand, is an international company. Before Harley-Davidson goes overseas, Harley needs considerable research on each international market. In late 2009, Harley opened its first dealerships in India. Indian consumers were more accustomed to scooters than ‘hogs’, but the rising middle class and a new demand for luxury products created opportunity. Survey research among Indian consumers showed a favourable opinion since the early 2000s. However, high duties imposed by

THE RESEARCH PROGRAM STRATEGY Our discussion of the marketing research process began with the assumption that the researcher wished to gather information to achieve a specific marketing objective. We have emphasised the researcher’s need to select specific techniques for solving one-dimensional problems, such as identifying market segments, selecting the best packaging design or test marketing a new product. However, if you think about a firm’s marketing mix activity in a given period of time (such as a year), you’ll realise that marketing research is not a one-shot activity – it is a continuous process. An exploratory research study may be followed by a survey, or a researcher may conduct a specific research project for each aspect of the marketing mix. If a new product is being developed, the different types of research might include market potential studies to identify the size and characteristics of the market; product usage testing to record consumers’ reactions to prototype products; brand name and packaging research to determine the product’s symbolic connotations; and test marketing the new product. Because research is a continuous process, management should view marketing research at a strategic planning level. The program strategy refers to a firm’s overall plan to use marketing research. It is a planning activity that places a series of marketing research projects in the context of the company’s marketing plan. Organisations like David Jones, ANZ Bank, Meat and Livestock Australia and TAC Victoria all have used a research program to monitor and improve their performance of marketing and social campaigns. The marketing research program strategy can be likened to a term insurance policy. Conducting marketing research minimises risk and increases certainty. Each research project can be seen as a series of term insurance policies that makes the marketing manager’s job a bit safer.

program strategy The overall plan to conduct a series of marketing research projects; a planning activity that places each marketing project in the context of the company’s marketing plan.

32

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

TIPS OF THE TRADE

Throughout this text, a ‘Tips of the Trade’ section is provided to give hints for using and doing marketing research. The first tip is to pay attention to these sections as helpful references. »» Customers and employees are valuable sources for input that leads to innovation in the marketplace and in the workplace. »» Business problems ultimately boil down to information problems because with the right information, the business can take effective action. »» Good marketing research is as rigorous as good research in other fields, including the physical sciences. »» Research plays a role before, during and after key marketing decisions. »» Research helps design marketing strategies and tactics before action is taken. »» Once a plan is implemented, research monitors performance with key metrics providing valuable feedback. »» After a plan is implemented, research assesses performance against benchmarks and seeks explanations for the failure or success of the action. »» Research that costs more than it could ever return should not be conducted.

01

SUMMARY UNDERSTAND THE IMPORTANCE OF MARKETING RESEARCH AS A MANAGEMENT DECISION-MAKING TOOL

UNDERSTAND THE MANAGERIAL VALUE OF MARKETING RESEARCH AND ITS ROLE IN THE DEVELOPMENT AND IMPLEMENTATION OF MARKETING STRATEGY

Marketing research is a tool companies use to discover consumers’ wants and needs so that they can satisfy those wants and needs with their product offerings. Marketing research is the marketing manager’s source of information about market conditions. It covers topics ranging from long-range planning to near-term tactical decisions.

The development and implementation of a marketing strategy consist of four stages:

DEFINE MARKETING RESEARCH

3 planning and implementing a marketing mix that will satisfy customers’ needs and meet the objectives of the organisation

Marketing research is the systematic and objective process of generating information – gathering, recording and analysing data – to aid marketing decision-making. The research must be conducted systematically, not haphazardly. It must be objective to avoid the distorting effects of personal bias. UNDERSTAND THE DIFFERENCE BETWEEN BASIC AND APPLIED MARKETING RESEARCH

Applied marketing research seeks to facilitate managerial decisionmaking. Basic or pure research seeks to increase knowledge of theories and concepts.

1 identifying and evaluating opportunities 2 analysing market segments and selecting target markets

4 analysing marketing performance. Marketing research helps in each stage by providing information for strategic decision-making. Managers use marketing research to define problems, identify opportunities and clarify alternatives. They also use it to determine what went wrong with past marketing efforts, describe current events in the marketplace or forecast future conditions.

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

UNDERSTAND WHEN MARKETING RESEARCH IS NEEDED AND WHEN IT SHOULD NOT BE CONDUCTED

Marketing managers determine whether marketing research should be conducted based on: 1 time constraints 2 availability of data 3 the nature of the decision to be made 4 the benefit of the research information versus its cost. Decision-making is the process by which managers resolve problems or choose among alternative opportunities. Decisionmakers must recognise the nature of the problem or opportunity, identify how much information is available, and recognise what information they need. Every marketing decision can be classified on a continuum ranging from complete certainty to absolute ambiguity. LIST THE STAGES IN THE MARKETING RESEARCH PROCESS

Research proceeds in a series of six interrelated phases. The first is problem definition, which may include exploratory research using secondary data, experience surveys or pilot studies. Once the problem is defined, the researcher selects a research design. The major designs are surveys, experiments, secondary data analysis and observation. Creative research design can minimise the cost of obtaining reliable results. After the design has been selected, a sampling plan is chosen, using a probability sample, a nonprobability sample or a combination of the two. The design is put into action in the data-gathering phase. This phase may involve a small pretest before the main study is undertaken. In the analysis stage the data are edited and coded, then processed, usually by computer. The results are interpreted in light of the decisions that management must make. Finally, the

33

analysis is presented to decision-makers in a written or oral report. This last step is crucial, because even an excellent project will not lead to proper action if the results are poorly communicated. CLASSIFY MARKETING RESEARCH AS EXPLORATORY RESEARCH, DESCRIPTIVE RESEARCH OR CAUSAL RESEARCH

Exploratory, descriptive and causal research are three major types of marketing research projects. The clarity with which the research problem is defined determines whether exploratory, descriptive or causal research is appropriate. Exploratory research is appropriate when management knows only the general nature of a problem; it is used not to provide conclusive evidence but to clarify problems. Descriptive research is conducted when there is some understanding of the nature of the problem; such research is used to provide a more specific description of the characteristics of a problem. Causal research identifies cause-and-effect relationships when the research problem has been narrowly defined. EXPLAIN THE DIFFERENCE BETWEEN A RESEARCH PROJECT AND A RESEARCH PROGRAM

Quite often research projects are conducted together as parts of a research program. Such programs can involve successive projects that monitor an established product or a group of projects undertaken for a proposed new product to determine the optimal form of various parts of the marketing mix. A major problem facing students of marketing research is that they must consider each stage in the research process separately. However, without concentrated emphasis on the total process, understanding the individual stages is difficult. Thus, learning marketing research is like walking a tightrope between too broad and too narrow a focus.

KEY TERMS AND CONCEPTS applied research basic (pure) research causal research conclusions and report preparation stage

data-gathering stage data processing and analysis stage descriptive research exploratory research

marketing research performance-monitoring research pilot study problem definition stage

program strategy research design research design stage sampling stage scientific method

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 Discuss a situation when the use of market research cannot be justified.

3 Name some products or services that logically might have been developed with the help of marketing research.

2 What do you think were some potential problems faced by researchers mentioned in the opening vignette of this chapter who were developing research on hearing loss in Australian Aboriginal and Torres Strait communities?

4 In your own words, define marketing research and describe its task. How different is it from the social sciences?

34

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

5 Which of the following organisations are likely to use marketing research? Why? How? a a manufacturer of breakfast bars b a manufacturer of nails, bolts and screws c TAC Victoria d The Australia Council e a mobile phone service provider 6 An automobile manufacturer is conducting research in an attempt to predict the use of electric cars by consumers over the next 20 years. Is this basic or applied research? Explain. 7 The owner of 22 restaurants was asked how he does marketing research. He answered that he does it after midnight, driving around in a ute: ‘I stay up late. If it’s midnight and I don’t have anything else to do, I drive around town and look at the queues in front of places. I’ll look at the rubbish and see if a guy’s doing business. If he’s got a really clean bunch of rubbish bins and an empty Dumpmaster, he’s not doing any business. I find out a lot by talking to my suppliers. I ask the bread guy how many boxes of rolls my competitor down the street is buying. Very few restaurateurs do that. But that’s the way I research my market.’ Is this marketing research? 8 Comment on the following statements: a Marketing managers are paid to take chances with decisions. Marketing researchers are paid to reduce the risk of making those decisions. b A marketing strategy can be no better than the information on which it is formulated. c The purpose of research is to solve marketing problems. 9 In what specific ways can marketing research influence the development and implementation of marketing strategy? 10 Outline some of the main considerations that a Western company would have if it were to commission market research in the South-East Asian region. 11 For each of the following situations, decide whether the research should be exploratory, descriptive or causal: a establishing the relationship between online visits and sales b investigating consumer reactions to the idea of a smokeless, electronic cigarette

c identifying target market demographics for a tourist theme park d estimating sales potential for fencing equipment in a New Zealand sales territory. 12 Describe a research situation that allows one to infer causality. 13 A researcher is interested in knowing the answer to a ‘why’ question, but does not know beforehand what sort of answer will satisfy. Will answering this question involve exploratory, descriptive or causal research? Explain. 14 Do the stages in the research process follow the scientific method? 15 Why is the problem definition stage of the research process probably the most important stage? 16 How have technology and internationalisation affected marketing research? 17 Which research design seems appropriate for the following studies? a The manufacturer and marketer of flight simulators and other pilot training equipment wish to forecast sales volume for the next five years. b A local chapter of the Stroke Foundation in New Zealand wishes to identify the demographic characteristics of individuals who donate more than $1500 per year. c A major petroleum company is concerned with the increased costs of the ‘non-sniffable’ fuel, Opal, and considering dropping this product. d A food company researcher wishes to know what types of food are taken as packed lunches to learn if the company can capitalise on this phenomenon. e A researcher wishes to identify who plays the Call of Duty video game at home and for how long and with whom? 18 Should the marketing research program strategy be viewed as a strategic planning activity?

ONGOING PROJECT DOING A MARKET RESEARCH PROJECT? CONSULT THE CHAPTER 1 PROJECT WORKSHEET FOR HELP

Download the Chapter 1 project worksheet from the CourseMate website. It is used to determine the type of research project you

may wish to do. There are project worksheets in each chapter of the textbook to help you with each stage of the research process.

CHAPTER 01 > THE ROLE OF MARKETING AND THE RESEARCH PROCESS

35

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to log in to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ interactive quizzes

☑ crosswords on key concepts ☑ research activities ☑ videos.

☑ flashcards

WRITTEN CASE STUDY 1.1 BUYING NEW ZEALAND MADE ONLINE IN CHINA26 In China, 10 per cent of all retail transactions are now online, and there are still many small corner-store-like bricks-and-mortar retailers. Chinese online retailer Alibaba has been estimated to be as large as the top 100 bricks-and-mortar retailers in China, and holds around 80 per cent of the e-commerce market in that country. Entering the online retail space thus presents an opportunity for New Zealand firms. There is a reported interest in New Zealand products and services in China, because of their authenticity and traceability. As one New Zealand exporter noted, ‘The Chinese consumer wants to know what they are buying is genuine and hasn’t been counterfeited in any way’. Issues for New Zealand exports are meeting the scale of demand and having a near perfect logistic network. Already products such as

infant baby formula, abalone products, kiwifruit juice and honey are becoming popular imports from New Zealand.

QUESTIONS

1 What kind of research do you think needs to be done by New Zealand exporters in this case in order to enter the market in China? 2 Who should commission the research? The New Zealand government? Exporters? Provide a justification. 3 Are there any potential cultural and technological issues involved in this research study?

WRITTEN CASE STUDY 1.2 THE QUIT CAMPAIGN IN AUSTRALIA AND SINGAPORE Children in Australia spend on average $60 million a year on cigarettes, with some 70 000 teenagers starting to smoke every year. Health effects associated with smoking are well known – 75 per cent of all lung cancers are associated with smoking – but there are also immediate problems. Quit Victoria has found in research that children who smoke are more likely to be absent from school because of smoking-related ailments or because of truancy and suspension. In order to deal with the significant health issues associated with this, Quit Victoria has designed a series of health campaigns to try to reduce the prevalence of smoking among children. Both qualitative and quantitative research have been used to develop campaigns targeting schools, promotion in sport (which appeals to young males) and advertising campaigns that are used to encourage young girls not to take up the habit. Research has not only examined how to discourage demand for cigarettes but also the supply of what is for children an illegal product to purchase. Research that evaluates the success of the campaign, both in attitudes to smoking and behaviour, has also been very important in demonstrating to government the effectiveness of such campaigns and how they might be improved in the future.

Evaluation research of anti-smoking campaigns has also been important in New Zealand. A similar campaign has been developed in Singapore, which is called the National Smoking Control Programme. Its main activities: • A national smoking control campaign is held annually to raise awareness on the harmful effects of smoking and encourage smokers to quit smoking.

• Mass media is used extensively, and innovative publicity events and programs are organised to elicit maximum media coverage.

• Interpersonal activities are conducted throughout the year at various settings such as schools, workplaces, and healthcare and community venues.

• The QuitLine is another feature of the program. Staffed by trained nurse–counsellors, callers can seek advice and/or information on how to quit smoking or how to help someone quit.

• Quit services are provided by three hospitals, 16 polyclinics and some non-government organisations. Teachers who are

36

PART ONE > INTRODUCTION TO THE RESEARCH PROCESS

interested in helping their students stop smoking are trained to conduct a school-based smoking cessation program.

• As in Australia, such campaigns have developed on the basis of market research, using a number of approaches (exploratory, descriptive and causal). The campaigns are also evaluated regularly by government so that their effectiveness can be determined.

QUESTIONS

1 What type of market research do you think is appropriate in order to develop and evaluate the Quit Campaign in Australia? 2 What differences, if any, in market research do you think would occur between market research in Singapore and Australia in this case?

NOTES 1 Australian Market and Social Research Society (2015) Research effectiveness awards, 2014 Winners, accessed at http://researcheffectiveness.com.au/2014-winners#cia on 2 November 2015. 2 American Marketing Association (2012) Definition of Marketing Research, accessed at http://www.marketingpower.com/AboutAMA/Pages/DefinitionofMarketing.aspx (Approved 2004) on 5 July 2012. 3 Roy Morgan Research (2015) Goode choice, David Jones! accessed at http://www. roymorgan.com/findings/6516-goode-choice-david-jones-201510220412 on 2 November 2015. 4 Schutte, Hellmut & Ciarlante, Deanna (1998) Consumer behaviour in Asia, London: Macmillan Press, pp. 243–62. 5 For a detailed discussion of marketing strategy and tactics, see Zikmund and d’Amico, Marketing, Cincinnati: South-Western, chap. 2. 6 Flores, J., Baruca, A., & Saldivar, R. (2014) ‘Is neuromarketing ethical? Consumers say yes. Consumers say no’, Journal of Legal, Ethical and Regulatory Issues, 17(2), 77–91. 7 Schwartz, David (1987) Concept testing: How to test product ideas before you go to market, New York: AMACOM, p. 91. 8 Sprague, Julie-anne (2012) ‘Treasury Be target is Gen Y’. Australian Financial Review, 14 April, 20. 9 Benezra, Karen (1995) ‘Fritos around the world’, BrandWeek, 27 March, p. 32; New York Times (1994) ‘Chinese Cheetos’, 27 November, p. 31; USA Today (1994) ‘Cheetos make debut in China but lose cheese in translation’, 2 September, p. B-1. 10 Parker, M. (2014) ‘The low carb beer phenomenon’, National Liquor News, 33(11), 60–66. 11 Schutte, Hellmut & Ciarlante, Deanna (1998) Consumer behaviour in Asia, London: Macmillan Press, p. 161. 12 Mohn, N. Carroll (1995) ‘Pricing research for decision making’, Marketing Research, Winter, pp. 11–12. 13 Ward, Steven, Chitty, William Achard, Brendan (2005) ‘Brand equity in an online world’, Journal of Internet Business Studies, available at http://www.csu.edu.au/faculty/ commerce/jib/issues/issue02. 14 Probert, J. & Lasserre, P. (1997) The Asian business context: A follow-up survey, Euro-Asia Centre Research Series, no. 44 Fontainebleau: INSEAD EAC, p. 23.

15 Bradley, R. (2014) How Lego got hot . . . again: A Danish firm helps the toymaker find its fun, Fortune.com, 27 February, p. 31. 16 Schutte, Hellmut & Ciarlante, Deanna (1998) Consumer behaviour in Asia, London: Macmillan Press, p. 182. 17 Biel, A. L. & Bridgewater, C. A. (1990) ‘Attributes of likeable television commercials’, Journal of Advertising Research, 15(2), pp. 152–66. 18 Express Magazine (1992) ‘You say tomato, I say tomahto’, Spring, p. 19. 19 See http://www.statista.com/statistics/264810/number-of-monthly-active-facebookusers-worldwide, accessed February 2016. 20 Barnett, Michael (2011) Driving home the benefit of green wheels. Marketing Week, April 6, accessed at http://www.factiva.com on 13 September 2012. 21 Blue Moon Research and Planning (2003) ‘Formative research with young Australians to assist in the development of the National Illicit Drug Campaign’, prepared by Greg Clark, Scott, Ned & Cook, Steve. 22 Burbury, Rochelle (2004) ‘New title aimed at career girl’, Australian Financial Review, 25 October, p. 47. 23 Burbury, Rochelle (2001) ‘Cashing in on shoppers’ shifting ways’, Australian Financial Review, 7 August, p. 42. 24 ESOMAR (2015) The talent contest: ESOMAR research effectiveness award finalists: Reinventing convenience store food: growing 7-Eleven’s foodservice, profit, sales and image using multi-staged focused research. Accessed at https://www.esomar. org/events-and-awards/events/global-and-regional/congress-2015/congress-2015_ abstracts.php?anchor=Reinventing-Convenience-Store-Food-1548, on 5 November 2015. 25 Sources: Fournier, S., & J. Avery (2011) ‘The uninvited brand,’ Business Horizons (in press). NDTV (2010), ‘Harley-Davidson opens first outlet in India,’ Press Trust of India (July), http://www.ndtv.com/article/india/harleydavidson-opens-first-outlet-inindia-36610, accessed 30 January 2011; ‘Harley-Davidson rules out India foray for near future,’ Asia-Africa Intelligence Wire (2 September 2005). 26 New Zealand Herald, (2015) Online – The biggest shop of all, New Zealand Herald, 19 October, accessed at http://www.fativa.com on 5 November 2015.

TWO DEFINING THE PROBLEM

02 » PROBLEM DEFINITION AND THE RESEARCH PROCESS

PART 2: Defining the problem

Ascertain the decision-makers’ objectives.

Understand the background of the problem

Isolate and identify the problem, not the symptoms.

Determine the unit of analysis.

Stock.com/Johnny Greig

Determine the relevant variables.

State the research questions (hypotheses) and research objectives.

39

02 » WHAT YOU WILL LEARN IN THIS CHAPTER

To discuss the nature of decision-makers’ objectives and their role in defining the research problem.

To understand that proper problem definition is essential for effective marketing research. To understand the importance of identifying key variables. To discuss how formulation of research questions and hypotheses clarifies problem definition. To discuss the influence of the statement of the marketing problem on the specific research objectives.

The welfareto-work story: Getting employers to look outside the square1

It is estimated that the federal government in Australia spends over 3.6 billion dollars a year on welfare-to-work initiatives to increase workforce participation. However the benefits of this program need to be communicated clearly to employers and potential employees. Open Mind, an independent market research company, was commissioned to help provide information which would be used in refining key messages, and to engage target audiences and benchmark and track the campaign. The research company used a combination of qualitative approaches to provide a 360 degree perspective on the complex issue of welfare to work. This included group discussions allowing people in broadly similar situations to consider the changes and the implications to their own lives; ethnographic case studies conducted in homes to establish a closer engagement with the lived experience of these groups of job seekers; and a combination of small groups and

qualitative interviews conducted over the phone with businesses. The research saw a major shift in communication focus, moving from motivating job seekers to motivating employers to choose from a broader group of society. It revealed that job seekers wanted to work but were lacking in confidence and believing that employers wouldn’t give them a go. The research showed how, by appearing to target employers, the campaign in fact spoke more effectively to job seekers. This example shows that in dealing with a research brief, or statement of a management problem or decision, the researcher must employ the appropriate tools or research design. The researcher must also consider the dimensions of the research problem, who will be studied and how. While many social and market research studies are not as complex as this study, the example shows that understanding the decision-maker’s problem, the context of the problem and employing the right research design are crucial steps in the research process.

iStock.com/gradyreese

To explain the purpose of the research proposal.

PROBLEM DEFINITION AND THE RESEARCH PROCESS

40

PART TWO > DEFINING THE PROBLEM

THE NATURE OF MARKETING PROBLEMS Chapter 1 indicated that a decision-maker’s degree of uncertainty influences decisions about the type of research that will be conducted. In this chapter we elaborate on the conditions under which decision-making occurs and the process by which managers clearly define marketing problems and opportunities. Remember that a marketing manager may be completely certain about a business situation. For example, a retail store that has been recording and analysing optical scanner data for years knows exactly what information its scanners need to record every day. Another example for a manager to consider is how effective promotions can be in social media. Given that social media has only become popular in the last ten years, and many managers may not be active users of social media, there is an important role for qualitative research to better identify the issues of advertising on social media. In other situations, routine research techniques are regularly used to investigate routine problems that have already been defined. At the other extreme, a manager or researcher may face an absolutely ambiguous decision-making situation. The nature of the problem to be solved is unclear. The objectives are vague and the alternatives are difficult to define. This is by far the most difficult decision situation. Most marketing decision situations fall between these two extremes. Managers usually grasp the general nature of the objectives they wish to achieve, but they often remain uncertain about the full details of the problem. Important information is missing. Ambiguity or uncertainty needs to be cleared up before making a formal statement of the marketing problem. There is also a responsibility for market researchers to make sure that they meet the decision-maker’s objectives, as noted by a senior market research consultant in Singapore: ‘The problem is that when market researchers do sit at the table, they usually don’t say anything interesting, often getting lost in the data and failing to deliver concise, clear thinking. What’s more, they are often disconnected with what the decision-makers care about.’2

Now that you’ve been through the Zikmund AsiaPac Qualtrics online survey (which you responded to in Chapter 1 and which you will view results for later), you can use it to better learn course concepts. In later chapters, you will analyse actual data to address key research questions and hypotheses. In this chapter, you will learn the basic research process, which is composed of six stages. List each stage in the basic research process. Based on this survey project, try to describe activities that went on, are going on, or will go on that correspond to each of the six stages. Use the flow chart at the start Courtesy of Qualtrics.com of this chapter to help with this task. Also, provide a list of deliverables that you will be able to produce for a client aiming at better understanding the university student market. Finally, comment on how you believe a sample could be obtained from students using this book in Australia, New Zealand and the Asia Pacific region.

SURVEY THIS!

CHAPTER 02 > PROBLEM DEFINITION AND THE RESEARCH PROCESS

41

42

PART TWO > DEFINING THE PROBLEM

THE IMPORTANCE OF PROPER PROBLEM DEFINITION The formal quantitative research process should not begin until the problem has been clearly defined. However, properly and completely defining a marketing problem is easier said than done. When a problem or opportunity is discovered, managers may have only vague insights about a complex situation. For example, suppose market share is declining in Western Australia and management does not know the reason. If quantitative research begins before the manager learns exactly what is important, the investigation may yield false conclusions. The right answer to the wrong question may be absolutely worthless – indeed, it may even be harmful. The Disneyland theme park in Hong Kong was not popular with Chinese tourists when it opened in 2005, as it was too small and lacked rides, which was disappointing as the Chinese like the biggest of everything. Also there had been no Disney channel in China, so there was little attachment for the many characters in the theme park. The lack of emotional resonance with the Disney brand and the small size of the theme park in Hong Kong meant that in the 2010 fiscal year it suffered a net loss of HK$718 million (A$92.4 million).3 Consider also what happened when Coca-Cola made the decision to change its Coke formula. Management’s definition of the problem focused on a need to improve Coke’s taste because its competitor’s ‘Pepsi Challenge’ advertising campaign touted Pepsi’s superior taste. The research question was to investigate the ultimate consumer reaction to the taste of reformulated Coke. The results of the taste test led to the introduction of ‘new’ Coke and the withdrawal from the market of regular Coke. As soon as consumers learned the company’s original formula was no longer available, there were emotional protests from Coca-Cola loyalists. The consumer protests were so passionate and determined that the original formula was quickly brought back as Coca-Cola Classic. Later, the company learned that the consumer protests indicated a larger problem: Coke’s marketing research had been too narrow in scope and the problem had been inadequately defined. The company had carried out a series of taste tests in shopping malls. No take-home taste tests had been conducted, and consumers had not been asked if the new Coke should replace the original. The marketing research had failed to identify consumers’ emotional attachment and loyalty to the brand as a problem for investigation. The Coca-Cola mistake teaches a valuable lesson: do not ignore investigating the emotional aspects of buying behaviour.

REAL WORLD SNAPSHOT

WHAT’S NEW IS OLD, WHEN IT COMES TO PROBLEM DEFINITION4

Tracy Rankin, an experienced market research director of Australian market research company, Yellow Door, outlines five important guidelines in defining market research problems. These are based on her understanding of consumer psychology. The five guidelines are: 1 We are emotional beings first, rational thinkers second. • Consumers make decisions based on emotional reasons, and researchers should not accept what they tell researchers. As Ms Rankin, notes, ‘People will forget what you did, but people will not forget how you made them feel.’ 2 Give a man a mask and he will tell you all about himself. • People care too much to reveal their inner thoughts: projective techniques (as discussed in Chapter 3), provide access to a consumer’s ‘true story’ and are more likely to reveal the real reasons for behaviour.

»

CHAPTER 02 > PROBLEM DEFINITION AND THE RESEARCH PROCESS

»

43

3 We are a symbolic species and attach meaning to everything around us. • Even ingrained habit is laden with hidden symbolism that can provide a wealth of behaviour and decisions. 4 Human thought is image based – it stems from the unconscious and we use language to verbalise it. • We get more value when we allow respondents to express thoughts and feelings through images (pictures, collages, objects) real and visualised. 5 People are natural story tellers – story telling is what we do. • Stories help us make sense of our world and our experiences. If you let people talk, they will eventually reveal themselves. Ms Rankin’s guidelines show the importance of qualitative techniques, such as projective techniques, storytelling and the use of images and pictures, to define research problems from the consumer’s perspective, an important basis of any market research study.

THE PROCESS OF DEFINING THE PROBLEM Just because a problem has been discovered or an opportunity recognised does not mean that the problem has been defined. A problem definition indicates a specific marketing decision to be clarified or problem to be solved. It specifies research questions to be answered and the objectives of research. The process of defining the problem involves several interrelated steps, as shown in Exhibit 2.1: 1 Ascertain the decision-makers’ objectives. 2 Understand the background of the problem. 3 Isolate and identify the problem, not the symptoms.

decision-makers’ objectives Managerial goals expressed in measurable terms.

4 Determine the unit of analysis. 5 Determine the relevant variables. 6 State the research questions (hypotheses) and research objectives.

Ascertain the decisionmakers' objectives.

Understand the background of the problem.

Isolate and identify the problem, not the symptoms.

Determine the unit of analysis.

Determine the relevant variables.

problem definition The crucial first stage in the research process – determining the problem to be solved and the objectives of the research.

State the research questions and research objectives.

← EXHIBIT 2.1 THE PROCESS OF PROBLEM DEFINITION

Ascertain the decision-makers’ objectives The research investigator must attempt to satisfy the decision-makers’ objectives – those of the brand manager, sales manager and others who requested the project. Management and organisational theorists suggest that the decision-maker should express goals to the researcher in measurable terms. Unfortunately, expecting this to happen is overly optimistic: Despite a popular misconception to the contrary, objectives are seldom clearly articulated and given to the researcher. The decision-maker seldom formulates his objectives accurately. He is likely to state his objectives in the form of platitudes which have no operational significance. Consequently, objectives usually have to be extracted by the researcher. In so doing, the researcher may well be performing the most useful service to the decision-maker.5

Researchers who must conduct investigations when a marketing manager wants the information ‘yesterday’ usually get little assistance when they ask: ‘What are your objectives for this study?’

decision-makers’ objectives

44

PART TWO > DEFINING THE PROBLEM

Nevertheless, even decision-makers who have only a gut feeling that marketing research might be a good idea will benefit greatly if they work with the marketing researcher to articulate precise research objectives.6 Both parties should attempt to gain a clear understanding of the purpose for undertaking the research. One effective technique for uncovering elusive research objectives is to present the marketing manager with each possible solution to a problem and ask which, if any, of these courses of action should be followed. A blanket ‘none’ can prompt further questioning to determine why the courses of action are inappropriate; this usually will help formulate objectives. By illuminating the nature of the marketing

EXHIBIT 2.2 → THE ICEBERG PRINCIPLE

Obvious symptoms

Problem definition

Boat from iStock.com/Denis Dubrovin

opportunity or problem, exploratory research also helps managers to clarify their research objectives.

Marketing management problems

Why do many marketing research projects begin without clear objectives or adequate problem iceberg principle The idea that the dangerous part of many marketing problems is neither visible to nor understood by marketing managers. situation analysis A preliminary investigation or informal gathering of background information to familiarise researchers or managers with the decision area.

definitions? Marketing managers are logical people, and it seems logical that definition of the problem is the starting point for any enterprise. Frequently, however, marketing researchers and managers cannot discover actual problems because they lack sufficiently detailed information. The iceberg principle serves as a useful analogy (see Exhibit 2.2). A sailor on the open sea knows that only 10 per cent of an iceberg extends above the surface of the water, while 90 per cent is submerged. The dangerous part of many marketing problems, like the submerged portion of the iceberg, is neither visible to nor understood by marketing managers. If the submerged portions of the problem are omitted from the problem definition (and subsequently from the research design), the decisions based on the research may be less than optimal. The example of new Coke is a case in point. Omission of important information or faulty assumptions about the situation can be extremely costly.

Based on Inside Story Knowledge Management's ‘What Makes us Different?’ http://insidestory.com.au/what-makes-us-different

Understand the background of the problem Although no textbook outline for identifying a marketing problem exists, the iceberg principle illustrates that understanding the background of the problem is vital. Often experienced managers know a great deal about a situation and can provide researchers with considerable background information about previous events and why they occurred. Under these circumstances, when the decision-makers’ objectives are clear, the problem may be diagnosed exclusively by exercising managerial judgement. On other occasions, when information about what has happened before is inadequate or when managers have trouble identifying The dramatic process of problem recognition and discovery.

the problem, a situation analysis is the logical first step in

CHAPTER 02 > PROBLEM DEFINITION AND THE RESEARCH PROCESS

defining the problem. A situation analysis involves the informal gathering of background information to familiarise researchers or managers with the decision area. Gaining an awareness of marketplace

45

ONGOING PROJECT

conditions and an appreciation of the situation often requires exploratory research. The many exploratory research techniques developed to help formulate clear definitions of the problem are covered in Chapter 3.

IDENTIFY THE BACKGROUND OF THE PROBLEM WITH A LITERATURE REVIEW

Another way to understand the background of the problem is to look at past research in the area, which is often called a literature review. A literature review is an examination of peer-reviewed (scientific and objective) research. The purpose of a literature review is to show that you have studied existing work in the background of the problem – with insight. It is not enough merely to summarise other research studies. You need to examine previous research with insight and to review it critically. An effective review analyses and synthesises material, and it should meet the following requirements: 1 Important variables (factors) that are likely to influence the problem situation are not left out of the study. 2 A clearer idea emerges as to which variables would be the most important to consider. 3 Testability and replicability of the findings of the current research are enhanced. 4 The problem statement can be made with precision and reliability. 5 Applicable methods and units of analysis are identified. An example of an extract of a literature review by a market research student is shown below. This study aims to determine which factors influence the choice of low carbohydrate (low carb) alcohol beer.

TIPS OF THE TRADE

Consumers tend to purchase products primarily where the personality of the brand is a match with their own or has traits they’d like to be associated with (Author, Year). This means that they purchase brands as a reflection or an enhancement of themself. Consumers are generally unaware that they do this consciously (Author, Year). On a general level this means that someone who perceives themself to be ‘sophisticated’ will prefer to purchase brands that are positioned in their mind as ‘sophisticated’. This general rule is subject to limitations, however, such as product category (Author, Year). For example, a product category of cleaning products might be demanded by consumers to be ‘competent’, irrelevant of their own personality. However, this limitation still identifies the importance of establishing and maintaining the brand image to effectively appeal to the target market. This concept can also be applied to determining how low carb beer is and should be positioned in Australia for male consumers. A study of Australian males has found that a relationship exists between their preference in beer and the management of their own personal image (Author, Year). This study also found that there was a social need to monitor other people’s reactions to their choice of beer brand.

These findings highlight various requirements regarding the marketing strategy for low carb beer. One can see the justification for inclusion of personality as a determinant in future research. Personality measures useful in research study have also been identified as part of this literature review. There are, of course, other factors that influence the choice of a brand of beer and these would need to be addressed in the literature review.

Isolate and identify the problem, not the symptoms Anticipating the many influences and dimensions of a problem is impossible for any researcher or executive. For instance, a firm may have a problem with its advertising effectiveness. The possible causes of this problem may be low brand awareness, the wrong brand image, use of the wrong media or perhaps too small a budget. Management’s job is to isolate and identify the most likely causes. Certain occurrences that appear to be the problem may be only symptoms of a deeper problem. Table 2.1 illustrates how symptoms may be mistaken for the true problem.

ONGOING PROJECT

46

PART TWO > DEFINING THE PROBLEM

TABLE 2.1 »

SYMPTOMS CAN BE CONFUSING

Organisation

Symptoms

Problem definition based on symptom

True problem

Twenty-year-old local swimming association in a major city

Membership has been declining for years. New water park with wave pool and water slides moved into town a few years ago.

Local residents prefer the more expensive water park and have a negative image of swimming pool.

Demographic changes: children in this 20-year-old local area have grown up; older residents no longer swim.

Manufacturer of palm-sized computer with wireless Internet access

Distributors complain prices are too high.

Investigate business users to learn how much prices need to be reduced.

Sales management: distributors do not have adequate product knowledge to communicate product’s value.

Boutique brewery

Consumers prefer the taste of a competitor’s brand.

What type of reformulated taste is needed?

Package: old-fashioned package influences taste perception.

Other problems may be identified only after gathering background information and conducting exploratory research. How does one ensure that the fundamental problem has been identified, rather than its symptoms? There is no easy answer to this question. Executive judgement and creativity must be exercised. ONGOING PROJECT

Determine the unit of analysis Defining the problem requires that the researcher determines the unit of analysis for the study. The researcher must specify whether the investigation will collect data about individuals, households, organisations, departments, geographical areas or objects. In

Thinkstock

studies of home buying, for example, the husband–wife dyad typically is the unit of analysis, rather than the individual, because many purchase decisions are made jointly by husband and wife. Researchers who think carefully and creatively about situations often discover that a problem may be investigated at more than one level of analysis. Determining the unit of analysis, although relatively straightforward in most projects, should not be overlooked during the problem definition stage of the research. It is a fundamental aspect of problem definition. Defining the unit of analysis is an important aspect of the problem definition process. In many marketing research studies, the family rather than the individual is the appropriate unit of analysis. variable Anything that may assume different numerical or categorical values.

Determine the relevant variables Another aspect of problem definition is identification of key variables. The term ‘variable’ is important in research. A variable is anything that varies or changes in value. Because a variable represents a quality that can exhibit differences in value, usually in magnitude or strength, it may be said that a variable generally is anything that may assume different numerical or categorical values. For example, attitudes towards airlines may be a variable ranging from positive to negative. Each attribute of airline services – such as safety, seat comfort and baggage handling – is a variable.

CHAPTER 02 > PROBLEM DEFINITION AND THE RESEARCH PROCESS

47

← EXHIBIT 2.3 A FISHBONE DIAGRAM FOR AIRLINE SERVICE QUALITY

Independent variable

Customer expectations

Customer’s mood

Price discount v standard fare Dependent variable Airline service quality

Actual performance past and present

Demographics

Purpose of flight

ONGOING PROJECT

One way to ensure that all relevant variables are included in a study is to conduct a literature review and then draw a diagram that shows the important independent variables (factors) that influence the dependent variable (outcome). This is often called a fishbone diagram (see Exhibit 2.3) and includes at its head the dependent variable and on its ribs the independent variables.

THEY MAY WANT TO STRANGLE YOU!7

Hugh Dubberly, a manager with the Times-Mirror Company, advocates the following step-by-step process to help clearly define the problem to be solved.

REAL WORLD SNAPSHOT

How do we define the problem? Begin by assembling all the relevant players in a room. Ask each player to describe the unmet need or, in other words, to suggest the cause of the problem. Write down each suggestion. Nothing you will do on the project will be more important. With each suggestion, ask in turn for its cause. And then the cause of the cause. And then the cause of the cause of the cause. Keep at it like a two-year-old. By the time everyone in the room wants to strangle you, you will very likely have found the root cause of the problem. After you’ve developed the problem statement, you need to be sure to gain consensus on it from all the relevant parties. Failure to get ‘buy in’ from all the right people at this stage creates the potential for trouble later in the process. Someone who hasn’t agreed on the definition up front is likely to want to change it later. In statistical analysis, a variable is identified by a symbol, such as X. Categories or numerical values may then be associated with this symbol. The variable ‘gender’ may be categorised as male or female; gender is a categorical or classificatory variable, since it has a limited number of distinct values. On the other hand, ‘sales volume’ may encompass an infinite range of numbers; it is a continuous variable, one that can have an infinite number of values.

categorical (classificatory) variable A variable that has a limited number of distinct values. continuous variable A variable that has an infinite number of possible values.

48

PART TWO > DEFINING THE PROBLEM

In causal research, the terms ‘dependent variable’ and ‘independent variable’ are frequently encountered. A dependent variable is a criterion or a variable that is to be predicted or explained. An independent variable is a variable that is expected to influence the dependent variable. For dependent variable independent variable

example, average sales compensation may be a dependent variable that is influenced or predicted by an independent variable, such as number of years of experience. These terms are discussed in greater detail in the chapters on experimentation and data analysis.

WHAT WENT WRONG?

dependent variable A criterion or variable to be predicted or explained. independent variable A variable that is expected to influence a dependent variable.

»» Being fortunate enough to have a window and then keeping the shade closed all day. »» Coming in late but leaving on time. »» Noisiness – squeaky chairs, loud laughter, loud phone calls, aggravating smartphone noises and annoying body noises. The question becomes, how should management deal with these annoying habits? Should employees confront each other or should management create policies that restrict all possible annoying behaviour? Is that possible? Maybe this calls for more research. Do you know an annoying co-worker?8 Corbis/Tim Pannell

Marketing problems are often people problems. Disgruntled marketing employees can lack creativity and lose the motivation that drives a strong work ethic. What appears to be a marketing problem can really be an internal, people-provoked problem. With this in mind, what kind of things do co-workers do that irritate other co-workers? Several descriptive research studies address this by surveying employees and having them rate potentially problematic behaviours based on how annoying they actually are. Perhaps the resulting list isn’t surprising, but some of the most problematic co-worker habits and practices include the following: »» Slacking – the perception that a co-worker is simply not pulling his or her share of the work is one of the most frequently mentioned annoying behaviours. »» Abusing the printer or copier – printing things unnecessarily, wasting paper, occupying the printer or copier, or both, with personal documents and slowing down work. »» Leaving used tea bags in the office sink. »» Currying favour with the boss. »» Constantly eating and leaving behind crumbs and warming things in the microwave that leave behind strong odours (pickled cabbage, fish, popcorn, etc.). »» Over air-conditioning the office, making it as cold as a fridge.

Managers and researchers must be careful to identify all relevant variables necessary to define the managerial problem. Likewise, variables that are superfluous (not directly relevant to the problem) should not be included. The process of identifying the relevant variables overlaps with the process of determining the research objectives. Typically, each research objective will mention a variable or variables to be measured or analysed.

ONGOING PROJECT

State the research questions and research objectives Both managers and researchers expect problem definition efforts to result in statements of research questions and research objectives. At the end of the problem definition stage, the researcher should prepare a written statement that clarifies any ambiguity about what the research hopes to accomplish.

CHAPTER 02 > PROBLEM DEFINITION AND THE RESEARCH PROCESS

49

CLARITY IN RESEARCH QUESTIONS AND HYPOTHESES Formulating a series of research questions and hypotheses adds clarity to the statement of the marketing problem. A personal computer company made the following statement about an advertising problem: ‘In the broadest sense, the marketing problem is to determine the best ways [name of the company] can communicate with potential purchasers of laptop computers.’ →→ How familiar are consumers with the various brands of computers? →→ What attitudes do consumers have towards these brands? →→ How important are the various factors for evaluating the purchase of a laptop computer? →→ How effective are the communications efforts of the various competitive marketers in terms of message recognition? Research questions make it easier to understand what is perplexing managers and to indicate what issues have to be resolved. A research question is the researcher’s translation of the marketing problem into a specific inquiry. A research question can be too vague and general if stated in terms such as: ‘Is advertising copy X better than advertising copy Y?’ Advertising effectiveness can be variously measured by sales, recall of sales message, brand awareness, intention to buy and so on. Asking a more specific research question (for example: ‘Which advertisement has a higher day-after recall score?’) helps the researcher design a study that will produce pertinent information. The answer to the research question should be a criterion that can be used as a standard for selecting alternatives. The stage of the research obviously is related to problem definition. The goal of defining the problem is to state the research questions clearly and to develop well-formulated hypotheses. A hypothesis is an unproven proposition or possible solution to a problem. Hypothetical statements assert probable answers to research questions. A hypothesis is a statement about the nature of the world; in its simplest form it is a guess. A sales manager may hypothesise that salespeople who show the highest job satisfaction will be the most productive. An advertising manager may believe that if consumers’ attitudes towards a product are changed in a positive direction, consumption of the product will increase. Problem statements and hypotheses are similar. Both state relationships, but

hypothesis An unproven proposition or supposition that tentatively explains certain facts or phenomena; a probable answer to a research question.

problem statements are interrogative whereas hypotheses are declarative. Sometimes the two types of statements are almost identical in substance. An important difference, however, is that hypotheses usually are more specific than problem statements; typically, they are closer to the actual research operations and testing. Hypotheses are statements that can be empirically tested. A formal statement of a hypothesis has considerable practical value in planning and designing research. It forces researchers to be clear about what they expect to find through the study, and it raises crucial questions about the data that will be required in the analysis stage. When evaluating a hypothesis, researchers should ensure that the information collected will be useful in decisionmaking. Notice how the following hypotheses express expected relationships between variables: →→ There is a positive relationship between buying on the Internet and the presence of younger children in the home. →→ Sales are lower for salespeople in regions that receive less advertising support. →→ Consumers will experience cognitive dissonance after the decision to purchase a plasma rather than an LCD widescreen television. →→ Opinion leaders are more affected by mass media communication sources than are non-leaders. →→ Among non-exporters, the degree of perceived importance of overcoming barriers to exporting is related positively to general interest in exporting (export intentions).9

hypothesis

50

PART TWO > DEFINING THE PROBLEM

DECISION-ORIENTED RESEARCH OBJECTIVES research objective The researcher’s version of the marketing problem; it explains the purpose of the research in measurable terms and defines standards for what the research should accomplish.

The research objective is the researcher’s version of the marketing problem. After the research questions or hypotheses have been stated, the research project objectives are derived from the problem definition. They explain the purpose of the research in measurable terms and define standards for what the research should accomplish. In addition to explaining the reasons for conducting the project, research objectives help ensure that the research project will be manageable in size. Table 2.2 illustrates how the marketing problem of a nationwide retail chain store (whether it should offer an in-home shopping service using Internet ordering) is translated into research objectives. In some instances the marketing problem and the project’s research objectives are identical. However, the objectives must specify the information needed to make a decision. Identifying the needed information may require that managers or researchers be extremely specific, perhaps even listing the exact wording of the question in a survey or explaining exactly what behaviour might be observed or recorded in an experiment. Statements about the required precision of the information or the source of information may be required to clearly communicate exactly what information is needed. Many product buying decisions, for example, are made by both husband and wife. If this is the case, the husband–wife decision-making unit is the unit of analysis. The objective of obtaining X information about research questions from this unit should be specifically stated. Table 2.2 translates the broad research objective – to determine consumers’ perceived need for an in-home shopping service – into specific objectives; namely, to determine consumer awareness, to obtain ranked preferences for alternative forms of the service, to compare the needs of various market segments and so on. The specific objectives influence decisions about the research design because they indicate the type of information needed.

TABLE 2.2 »

MARKETING PROBLEM TRANSLATED INTO RESEARCH OBJECTIVES

Marketing management problem/questions

Research questions

Research objectives

Should the retail chain store offer in-home shopping via the Internet?

Are consumers aware of Internet home shopping systems? What are consumers’ reactions to Internet shopping?

To determine consumer awareness with aided recall. To measure consumer attitudes and beliefs about home shopping systems.

In which of several possible forms should the service be offered?

How do consumers react to service form A? B? C? What are the perceived benefits of each form of service?

To obtain ratings and rankings of each form of service. To identify perceived benefits of and perceived objections to the system.

What market segment should be the target market?

Will consumers use the service? How often? Do the answers to the above questions differ depending on demographic group? Who are the best prospects?

To measure purchase intentions; to estimate likelihood of usage. To compare – using cross-tabulations – levels of awareness, evaluations, purchasing intentions etc. of men versus women, high-income versus low-income groups, young consumers versus older consumers etc.

What pricing strategy should we follow?

How much do prospective customers think the service will cost? Do prospective customers think this product will be priced higher or lower than competitive offerings? Is the product perceived as a good value?

To ascertain consumers’ knowledge and expectations about prices. To learn how the price of this service is perceived relative to competitors’ pricing. To determine the perceived value of the service.

Note: For simplicity, hypotheses are omitted from the table.

CHAPTER 02 > PROBLEM DEFINITION AND THE RESEARCH PROCESS

A research objective is useful if it is a managerial action standard that specifies the performance criterion to be used. If the criterion to be measured (for example, sales or attitude changes) turns out to be higher than some predetermined level, management will do A; if it is lower, management will do B.10 This type of objective leaves no uncertainty about the decision to be made once the research is finished.

managerial action standard A performance criterion or objective that expresses specific actions that will be taken if the criterion is achieved.

The research objectives should be limited to a manageable number. Fewer study objectives make it easier to ensure that each will be addressed fully. Exhibit 2.4 shows how the statement of a marketing problem influences the research objectives. The specific objectives, in turn, become the basis for the research design. Exhibit 2.4 also shows how exploratory research can help managers in the overall definition of the marketing problem. However, in routine situations or when managers are quite familiar with the background information, it is quite likely that the problem definition will be based exclusively on the decision-maker’s objectives. Once the research has been conducted, the results may show an unanticipated aspect of the problem and suggest a need for additional research to satisfy the main objective. Accomplished researchers who have had the experience of uncovering additional aspects of a marketing problem after finishing fieldwork recommend designing studies that include questions designed to reveal the unexpected. ← EXHIBIT 2.4 INFLUENCE OF THE STATEMENT OF THE MARKETING PROBLEM ON OBJECTIVES AND RESEARCH DESIGN

Specific objective –1

Statement of marketing problem

Exploratory research (optional)

Broad research objectives

Specific objective – 2

Research design

Results

Specific objective – 3

HOW MUCH TIME SHOULD BE SPENT DEFINING THE PROBLEM? Budget constraints usually influence the amount of effort that will be spent defining the problem. Most marketing situations are complex, and numerous variables may have some influence. Searching for every conceivable cause and minor influence is impractical. The importance of the recognised problem will dictate a reasonable amount of time and money to spend determining which possible explanations are most likely. Marketing managers, being responsible for decision-making, may wish the problem definition process to proceed quickly. Researchers, who can take a long time to carefully define problems, may frustrate managers. However, the time taken to identify the correct problem is time well spent.

51

52

PART TWO > DEFINING THE PROBLEM

THE RESEARCH PROPOSAL research proposal A written statement of the research design that includes a statement explaining the purpose of the study and a detailed, systematic outline of procedures associated with a particular research methodology.

The research proposal is a written statement of the research design. It always includes a statement explaining the purpose of the study (research objectives) or a definition of the problem. It systematically outlines the particular research methodology and details the procedures that will be followed during each stage of the research process. Normally a schedule of costs and deadlines is included in the research proposal. Exhibit 2.5 illustrates a proposal for a short research project. Preparation of a research proposal forces the researcher to think critically about each stage of the research process. Vague plans, abstract ideas and sweeping generalisations about problems or procedures must become concrete and precise statements about specific events. Information to be obtained and research procedures to be implemented have to be clearly specified so others

research proposal

may understand their exact implications. All ambiguities about why and how the research will be conducted must be clarified before the proposal is complete. Because the proposal is a clearly outlined plan submitted to management for acceptance or rejection, it initially performs a communication function: it serves as a mechanism that allows managers to evaluate the details of the proposed research design and determine if alterations are necessary. The proposal helps managers decide if the proper information will be obtained and if the proposed research will accomplish what is desired. If the marketing problem has not been adequately translated into a set of specific research objectives and a research design, the client’s assessment of the proposal will help to ensure that the researchers revise it to meet the client’s information needs.

EXHIBIT 2.5 → AN ABBREVIATED VERSION OF A RESEARCH PROPOSAL FOR NSW TRADE & INVESTMENT11

NSW Trade & Investment wishes to engage a service provider to develop a customer satisfaction measurement methodology for the Department and undertake research to establish a baseline for customer satisfaction and general awareness of and use of current services. The scope includes: • Development of methodology • Provision of a customer satisfaction metric and net promoter score • Delivery of a single measure that will be used as a corporate driver for customer engagement improvements and act as a measurement for the Department’s performance. • Research design, customer data collection and implementation • Demonstrated capability to recruit target audience from various industry sectors and geographic locations (regional & international) • Analysis and interpretation of results • Online access (i.e., via service provider’s portal) to information by service type, industry

sector, international market and regional location – Recommendation of on-going customer satisfaction measurement Requirements include the following: • Development of methodology • Provision of a customer satisfaction metric and net promoter score • Delivery of a single measure that will be used as a corporate driver for customer engagement improvements and act as a measurement for the Department’s performance • Research design, customer data collection and implementation Research design to be informed by: • Service types • Input from NSW Trade & Investment subject matter experts • Target audience Target audience includes businesses (SMEs, large organisations, investors, buyers), community and

»

CHAPTER 02 > PROBLEM DEFINITION AND THE RESEARCH PROCESS

» • • •

• •

industry representatives as they relate to the Department’s service types, priority sectors and markets Analysis and interpretation of results Provision of insights that will drive strategic and tactical business actions Expectation is that insights will inform customer service improvements, and business process and service design Identification of competitors and awareness of their level of service to the market Develop a single measure that will represent the performance of strategic priority one (1) of

the corporate plan – ‘Engage with business and communities’. • Online access to information by service type, industry sector, international market and regional location • Ability for executive leaders and senior managers to access and mine results (i.e. via service provider’s portal) as it relates to a specific service, industry sector and/ or international market. • Must have experience in developing a similar methodology/tool for similar organisations.

The proposal must communicate exactly what information will be obtained, where it will be obtained and how it will be obtained. For this reason, it must be explicit about sample selection, measurement, fieldwork and so on. For instance, most survey proposals will include a copy of the proposed questionnaire (or at least some sample questions) to ensure that managers and researchers agree on the information to be obtained and on the wording of questions. The format for the research proposal in Exhibit 2.5 follows the six stages in the research process outlined in Exhibit 1.3. At each stage, one or more questions must be answered before the researcher can select one of the various alternatives. For example, before a proposal can be completed, the researcher needs to know what is to be measured. A simple statement like ‘market share’ may not be enough; market share may be measured by auditing retailers’ or wholesalers’ sales, using trade association data, or asking consumers what brands they buy. What is to be measured is just one of many important questions that must be answered before setting the research process in motion. This issue is addressed in greater detail in Chapter 8; for now, Table 2.3 presents an overview of some of the basic questions that managers and researchers typically must answer when planning a research design. Included in the online materials with this text is also a suggested table of contents for a market research report. Review the research proposal in Exhibit 2.5 to see how some of the questions in Table 2.3 were answered in a specific situation.12 However, you will have to read this entire book before you can fully understand these issues. In business, one often hears the adage: ‘Don’t say it, write it.’ This is wise advice for the researcher who is proposing a research project to management. Misstatements and faulty communication may occur if the parties rely only on each individual’s memory of what occurred at a planning meeting. Writing a proposal for a research design, specifying exactly what will be done, creates a record to which everyone can refer and eliminates many problems that might arise after the research has been conducted. With a written proposal, management and researchers alike are less likely to discover, after completion of the research, that information related to a particular variable was omitted or that the sample size was too small for a particular subgroup. Furthermore, as a statement of agreement between marketing executives and researchers, the formal proposal will reduce the tendency for someone reading the results to say: ‘Shouldn’t we have had a larger sample?’ or ‘Why didn’t you

53

54

PART TWO > DEFINING THE PROBLEM

TABLE 2.3 »

BASIC QUESTIONS TYPICALLY ASKED WHEN PLANNING A RESEARCH DESIGN

Decisions to make

Basic questions

Problem definition

What is the purpose of the study? How much is already known? Is additional background information necessary? What is to be measured? How? Can the data be made available? Should research be conducted? Can a hypothesis be formulated?

Selection of basic research design

What types of questions need to be answered? Are descriptive or causal findings required? What is the source of the data? Can objective answers be obtained by asking people? How quickly is the information needed? How should survey questions be worded? How should experimental manipulations be made?

Selection of sample

Who or what is the source of the data? Can the target population be identified? Is a sample necessary? How accurate must the sample be? Is a probability sample necessary? Is a national sample necessary? How large a sample is necessary? How will the sample be selected?

Data gathering

Who will gather the data? How long will data gathering take? How much supervision is needed? What procedures will data collectors need to follow?

Data analysis and evaluation

Will standardised editing and coding procedures be used? How will the data be categorised? Will computer or hand tabulation be used? What is the nature of the data? What questions need to be answered? How many variables are to be investigated simultaneously? What are the criteria for evaluation of performance?

Type of report

Who will read the report? Are managerial recommendations requested? How many presentations are required? What will be the format of the written report?

Overall evaluation

How much will the study cost? Is the time frame acceptable? Is outside help needed? Will this research design attain the stated research objectives? When should the research begin?

CHAPTER 02 > PROBLEM DEFINITION AND THE RESEARCH PROCESS

55

do it this way?’ As a record of the researcher’s obligation, the proposal also provides a standard for determining whether the actual research was conducted as originally planned. When the research will be conducted by a consultant or an outside research supplier, the written proposal serves as that person’s bid to offer a specific service. Typically, a client solicits several competitive proposals, and these written offers help management judge the relative quality of alternative research suppliers. One final comment needs to be made about the nature of research proposals: not all proposals follow the same format. The researcher must adapt the proposal to the target audience. An extremely brief proposal submitted by an organisation’s internal marketing research department to its own marketing executives bears little resemblance to a complex proposal submitted by a university professor to a federal government agency to research a basic consumer issue.

GIVE THEM WHAT THEY WANT OR WHAT THEY NEED?

Market research is a competitive business and often research findings may be used merely to confirm or support decisions management has already taken. Likewise, there is often pressure on consultants to produce favourable findings for the client so that repeat business can be assured. These pressures may well be reflected in the nature of the research proposal. Good researchers must support the objectivity of the market research and the true value of market research by only providing proposals that aid rather than confirm decision-making.

EXPLORING RESEARCH ETHICS

ANTICIPATING OUTCOMES The description of the data processing and analysis stage in Table 2.3 is extremely brief because this topic is not discussed until Chapter 11. However, at this point some advice about data analysis is in order. One aspect of problem definition often lacking in research proposals is anticipating the outcomes (that is, the statistical findings) of the study. The use of a dummy table in the research proposal often helps the manager gain a better understanding of what the actual outcome of the research will be. Dummy tables are representations of the actual tables that will appear in the findings section of the final report. They get the name because the researcher populates, or ‘dummies up’, the tables with likely but fictitious data. In other words, the researcher anticipates what the final research report will contain (table by table) before the project begins. A research analyst can present dummy tables to the decision-maker and ask: ‘Given findings like these, will you be able to make a decision to solve your managerial problem?’ If the decision-maker says yes, the proposal may be accepted. However, if the decision-maker cannot glean enough information from the dummy tables to make a decision about what the company would do with the hypothetical outcome they suggest, he or she must rethink what outcomes and data analyses are necessary to solve the problem. In other words, the marketing problem is clarified by deciding on action standards or performance criteria and recognising the types of research findings necessary to make specific decisions.

dummy tables Representations of the actual tables that will be in the findings section of the final report; used to provide a better understanding of what the actual outcomes of the research will be.

56

02

PART TWO > DEFINING THE PROBLEM

SUMMARY DISCUSS THE NATURE OF DECISION-MAKERS’ OBJECTIVES AND THEIR ROLE IN DEFINING THE RESEARCH PROBLEM

DISCUSS HOW FORMULATION OF RESEARCH QUESTIONS AND HYPOTHESES CLARIFIES PROBLEM DEFINITION

The first step in any marketing research project is to define the problem or opportunity. Decision-makers must express their objectives to researchers to avoid getting the right answer to the wrong question. Defining the problem often is complicated in that portions of the problem may be hidden from view. The research must help management isolate and identify the problem to ensure that the real problem, rather than a symptom, is investigated.

Research questions and hypotheses are translations of the marketing problem into marketing research terms. A hypothesis is an unproven proposition or a possible solution to the problem. Hypotheses state relationships between variables that can be tested empirically. Hypotheses and research questions should be stated in clear unambiguous terms for the research to deliver relevant results.

UNDERSTAND THAT PROPER PROBLEM DEFINITION IS ESSENTIAL FOR EFFECTIVE MARKETING RESEARCH

DISCUSS THE INFLUENCE OF THE STATEMENT OF THE MARKETING PROBLEM ON THE SPECIFIC RESEARCH OBJECTIVES

The type and nature of the research problem will determine the research design and unit of analysis. There is no such thing as a research design that provides all the information for all types of problems. Problems that are not well defined or complex may require exploratory and/or qualitative research. More straightforward requests to know the extent of something may be answered by descriptive designs such as surveys. Lastly, wanting to know what causes changes in consumer behaviour and how this might be influenced by changes in the marketing mix is best answered by causal or experimental designs. It’s a ‘horses for courses’ approach. UNDERSTAND THE IMPORTANCE OF IDENTIFYING KEY VARIABLES

A variable is anything that changes in value. Variables may be categorical or continuous. One aspect of problem definition is the identification of the key dependent and independent variables.

Research objectives specify information needs. For the research project to be successful, the research problem must be stated in terms of clear and precise research objectives. This should be provided by the client, but can be determined in partnership with the researcher. These should always be written as part of the brief and link back to the management decision or problem that triggered the need for research in the first place. EXPLAIN THE PURPOSE OF THE RESEARCH PROPOSAL

The research proposal is a written statement of the research design that will be followed in addressing a specific problem. The research proposal allows managers to evaluate the details of the proposed research and determine if alterations are needed. Most research proposals include the following sections: purpose of the research, research design, sample design, data gathering and/or fieldwork techniques, data processing and analysis, budget, and time schedule.

KEY TERMS AND CONCEPTS categorical (classificatory) variable continuous variable decision-makers’ objectives

dependent variable dummy tables hypothesis iceberg principle

independent variable managerial action standard problem definition research objective

research proposal situation analysis variable

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 Assume you are the marketing manager for MasterCard, Prepare a market research brief that addresses the impact of cyberfraud on purchasing behaviour online. Include a statement of the management problem, research objectives and possible research design. How would your brief change if you were told by management that research findings had to be obtained in two weeks’ time?

2 In its broadest context, what is the task of problem definition? 3 Define market symptoms. Give an example as to how it applies to the hospitality and restaurant industries. 4 What is the iceberg principle?

CHAPTER 02 > PROBLEM DEFINITION AND THE RESEARCH PROCESS

5 A government department wants to reduce youth binge drinking. List some variables that might be investigated to solve this problem. 6 Go the library, find the business journals, and record and evaluate some hypotheses that have been investigated in recent years. Identify the key independent and dependent variables. 7 Consider the following list, and indicate and explain whether each best fits the definition of a problem, opportunity or symptom. a A 12.5 per cent decrease in store traffic for a department store in a medium-sized shopping centre. b A furniture manufacturer and retailer in China reads a research report indicating consumer trends towards Australia jarrah and karri wood. The export of these products is very expensive. c A health food manufacturer reads a research report by Euromonitor International that indicates that consumers in Malaysia are becoming more concerned about their health, as the median age of the population increases. d DHL fuel costs increased 200% between 2015 and 2016. e The Hilton hotel group faces exchange rate pressures between Australia, the United States, Europe and Japan that have changed dramatically between 2005 and 2016. 8 What purpose does a research proposal serve? 9 What role should managers play in the development of the research proposal?

57

10 Comment on the following statements: a ‘The best marketing researchers are prepared to rethink and rewrite their proposals.’ b ‘The client’s signature is an essential element of the research proposal.’ 11 You have been hired by a group of hotel owners, restaurant owners and other people engaged in businesses who benefit from tourism at Hawkes Bay, New Zealand. They wish to learn how they can attract a large number of international students there on their summer break. Define the marketing research problem. (You may substitute a beach town in your country or state if you prefer.) 12 You have been hired by the Victorian Country Fire Volunteers to learn how they can increase the number of people who volunteer as country firefighters. Define your research objectives. 13 You have solicited research proposals from several firms. The lowest bidder has the best questionnaire and proposal. However, there is one feature that you particularly like in the proposal submitted by a firm that will not receive the job. How should you handle this situation? 14 You are asked by management to provide a justification for last year’s marketing campaign by conducting research that will show that it had a positive effect on sales. There is a very good chance that if your report is well received you will be considered for promotion. What should you do?

ONGOING PROJECT DOING A RESEARCH PROJECT? CONSULT THE CHAPTER 2 PROJECT WORKSHEET FOR HELP

Download the Chapter 2 project worksheet from the CourseMate website and see if you are able to address each of the six steps in

defining the research problem. If you are unable to complete these six steps, then a literature review is required (see page 45, Tips of the trade).

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to log in to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ online research activities

☑ flashcards ☑ case projects ☑ interactive quizzes ☑ videos.

WRITTEN CASE STUDY 2.1 DRINKWISE: CHANGING AUSTRALIA’S DRINKING CULTURE13 DrinkWise Australia is an independent, not-for-profit organisation focused on promoting change towards a healthier and safer drinking culture in Australia. As an evidence-based organisation,

DrinkWise relies on key independent research and clinical advice. The information supporting their campaigns and contained on their website has been gathered from studies and knowledge

58

PART TWO > DEFINING THE PROBLEM

gained in consultation with experts in the fields of public health, neuroscience, epidemiology, and child and adolescent psychology. The research program of DrinkWise also includes market research. One market research study entitled, ‘What a great night: The cultural drivers of drinking practices among 14–24’, was conducted by researchers at Monash and Deakin universities. The study used a number of research methodologies: These were: 1 a drinking biographies study that involved in-depth interviews 2 a quantitative questionnaire with young people, aged 20–24 3 a sporting clubs study that involved in-depth interviews and focus groups with young people and key informants at Victorian sporting clubs. The key results of the research were that alcohol consumption is a central feature of Australian culture and identity, particularly for young people. Rather than following guidelines and government recommendations regarding alcohol intake, young people emphasise responsibility when drinking. The preoccupation of alcohol researchers and public health

practitioners in accurately measuring alcohol consumption and defining standard drinks is not shared by the young people in these research projects. Young adults did not count drinks as a measure of harm control; rather it was the physical and emotional risks of being out of control that was considered by them. University colleges were places where heavy drinking occurred, while sporting clubs encouraged a greater sense of community and responsible drinking.

QUESTIONS

1 What do you think were the management issues/problems which triggered the need for this research? 2 Outline some key research objectives you think would have been developed from the management problems/issues in this research study. 3 How well do you think the research design meets the research objectives of the study? Justify your answer.

WRITTEN CASE STUDY 2.2 A TENDER FOR MARKET RESEARCH SERVICES BY THE AUSTRALIAN COMMUNICATIONS AND MEDIA AUTHORITY (ACMA) Below is a tender from ACMA for market research services. The ACMA wishes to establish a panel of research provider organisations to supply social and market research services. The panel is expected to operate for 3 years with the possibility of two 1-year extensions. It is anticipated that the organisations that will make up the social and market research panel will be required to undertake research services falling within the major categories of: a qualitative research b quantitative research c academic and other specialist research services d off-the-shelf research, reports and/or data sets. Within these categories will be a range of activities that will need to be supported by an appropriate match of research skills. The activities include, but may not be limited to: a quantitative surveys (including computer-assisted telephone interviewing (CATI), online, mobile only, face-to-face, mail-out and omnibus surveys)

f

other services (i.e., ethnographic research, desk research, specialist research services and/or advice/input into research development)

g production of research reports. The ACMA requires primary research to be undertaken with a range of participant groups which may include, but is not limited to, the following: a communities, consumers, citizens, audiences b general population, those with a landline at home, those with a mobile only, Internet users c specific sub-groups (i.e., non-English speaking background, Aboriginal and/or Torres Strait Islander, disability, older people, parents, children and young people) d small, medium and large businesses and organisations e metropolitan, regional and remote communities (as well as remote Indigenous communities) f

industry sectors (education, health, farming, etc.)

g related professional and stakeholder organisations.

b qualitative research such as: i

focus group discussions (including in person and online groups)

ii in-depth interviews (i.e., via telephone or face-to-face) c literature review d peer review e evaluation research (i.e., program evaluation, evaluation of effectiveness of brand marketing)

QUESTIONS

1 Why is there no research problem in this tender? 2 Why is there no discussion of budget, or expected time of completion in this proposal? 3 How might a market research company go about preparing a proposal for this tender?

CHAPTER 02 > PROBLEM DEFINITION AND THE RESEARCH PROCESS

59

ONGOING CASE STUDY: MOBILE PHONE SWITCHING AND BILL SHOCK David, Leanne and Steve, marketing academics at a large university, were approached by an Australian telecommunication provider, AusBargain. The company has developed a prepaid SIM card with improved simplicity and enhanced customer value. Consumers can also switch seamlessly to prepaid or contract with the provider. The prepaid SIM card provided by AusBargain is backed by an Australian call centre. The CEO of AusBargain wants David, Leanne and Steve to develop a research proposal, for no more than $50 000, to examine why consumers switch mobiles and how they deal with bill shock (unexpected large charges on

their account). He hopes this will trigger consumers to consider changing providers and therefore saving money.

QUESTIONS

1 Develop a research proposal for this client 2 What kind of research designs are best suited? Why? 3 Are there any ethical issues for David, Leanne and Steve to consider with this proposal and contract and how could they be avoided?

NOTES 1

2 3 4 5 6

7 8

Australian Market and Social Research Society (2010) Research effectiveness awards: Winner 2010 for Social Impact: The welfare to work story getting employers to look outside the square. Accessed at http://researcheffectiveness.com.au/2012-winners on 5 November 2015. The Economist Intelligence Unit (2011) ‘Turning market research focus away from the shovel and to the hole’, Executive Briefing, The Economist Intelligence Unit, 16 March, accessed at http://www.factiva.com on 10 July 2012. Reuters News (2011) ‘Disney’s Magic Kingdom comes to the Middle Kingdom’, Reuters News, 9 April, accessed at http://www.factiva.com on 10 July 2012. Rankin, Tracey (2011) ‘What’s new is old’, Research News, September, pp. 30–1, Australian Market and Social Research Society. Ackoff, Russell (1962) Scientific method, New York: John Wiley and Sons, p. 71. Chapman, Randall (1989) ‘Problem definition in marketing research studies’, Journal of Marketing Research, Spring, pp. 51–9; Yang, Yoo, Leone, Robert & Alden, Dala (1992) ‘A market expansion ability approach to identify potential exporters’, Journal of Marketing, January, p. 88. Coyne, Patrick (1995) ‘The ACD Conference on Interactive Media’, Communication Arts, January/February, pp. 135–40. Excerpted with permission from Coyne & Blanchard, Inc. Sources: ‘What Your Workers Find Most Annoying,’ Legal Alert for Supervisors, 3, no. 64 (2008) 3; Piccolo, C. ‘Irritating Coworkers,’ Medhunters.com (2008), http://www.

9 10 11 12

13

medhunters.com/articles /irritatingCoworkers.html, accessed 20 July 2008; ‘Xerox Survey Reveals Environmental Pet Peeves Among Office Workers’, Graphic Arts Online (17 April 2008), accessed at http://www.graphicartsonline.com on 18 June 2008; Nudd, T. (2005) ‘Pet Peeves,’ Adweek, 46 (26 September), p. 33. Aulino, John (1975) ‘Will the real market research manager stand up?’, Marketing Review, April, p. 12. Holbert, Bruce (1976) ‘Research: The ways of academe and business’, Business Horizons, February, p. 38. NSW Government (2015) Archived Tender DTIRISS15/247 Accessed at https:// tenders.nsw.gov.au/?event=public.rft.showArchived&RFTUUID=4BDC6099-C08FD7C2-48867065C46EFF78 on 6 November 2015. Space restrictions do not permit the inclusion of a complete research proposal. Often entire questionnaires appear as exhibits in proposals. Students interested in additional information on writing research proposals should review Chapter 16 on writing the research report. Lindsay, Jo, Kelly, Peter, Harrison, Lyn, Hickey, Christopher, Advocat, Jenny, & Cormack, Sue (2009) ‘What a great night: The cultural drivers of alcohol consumption 14–24 year-old Australians,’ Australian Government, Department of Health and Ageing, Accessed at http://www.drinkwise.org.au on 15 September 2012.

THREE PLANNING THE RESEARCH DESIGN 03 » QUALITATIVE RESEARCH 04 » SECONDARY RESEARCH WITH BIG DATA 05 » SURVEY RESEARCH 06 » OBSERVATION 07 » EXPERIMENTAL RESEARCH AND TEST MARKETING 08 » MEASUREMENT 09 » QUESTIONNAIRE DESIGN

PART 3: Planning the research design

NATURE OF RESEARCH PROBLEM

Undefined and exploratory (Why don't people buy the product?)

Descriptive (e.g. who is the market?)

Secondary research

Casual (Does packaging influence sales?)

Observational research

Qualitative research

Experimental research Survey research

iStock.com/Carlos Gomez Bove

Measurement

Questionnaire design

61

03 » WHAT YOU WILL LEARN IN THIS CHAPTER

To compare and contrast qualitative research and quantitative research.

To understand the use of qualitative research in exploratory research designs. To describe the basic orientations of qualitative research. To recognise common qualitative research tools and know the advantages and limitations of their use. To prepare a focus group outline. To recognise technological advances in the application of qualitative research approaches.

Engaging with the young disenchanted voter in Australia1

As voting is compulsory in Australia, there is a much greater chance that people with low involvement (or not interested in politics) have to engage in the political process. Despite being compulsory, the lowest voter turnout in Australia is with younger (18–29 year old) people. There is evidence from overseas research that such young voters have a much lower trust of politicians and ‘a cynical view of the political system’.2 Other possible reasons for a lack of engagement may include that younger voters do not yet feel like stakeholders in the system, as they are less likely to own a house, have stable residence or be parents. Lack of engagement with the political process is clearly a concern in Australia, with around 25 per cent not registering to vote in the 2013 election.3 Qualitative research, in this case in-depth personal interviews, revealed among a sample of 29 respondents the possible reasons for a lack of engagement in politics by young Australians among a sample of 29 respondents. These included:

Stability – younger voters being less involved are less party loyal: 1 satisfaction and trust of the political party 2 confidence in the political party and system 3 attitude to compulsory voting 4 attitudes to and use of media 5 impact on family and friends. As one more engaged respondent noted: I’m interested in what happens to my country to making sure that the best decisions are made, not only the ones that affect me, but impact on my family and my community and my workplace.4

This research, although using a small sample, suggests the basis for future research and political strategy necessary to engage young adults in Australia. As will be seen in this chapter, qualitative research provides a depth of responses not possible with quantitative research. It is often the basis for problem definition (in this case, how to engage young adult Australian voters), by identifying key issues and possible causes of behaviour that can be confirmed in later quantitative, especially survey, research.

iStock.com/Izabela Habur

To appreciate the role of qualitative research in management decision-making.

QUALITATIVE RESEARCH

62

PART THREE > PLANNING THE RESEARCH DESIGN

WHAT IS QUALITATIVE RESEARCH?

qualitative research

Qualitative research is a methodology that addresses research objectives through techniques that allow the researcher to provide elaborate interpretations of phenomena of interest without depending on numerical measurement. Qualitative research is less structured than quantitative approaches such as experiments and surveys. Its focus is on in-depth understanding and insight, rather than on more

qualitative research Initial research and interpretative research that is not based on numerical analysis.

generalisable findings associated with quantitative research. When a researcher has a limited amount of experience with or knowledge about a research issue, qualitative research is a useful preliminary step. It helps ensure that a more rigorous, conclusive future study will not begin with an inadequate understanding of the nature of the marketing problem. For this reason, qualitative research is often used to define key issues for more detailed and larger sample-based quantitative research that follows it, as was shown in the opening vignette of this chapter. Qualitative research relies more on the skill of the researcher, as he or she must extract meaning that is actionable from unstructured responses such as text, a recorded interview, stories from consumers, web logs, video recordings and transcripts. The researcher then summarises and extracts meaning from these raw data and converts them to information such as a research report or presentation for the client to act on. Clients are generally more familiar with qualitative findings than statistical findings from quantitative research; indeed, they can even be present when the interviews take place, such as in a focus group. For these reasons, qualitative research is one of the most popular research approaches in commercial marketing and social research today. Many advertising companies also use qualitative research as part of the process of the development of their campaigns.

Although we most often think of surveys as ways of collecting quantitative data, we can also use them to collect qualitative data. Take a look at the question in the screenshot that was part of the in-class survey. Find at least three responses from students in the data and try to interpret the results. What approach best fits your attempt to interpret this data? What do you think (what theory) can be learned from the responses to this question? Compare your interpretation with those of other students. How much do you agree with other students and what do you think is the source of disagreement, if any?

SURVEY THIS!

Courtesy of Qualtrics.com

USES OF QUALITATIVE RESEARCH

ONGOING PROJECT

The purpose of qualitative research is intertwined with the need for a clear and precise statement of the recognised problem. Researchers conduct qualitative research for three interrelated purposes: 1 diagnosing a situation 2 screening alternatives 3 discovering new ideas. For these reasons qualitative research is also often called exploratory research, though the terms are interchangeable. CHAPTER 03 > QUALITATIVE RESEARCH

63

64

PART THREE > PLANNING THE RESEARCH DESIGN

Diagnosing a situation Much has already been said about the need for situation analysis to clarify a problem’s nature. Qualitative research helps to diagnose the dimensions of problems, so that successive research projects will be on target; it helps set priorities for research. In some cases exploratory research helps to orient management by gathering information on an unfamiliar topic. A research project may not yet be planned, but information about an issue will be needed before the marketing strategy can be developed. For example, focus group research for the National Drug Strategy in Australia showed that there were significant changes occurring in the ‘drug landscape’. This included ‘seeing drugs everywhere’, in popular culture through movies, TV shows, music, and in the online environment. People also see and hear about illicit drugs at school, and talk about them with their peers and parents.5 The research showed that the use of ecstasy is waning and there was now a greater use of hallucinogenic drugs such as LSD as cheaper replacements. Such research is invaluable to health and police authorities in understanding how the risks and criminal behaviour surrounding illicit drugs is changing, and forms an important part of any future strategy to reduce demand for these substances.

Screening alternatives When several opportunities, such as new product ideas, arise at once, but budgets don’t allow trying all possible options, exploratory research may be used to determine the best alternatives. Qualitative research can help reveal which of several new product ideas are the best to pursue. Many good products are not on the market because a company chose to market something better. Equally, qualitative combined with exploratory research may indicate that some new product ideas are unworkable. An exploratory look at market data (size, number and so on) may depict a product alternative as not feasible because the market of buyers is too small, while qualitative research may also show that there is not much interest or support for this new product. Concept testing is a frequent reason for conducting qualitative research. Concept testing is a general term for many different research procedures, all of which have the same purpose: to test some sort of stimulus as a proxy for a new, revised or repositioned product, service or strategy. Typically, consumers concept testing Any qualitative research procedure that tests some sort of stimulus as a proxy for an idea about a new, revised or repositioned product, service or strategy.

are presented with a written statement or filmed representation of an idea and asked if they think it is new and different, if they would use it, whether they like it, and so on. Concept testing is a means of evaluating ideas by providing a feel for their merits prior to the commitment to any research and development, manufacturing or other company resources. In the problematic area of organ donation it has been suggested that qualitative research could be useful in deciding how governments go about marketing this cause. What types of promotional appeals should be used? What style of public policy should be pursued and should incentives be provided to donors?6

REAL WORLD SNAPSHOT

WHAT STUDENTS THINK ABOUT THE MARKET RESEARCH INDUSTRY7

Focus group research and street interviews with young adults, aged mainly 18–24, paint a poor picture of the market research industry in Australia, with many describing it as boring or difficult, and few as fun and interesting. Market research was seen as ‘less sexy’ than other marketing careers. Specifically the respondents focused on statistics and fieldwork as describing what the market research industry is. The market research industry was seen as paying less than other marketing professions. Students in this research were, however, unaware of the size of the industry ($2.6 billion dollars) and its growth in the Asia Pacific area, and the internal use of market research by many companies and organisations.

65

CHAPTER 03 > QUALITATIVE RESEARCH

Researchers look for trouble signals in consumer evaluations of concepts to reduce the number of concepts under consideration or improve them to avoid future problems. Concept testing portrays the functions, uses and possible applications for the proposed good or service. For example, marketers scrapped a concept for a men’s shampoo that claimed to offer a special benefit to hair damaged by overexposure to the sun, heat from a hair dryer or heavy perspiration, after exploratory research showed that consumers thought the product was a good idea for someone with an outdoor lifestyle, but not for themselves.8 Early research indicated that although the product was seen as unique, the likelihood of persuading men that it matched their self-image was low. If a concept is flawed, but the product has not been evaluated negatively, researchers may learn that the product concept needs to be refined or repositioned. For example, in 1990 Kelvinator launched the MagiCook, a microwave oven, on the Indian market. The product failed to take off, so the company brought out a series of new microwave ovens with a convection facility to brown food. Kelvinator also included a series of cookbooks on microwave cooking with these new models. Demonstrations were used to convince people that microwave ovens can be used for cooking everyday Indian meals.9

Alamy/Science Photo Library

Discovering new ideas Marketers often conduct qualitative research to generate ideas for new products, advertising copy and so on. For example, automobile marketers have consumers design their dream cars using computerised design systems similar to those used by automotive designers. This qualitative research might generate ideas that would never have occurred to the firms’ own designers. Uncovering consumer needs is a great potential source of product ideas. One goal of exploratory research is to first determine what problems consumers have with a product category. When research has to determine what kinds of products people will buy, there is a difference between asking people about what they want or need and asking them about their problems. When you ask a customer what he or she wants in a dog food, the reply likely will be: ‘Something that is good for the dog,’ If you ask what the problems with dog food are, you may learn that: ‘The dog food smells bad when it is put into the refrigerator.’ Once problems have been identified through research, the marketing job is to find how to solve them.

THE OTHER SIDE OF THE COIN10

REAL WORLD SNAPSHOT

Sto ck ph

ck Sto Ed o/ ot

Some brands get all the good publicity and seem to have a virtual halo. Business succeeds when consumers like the firm’s products – right? After all, how many social networking sites are dedicated to accumulating ‘likes’? Product developers need information on what consumers like about brands’ current products, but they also need to reliably guess at what people might like. Obviously, brand managers can learn from what customers like, but brand managers can also learn from what the consumers who don’t like the brand have to say. A great many social network sites are devoted to dislike of brands. One Facebook site seeks to find one million consumers who hate Heineken. The site’s wall and discussion board are filled with sometimes humorous and sometimes crude stories about consumers’ true feelings about Heineken beer. Much of it reflects a disdain for all mass-marketed products. Few brands are more written about than Apple. A qualitative analysis of comments about Apple and anti-Apple social network sites and blogs reveals a striking contrast between those who ‘like’ Apple and others. Apple fans’ comments focus more on design (appearance) and the hedonic gratification that comes from owning the latest electronics. In contrast, the comments left by Apple dissidents reflect a more utilitarian concern.

As a way to generate ideas, vehicle marketers have consumers design their dream cars using computerised design systems similar to those used by vehicle designers.

66

PART THREE > PLANNING THE RESEARCH DESIGN

QUALITATIVE VERSUS QUANTITATIVE RESEARCH Quantitative research answers questions of fact necessary to determine a course of action. Qualitative research, on the other hand, never has this purpose. Usually, qualitative research provides greater understanding of a concept or crystallises a problem rather than providing precise measurement or quantification. The focus of qualitative research is not on numbers but on words and observations: stories, visual portrayals, meaningful characterisations, interpretations and other expressive descriptions (see Exhibit 3.1). EXHIBIT 3.1 → COMPARING QUALITATIVE AND QUANTITATIVE RESEARCH

Qualitative research

Research aspect

Quantitative research

Discover ideas; used in exploratory research with general research objects

Common purpose

Test hypotheses or specific research questions

Observe and interpret

Approach

Measure and test

Unstructured, free-form

Data collection approach

Structured response categories provided

Researcher is intimately involved; results are subjective

Researcher independence

Researcher uninvolved observer; results are objective

Small samples – often in natural settings

Samples

Large samples to produce generalisable results (results that apply to other situations)

Exploratory research designs

Most often used

Descriptive and causal research designs

Conversely, the purpose of quantitative research is to determine the quantity or extent of some phenomenon in the form of numbers. Most qualitative research is not quantitative. Qualitative research may be a single investigation or a series of informal studies intended to provide background information. Researchers must be creative in the choice of information sources to be investigated. They must be flexible enough to investigate all inexpensive sources that may possibly provide information to help managers understand a problem. This flexibility does not mean that researchers need not be careful and systematic when designing qualitative studies. Most of the techniques discussed in this chapter have limitations. Researchers should be keenly aware of the proper and improper uses of the various techniques.

ONGOING PROJECT

QUALITATIVE RESEARCH ORIENTATIONS There are many ways in which qualitative research can be conducted. Four major schools of thought currently influence the choice of a technique: 1 phenomenology – originating in philosophy and psychology 2 ethnography – originating in anthropology 3 grounded theory – originating in sociology

phenomenology A philosophical approach to studying human experiences based on the idea that human experience itself is inherently subjective and determined by the context in which people live.

4 case studies – originating in psychology and business research.

Phenomenology Phenomenology is both a philosophy and a methodology and is used to develop an understanding of complex issues that are not apparent in direct responses.11 Usually this approach collects responses in the context in which people live.12 The phenomenological researcher focuses on how a consumer’s

CHAPTER 03 > QUALITATIVE RESEARCH

behaviour is shaped by the relationship he or she has with the physical environment, objects, people and situations. Phenomenological inquiry seeks to describe, reflect upon and interpret experiences. Researchers using this approach rely on conversational interview tools, which are also often recorded in audio and/or video. The phenomenological interviewer does not seek to ask direct questions, but only makes the respondent comfortable enough to tell their story. In order to gain such trust, researchers may become members of the respondent’s peer group (for example, becoming an extreme sports participant) or by having the person reveal their true name (which is in direct conflict with the research ethics of the Australian Market and Social Research Society). This, however, may be an issue when dealing with sensitive issues, such as shoplifting, drug taking or sexual behaviour. Another name for the analysis of texts in respondent stories is hermeneutics. The aim of hermeneutics is to find key themes, patterns or archetypes in a respondent’s story; these are often called hermeneutic units. Stories or inputs into hermeneutics can describe characters, location and events; they do not involve opinions per se or mere facts. Stories13 can be collected by:

67

hermeneutics An approach to understanding phenomenology that relies on analysis of texts through which a person tells a story about him- or herself. hermeneutic unit Refers to a text passage from a respondent’s story that is linked with a key theme from within this story or provided by the researcher. ethnography Represents ways of studying cultures through methods that involve becoming highly active within that culture.

→→ anecdote circles – where you simply invite a group of people to come together and ask them to tell you their experiences on a particular topic Getty Images/STOCK4B

→→ naive interviewers – where they collect stories for you: it’s particularly useful if interviewing children; as children love to collect stories, you can ask them to collect stories for you →→ mass narrative capture – where people contribute stories online. Information collected by stories can then be analysed by experts, machines (qualitative software) or by the participants themselves. Participants may identify archetypes or patterns. Because of the subjective nature of this approach, it is recommended that story interpretation should be done by a number of means and groups; this is often called triangulation in qualitative research. Phenomenological or story-based approaches seem best suited to make sense of complex, ambiguous situations, or where direct questioning may be counterproductive; for example, situations with an emotional or cultural component.

Ethnographic (participant observation) approaches may be useful to understand how children obtain value from their experiences with toys. Dmitriiy Shironosov/Shutterstock

Ethnography Ethnography is a research approach from anthropology that studies cultures by participant observation. Participant observation means that the researcher becomes immersed in the culture they are studying. A culture can be a broad culture like the Chinese culture or it can be a narrower subculture, such as skateboarders or computer gamers. Ethnographic research relies more on observation of ‘natural behaviour’ than direct questioning and is particularly useful when a certain culture or subculture cannot verbalise their thoughts and feelings. Ethnographic research, for example, may be particularly useful when studying children, as it does not involve language and avoids biases of direct question and answering of respondents who may wish to please the interviewer. Instead the researcher can simply become part of the environment and watch what children do (for example, in play) and record their behaviour. Like many qualitative approaches, ethnography relies on researcher interpretation. Therefore, findings in ethnographic studies should be checked (triangulated) with a number of experts and checked with respondents to see if the correct meanings have been assigned to their behaviours.

A researcher can understand much about children’s motivations by observing the way they play.

68

PART THREE > PLANNING THE RESEARCH DESIGN

REAL WORLD SNAPSHOT

DOING ETHNOGRAPHIC RESEARCH IN A MARKET RESEARCH SUBJECT14

Students at the University of Technology Sydney have successfully used ethnographic research to understand the consumer behaviour of shoppers in Parramatta. Over a period of four weeks the students were rostered on four-hour observation periods of shoppers. Students were trained in the ethics of research and in interpretation of their findings. Results from the student project showed that lone shoppers rarely purchased but returned later with family. Fewer items were examined when there was interaction with sales staff. Salespeople were shown to be annoying if they approached people but the reverse was true when the customer was there to make a purchase. Finally, single customers exhibited fewer emotional states than those who visited with others. The students compiled a reflective journal of their findings and used the results to further later focus group or survey research. The authors suggest, that given the industry demands for the type of skills developed in this project, it is unfortunate that more opportunity to develop ethnographic skills is not provided to students. A new emerging area of ethnography is netnography, which is the study of online posts in discussion groups and online communities. The researcher observes the text-based conversation for a period of time, collecting posts and threads and analysing the discussions, and then reports on the findings.15 Netnography has been successfully used to examine co-creation of value in health forums16 and in the value of user support forums for organisational customers.17

Grounded theory Grounded theory, though not widely used in market research, represents a more direct approach to qualitative research than does ethnography or phenomenology. With grounded theory the researcher poses questions about information provided from historical records; for example, a family history of travel records. The researcher asks the questions to her- or himself and repeatedly questions the responses to derive deeper explanations. Grounded theory is particularly useful for highly dynamic netnography Netnography is a type of ethnography that analyses the free behaviour of individuals on the Internet. In particular, the posts, tweets and other online contributions of participants are studied, usually by observation. In the market research field it is also known as ‘social listening’. grounded theory Represents an inductive investigation in which the researcher poses questions about information provided by respondents or taken from historical records; the researcher asks the questions to him- or herself and repeatedly questions the responses to derive deeper explanations. case study method The qualitative research technique that intensively investigates one or a few situations similar to the problem situation.

situations involving rapid and significant changes; for example, financial and travel decisions that result from career changes or retirement, or significant life events such as the birth of a first child. Two key questions asked by grounded theory are: ‘What is happening here?’ and ‘How is it different?’ The defining characteristic of grounded theory is that it does not begin with a theory but instead extracts one from whatever emerges from an area of inquiry. While this means grounded theory is a very flexible approach to qualitative research, problems can remain; for example, the inquiry can proceed in many directions other than the stated purpose of the research, because of the search for meaningful explanation. Grounded theory does have practical application and has been used to develop new product concepts for mobile phones.18

Case studies The purpose of the case study method is to obtain information from one or a few situations that are similar to the researcher’s problem situation. For example, a bank in New South Wales may intensively investigate the marketing activities of an innovative bank in New Zealand. A shirt manufacturer interested in surveying retailers may first look at a few retail shops to identify the nature of any problems or topics that a larger study should investigate. Clinical interviews of individual consumers can also represent a case study. These may focus on their experiences with certain brands or products. The case studies are then analysed for important themes. Themes are identified by the frequency with which the same term (or a synonym) arises in the narrative description. The themes may be useful for identifying variables (or factors) that are relevant to potential explanations.

69

CHAPTER 03 > QUALITATIVE RESEARCH

An example of case study research occurred recently when an Australian communications agency, Naked,19 hired former Big Brother psychologist Simon Thatcher in order to understand what areas of psychological theories influence consumer behaviour. This couch-based series of case studies was completed for an advertising project on a traditional grocery product with an additional health benefit. Relevant consumer behaviour insights were based on the applications of psychological theories such as: cognitive behavioural therapy (how you think dictates how you feel and how you behave); existential psychology (life is about reconciling death, aloneness, meaning and freedom); positive psychology (learning to be optimistic); and mindfulness (learning to live in the present). The primary advantage of the case study is that an entire organisation or entity can be investigated in depth with meticulous attention to detail. This highly focused attention enables the researcher to carefully study the order of events as they occur or to concentrate on identifying the relationships among functions, individuals or entities. A fast-food restaurant may test a new menu item or a new shop design in a single location, to learn about potential operating problems that could hinder service quality, before launching the change throughout the chain. Conducting a case study often requires the cooperation of the party whose history is being studied. A successful franchisee may be willing to allow the franchisor access to records and reports. Intensive interviews or long discussions with the franchisee and his or her employees may provide an understanding of a situation. The researcher has no standard procedures to follow: he or she must be flexible and attempt to glean information and insights wherever possible. This freedom to search for whatever data an investigator deems important makes the success of any case study highly dependent on the alertness, creativity, intelligence and motivation of the individual performing the case analysis. As with all qualitative research, the results from case studies should be seen as tentative. Generalising from a few cases can be dangerous, because most situations are atypical in some sense. For example, the Hong Kong and Shanghai Banking Corporation (HSBC) in Malaysia may not be in a market comparable to the one in Singapore. Even if situations are not directly comparable, however, a number of insights can be gained and hypotheses suggested for future research. Obtaining information about competitors may be very difficult, because they generally like to keep the secrets of their success to themselves. The exact formula of Coca-Cola, for example, is known by only a few top executives in the firm; they feel that confidentiality is a definite competitive edge in their product strategy. Thus, researchers may have limited access to information from other firms.

THE USE OF A CASE STUDY. THE DESTINATION BRANDING THE HIGH COUNTRY OF VICTORIA, AUSTRALIA20

Ch es te

amy /Al age oy rV Photo Stock

Case study research can be useful in tourism when designing a branding campaign. An examination of a successful branding strategy for the high country in Victoria showed that the consideration of a broader suite of values including social, cultural, historic, geographic, symbolic, environmental and economic were important aspects of the destination brand. The case study suggested that sustainable brands are those that are developed organically, driven by the values held by local brand communities and networks, rather than a more limited consumer-based value set being imposed upon a destination.

REAL WORLD SNAPSHOT

70

PART THREE > PLANNING THE RESEARCH DESIGN

ANALYSING QUALITATIVE RESPONSES As in quantitative analysis, qualitative information (verbal responses, content analysis, blogs, SMS, video and audio) can be structured into summated findings. This is especially so given the now widespread use of social media and online interactions and the development of software packages such as NVivo and Leximancer. Both packages use thematic analysis, a technique where phrases, or concepts are joined by their co-occurrence across respondents, events and time. A six-step approach is recommended.

EXHIBIT 3.2 → SAMPLE LEXIMANCER MAP OF THEMES FROM A SELECTION OF FASHION BLOGS

healthy

demand

size

paris

heels

culture new york

shoes body colour

global

body

trend

look looking

beauty

price

makeup

purchase

online innovation advertising retail

glamour

season handbag

earrings

label

accessories collection new clothes jewellry couture designer catwalk industry textiles

price style beautiful

love

brand

tried

sales best

model glamour

product

trademark

products buy

magazine review fashion

reviews

manufacturing

blog

post

CHAPTER 03 > QUALITATIVE RESEARCH

71

The first step in thematic analysis is to become familiar with the data. The sample Leximancer map uses blogs on fashion, selected according to criteria of interest to the researcher (length, number of reviews, gender of blogger, age, ethnicity, geographical location, industry expertise, to name a few). The responses are then coded. These codes are used to systematically represent the social constructions of the social processes. The third set of thematic analysis is to assign sets of codes to potential themes. The fourth step is to review these themes and to assign additional themes or include any missing codes into themes. It is often the case here that the researcher tries to take the perspective of the respondent and develop a thematic analysis consistent with his or her stories and experiences. The fifth step is to name and define themes, so that they are reflective of the respondents’ perspectives. The final stage of thematic analysis is to produce a written report. Note that thematic analysis often involves moving backwards and forwards to the data and the report. The data should be presented in such a way that they tell a story, rather than providing simply a summary of responses and content. Exhibit 3.2 shows the result of such an approach.

COMMON TECHNIQUES USED IN QUALITATIVE RESEARCH There are many and evolving techniques in qualitative research. These techniques may be used for a number of research approaches discussed previously, although each category of qualitative approach will have a preference for a particular technique. It is also important to keep in mind that various approaches and associated techniques may be best suited to a particular type of research problem or objective, in line with a management decision. There is no one perfect research design that can meet all research objectives. Exhibit 3.3 outlines characteristics of some of the common qualitative research techniques.

focus group interview An unstructured, freeflowing interview with a small group of people.

Focus group interviews

Spencer Grant/PhotoEdit

The focus group interview has become so popular that many advertising and research agencies consider it to be the only qualitative research tool. As noted in Chapter 1, a focus group interview is an unstructured, free-flowing interview with a small group of people. It is not a rigidly constructed question-and-answer

session

but

a

flexible

format that encourages discussion of a brand, advertisement or new-product concept. The group meets at a central location at a designated time; typically, it consists of a moderator or interviewer and six to 10 participants, although larger groups are sometimes used. The participants may range from consumers talking about hair colouring or petroleum engineers talking about problems in the ‘oil patch’, to children talking about toys. The moderator introduces the topic and encourages group members to discuss the subject among

Traditional focus group facilities typically include a comfortable room for respondents, recording equipment and a viewing room via a two-way mirror.

72

PART THREE > PLANNING THE RESEARCH DESIGN

EXHIBIT 3.3 → COMMON QUALITATIVE RESEARCH TOOLS

Tool

Description

Type of approach (category)

Key advantages

Key disadvantages

Focus group interviews

Small group discussions led by a trained moderator

Ethnography, Case studies

»» Can be done quickly »» Gain multiple perspectives »» Flexibility »» Inexpensive

»» Results do not generalise to larger population »» Difficult to use for sensitive topics

Depth interviews

One-on-one, probing interview between a trained researcher and a respondent

Ethnography, Grounded theory, Case studies

»» Gain considerable insight from each individual »» Good for understanding unusual behaviours

»» Results do not generalise »» Very expensive per each interview

Conversations

Unstructured dialogue recorded by a researcher

Phenomenology, Grounded theory

»» Gain unique insights from enthusiasts »» Can cover sensitive topics »» Less expensive than depth interviews or focus groups

»» Easy to get off course »» Interpretations are very researcherdependent

Semi-structured interviews

Open-ended questions, often in writing, that ask for short essaytype answers from respondents

Grounded theory, Ethnography

»» Can address more specific issues »» Results can be easily interpreted »» Cost advantages over focus groups and depth interviews

»» Lack the flexibility that is likely to produce truly creative or novel explanations

Word association/ Sentence completion

Records the first thoughts that come to a consumer in response to some stimulus

Grounded theory, Case studies

»» Economical »» Can be done quickly

»» Lack the flexibility that is likely to produce truly creative or novel explanations

Observation

Recorded notes describing observed events

Ethnography, Grounded theory, Case studies

»» Can be unobtrusive »» Can yield actual behaviour patterns

»» Can be very expensive with participantobserver series

Collages

Respondent assembles pictures that represent their thoughts/feelings

Phenomenology, Grounded theory

»» Flexible enough to allow novel insights

»» Highly dependent on the researcher’s interpretation

Thematic apperception/ Cartoon tests

Researcher provides an ambiguous picture and respondent tells about the story

Phenomenology, Grounded theory

»» Projective, allows to get at sensitive issues »» Flexible

»» Highly dependent on the researcher’s interpretation

CHAPTER 03 > QUALITATIVE RESEARCH

73

themselves. Ideally, the discussion topics emerge at the group’s initiative. Focus groups allow people to discuss their true feelings, anxieties and frustrations, as well as the depth of their convictions, in their own words. The primary advantages of focus group interviews are that they are relatively

focus group

fast, easy to execute and inexpensive. In an emergency situation, three or four group sessions can be conducted, analysed and reported in less than a week at a cost substantially lower than that of other attitude-measurement techniques. Remember, however, that a small group of people will not be a representative sample no matter how carefully they are recruited. Focus group interviews cannot take the place of quantitative studies.

ONGOING PROJECT

The flexibility of focus group interviews has some advantages, especially when compared with the rigid format of a survey. Numerous topics can be discussed and many insights can be gained, particularly with regard to the variations in consumer behaviour in different situations. Responses that would be unlikely to emerge in a survey often come out in group interviews: ‘If it is one of the three brands I sometimes use and if it is on sale, I buy it; otherwise, I buy my regular brand’ or ‘If the day is hot and I have to serve the whole neighbourhood, I make cordial; otherwise, I give them Pepsi or Coke.’ If a researcher is investigating a target group to determine who consumes a particular beverage or why a consumer purchases a certain brand, situational factors must be taken into account. If the researcher does not realise the impact of the occasion on which the particular beverage is consumed, the results of the research may be general rather than portraying the consumer’s actual thought process. Focus groups often are used for concept screening and concept refinement. The concept may be continually modified, refined and retested until management believes it is acceptable. Political parties in Australia, for example, formulate, test and refine policies based on focus group research. This is discussed in the ‘Exploring research ethics’ box.

FOCUS GROUPS: THE DEATH OF POLITICAL LEADERSHIP IN AUSTRALIA?21

All political parties in Australia use focus group research to monitor the mood of the voters. The research is paid for by Australian taxpayers. Commenting on their widespread use a leading marketing academic Professor Pascale Quester, former Executive Dean of the Faculty of Professions at the University of Adelaide, noted: ‘The use of focus groups in politics is actually the death of the conviction politician.’ Professor Quester has studied focus groups and conducted them as well. In pure marketing terms she thinks they can be useful in the early development of a product to find out its strengths and weaknesses. But politicians should avoid them. Nevertheless focus group are widely used to canvass policy changes and to examine the political mood of voters in marginal seats. The danger is the small numbers may not reflect overall community sentiment and politicians may become extremely short term and reactive in their thinking. The specific advantages of focus group interviews have been categorised as follows:22 →→ Synergy: The combined effort of the group will produce a wider range of information, insights and ideas than the accumulation of separately secured responses from a number of individuals. →→ Snowballing: A bandwagon effect often operates in a group interview situation. A comment by one individual often triggers a chain of responses from the other participants. Brainstorming of ideas frequently is encouraged in focus group sessions. →→ Serendipity: It is more likely that an idea will drop out of the blue in a focus group than in an individual interview. The group also affords a greater opportunity to develop an idea to its full potential.

EXPLORING RESEARCH ETHICS

74

PART THREE > PLANNING THE RESEARCH DESIGN

→→ Stimulation: Usually, after a brief introductory period, the respondents want to express their ideas and expose their feelings as the general level of excitement over the topic increases. →→ Security: In a well-structured group, the individual usually can find some comfort in the fact that his or her feelings are similar to those of others in the group and that each participant can expose an idea without being obliged to defend it or follow through and elaborate on it. One is more likely to be candid because the focus is on the group rather than on the individual; the participant soon realises that the things said are not necessarily being identified with him or her. →→ Spontaneity: Because no individual is required to answer any given question in a group interview, the individual’s responses can be more spontaneous and less conventional. They should provide a more accurate picture of the person’s position on some issue. In the group interview, people speak only when they have definite feelings about a subject, not because a question requires a response. →→ Specialisation: The group interview allows the use of a more highly trained interviewer (moderator) because certain economies of scale exist when a number of individuals are interviewed simultaneously. →→ Structure: The group interview affords more control than the individual interview regarding the topics covered and the depth in which they are treated. The moderator has the opportunity to reopen topics that received too shallow a discussion when initially presented. →→ Speed: The group interview secures a given number of interviews more quickly than interviewing individual respondents. →→ Scientific scrutiny: The group interview allows closer scrutiny in several ways. First, the session can be observed by several people, which allows some check on the consistency of the interpretations. Second, the session can be electronically monitored and recorded. Later, detailed examination of the recorded session can offer additional insight and help clear up any disagreements about what happened.

GROUP COMPOSITION The ideal size of the focus group is six to 10 relatively similar people. If the group is too small, one or two members may intimidate the others. Groups that are too large may not allow for adequate participation by each group member. Homogeneous groups seem to work best, because they allow researchers to concentrate on consumers with similar lifestyles, experiences and communication skills. The session does not become rife with too many arguments and different viewpoints stemming from diverse backgrounds. When the US Centers for Disease Control and Prevention tested public service announcements about AIDS through focus groups, it discovered that single-race groups and multicultural groups reacted differently. By conducting separate focus groups, the organisation was able to gain important insights about which creative strategies were most appropriate for targeted versus broad audiences.

REAL WORLD SNAPSHOT

FOCUS GROUPS GIVEN NEW ROLE TO PLAY23

A new research concept that removes the mirrored walls and television monitors used during research focus groups and puts marketers in the same room as consumers led to several new product concepts for Red Rock Deli, a brand owned by The Smith’s Snackfood Company. The concept, called ‘co-creating focus groups’, was created by Ken Hudson from High Performance Thinking. He claimed that traditional focus groups were tired, sterile and placed artificial barriers between marketers and consumers.

»

CHAPTER 03 > QUALITATIVE RESEARCH

»

75

‘The co-creating process was born out of a frustration from marketing people who said that they do a lot of qualitative research yet rarely get new ideas or opportunities from consumers,’ said Mr Hudson, a former marketing director at American Express. ‘Traditional focus groups give them insights but not new concepts. They are partly designed to support researchers’ claims that only they can really interpret what consumers say. But they’re just people.’ Although Mr Hudson claimed his approach was new, other researchers disagreed. Graham Chant from Chant Link & Associates said some interaction between marketers and consumers was not unusual. ‘There probably are situations when the best-quality information allows marketers to interact with consumers,’ he said. ‘Depending on the situation, [co-creating focus groups] may have legs and we’d certainly consider doing that, depending on the problem.’ Researcher Barry Elliott said marketers and consumers have been jointly participating in research groups for 20 years. ‘It’s wrong to call it research; it’s more creative brainstorming. My gut feeling is that there won’t be a terribly big market for it.’ Mr Hudson launched the co-creating concept [in 2004] and has conducted groups for Smith’s, a building products company, a confectionery brand aimed at teenagers and a breakfast cereal. Some of the ideas developed for Red Rock Deli, a brand of premiumpriced chips, were being assessed through quantitative research. For example, a typical homogeneous group might be made up of married, full-time homemakers

with children at home; the group would not include unmarried, working women. Having firsttime mothers in a group with women who have three or four children reduces the new mothers’ participation; they look to the more experienced mothers for advice. Although they may differ in their opinions, they defer to the more experienced mothers; thus, first-time mothers and experienced mothers would be in separate groups. Researchers who wish to collect information from different types of people should conduct several focus groups; for example, one focus group might consist only of men and another only of women. Thus, a diverse sample may be obtained even though each group is homogeneous. Most focus group experts believe that four focus group sessions (often in different cities) can satisfy the needs of exploratory research.

ENVIRONMENTAL CONDITIONS The group session may take place at the research agency, the advertising agency, a hotel or one of the subjects’ homes. Research suppliers that specialise in conducting focus groups operate from commercial facilities that have cameras in observation rooms behind two-way mirrors, plus microphone systems connected to recorders and speakers to allow observation by others who are not in the room. Some researchers suggest that the atmosphere be established in the commercial research facility to ensure that the mood of the sessions will be as relaxed and natural as possible. They expect more open and intimate reports of personal experiences and sentiments to be obtained under these conditions. Often food and drinks are provided. The provision of alcohol, though, is not a good idea.

THE MODERATOR Exhibit 3.4 is a partial transcript of a focus group interview held in 2011. Notice how the moderator ensures that everyone gets a chance to speak and how he or she contributes to the discussion.

moderator The person who leads a focus group discussion.

76

PART THREE > PLANNING THE RESEARCH DESIGN

EXHIBIT 3.4 → WHAT HAPPENS IN A FOCUS GROUP? RESULTS FROM A MOBILE PHONE SWITCHING STUDY: 18–24 YEAR OLDS24

Introduction...What kind of mobile phone that you have got at the moment, the plan...? I have had my current phone for a while. A fair few months, I guess. It’s a Samsung that I got – I think it was on contract. Basically, the story of it was that my mum gave it to me as a gift and she put me onto a plan and volunteered to pay for it. So I was pretty lucky there. So I am on a Crazy John’s service. So the phone was part of the contract; you didn’t buy the phone? No, it all came as a contract. What about you? I was with 3G before. I changed probably about five months ago. That is Telstra 3G? No, just as in 3 mobile, in response to the – when I used to watch the cricket. So I was with them. I just live in Lane Cove and for some reason we just get dead zones and things like that with that company. So it was just shocking. So I changed over. I am with Vodafone now. I have been with them in the past. Basically, 3G came out with pretty cheap phones and good deals at the time. So I switched to them and, yeah, changed back. How cheap were the phones; can you remember? It’s before 1999. I had a silver flip phone. It had a camera but no net access or anything like that. What about you?

I have got a Samsung Galaxy. I changed about six months ago because my plan was ending. My provider is 3. I went from Optus to 3. What was different about your current plan compared to your old one? I just wanted the phone, actually. The new phone. You wanted a new phone and then provider? Yeah, my plan was ending and also I was interested in the phone. That was the best plan between the providers at the time. That was before six months ago that I changed. What about you? I have a pretty bad history of losing phones. So at the moment I am not on a contract. I am prepaid with Optus. I used to be with Vodafone. Then I went overseas for a bit, came back and I didn’t know where my phone was. I thought my contract is running out. I am swapping anyway, so I am prepaid at the moment. Prepaid with Optus. What kind of handset? Right now, BlackBerry. What about you? I was with Vodafone for about 12 years and had several unsatisfactory experiences and the primary one was, I had barely any reception at my house: that deteriorated over time and then I got fed up. I switched to Telstra. Now I am on iPhone 4, $80 a month. I can’t remember what I get for that, but I don’t usually pay more than that.

The moderator’s job is to develop a rapport with the group and to promote interaction among its members. The moderator should be someone who is really interested in people, who listens carefully to what others have to say, and who can readily establish rapport, gain people’s confidence and make them feel relaxed and eager to talk. Careful listening is especially important because the group interview’s purpose is to stimulate spontaneous responses. The moderator’s role is also to focus the discussion on the areas of concern. When a topic is no longer generating fresh ideas, the effective moderator changes the flow of discussion. The moderator does not give the group total control of the discussion, but normally has prepared questions on topics that concern management. However, the timing of these questions in the discussion and the manner in which they are raised are left to the moderator’s discretion. The term ‘focus group’ thus stems from the moderator’s task: he or she starts out by asking for a general discussion but usually focuses in on specific topics during the session. discussion guide A document prepared by the focus group moderator that contains remarks about the nature of the group and outlines the topics or questions to be addressed.

PLANNING THE FOCUS GROUP OUTLINE Effective focus group moderators prepare discussion guides to help ensure that the groups cover all topics of interest. The discussion guide begins with a written statement of the introductory remarks to inform the group about the nature of the focus group; it then outlines topics or questions to be addressed in the group session.

CHAPTER 03 > QUALITATIVE RESEARCH

77

A group of marketing researchers who wanted to examine the switching behaviour in the mobile phone market used a discussion guide shown in Exhibit 3.5. Note that some sample responses from this study among Gen Y consumers are shown in Exhibit 3.4. The marketing researchers designed the question guide in the following way: →→ The first questions set the scene of the focus group, and described to the respondent how the focus group would be run by the moderator. →→ The second set of questions described the context; usually what is the current market behaviour and attitudes of the respondents. →→ Succeeding questions were asked, first on an open-ended basis about what factors might influence them to switch mobile phone providers (push/pull factors). →→ Finally, the ‘bottom line’ questions were asked about the difficulty of switching and any post sales complaints respondents had and how they were handled. Notice that the researchers who planned the outline established certain objectives for each part of the focus group. The initial effort was to break the ice and establish rapport within the group. The logical flow of the group session then moved from general discussion about mobile phones to more focused discussion on to what would influence them to leave a current provider (push factors) and what would attract them to a new mobile phone provider (pull factors).

1

INTRODUCTION a b c d e

Introduce moderator and thanks for coming. Mobiles OFF. Directions to the toilets and safety. Permission to record audio and video audio. Only one person to speak at once to ensure audio recording clear. f Advise that the group is also being viewed live. g Interested in all the opinions, experiences and perceptions on both current and previous mobile provider (be it positives or negatives). h No right or wrong answers – everybody has a different opinion and circumstances. i Self-introduction of group participants. We would like to discuss your experiences with both your current and previous mobile service provider and the things that pushed you away from your previous mobile service provider. Moderator to probe for full understanding of words/descriptions used, for example: ‘What do you mean by “rude” staff?’ ‘What do you mean by “difficult to communicate with them”?’

2 DESCRIBES THE CONTEXT a How long have you been with your mobile service provider? b Describe the details of the plan you were previously on before you switched. c Prompt if necessary: Who with? How much?, What does it include (data, SMS etc.)?

d What’s different about your current plan compared to the old one? e Do you use more than one mobile provider? If so, why? f Do you know the difference between a network carrier (e.g., Telstra, Optus) and a service provider (e.g., Boost, Crazy John’s)? Does it matter? g How often you monitor your bill? If so, how do you monitor?

← EXHIBIT 3.5 EXCERPT OF A DISCUSSION GUIDE FOR A MOBILE PHONE SWITCHING STUDY25

3 DESCRIBES THE PUSH FACTORS AWAY FROM YOUR PREVIOUS MOBILE PHONE SERVICE PROVIDER a Describe your experience with your previous mobile service provider. b Was there anything that really ‘disappointed’ or ‘frustrated’, or made you unhappy about your previous mobile service provider. (e.g. too hard to compare plans, they’re all the same, I want to keep my mobile number, I’ve been with my current provider for so long I can’t be bothered changing)? c Were there any incidents that occurred that led you to say, ‘That’s enough. I have to change?’ d Probe: What were the biggest areas of service failure (examples include billing errors, technical problems, unexpected costs, lack of customer service)? e Did you incur any unexpected costs (or savings; i.e., you underspent) with your

»

78

PART THREE > PLANNING THE RESEARCH DESIGN

»

f

previous mobile phone service provider? What happened? How did you feel? Did you find it easy or difficult to understand all of the costs involved in using your previous mobile service provider? Was your previous service provider open in telling you all the costs involved? Have you incurred any unexpected costs (or savings) with your current mobile service provider? What happened? How did you feel? What did you do about it?

4 DESCRIBES THE PULL FACTORS TOWARDS YOUR CURRENT NEW MOBILE PHONE SERVICE PROVIDER a Now that you have switched what attracted you to move to this new mobile phone service provider? b Describe your experience and the steps you took in searching for a new mobile phone service provider. c How difficult was it for you to compare the different service provider offers? Was the information easy to find and understand? [Prompt if necessary.] d Is there anything which caused confusion?

e Describe your experiences switching into the new provider. [Probe if necessary.]

5 a Where did you buy your mobile service

b c

d

e

f

g

from? In person? Over the phone? Or on the Internet? Why did you choose this source? Were there any hassles along the way? Did you read the contract? Any issues with it? How difficult to understand was the pre-sales marketing and product information provided to you by your new provider? Did the mobile phone provider follow through on its customer service promises? How did the mobile phone provider handle these issues? Having now switched over, describe your experience with the mobile phone service provider since you switched. Were your expectations met? Did the new provider tell you all the relevant details about your new plan? Have you saved money or experienced other benefits since switching mobile service providers? If so, what benefits have you experienced? Did you have any post-sale complaints? Can you tell me about those? How were they handled?

FOCUS GROUPS AS DIAGNOSTIC TOOLS Researchers predominantly use focus groups as a means of conducting exploratory research. Focus groups can be helpful in later stages of a research project, but the findings from surveys or other quantitative techniques raise more questions than they answer. Managers who are puzzled about the meaning of survey research results may use focus groups to better understand what consumer surveys indicate. In such a situation, the focus group supplies diagnostic help after quantitative research has been conducted.

EXPLORING RESEARCH ETHICS

TYPICAL CONSUMERS OR PROFESSIONAL RESPONDENTS?

Clients that lack physical facilities for conducting focus groups regularly hire research suppliers that specialise in focus group research. What is a research supplier’s responsibility when recruiting individuals to participate in a focus group? Should respondents be recruited because they will make the session go well or because they are typical consumers? An example of a lack of objectivity in research occurred when managers of a client organisation observed a focus group interview being conducted by a research supplier that had previously worked for the client on other projects. They noticed that some of the respondents looked familiar. A review of the video recordings of the session found that, to make the session go smoothly, the focus group moderators had solicited subjects who in the past had been found to be very articulate and cooperative. It is questionable whether such ‘professional respondents’ can avoid playing the role of expert.

CHAPTER 03 > QUALITATIVE RESEARCH

79

SHORTCOMINGS The shortcomings of focus groups are similar to those of most qualitative research techniques, as discussed later in this chapter. However, here we must point out two specific shortcomings of bringing people together for focus groups. First, focus groups require sensitive and effective moderators; without a good moderator, self-appointed participants may dominate a session, giving somewhat misleading results. If participants react negatively towards the dominant member, a ‘halo effect’ on attitudes towards the concept or topic of discussion may occur. This situation should be carefully avoided. Second, some unique sampling problems arise with focus groups. Researchers often select focus group participants because they have similar backgrounds and experiences or because screening indicates that the participants are more articulate or gregarious than the typical consumer. Such participants may not be representative of the entire target market (see the ‘Exploring research ethics’ box on page 78). For this reason it is suggested that the use of focus groups follow the suggested guidelines of a professional association; for example, the qualitative research guidelines of the Australian Market and Social Research Society26 or equivalent organisation in the country in which the research is being ONGOING PROJECT

conducted.

Depth interviews Motivational researchers who want to discover reasons for consumer behaviour may use relatively unstructured, extensive interviews during the primary stages of the research process. A depth interview is similar to a client interview conducted by a clinical psychologist or psychiatrist. The researcher asks many questions and probes for additional elaboration after the subject answers. In a depth interview, in contrast to projective techniques, the subject matter is generally undisguised.

depth interview A relatively unstructured, extensive interview in which the interviewer asks many questions and probes for in-depth answers.

The interviewer’s role is extremely important in the depth interview. He or she must be a highly skilled individual who can encourage the respondent to talk freely without influencing the direction of the conversation. Probing questions such as ‘Can you give me an example of that?’ and ‘Why do you say that?’ stimulate the respondent to elaborate on the topic. An excerpt from a depth interview is given in Exhibit 3.6.

An interviewer (I) talks with Marsha (M) about furniture purchases. Marsha indirectly indicates she delegates the buying responsibility to a trusted antique dealer. She has already said that she and her husband would write to the dealer telling him the piece they wanted (e.g., bureau, table). The dealer would then locate a piece that he considered appropriate and would ship it to Marsha from his shop in another state. M: We never actually shopped for furniture since we state what we want and (the antique dealer) picks it out and sends it to us. So we never have to go looking through stores and shops and things. I: You depend on his (the antique dealer’s) judgement? M: Um, hum. And, uh, he happens to have the sort of taste that we like and he knows what our taste is and always finds something that we’re happy with. I: You’d rather do that than do the shopping?

← EXHIBIT 3.6 EXCERPT FROM A DEPTH INTERVIEW27

M Oh, much rather, because it saves so much time and it would be so confusing for me to go through stores and stores looking for things, looking for furniture. This is so easy that I just am very fortunate. I: Do you feel that he’s a better judge than … M: Much better. I: Than you are? M: Yes, and that way I feel confident that what I have is very, very nice because he picked it out

»

80

PART THREE > PLANNING THE RESEARCH DESIGN

»

and I would be doubtful if I picked it out. I have confidence in him; [the antique dealer] knows everything about antiques, I think. If he tells me

something, why I know it’s true – no matter what I think. I know he is the one that’s right.

This excerpt is most revealing of the way in which Marsha could increase her feeling of confidence by relying on the judgement of another person, particularly a person she trusted. Marsha tells us quite plainly that she would be doubtful (i.e., uncertain) about her own judgement, but she ‘knows’ (i.e., is certain) that the antique dealer is a good judge, ‘no matter what I think’. The dealer once sent a chair that, on first inspection, did not appeal to Marsha. She decided, however, that she must be wrong and the dealer right, and grew to like the chair very much.

International marketing researchers find that in certain cultures, depth interviews work far better than focus groups. They provide a quick means to assess buyer behaviour in foreign lands. The depth interview may last more than an hour and requires an extremely skilled interviewer; hence, it is expensive. In addition, the topic for discussion is largely at the discretion of the interviewer, so the success of the research depends on the interviewer’s skill – and, as is so often the case, good people are hard to find. A third major problem stems from the necessity of recording both surface reactions and subconscious motivations of the respondent. Analysis and interpretation of such data are highly subjective, and it is difficult to settle on a true interpretation. An example of conflicting claims is illustrated by a study of prunes done by two organisations. One study used projective techniques to show that people considered prunes shrivelled, tasteless and unattractive; symbolic of old age and parental authority (thus disliked); and associated with hospitals, boarding houses, peculiar people and the army. The other study stated that the principal reason people did not like prunes was the fruit’s laxative property. Finally, alternative techniques, such as focus groups, can provide much the same information as ONGOING PROJECT

depth interviews.

Projective techniques There is an old story about asking a man why he purchased a Mercedes. When asked directly why he purchased a Mercedes, he responds that the car holds its value and does not depreciate much, that it gets better fuel economy than you would expect, or that it has a comfortable ride. If you ask the same person why a neighbour purchased a Mercedes, he may well answer, ‘Oh, that status seeker!’ This story illustrates that individuals may be more likely to give true answers (consciously or unconsciously) to disguised questions. Projective techniques seek to discover an individual’s true attitudes, motivations, defensive reactions and characteristic ways of responding. The assumption underlying these methods lies in Oscar Wilde’s observation: ‘A man is least himself when he talks in his own person. Give him a mask, and he will tell the truth.’ In other words, advocates of projective techniques assume that when directly questioned, respondents do not express their true projective technique projective technique An indirect means of questioning that enables a respondent to project beliefs and feelings onto a third party, an inanimate object or a task situation.

feelings because they are embarrassed about answers that reflect negatively on their self-concept; they wish to please the interviewer with the ‘right’ answer; or they cannot reveal unconscious feelings of which they are unaware. However, if respondents are presented with unstructured, ambiguous stimuli, such as cartoons or inkblots, and are allowed considerable freedom to respond, they will express their true feelings. A projective technique is an indirect means of questioning that enables respondents to project beliefs and feelings onto a third party, an inanimate object or a task situation. Respondents are not

CHAPTER 03 > QUALITATIVE RESEARCH

81

required to provide answers in any structured format. They are encouraged to describe a situation in their own words with little prompting by the interviewer. Individuals are expected to interpret the situation within the context of their own experiences, attitudes and personalities, and to express opinions and emotions that may be hidden from others and possibly themselves. The most common projective techniques in marketing research are word association tests, sentence completion methods, third-person techniques and role-playing, and thematic apperception tests,28 discussed below.

WORD ASSOCIATION TESTS During a word association test, the subject is presented with a list of words, one at a time, and asked to respond with the first word that comes to his or her mind. Both verbal and nonverbal responses (such as hesitation in responding) are recorded. For example, a researcher who reads a list of job tasks to sales employees expects that the word association technique will reveal each individual’s true feelings about the job tasks. A sales representative’s first thought presumably is a spontaneous answer because the subject does not have enough time to think about and avoid making admissions that reflect poorly on him- or herself. Word association frequently is used to test potential brand names. For example, a liquor manufacturer attempting to market a clear-coloured light whisky tested the brand names Frost, Verve, Ultra and Master’s Choice. Frost was seen as upbeat, modern, clean and psychologically right. Verve was too modern, Ultra was too common and Master’s Choice was not upbeat enough. Interpreting word association tests is difficult, and the marketing researcher should make sure to avoid subjective interpretations. When there is considerable agreement in the free-association process, the researcher assumes that the test has revealed the consumer’s inner feelings about the subject. Word association tests are also analysed by the amount of elapsed time. For example, if the researcher is investigating alternative advertising appeals for a method of birth control, a hesitation in responding may indicate that the topic arouses some sort of emotion (and the person may be seeking an ‘acceptable’ response). The analysis of projective technique results takes into account not only what consumers say, but also what they do not say. Word association tests can also be used to pretest words or ideas for questionnaires. This enables the researcher to know beforehand whether and to what degree the meaning of a word is understood in the context of a survey.

SENTENCE COMPLETION METHOD The sentence completion method is also based on the principle of free association. Respondents are required to complete a number of partial sentences with the first word or phrase that comes to mind. For example:

People who drink beer are

.

A man who drinks a dark beer is

.

Imported beer is most liked by

.

The woman in the advertisement

.

Answers to sentence completion questions tend to be more extensive than responses to word association tests. The intent of sentence completion questions is more apparent, however.

word association test A projective technique in which the subject is presented with a list of words, one at a time, and asked to respond with the first word that comes to mind. sentence completion method A projective technique in which respondents are required to complete a number of partial sentences with first word or phrase that comes to mind.

82

PART THREE > PLANNING THE RESEARCH DESIGN

REAL WORLD SNAPSHOT

USING PROJECTIVE TECHNIQUES IN CONSUMER RESEARCH IN SOUTH-EAST ASIA29

Leading qualitative researcher Professor Russel Belk, notes that projective techniques are particularly suited to research in Asian countries given the visual nature of these countries and the ability of these techniques to provide deep insight into the nature of consumer motivations and developing materialism. He provides a nice example of research by Chan30 who asked Chinese children to draw pictures of children who had lots of expensive toys compared to those who had few in order to study their materialism and its relationship to various feelings. This is shown below. Sometimes a picture is worth a thousand words. This child has a lot of new and expensive toys.

This child does not have a lot of toys.

Source: Belk, R. (2013) ‘Visual and projective methods in Asian research’ Qualitative Market Research: An International Journal, 16 (1), p. 97.

THIRD-PERSON TECHNIQUE AND ROLE PLAYING third-person technique A projective technique in which the respondent is asked why a third person does what she does or what she thinks about a product. The respondent is expected to transfer her attitudes to the third person.

Almost literally, providing a mask is the basic idea behind the third-person technique. Respondents are asked why a third person (for example, a neighbour) does what she does, or what she thinks about a product. For example, male homeowners might be told: ‘We are talking to a number of homeowners like you about this new type of lawnmower. Some men like it the way it is; others believe that it should be improved. Please think of some of your friends or neighbours, and tell us what they might find fault with on this new type of lawnmower.’ Respondents can transfer their attitudes to neighbours, friends or co-workers. They are free to agree or disagree with an unknown third party. The best-known and certainly a classic example of a study that used this indirect technique was conducted in the USA in 1950, when Nescafé Instant Coffee was new to the market. Two shopping lists, identical except for the brand of coffee, were given to two groups of women: →→ pound and a half of hamburger →→ two loaves of Wonder bread →→ bunch of carrots

CHAPTER 03 > QUALITATIVE RESEARCH

83 iStockphoto/Mykola Velychko

→→ one can of Rumford’s Baking Powder →→ Nescafé Instant Coffee [or Maxwell House Coffee, drip grind] →→ two cans Del Monte peaches →→ five pounds potatoes. The instructions were: →→ Read the shopping list. →→ Try to project yourself into the situation as far as possible until you can more or less characterise the woman who bought the groceries. →→ Then write a brief description of her personality and character. →→ Whenever possible indicate what factors influenced your judgement. Forty-eight per cent of the housewives given the list that included Nescafé described the Nescafé user as lazy and a poor planner. Other responses implied that the instant coffee user was not a good wife and spent money carelessly. The Maxwell House user, however, was thought to be practical, frugal and a good cook. Role playing is a dynamic re-enactment of the third-person technique in a given situation. The

A child placed in a role-playing situation may be better able to express her true feelings. A child may be told to pretend she is a parent talking to a friend about toys, food or clothing. Thus, the child does not feel pressure to express her own opinions and feelings directly.

role-playing technique requires the subject to act out someone else’s behaviour in a particular setting. For example, a child in a role-playing situation might use a pretend telephone to describe the new biscuit she has just seen advertised. She projects herself into a mother role. Many researchers who specialise in research with children believe the projective play technique can be used to determine a child’s true feelings about a product, package or advertisement. Young children frequently have their own meaning for many words. A seemingly positive word such as ‘good’, for example, can be a child’s unflattering description of the teacher’s pet in his class. In a role-playing game, the child can show exactly what he thinks ‘good’ means. Role-playing is particularly useful in investigating situations in which interpersonal relationships are the subject of the research; for example, salesperson–customer, husband–wife or wholesaler– retailer relationships.

THEMATIC APPERCEPTION TEST (TAT) A thematic apperception test (TAT) presents subjects with a series of pictures in which consumers and products are the centre of attention. The investigator asks the subject to describe what is happening in the pictures and what the people might do next. Hence, themes (thematic) are elicited on the basis

thematic apperception test

of the perceptual-interpretative (apperception) use of the pictures. The researcher then analyses the contents of the stories that the subjects relate. The picture or cartoon stimulus must be sufficiently interesting to encourage discussion, but ambiguous enough not to disclose the nature of the research project. Clues should not be given to the character’s positive or negative predisposition. A pretest of a TAT investigating why men might purchase chainsaws used a picture of a man looking at a very large tree. The subjects of the research were homeowners and weekend woodcutters. When confronted with the picture of the imposing tree, they almost unanimously said that they would get professional help from a tree surgeon. Thus, early in the pretesting process, the researchers found out that the picture was not sufficiently ambiguous for the subjects to identify with the man in the picture. If subjects are to project their own views into the situation, the environmental setting should be a well-defined, familiar problem, but the solution should be ambiguous. Frequently, the TAT consists of a series of pictures with some continuity so that stories may be constructed in a variety of settings. The first picture might portray two people discussing a product in a supermarket; in the second picture, a person might be preparing the product in the kitchen; the final picture might show the product being served at the dinner table.

role-playing technique A projective technique that requires the subject to act out someone else’s behaviour in a particular setting. thematic apperception test (TAT) A projective technique that presents a series of pictures to research subjects and asks them to provide a description of or a story about the pictures.

84

PART THREE > PLANNING THE RESEARCH DESIGN

CARTOON TESTS picture frustration A version of the TAT that uses a cartoon drawing for which the respondent suggests dialogue the characters might engage in.

The picture frustration version of the TAT uses a cartoon drawing in which the respondent suggests a dialogue in which the characters might engage. Exhibit 3.7 is a purposely ambiguous illustration of an everyday occurrence. A man and a woman are shaking hands and the man is saying: ‘Thanks for doing business with us today. Is there a way we can serve you better in the future?’ The respondent is then asked to complete the response from the woman, with a particular company or organisation in mind. This type of research is therefore useful in identifying underlying causes of service quality in business-to-business markets.

EXHIBIT 3.7 → PICTURE FRUSTRATION VERSION OF TAT

Thanks for doing business with us today. Is there a way we can serve you better in the future?

Several other projective techniques apply the logic of the TAT. Construction techniques request that consumers draw a picture, construct a collage or write a short story to express their perceptions or feelings. For example, children hold in their heads many pictures that they are unable to describe in words. Asking a child to ‘draw what comes to your mind when you think about going shopping’ enables the child to use his or her visual vocabulary to express feelings.31

TIPS OF THE TRADE

»» Qualitative research tools are most helpful when: • research questions are not very specific • some specific behaviour needs to be studied in depth • the value of a product changes dramatically from situation to situation or consumer to consumer • exploring a research area with the intent of studying it further • using concept testing. »» The focus group moderator is key to a successful interview. Not just anyone can moderate a focus group. Generally speaking, a good moderator can get more out of a respondent by saying less. »» Focus group questions should start with more general questions and work to the more specific.

»

CHAPTER 03 > QUALITATIVE RESEARCH

»

85

»» Don’t be afraid to use props such as advertisements, photos or actual products to get respondents talking. »» Modern technology makes a tremendous amount of qualitative information available via the Internet. Formal interviews can sometimes be replaced by data pulled from blogs and social networking sites. Consumers can also be interviewed using Internet video technology. »» Exploratory research designs do not lend themselves well to hypothesis testing or scientifically concluding that one alternative is better than another. »» The overall value of a research tool is not determined by whether it is quantitative or qualitative but by the value that it produces. Qualitative tools are irreplaceable for many marketing research situations.

Modern technology and qualitative research Technological advances have greatly improved researchers’ ability to quickly gather and analyse qualitative data. The Internet has made it possible for qualitative research to be conducted globally and the use of software packages has greatly assisted researchers in finding patterns, themes and archetypes in qualitative data, be it text, voice or video.

VIDEOCONFERENCING AND STREAMING MEDIA The videoconferencing industry has grown dramatically in recent years, and as the ability to communicate via telecommunications and videoconferencing links has improved in quality, the number of companies using these systems to conduct focus groups has increased. With traditional focus groups, marketing managers and creative personnel often watch the moderator lead the group from behind two-way mirrors. If the focus group is being conducted out of town, the marketing personnel usually have to spend more time in aeroplanes, hotels and taxis than they do watching the group session. With video-conferenced focus groups, marketing managers can stay at home. These types of focus groups may be particularly useful when interviewing people in industry to government, or experts about a new innovation. US-based marketing research company, Focus Vision Network, provides videoconferencing equipment and services for clients. The Focus Vision system is modular, which allows it to be wheeled around to capture close-ups of each group member. The system operates via a remote keypad that allows observers in a far-off location to pan the focus group room or zoom in on a particular participant. The system allows marketing managers at remote locations to send messages to the moderator. For example, while new product names were being tested in one focus group, an observing manager had an idea and contacted the moderator, who tested the new name on the spot. Streaming media consist of multimedia content such as audio or video that is made available in real time over the Internet or a corporate intranet, with no download wait and no file to take up space on a viewer’s hard disc.32 This technology for digital media delivery allows researchers to ‘broadcast’ focus groups that can be viewed online. The offsite manager uses RealPlayer or Microsoft Media Player

online focus group

to view a focus group on a computer rather than at a remote location. Except for a decrease in quality of the video when there are bandwidth problems, the effect is similar to videoconferencing.

INTERACTIVE MEDIA AND ONLINE FOCUS GROUPS The use of the Internet for qualitative exploratory research is growing rapidly. The term online focus group refers to a qualitative research effort in which a group of individuals provide unstructured comments by entering their remarks into a computer connected to the Internet. The group participants keyboard their remarks either during a chat room format or when they are alone at their computers. Because respondents enter their comments into the computer, transcripts of verbatim responses are

streaming media Multimedia content, such as audio or video, which can be accessed on the Internet without being downloaded first. online focus group A focus group whose members use Internet technology to carry on their discussion.

86

PART THREE > PLANNING THE RESEARCH DESIGN

available immediately after the group session. Online groups can be quick and cost-efficient. However, because there is less interaction between participants, group synergy and snowballing of ideas may be diminished. A research company may set up a private chat room on its company website for focus group interviews. Participants in these chat rooms feel that their anonymity is very secure. Often they will make statements or ask questions they would never pose under other circumstances.33 This can be a major advantage for a company investigating sensitive or embarrassing issues. Many focus groups using the chat room format involve a sample of participants who are online at the same time, typically for about 60 to 90 minutes. Because participants do not have to be together in the same room at a research facility, the number of participants in these online focus groups can be much larger than in traditional focus groups. Twenty-five participants or more is not uncommon for the simultaneous chat room format. Participants can be at widely separated locations, even in different time zones, because the Internet does not have geographical restrictions. Of course, a major disadvantage is that only individuals with Internet access can be selected for an online group. (The nature of Internet samples is discussed in depth in Chapters 5 and 10.) The job of an online moderator resembles that of an in-person moderator. However, the online moderator should possess fast and accurate keyboard skills or be willing to hire an assistant who does. Ideally, the discussion guide is downloaded directly onto the site so the moderator can, with one click, enter a question into the dialogue stream. A problem with online focus groups is that the moderator cannot see body language and facial expressions (bewilderment, excitement, interest etc.) to interpret how people are reacting. Also, the moderator’s ability to probe and ask additional questions on the spot is reduced in online focus groups, especially those in which group members are not participating simultaneously.34 Research that requires focus group members to actually touch something (such as a new easy-opening packaging design) or taste something cannot be performed online. The complexity of the subject will determine the exact nature and length of an online focus group. For many online projects, the group discussion can continue for 24 or 48 hours or even longer. Cross Pen Computing Group tested the appeal of an advertising campaign for a new product called CrossPad with an online brainstorming group that ran for five days.35 As the session’s time expands, so may the number of participants. Some sessions involve quite a large number, perhaps as many as 200 participants. Whether these online chat sessions are true focus groups or not is a matter of some minor debate.36 However, these online research projects do have their purpose. For example, Nickelodeon uses an online format to learn about a variety of subjects from a group of viewers. These kids use personal computers and the Internet to talk with each other and with network researchers about pets, parents, peeves and pleasures. Kids post notes on the computer bulletin board whenever they want to. Three times a week they log on for scheduled electronic conferences, during which Nickelodeon researchers lead discussions to answer questions such as: ‘Is this a good scoring methodology for a game show?’ or ‘Do kids understand if we show a sequence of program titles and air times?’ On one occasion, the kids told researchers they were confused by the various locations shown in a segment of The Tomorrow People, a futuristic series with events occurring around the world. Realising that the sight of a double-decker bus wasn’t enough to allow a modern kid to identify London, the producers wrote the name of the city on the screen.37 Several companies have established an informal ‘continuous’ focus group by using an Internet blog. The purpose of such blogs may be for companies to build brand communities with consumers and harvest new product ideas. An example is the toy company Lego. Lego blogs can be found at http://www.brothers-brick.com. With focus groups conducted at a market research agency people are paid to attend, but bloggers in online focus groups may participate for no fee at all.

CHAPTER 03 > QUALITATIVE RESEARCH

In Australia, two types of online focus groups are becoming popular. One type is called an ‘offtime’ online focus group and consists of a series of bulletin boards, where respondents at their own convenience type in answers to open-ended questions and responses to other postings. Usually, there is not a moderator present. The other is called an ‘on-time’ focus group where respondents use a chat room to type in responses in real time, and a moderator is present. Although we have not yet discussed Internet surveys, it is important to make a distinction between online focus groups, which provide qualitative information, and Internet surveys, which provide quantitative findings. Chapter 5 discusses technological challenges and how to administer Internet surveys. (Much of that discussion is also relevant for researchers wishing to conduct online focus groups.)

SOCIAL NETWORKING Social networking sites like Facebook, Instagram, LinkedIn, Twitter and others can provide a wealth of qualitative data for researchers. Companies such as Nielsen can assign research assistants to monitor these sites for information about particular brands. Other companies like Ford and Procter & Gamble may maintain their own social networking sites for the purposes of gathering research data. Information gathered in such a way is considered more realistic and less influenced by response biases than direct questioning. There are, however, serious ethical issues when collecting information from respondents without their knowledge or consent. Market research company TNS Australia conducted research for the Mars corporation in order to find ‘creative and cool’ young consumers using social networking sites.38 The research consisted of sending out invitations to friends of the researchers, in the 20–22 age group, to join a social networking site. The researchers then asked a creative question on a specially designed page. Seventy-five people were chosen who conveyed creative ideas and whose personalities fitted the demographic requirements. Telephone interviews were then used to select from this group to find 15 to 20 people who fitted the ‘Future Shaper’ category desired by the client – new, savvy consumers to act as advocates for brands in the market. From this second group Mars selected 12 to attend a workshop with their marketing team to discuss ideas and concepts.

SOFTWARE DEVELOPMENT Computerised qualitative analysis software is now often used. Three commonly used programs are NUD*IST, ATLAS and NVivo. These can save a lot of time in identifying themes and connections within the text. In fact, today’s programs can also assist in interpreting voice, photographs and video for meaning. Computerised analysis of depth interviews with service providers and their customers reveal interesting key themes dealing with friendship or bonds that form between them. Some themes to emerge have included the feeling that meetings were more like get-togethers with a friend, the feeling that the service provider wants to give something back to a client, and the belief that one can share one’s true thoughts and feelings with a client. On the not-so-positive side, a theme that also emerged was that sometimes the friendships were not mutual.39 Comments such as ‘I thought she would never leave’ or ‘Won’t you give me a break’? would be consistent with that theme. There are many other software programs that can assist with basic qualitative analysis. Some are available as freeware. These are listed on the Impoverished Social Scientist Guide to Free Software, at Harvard University (see http://www.umass.edu/qdap), which is regularly updated. Trial and student versions of the more popular packages are also available.

87

88

PART THREE > PLANNING THE RESEARCH DESIGN

Text mining It is not only company and website information that can be mined for patterns and possible causes of behaviour. Modern predictive analysis software allows text data to be mined from various sources including social networking sites, recorded conversations from call centres, email contacts and many more sources. Large companies like Sikorsky Aircraft, one of the largest helicopter companies in the world, and Cablecom, a Swiss telecommunication firm, have used text mining software to help interpret issues related to consumers choosing alternative providers.40 Leading software companies such as SAS and SPSS now offer text mining capabilities. Although these programs can be expensive, they offer the ability to extract meaning from the mountains of verbal and text-based information generated by companies, customers, partners and even competitors.

A WARNING ABOUT QUALITATIVE RESEARCH Qualitative research cannot take the place of more generalisable quantitative research. Nevertheless, firms often use what should be qualitative and exploratory studies as final, conclusive research projects. This can lead to incorrect decisions. The most important thing to remember about qualitative research techniques is that they have limitations. Most of them provide subjective information, and interpretation of the findings typically is judgemental. For example, the findings from projective techniques can be vague. Projective techniques and depth interviews were frequently used decades ago by practitioners who categorised themselves as motivational researchers. They produced some interesting and occasionally bizarre hypotheses about what was inside a buyer’s mind, such as the following: →→ A woman is very serious when she bakes a cake because unconsciously she is going through the symbolic act of giving birth. →→ A man buys a convertible as a substitute mistress. →→ Men who wear suspenders are reacting to an unresolved castration complex.41 Unfortunately, bizarre hypotheses cannot be relegated to history as long-past events. Several years ago researchers at the McCann-Erickson advertising agency interviewed low-income women about their attitudes towards insecticides. The women indicated that they strongly believed a new brand of cockroach killer sold in little plastic trays was far more effective and less messy than traditional insect sprays. Rather than purchase the new brand, however, they remained stubbornly loyal to their old insect sprays. Baffled by this finding, the researchers did extensive qualitative research with female consumers. After reviewing the women’s drawings and in-depth descriptions of cockroaches, the researchers concluded that women subconsciously identified cockroaches with men who had abandoned them. Spraying the cockroaches and watching them squirm and die was enjoyable – so by using the spray, the women both gained control over the cockroaches and vented their hostility towards men.42 Conclusions based on qualitative research may be subject to considerable interpreter bias. Findings from focus group interviews likewise may be ambiguous. How should a facial expression or nod of the head be interpreted? Have subjects fully grasped the idea or concept behind a non-existent product? Have respondents overstated their interest because they tend to like all new products? Because of such problems in interpretation, exploratory findings should be considered preliminary. Another problem with qualitative studies deals with the ability to make projections from the findings. Most qualitative techniques use small samples, which may not be representative because they have not been selected on a probability basis. Case studies, for example, may have been selected

CHAPTER 03 > QUALITATIVE RESEARCH

89

because they represent extremely good or extremely bad examples of a situation rather than the average situation. Before making a scientific decision, the researcher should conduct a quantitative study with an adequate sample to ensure that measurement will be precise. This is not to say that exploratory research lacks value; it simply means that such research cannot deliver what it does not promise. The major benefit of exploratory research is that it generates insights and clarifies the marketing problems for hypothesis testing in future research. One cannot determine the most important attributes of a product until one has identified those attributes. Thus, exploratory research is extremely useful, but it should be used with caution. However, occasions do arise where the research process should stop at the qualitative and exploratory stage. If a cheese producer conducts a focus group interview to get a feel for consumers’ reactions to a crispy snack food made from whey (what is left over from cheese making) and exploratory findings show an extremely negative reaction by almost all participants, the cheese manufacturer might decide not to continue the project. Some researchers suggest that the greatest danger in using qualitative research is to evaluate alternative advertising copy, new product concepts and so on, because a poor idea will not be marketed – successive steps of research will prevent that. The real danger is that a good idea with promise may be rejected because of findings at the qualitative and exploratory stage. On the other hand, when everything looks positive in the qualitative research the temptation is to market the product without further research. Instead, after conducting qualitative research, marketing management should determine whether the benefits of the additional information would be worth the cost of further research. In most cases when a major commitment of resources is at stake, conducting the quantitative study is well worth the effort. Many times good marketing research only documents the obvious. However, the purpose of business is to make a profit, and decision-makers want to be confident that they have made the correct choice. Hugh Mackay, a respected qualitative researcher and commentator on social and marketing issues with over 20 years’ experience, outlines some of the pitfalls of qualitative research in the ‘Real world snapshot’ box. It is useful to consider this in detail and avoid making these common mistakes when implementing an exploratory or qualitative research project.

REAL WORLD SNAPSHOT

HUGH MACKAY’S SEVEN DEADLY SINS OF QUALITATIVE RESEARCH43

1 Gathering a collection of strangers together and calling them a ‘group’ You have to ask yourself what the point is of the group discussion technique if it is not to harness the dynamics of natural group interaction, to catch a glimpse of peer group pressure at work and to observe the phenomenon of opinion leadership. None of these things can be realistically observed when a collection of strangers is brought together for the purposes of research, because they are not a group in any meaningful sense of that term. In marketing, we know that many consumer attitudes are shaped by social pressure, and we know that a great deal of consumer behaviour is conformist behaviour. The use of genuine, natural, existing social groups taps into those processes, as they occur in the real life of the market place. But that raises the obvious question: why bring a collection of strangers together when what you want is a group? Some researchers answer that by creating an opportunity for a warm-up discussion, they begin to create some of the dynamics which are characteristic of group interactions. That is, of course,

»

90

PART THREE > PLANNING THE RESEARCH DESIGN

»

arrant nonsense. It doesn’t take minutes to generate group dynamics; in most cases, it takes years to achieve the kind of networking that has any impact on the attitudes, values and behaviour of consumers. Of course, it remains true that there are many things which consumers would prefer not to talk about even within the intimacy and security of an established group of friends, neighbours or workmates. In such cases, the answer is not to plonk people down in a roomful of strangers, but to use the alternative qualitative research technique of the unstructured personal interview. (Indeed, there is another deadly sin buried here somewhere – the sin of believing that ‘qualitative research’ only means ‘discussion groups’: what about all the other qualitative techniques?) Some people talk about the debate between the exponents of ‘affinity groups’ and ‘non-affinity groups’. I would prefer the terms ‘group’ and ‘non-group’, and I see no signs of any serious professional debate about it all, simply because the arguments in favour of calling a collection of strangers a ‘group’ must be terribly hard to marshal. The message is clear: if you want to use the group discussion technique, first find your group. 2 Herding respondents into a central location In the physical sciences, we have long been aware of the dangers of the so-called ‘experimental effect’; that is, the effect on the results of an experiment created by the very fact that the experiment is being conducted. Precisely the same hazard exists in social research. As currently practised, the group discussion technique generally seems to consist not only of bringing a collection of strangers together, but also putting them in a central location where it is convenient for a researcher to work with them (and where, even more bizarrely, it is convenient for clients and other interested spectators to gaze at them from behind a one-way screen). A few questions spring to mind: what is the effect on people’s willingness to reveal their innermost thoughts of putting them in a totally unfamiliar (and possibly threatening) environment? How do people feel when they know that they are being observed by some unknown people ‘on the other side of that mirror’? What is the effect on participants in a ‘focus group’ discussion of knowing that they are being filmed by a video camera? Even if we can’t quantify our answers to those questions, it doesn’t require much imagination to accept that there will be some ‘experimental effect’ from putting people into such an unfamiliar environment, when the whole purpose of the exercise is for them to feel relaxed, secure and confident enough to be able to tell the truth. It is hard to escape the conclusion that, if you put a collection of strangers in a strange place, they are very likely to say strange things. My view is that, in a well-conducted group discussion, the only strange element should be the researcher: members of the group should be comfortable with each other and with the location, which should, preferably, be the home of one member of the group (or an office tearoom or wherever the group normally meets and talks). If you want to study the behaviour of animals, you must go into the wild: putting them into a zoo makes life easier for the observer, but is very likely to distort their patterns of behaviour. It’s a rough analogy, but it emphasises the very real dangers that arise when we attempt to study consumer attitudes or behaviour in an environment where such attitudes and behaviour would not normally occur. In some cases, of course, people design ‘group discussion rooms’ that attempt to recreate the atmosphere of a domestic living room. Such attempts are a tacit admission of the hazard of dragging people out of their natural environment, but no simulated living room is as comfortable or secure as the one in which we actually sit and chat with our friends. If we acknowledge that a group discussion is a subtle and delicate process, then we must also acknowledge that the atmosphere in which it is conducted is crucial to the success – and the validity – of the outcome. In any case, the current obsession with central locations means that qualitative sampling has become hopelessly distorted. Far too little emphasis is being given to non-metropolitan consumers, or to those who, for any number of reasons, find it difficult or inconvenient to travel to a central location.

»

CHAPTER 03 > QUALITATIVE RESEARCH

»

There is a ‘volunteer bias’ inherent in all social research; surely, the challenge is to minimise – rather than maximise – the bias. (The counter-argument to all this, of course, is that highly paid researchers do not have the time to go driving around the suburbs – let alone the countryside – in order to meet with natural, existing groups in their natural, existing habitats. That is certainly not a scientific defence of central locations, but only an argument based on the comfort and convenience of researchers who run their central locations rather like dental surgeries.) 3 Asking questions – especially ‘Why?’ Qualitative research evolved as a means of overcoming many of the problems inherent in the use of structured questionnaires. The whole point of qualitative research is to listen, to explore and to explain. Over the years, we have found that the more questions we ask, the harder it is to get to the truth. It is easy to explain this apparent paradox: When you ask a question, you get an answer – but you never know whether the answer is only the product of the question or whether it is an accurate representation of an existing attitude or point of view in the person who gave the answer. One of the joys of qualitative research is that it allows us to break free of the shackles of questions. It allows us to shatter the illusion of rationality which surrounds the asking of formal questions – especially those which begin with the dreaded word ‘Why’. If you ask someone why they did something, you have immediately created the expectation that there is an explanation for what they did and that the explanation should sound reasonably rational. The truth about human behaviour, of course, is that we do many things for reasons that are not entirely rational, or for no ‘reason’ at all. The challenge for qualitative research is to explore and interpret consumers’ motivations and aspirations without forcing the consumer into the straitjacket of our agenda, as expressed by our questions. Qualitative research allows us to bypass questions so that we can listen to people saying what they want to say, unconstrained by our assumptions or our expectations. Certainly, we may want to probe things that people have said, but, if we use questions to do it, we risk stemming the flow of data which we are so anxious to obtain. When it comes to group discussions in particular, the term ‘non-directive’ means what it says. Discussions peppered with questions asked by a researcher are a sure sign that the technique is not understood. 4 Intimidating the respondents Surely – I hear you ask – no researcher would ever want to intimidate her/his respondents? Surely it is obvious that, if respondents are intimidated, they will also be inhibited and that the quality of data will be adversely affected? Of course that is true, and yet it is surprising how often qualitative researchers manage to intimidate their respondents, even without trying. The female researcher who sweeps into a group discussion looking as if she had stepped out of the pages of Vogue magazine is, almost certainly, intimidating her respondents. (It is a fundamental rule of qualitative research that the researcher should be as inconspicuous as possible and, when it comes to dress, should be at about the same standard – or a shade below – the respondents themselves.) The researcher who adopts a confident and authoritative manner – actively, and even aggressively, leading the discussion – is intimidating his/her respondents. The researcher who looks comfortable and in control of the situation because the respondents are clearly on her/his ‘patch’ is intimidating the respondents. Even the researcher who ventures into the field to talk to respondents on their home ground may intimidate them by an inappropriate choice of vehicle parked outside in the street. If respondents are intimidated, they are much more likely to say what they think the researcher wants to hear or simply to become defensive. The goal for the qualitative researcher who is serious about minimising the experimental effect is to aim for invisibility; to attract as little attention as possible; to be devoted only to the process of facilitation; in short, to be nondescript.

»

91

92

PART THREE > PLANNING THE RESEARCH DESIGN

»

5 Giving ‘top-line’ results The very term ‘top-line’ shows just how poorly qualitative research is still understood. Top-line results come straight out of statistical processing, and that’s where they belong. When it comes to qualitative research, ‘top-line’ often means ‘top-of-the-head’ or ‘first impressions’ and any researcher who is prepared to give a quick analysis of qualitative data is falling for one of the biggest traps of all. Part of the essence of qualitative research is the extended time that needs to be devoted to reflection, rumination and interpretation. What else is a qualitative researcher paid for if not for the combined ability to elicit sensitive data and to interpret it sensitively? I have never found a way of quickly interpreting qualitative data. Indeed, I would never be prepared to discuss findings with a client in anything less than a week after the work was done. Qualitative researchers who are worthy of the name are always building theories, relating their data to existing structures or testing what they think they have found against the professional literature. And that all takes time. Qualitative research, after all, is social science, not journalism. There is no place for ‘quick and dirty’ research in this field, because the very nature of the research process is time-consuming. The researcher who is prepared to give ‘top-line’ results has failed to understand what the process involves. 6 Presenting clear and simple answers Yes, sometimes the answers are clear and simple but the complexity of human attitudes and behaviour (and the complexity of the relationship between attitudes and behaviour) means that explanations and interpretations are usually not simple. Paradox, contradiction and uncertainty are the stuff of qualitative data and qualitative diagnosis because they are the stuff of life. Black-and-white answers are easy for clients to understand and easy for researchers to present. But, unwelcome though the news may be, the truth is usually to be found in subtle greys. A qualitative researcher was recently heard to brief a colleague with the comment: ‘Don’t confuse the client with complexity’. If I were the client, I would want all the complexity I could lay my hands on, since my role in life is to respond to the way consumers are, not to some simplified silhouette. The qualitative researcher has the responsibility to report on all the shades of opinion that emerge in a qualitative study; a single, simple finding often conceals a less convenient truth. 7 Using just one researcher to do the job The ‘guru’ syndrome is alive and well in qualitative research. Many qualitative researchers believe that they are so smart, so intuitive or so experienced that the question of methodological rigour does not apply to them: ‘I can get to the truth, regardless of how I obtain my data’ is a position often stated, or implied, by those in the industry who perpetuate the guru myth. The problem is that we are in the realm of social research – not mysticism or magic. There is no place in this business for gurus, but only for hard-working professionals who attend to the principles involved in what they do. Of course, it is true that qualitative research is a highly subjective – and often intuitive – business, particularly at the interpretation stage of the research process, and that should concern research buyers. But the solution is not to deal with a ‘guru’: a far better course of action is to ensure that research is never handled by just one person. The best safeguard against rampant subjectivity or prejudice posing as intuition is to have at least two researchers working on every project so that they are forced to pool their data, collaborate on the analysis and negotiate their way to an interpretation of the data that makes sense to both of them. Apart from anything else, the answer will never seem to be black-and-white when two researchers have to agree upon it. The ultimate question, of course, is why clients allow these sins to be committed on a daily basis. One cynical answer might be that they are persuaded by researchers who prefer to do things the easy way because it is more profitable to do so.

»

CHAPTER 03 > QUALITATIVE RESEARCH

»

93

Qualitative researchers themselves are generally the last to agonise out loud questions of philosophy or methodology. In the end, though, clients have the power to improve the quality of qualitative research by insisting, for example, on the use of real groups (when the group discussion is the appropriate technique), by insisting that researchers go to their respondents (rather than vice versa), by insisting that a minimum of two researchers be involved in every project, and by disciplining themselves to wait until the researchers have had time to think before demanding answers.

03

SUMMARY COMPARE AND CONTRAST QUALITATIVE RESEARCH AND QUANTITATIVE RESEARCH

This chapter focused on qualitative exploratory research. Qualitative research is subjective in nature. Much of the measurement depends on evaluation by the researcher rather than vigorous mathematical analysis. Quantitative research determines the quantity or extent of an outcome in numbers. It provides an exact approach to measurement. UNDERSTAND THE USE OF QUALITATIVE RESEARCH IN EXPLORATORY RESEARCH DESIGNS

Qualitative research may be conducted to diagnose a situation, screen alternatives or discover new ideas. It is a highly flexible design that provides greater depth of analysis than quantitative research. Qualitative research is particularly useful when research problems are unclear or important research issues are not clear. It is also useful for understanding complex situations and research problems. Most companies today use initial qualitative research studies for these purposes of diagnosis, followed by more generalisable quantitative studies later in order to provide more objective advice to management. DESCRIBE THE BASIC ORIENTATIONS OF QUALITATIVE RESEARCH

Phenomenology is a philosophical approach to studying human experiences based on the idea that human experience itself is inherently subjective and determined by the context within which a person experiences something. It lends itself to conversational or story-telling-based research. Ethnography represents ways of studying cultures through methods of high involvement with those cultures; participant observation is a common ethnographic approach. Grounded theory represents inductive qualitative investigation in which the researcher continually poses questions about a respondent’s discourse in an effort to derive an explanation of their behaviour. Collages are sometimes used in an effort to develop a

deep explanation of behaviour. Case studies are simply documented histories of a particular person, group, organisation or event.

RECOGNISE COMMON QUALITATIVE RESEARCH TOOLS AND KNOW THE ADVANTAGES AND LIMITATIONS OF THEIR USE

There are three basic sets of qualitative research tools: 1 Focus group interviews are unstructured, free-flowing, group sessions that allow individuals to initiate and elaborate on the topics of discussion. Interaction among respondents is synergistic and spontaneous, characteristics that have been found to be highly advantageous. 2 Depth interviews are unstructured, extensive interviews that encourage a respondent to talk freely and in depth about an undisguised topic. 3 Projective techniques are an indirect means of questioning respondents. Some examples are word association tests, sentence completion tests, the third-person technique, the role-playing technique and thematic apperception tests. PREPARE A FOCUS GROUP OUTLINE

A focus group outline should begin with introductory comments followed by very general opening questions that do not lead the respondent. More specific questions should be listed until a blunt question directly related to the study is asked. It should conclude with a debriefing session and a chance for questions and answers with respondents. RECOGNISE TECHNOLOGICAL ADVANCES IN THE APPLICATION OF QUALITATIVE RESEARCH APPROACHES

As the ability to communicate via the Internet, telecommunications and video conferencing links improves, companies are using these new media to conduct focus group research. Advances in computer software and text mining programs mean that companies can examine qualitative

94

PART THREE > PLANNING THE RESEARCH DESIGN

information from web logs, call centres, emails and chat rooms in order to discover important themes or patterns of consumer behaviour. APPRECIATE THE ROLE OF QUALITATIVE RESEARCH IN MANAGEMENT DECISION-MAKING

Although qualitative research has many advantages, it also has several shortcomings and should not take the place of conclusive, quantitative research.

Knowing where and how to use qualitative research is important. Many firms make the mistake of using a qualitative study as a final, conclusive research project. This can lead to decisions based on incorrect assumptions. Qualitative research techniques have limitations: the interpretation of the findings is based on judgement, samples are not representative, the techniques rarely provide precise quantitative measurement, and the ability to generalise the qualitative results is limited.

KEY TERMS AND CONCEPTS case study method concept testing depth interview discussion guide ethnography focus group interview

grounded theory hermeneutic unit hermeneutics moderator netnography online focus group

phenomenology picture frustration projective technique qualitative research role-playing technique sentence completion method

streaming media thematic apperception test (TAT) third-person technique word association test

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 Comment on the following remark by a marketing consultant: ‘Qualitative research is subjective and uses a few biased respondents; it does not constitute market research.’ 2 What type of qualitative research would you suggest in the following situations? a A product manager suggests development of a new interactive game for the Apple Watch. b A research project has the purpose of evaluating potential brand names for a new cocktail ready-to-drink mixer. c A movie producer is interested in the future of cinema attendance versus downloaded movies. d An advertiser wishes to identify the symbolism associated with using a power tool. 3 What are the differences between ‘real-time’ and ‘off-time’ focus groups? Suppose a manager wants to examine students’ attitudes to buying digital textbooks. Which type of online focus group would be better suited for them? 4 What benefits can be gained from case studies? What dangers, if any, do they present? In what situations are they most useful? 5 What is the function of a focus group? What are its advantages and disadvantages? 6 If a researcher wanted to conduct a focus group with children, what special considerations might be necessary? 7 A focus group moderator plans to ask respondents to taste new wine varieties before starting the group discussion about wine consumption habits. Is this a good idea? Explain.

8 Discuss the advantages and disadvantages of the following focus group techniques: a A videoconferencing system that allows marketers to conduct focus groups in two different locations with participants who interact with each other. b A system that uses telephone conference calls for group sessions. c An online focus group which allows respondents at their own convenience to join the discussion and type in their responses. 9 An online retailer receives many thousands of customer emails a year. Some are complaints, some are compliments. They cover a broad range of topics. Are these letters a possible source for exploratory research? Why or why not? 10 How might qualitative research be used to screen various ideas for advertising copy in television advertisements? 11 Most projective techniques attempt to assess a respondent’s true feelings by asking indirect questions rather than using direct questions that could give the respondent a good idea about the researcher’s true motives. Does the use of this technique constitute deception? 12 What are the potential problems with focus groups? How might they be improved? 13 Comment on the following approach to qualitative research. Does it breach any of Hugh Mackay’s seven deadly sins? If the research was successful, does it need to be changed or improved?

CHAPTER 03 > QUALITATIVE RESEARCH

Dr Clotaire Rapaille, a ‘cultural psychologist’, has made a fortune by applying a brand of Jungian psychoanalysis to market research. He does not believe in focus groups. Instead, he uses a technique called ‘imprint sessions’. They take three hours, and for the last hour participants lie on the floor in a dark room, listening to relaxing music, taking their minds back to childhood. When he worked for Chrysler he asked his subjects to go back to the ‘very first time in your life, your mental imprint, when you thought

95

“car”. There is an emotion attached to that’. What came out of this research was the PT Cruiser, a 1940s retro-looking sedan. Chrysler sold more than one million units.44

14 A researcher interested in understanding the illicit drug habits of 18–29 year olds in New Zealand, is wondering whether to use in depth interviews or focus groups. What would you advise and why?

ONGOING PROJECT DOING A QUALITATIVE ANALYSIS PROJECT? CONSULT THE CHAPTER 3 PROJECT WORKSHEET FOR HELP

Download the Chapter 3 project worksheet from the CourseMate website. It is used to determine what type of qualitative research

design you can use. You should be able to justify why you prefer one approach and methodology over another and why this addresses the research problem better than other approaches (see also the flowchart at the start of Chapter 2).

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to log in to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ interactive quizzes

☑ crosswords on key concepts ☑ Search me! activities ☑ online video activities

☑ flashcards

WRITTEN CASE STUDY 3.1 UP, UP AND AWAY. AIRBORNE FOCUS GROUPS WITH AIR NEW ZEALAND45 To showcase the airline’s ‘Business Premier’ service on its daily route from Los Angeles to London, Air New Zealand has conducted what it calls the first in-flight focus group to address perceived pros and cons of international travel. The focus group took place on a flight scheduled for 9 April 2013. The focus group consisted of frequent fliers and travel experts. The respondents were asked a series of questions. These included the respondents’ in-flight comfort, jet lag prevention

and flight preparation as they were in-flight from Los Angeles to London.

QUESTIONS

1 What are the advantages and disadvantages of using an airborne focus group in this place? 2 How could the design of the focus group be improved, if at all?

WRITTEN CASE STUDY 3.2 GETTING A GRIP: FOCUS GROUPS AND BEAUREPAIRES TYRES46 In 2007, former head of Angus and Robertson Bookworld Judith Swales was given the job of restructuring the joint venture between Ansell Australia and Goodyear and Dunlop in the tyre manufacturing and retailing business in Australia and New Zealand. The joint venture owns more than 700 stores – the

company-owned Beaurepaires, franchised Goodyear Autocare and licensed Dunlop dealers. Swales sums up her approach under three headings: simplify, engage, innovate. ‘We had to make a shift from being a manufacturing organisation to ask what is it that consumers want,’ she says.

96

PART THREE > PLANNING THE RESEARCH DESIGN

The company accounts for one in every four tyres sold. When Swales joined, its market share was falling. What it needed was less manufacturing ‘push’ and more consumer ‘pull’. When Swales joined the business, she says, You did what everyone else does, you go out there and have a good look. ‘I was looking at Beaurepaires stores and they may as well have said “men only” over the door, frankly. They were painted blue, they had high counters that the store managers used to hide behind and when you went in they’d say, “G’day love”. They weren’t great places to shop.’ She discovered half the company’s cash customers were women. We had stores that if you went in and looked on the table you would find the latest trucking magazines and Top Gear, but no reason why a woman would want to spend half an hour waiting for her tyres, in a shop which – incidentally – was full of tyres. How many times have you been and browsed a rack of tyres? ‘So we did extensive market research, we did focus groups, we did brand tracking.’

They rebranded the Beaurepaires stores, gave them a coat of paint and revamped them with a more lively, female-friendly look. Out went the tyres, except for a few the customers could check out, and in came a clean look with educational information to help inform customers about likely problems: ‘It was so people could feel more confident the things they were being told were actually true and they weren’t being ripped off.’ Swales says that tyres also had lots of acronyms. That’s changing. Now the tyre names give some indication of what you can expect; for example, Fuel Max.

QUESTIONS

1 What role did focus groups play in the rebranding of Beaurepaires? 2 Did the focus groups guide or did they confirm management decision-making? 3 Why was it important to do market research in this case? 4 How might this research design be improved?

ONGOING CASE STUDY MOBILE PHONE SWITCHING AND BILL SHOCK Following on from research proposal with AusBargain, David, Leanne and Steve were asked to develop qualitative research to examine mobile phone switching and the causes of bill shock. They were given a limited budget by the client and only two months to complete this stage of the research.

QUESTIONS

1 What kind of qualitative research do you think is best here? Why? 2 Develop a potential discussion guide for this response 3 What are some potential problems with your qualitative research design, and how could these be avoided?

NOTES 1 2 3 4 5 6 7

8 9

Winchester, T. M., Binney, W., & Hall, J. (2014) ‘Young adults and politics: Investigating factors influencing voter decision making’, Journal of Nonprofit & Public Sector Marketing, 26(3), 226–57. Winchester, T. M., Binney, W., & Hall, J. (2014) ‘Young adults and politics: Investigating factors influencing voter decision making’, Journal of Nonprofit & Public Sector Marketing, 26(3), p. 231. McGrath, C. (2013) Statistics show 25 per cent of young people failed to enrol to vote in September election. Retrieved from http://www.abc.net.au/news/2013-08-21/ figures-show-25-per-cent-of-young-people-failed-to-enrol-to-vote/4903292. Winchester, T. M., Binney, W., & Hall, J. (2014) ‘Young adults and politics: Investigating factors influencing voter decision making’, Journal of Nonprofit & Public Sector Marketing, 26(3), p. 249. Department of Health and Ageing (2015) Developmental research for the national drug campaign 2012–2014. Accessed at http://www.drugs.health.gov.au/Internet/ drugs/publishing.nsf/content/ndc-dev-res-rep2012-14 on 22 July 2015. Ramaseshan, B., Wong, D., & Turner, B. (2011) Emerging issues in sustainability and green marketing: Striking the right chord with organ donation. AMA Summer Educators’ Conference Proceedings, 22, 119–120. Brush, G. J., Frethey-Bentham, C., Ayre, M., Elmslie, S., Fowler, C., Howell, A., & Walsh, E. (2014) ‘Understanding student perceptions of a career in the marketing research industry: Implications for positioning and engagement’, Market & Social Research, 22(2), 32–52. Fusaro, Dave (1996) ‘Food products of the new millennium’, Prepared Foods, 1 January. Business World (1996) ‘Selling ideas before time, India: Intense market research is required before launching a new product concept as firms have to find out receptiveness’, 23 July, p. 54.

10 Sources: Cliff, E. (2010) ‘Tablet Mania: What’s Now and What’s Next?’ Bloomberg Businessweek, 4201 (29 October), p. 46; Dolliver, M. (2009) ‘Mac/PC Duality ‘Isn’t so Strict in Real Life’, Adweek, 50 (October 19), p. 19. 11 Lee, N., Saunders, J., & Goulding, C. (2005) ‘Grounded theory, ethnography and phenomenology: A comparative analysis of three qualitative strategies for marketing research’, European Journal of Marketing, 39(3/4), 294–308. 12 Thompson, Craig (1997) ‘Interpreting consumers: A hermeneutical framework for driving marketing insights from the tests of consumer consumption stories’, Journal of Market Research, 34 (November), pp. 438–55. 13 Moore, Matt (2009) ‘Storytelling for consumer insight’, Research News, Australian Market and Social Research Society, July, 26(2), pp. 16–17. 14 Freeman, L., & Spanjaard, D. (2012) Bridging the gap: The case for expanding ethnographic techniques in the marketing research curriculum. Journal of Marketing Education, 34(3), 238–250. 15 Chen, Y. & Xie, J. (2008) ‘Online Consumer Review: Word-of-Mouth as a New Element of Marketing Communication Mix’, Management Science, 54(3), pp. 447–49. 16 Stewart-Loane, Susan, Webster, Cynthia & D’Alessandro, Steven (2014) ‘Identifying consumer value co-created through social support within online health communities, Journal of Macromarketing’, 35(3): 353–67. 17 Baron, S. & Warnaby, G. (2011) ‘Individual customers’ use and integration of resources: Empirical findings and organizational implications in the context of value co-creation’, Industrial Marketing Management, 40(2), pp. 211–18. 18 Strang, K., David. (2011) A grounded theory study of cellular phone new product development. International Journal of Internet & Enterprise Management, 7(4), 366–87. 19 Sinclair, Lara (2008) ‘Naked gets on the couch with its consumers’, The Australian, 5 June, p. 36.

CHAPTER 03 > QUALITATIVE RESEARCH

20 Wheeler, F., Frost, W., & Weiler, B. (2011) ‘Destination brand identity, values, and community: A case study from rural Victoria, Australia’, Journal of Travel & Tourism Marketing, 28(1), 13–26. 21 Quester (2010) ‘No vision marketing’, The Advertiser Magazine, p. 1, 4 December. 22 Hess, John M. (1968) ‘Group interviewing’, in New science of planning, R. L. King, ed., Chicago: American Marketing Association, p. 194. 23 Burbury, Rochelle (2004) ‘Focus groups given new role to play’, Australian Financial Review, 17 May, p. 56. 24 Gray, David, D’Alessandro, Steven & Carter, Leanne (2012) ‘State of the mobile phone nation’, available at: http://www.prepaidmvno.com/wp-content/uploads/2012/05/ Macquarie-University-report-State-of-the-Mobile-Nation.pdf, accessed on 31 August 2012. 25 Gray, David, D’Alessandro, Steven and Carter, Leanne (2012) ‘State of the mobile phone nation’, available at: http://www.prepaidmvno.com/wp-content/ uploads/2012/05/Macquarie-University-report-State-of-the-Mobile-Nation.pdf, accessed on 31 August 2012. 26 See the Australian, Market and Social Research Society’s webpage for a copy of this guide at www.amsrs.com.au. 27 From Donald F. Cox, ed. (1967) Risk taking and information handling in consumer behavior, Boston: Division of Research, Harvard Business School, pp. 65–6. 28 Contemporary researchers are in the process of creating new projective techniques. See Pich, C., & Dean, D. (2015) ‘Qualitative projective techniques in political brand image research from the perspective of young adults’, Qualitative Market Research: An International Journal, 18(1), 115–144; McNeill, L., & Graham, T. (2014) ‘Mother’s choice: An exploration of extended self in infant clothing consumption’, Journal of Consumer Behaviour, 13(6), 403–410. 29 Belk, R. (2013) ‘Visual and projective methods in Asian research’, Qualitative Market Research: An International Journal, 16(1), 94–107, 28. 30 Chan, Kara and McNeal, James (2006) ‘Rural Chinese children as consumers: consumption experience and information sourcing’, Journal of Consumer Behaviour, (5) 3 182–192. 31 McNeal, James (1999) The kids market: Myths and realities, Ithaca, NY: Paramount Market Publishers, p. 245.

97

32 Heather, Rebecca Piirto (1994) ‘Future focus groups’, American Demographics, 1 January, p. 6. 33 www.realnetworks.com/getstarted/index.html, downloaded 30 June 2001. 34 Negroponte, Nicholas (1998) ‘Being anonymous’, Wired, October, p. 216. 35 Maddox, Kate (1998) ‘Virtual panels add real insights for marketers’, Advertising Age, 29 June, p. 34. 36 Rubin, Jon (2000) ‘Online marketing research comes of age’, BrandWeek, 30 October, p. 28. 37 Adapted from Speer, Tibbett (1994) ‘Nickelodeon puts kids online’, American Demographics, 1 January, p. 16. 38 Briggs, Jon (2008) ‘Research 2.0: How interactive approaches help guide deeper understanding for marketers’, Research News, October, accessed at www.amsrs.com. au/index.cfm?a=detail&eid=148&id=2824 on 20 July 2009. 39 Smith, A., Bolton, R. and Wagner, J. (1999) ‘A model of customer satisfaction with service encounters involving failure and delivery’, Journal of Marketing Research, 36 (August), pp. 356–72. 40 Business Wire (2008) ‘SPSS text mining reveals customer insights as organisations worldwide tap into unstructured data’, 16 June, accessed through ProQuest document number 14956884601, 14 July 2008. 41 Kotler, Philip (1965) ‘Behavioral models for analyzing buyers’, Journal of Marketing, October, pp. 37–45. 42 Alsop, Ronald (1998) ‘Advertisers put consumers on the couch’, Wall Street Journal, 13 May, p. 19. 43 Mackay, Hugh (1993) ‘Hugh Mackay’s Seven Deadly Sins’, Australian Financial Review, 1 June, p. 36. 44 Coultan, Mark (2005) ‘Consumers on the couch’, Sydney Morning Herald, 18 June, p. 28. 45 MediPost.com (2013) Air New Zealand Gives Focus Group High-Altitude ‘Premier’, accessed from http://Fativa.com, 25 July 2015. 46 Hooper, Narelle. (2010) ‘Getting a grip: sometimes it takes a fresh eye’, Australian Financial Review, 9 November, p. 58.

WHAT YOU WILL LEARN IN THIS CHAPTER

To discuss the advantages and disadvantages of secondary data.

To understand the types of objectives that can be achieved using secondary data. To discuss and give examples of the various internal and proprietary sources of secondary data.

Who is afraid of metadata?

Metadata, which is your electronic fingerprint of where you have been on the web, who you communicate with and, with mobile devices, where you have been and are, is information for governments and may be extremely valuable in preventing terrorism. The concern is that there is no longer any privacy from the state and that a citizen’s movements, communications and, via

To identify and give examples of various external sources of secondary data. To describe the impact of single source data and the globalisation of secondary data research.

98

the proxy of electronic payments, behaviour can be effectively monitored by government. Marketers are already using metadata to a large extent. The tagging of people in photographs, for example, provides organisations like Twitter and Facebook with valuable market research insights into social networks, which they onsell to marketers.1 Interestingly, consumers, especially New Zealanders, are more supportive of governments using metadata than marketers using this information.2 iStock.com/gpointstudio

04 »

SECONDARY RESEARCH WITH BIG DATA

PART THREE > PLANNING THE RESEARCH DESIGN

SECONDARY DATA RESEARCH

secondary data

Secondary data are gathered and recorded by someone else prior to (and for purposes other than) the current project. Secondary data usually are historical and already assembled. They require no access to respondents or subjects.

Secondary data are collected for a purpose other than the immediate research question at hand. When you participated in the survey as part of taking this course, you contributed to a database that your instructor can use to illustrate concepts and provide assignments through which you can analyse real-world data. However, taking part in the survey and collecting fresh data by having your class respond is more like primary data collection. In most primary data collections, the Courtesy of Qualtrics.com researcher could perhaps find secondary data that may not provide the precise information needed to address a research question, but it might at least be in the same general area as the research question. In our survey, the researcher had some interest in students’ communication behaviours. Thus, quite a few questions address text messaging, emailing and so on. Consider the accompanying screenshot from the survey. Can you find secondary data, aside from the database that goes with this questionnaire, that address similar issues among consumers? If so, what can you find? Do you think the results reveal similar patterns of behaviour to those exposed in the class survey? Discuss your results.

secondary data Data that have been previously collected for some purpose other than the one at hand.

SURVEY THIS!

Advantages The primary advantage of secondary data comes from their availability. Obtaining them is almost always faster and less expensive than acquiring primary data. This is particularly true when electronic retrieval is used to access data that are stored digitally. In many situations, collecting secondary data is instantaneous. Consider the money and time saved by researchers seeking updated population estimates for a town between the 2006 and 2016 censuses. Instead of doing the fieldwork themselves, researchers could acquire estimates from a firm dealing in demographic information or from sources such as the Australian Bureau of Statistics (ABS) or the World Bank. Many of the activities normally associated with primary data collection (for example, sampling and data processing) are eliminated by using secondary data. In some instances, data cannot be obtained using primary data collection procedures. For example, a manufacturer of farm implements could not duplicate the information in the US Census of Agriculture because much of the information there (for example, amount of taxes paid) might not be accessible to a private firm.

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

99

100

PART THREE > PLANNING THE RESEARCH DESIGN

Disadvantages An inherent disadvantage of secondary data is that they were not designed specifically to meet researchers’ needs. Thus, researchers must ask how pertinent the data are to their particular project. To evaluate secondary data, researchers should ask questions such as these: →→ Is the subject matter consistent with our problem definition? →→ Do the data apply to the population of interest? →→ Do the data apply to the time period of interest? →→ Do the secondary data appear in the correct units of measurement? →→ Do the data cover the subject of interest in adequate detail? Consider the following typical situations: 1 A researcher interested in forklift trucks finds that the secondary data on the subject are included in a broader, less pertinent category encompassing all industrial trucks and tractors. Furthermore, the data were collected five years earlier. 2 An investigator who wishes to study individuals earning more than $150 000 per year finds the top category in a secondary study reported at $90 000 or more per year. 3 A brewery that wishes to compare its per-barrel advertising expenditures with those of competitors finds that the units of measurement differ because some report on point-of-purchase expenditures with advertising, but others do not. 4 Data from a previous warranty card study show where consumers prefer to purchase the product, but provide no reasons. Each of these situations shows that even when secondary information is available, it can be inadequate. The most common reasons why secondary data do not adequately satisfy research needs are: (1) outdated information, (2) variation in definition of terms, (3) different units of measurement and (4) lack of information to verify the data’s accuracy. Information quickly becomes outdated in our rapidly changing environment. Because the purpose of most studies is to predict the future, secondary data must be timely to be useful. Every primary researcher has the right to define the terms or concepts under investigation to satisfy the purpose of his or her primary investigation. This is little solace, however, to the investigator of the Chinese market in Australia who finds secondary data reported as ‘per cent born outside Australia in Asia’. Variances in terms or variable classifications should be scrutinised to determine if differences are important. The populations of interest must be described in comparable terms. Researchers frequently encounter secondary data that report on a population of interest that is similar, but not directly comparable, to their population of interest. For example, OzTAM, an Australian media research company, reports its television audience estimates by geographical areas known as regions. Each region is a geographic area (North, South East, West etc.) consisting of Statistical Local Areas3 (SLAs) in a major Australian city in which the home market commercial television stations receive a preponderance of total viewing hours. This unique population of interest is used exclusively to report television audiences. In its census of population, the ABS uses the same term (Statistical Local Areas)4 for its geographic areas. However, they are not directly comparable because the OzTAM regions contain more densely populated suburbs, and fewer outlying or new suburbs with lower population densities. OzTAM has probably used this approach to give it a basis on which to draw a sample, at reasonable cost, which best reflects the viewing patterns of Australians in major cities, while the ABS conducts a census (interviews all households) and therefore includes all suburbs or regions in its SLA, regardless of population density.

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

101

Units of measurement may cause problems if they do not conform exactly to a researcher’s needs. For example, timber shipments in millions of board metres are quite different from billions of tonne kilometres of timber transported by railways. Head-of-household income is not the same unit of measure as total family income. In comparing the value of international markets, a common currency such as US or Australian dollars should be used, so all figures will need to be converted to this amount. Even when comparing individual incomes from two sources there may be differences. Reported incomes estimates provided to the ABS, as part of its census, may differ from taxable income collected by the Australian Taxation Office (ATO). The income figure from the ABS is usually a before-tax figure, while the ATO figure represents income that is accepted as taxable by the Australian government. To complicate matters further, income as measured by the ATO can differ substantially from income that is accepted by the Child Support Agency (CSA) for the payment of child support, or Centrelink for welfare purposes. All these organs of the Australian government prepare reports based on their own definitions of income for policy and public debate, and thus results of reports from different parts of the same government may not be directly comparable. Another complication is that often the objective of the original primary study may dictate that the data are summarised, rounded or reported such that, although the original units of measurement were comparable, aggregated or adjusted units of measurement are not suitable in the secondary study. When secondary data are reported in a format that does not exactly meet the researcher’s needs, data conversion may be necessary. Data conversion (also called data transformation) is the process of changing the original form of the data to a format suitable to achieve the research objective. For example, sales for food products may be reported in kilograms, cases or dollars. An estimate of dollars per kilogram may be used to convert dollar volume data to kilograms or another suitable format. Another disadvantage of secondary data is that the user has no control over their accuracy. Although timely and pertinent secondary data may fit the researcher’s requirements, the data could be inaccurate. Research conducted by other people may be biased to support the vested interest of the source. For example, media often publish data from surveys to identify the characteristics of their subscribers or viewers, but they will most likely exclude derogatory data from their reports. If the possibility of bias exists, the secondary data should not be used. Investigators are naturally more prone to accept data from reliable sources, such as the Australian or New Zealand governments. Nevertheless, the researcher must assess the reputation of the organisation that gathers the data and critically assess the research design to determine whether the research was correctly implemented. Unfortunately, such evaluation may not be possible if the manager lacks information that explains how the original research was conducted. Researchers should verify the accuracy of the data whenever possible. Cross-checks of data from multiple sources – that is, comparison of the data from one source with data from another – should be made to determine the similarity of independent projects. When the data are not consistent, researchers should attempt to identify reasons for the differences or to determine which data are more likely to be correct. If the accuracy of the data cannot be established, the researcher must determine whether using the data is worth the risk. Exhibit 4.1 illustrates a series of questions that should be asked to evaluate secondary data before they are used.

data conversion The process of changing the original form of the data to a format suitable to achieve the research objective; also called data transformation. cross-check The comparison of data from one source with data from another source to determine the similarity of independent projects.

102

PART THREE > PLANNING THE RESEARCH DESIGN

EXHIBIT 4.1 → EVALUATING SECONDARY DATA5

Do the data help to answer questions set out in the problem definition?

No

Stop

Yes Do the data apply to the time period of interest? Applicability to the current project

No

Yes Do the data apply to the population of interest?

No

Yes Do other terms and variable classifications presented apply to the current project?

No

Can the data be reworked? If yes, continue.

Yes Are the units of measurement comparable?

No

Yes If possible, go to the original source of the data. Yes Is the cost of data acquisition worth it?

No

Stop

Yes Is there a possibility of bias?

Yes

Stop

No Accuracy of the data

Can the accuracy of data collection be verified?

No Is using the data worth the risk? No Stop

Yes

Yes (accurate) Use data.

No (inaccurate or unsure)

Stop

No

Stop

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

TYPICAL OBJECTIVES FOR SECONDARY DATA RESEARCH DESIGNS It would be impossible to identify all possible purposes of marketing research using secondary data. However, it is useful to illustrate some common marketing problems that can be addressed with secondary research designs. Table 4.1 shows three general categories of research objectives: factfinding, model building and database marketing.

TABLE 4.1 »

COMMON RESEARCH OBJECTIVES FOR SECONDARY DATA STUDIES

Broad objective

Specific research example

Fact-finding

Identifying consumption patterns Tracking trends

Model building

Estimating market potential Forecasting sales Selecting trade areas and sites

Database marketing

Enhancing customer databases Developing prospect lists

Fact-finding The simplest form of secondary data research is fact-finding. A marketer of computer games might be interested in knowing how large the video game market is in Australia. Secondary data available from IBISWorld showed that in 2011 total sales of the video games market in Australia reached $2.5 billion, with an annual growth rate from 2007 of 41 per cent per year. The largest market segment was for games on personal computers (57 per cent), followed by games on smartphones (45 per cent) and on the Nintendo DS (35 per cent). The company with the greatest market share was Nintendo Australia (18.6 per cent), followed by JB Hi-Fi Limited (13.4 per cent). The market, though, appears to operate on a small margin, with only an estimated $261 million being derived by companies from the $2.5 billion market. These simple facts would interest a researcher who was investigating the computer game market, and whether it is worth entering the market or perhaps purchasing a competitor. Fact-finding can serve more complex purposes as well.

IDENTIFICATION OF CONSUMER BEHAVIOUR FOR A PRODUCT CATEGORY A typical objective for a secondary research study might be to uncover all available information about consumption patterns for a particular product category or to identify demographic trends that affect an industry. Such information is available from both secondary data provided by government and private organisations. The advertisement shown in Exhibit 4.2 illustrates the use of possible uses of secondary research from data provided by the New Zealand Film Commission on 2014 productions. One of the important findings in this report is that the film industry generated revenues in 2014 of NZD $3.1 billion and employs some 15 700 people.

103

104

PART THREE > PLANNING THE RESEARCH DESIGN

EXHIBIT 4.2 → NEW ZEALAND FILM STATISTICS

Source: New Zealand Film Commission. Accessed at http://www.nzfilm.co.nz/ on 29 July 2015.

TREND ANALYSIS Marketers watch for trends in the marketplace and the environment. Market tracking is the observation and analysis of trends in industry volume and brand share over time. Scanner research services and other organisations provide facts about sales volume to support this work. As shown in Exhibit 4.2, revenues from 2014-released New Zealand films were only NZD 6.5 million, with 18 films produced that year. This shows the challenge the New Zealand industry faces to repeat its successes of the past.6 Almost every large consumer goods company routinely investigates brand and product category sales volume using secondary data. This type of analysis typically involves comparisons with competitors’ sales or with the company’s own sales in comparable time periods. It also involves industry comparisons among different geographic areas. Table 4.2 shows the trend in domestic box office receipts in Australia. As can be seen, the Australian film industry struggles to find attendances market tracking The observation and analysis of trends in industry volume and brand share over time.

in Australia and has, in fact, reduced its share of box office sales from 4 per cent in 2007 to 2.4 per cent in 2015. The South Korean film industry, however, is well represented at home with around 43 per cent of box office in that country. Larger English-speaking countries like the UK have a volatile domestic share varying between 17 to 35.7 per cent of box office films. In the United States, on the other hand,

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

105

virtually all films seen are American, with 91.5 per cent of box office being domestic. In India, there is also a large domestic market, with 92 per cent of box office going to local films.

TABLE 4.2 »

DOMESTIC FILMS’ SHARE OF BOX OFFICE IN SELECTED COUNTRIES 2006–14 (PERCENTAGE)7

Country

2007

2008

2009

2010

2011

2012

2013

2014

Australia

4

3.8

5

4.5

3.9

4.3

3.5

2.4

New Zealand

2

2

n.a.

6.3

2.3

2.2

2.6

n.a

United Kingdom

28

31

17

24

35.7

32.1

22.1

n.a

Japan

48

60

57

53.4

54.5

65.7

60.6

41.7

China

54.1

61

56.6

56.4

56.6

48.5

58.7

54.5

South Korea

45

39

49

46.5

48.8

n.a

56

43

India

91

92

89

n.a

92

n.a

n.a

n.a

United States

97

97

n.a

91.5

n.a

n.a

n.a

n.a.

ENVIRONMENTAL SCANNING In many instances, the purpose of fact-finding is simply to study the environment to identify trends. Environmental scanning entails information gathering and fact-finding designed to detect indications of environmental changes in their initial stages of development. As mentioned in Chapter 1, the Internet can be used for environmental scanning; however, there are other less recurrent means, such as periodic review of contemporary publications and reports. For example, environmental scanning for information about members of the ‘Generation Z’, born after 1995 showed them to want a career plan and to be shown what they need to do to be successful. They also want control over their career, plus respect and validation. Access to mobile training apps and the Internet presence of the employer were also found to be important. Flexibility in workplace hours is important to them, as is being independent thinkers, backed by technology, with the ability to work things out for themselves.8 Therefore, managers can keep informed about generational differences in attitudes by scanning for reports such as these and then adjusting their strategy or policies appropriately to the changes detected.

environmental scanning Information gathering and factfinding that is designed to detect indications of environmental changes in their initial stages of development. push technology Internet information technology that automatically delivers content to the researcher’s or manager’s desktop. model building The use of secondary data to help specify relationships between two or more variables. Model building can involve the development of descriptive or predictive equations.

A number of online information services, such as Dow-Jones News Retrieval, routinely collect news stories about industries, product lines and other topics of interest that have been specified by the researcher. Push technology is an Internet information technology that automatically delivers content to the researcher’s or manager’s desktop.9 Push technology uses ‘electronic smart agents’ to find information without the researcher having to do the searching. The smart agent, which is a custom software program, filters, sorts, prioritises and stores information for later viewing.10 This frees the researcher from doing the searching. The true value of push technology is that the researcher who is scanning the environment can specify the kinds of news and information he or she wants, have it quickly delivered to a device and view it at leisure.

Model building The second general objective for secondary research, model building, is more complicated than simple fact-finding. Model building involves specifying relationships between two or more variables, perhaps extending to the development of descriptive or predictive equations. Models need not include

model building

106

PART THREE > PLANNING THE RESEARCH DESIGN

iStock.com/goir

complicated mathematics, though. In fact, decision-makers often prefer simple models that everyone can readily understand over complex models that are difficult to comprehend. For example, market share is company sales divided by industry sales. Although some may not think of this simple calculation as a model, it represents a mathematical model of a basic relationship. We will illustrate model building by discussing three common objectives that can be satisfied with secondary research: estimating market potential, forecasting sales and selecting sites.

ESTIMATING MARKET POTENTIAL FOR GEOGRAPHIC AREAS Marketers often estimate market potential using secondary data. In many cases exact figures may be published by a trade association or another source. However, when the desired information is unavailable, the researcher may estimate market potential by transforming secondary data from two or more sources. For example, managers may find secondary data about market potential for a country or other large geographic area, but this information may not be broken down into smaller geographical areas, The former director of MIT’s Media Lab, Jayant Krishnamurthy, is designing and building commonsense online agents to enable the development of more innovative recommendation systems that are more interactive and user-friendly than traditional collaborative filtering systems. The research hopes to discover the characteristics of products that drive user ratings so that these can be built into intelligent recommendation agents and effective product exploration tools.11

such as by metropolitan area, or in terms unique to the company, such as sales territory. In this type of situation, researchers often need to make projections for the geographic area of interest. An extended example will help explain how secondary data can be used to calculate market potential. A marketer of shoes is considering building a processing plant in the Asia Pacific region. Managers wish to estimate market potential for Australia, New Zealand and selected countries in the region. Secondary research uncovered data for per capita consumption of shoes and population projections for the year 2012. The data for the five Asia Pacific countries appear in Table 4.3. (The per capita expenditure on shoes was obtained from Euromonitor International, http://www. euromonitor.com.12 The population estimates are based on information from the CIA’s The World Factbook database.)

TABLE 4.3 » MARKET POTENTIAL FOR SHOES IN AUSTRALIA, NEW ZEALAND AND SELECTED ASIA PACIFIC COUNTRIES

Country

Population for 2015

2011 per-capita expenditure (US$)

Market potential estimate (US$)

Australia

23 852 183

175.6

$4 188 443 334.80

New Zealand

4 509 700

158.7

$715 689 390.00

Singapore

5 470 000

111.8

$611 546 000.00

Malaysia

30 915 238

38.5

$1 190 236 663.00

1 375 000 000

29.6

$11 100 029.60

249 900 000

17.4

$15 664 332.60

China Indonesia

To calculate market potential for Singapore in the year 2015, multiply that country’s population in the year 2015 by its per capita expenditure in 2011 on shoes, the nearest estimate. 5 470 000 3 111.8 5 $611 546 000.00

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

107

In New Zealand, with a similar population to Singapore, the market potential for footwear is more at $715 689 390. As Table 4.3 reveals, although Malaysia’s population is much higher than Australia, it has a much lower market potential than Australia. Note that while China has the highest market potential, it has the second-lowest expenditure on shoes. A marketer of luxury shoes may therefore prefer to examine markets with much higher per capita expenditure on shoes, such as Australia or New Zealand.

NEW TRENDS: THE RETURN OF VINYL RECORDS AND THE MOVE TO ONLINE STREAMING OF MUSIC13

Until a few years ago, selling music involved recordings on CDs, and then moved towards digital downloads. Research shows that consumers are returning to vinyl. Recent research shows that consumers are returning to vinyl classics, with a Nielsen report showing a 52 per cent increase in the sales of vinyl records, the largest yearly growth since 1991– some 9.2 million of all albums sold in the US in 2014. Interestingly CDs still comprise the majority of sales – 140.8 million in 2014, compared to digital sales of 106.5 albums. Audio streaming is growing, based on services such as Pandora and Spotify, up to 60.5 per cent in 2013–2014, with some 78.6 billion songs streamed in the United States. Thus the marketplace for music appears to be changing yet again, which presents artists and music companies with significant challenges. Similar trends are also occurring in the Australian market.14 In 2014, CD albums made up around $115 million in sales (down around 19 per cent from 2013). Digital album sales remained basically unchanged in 2014, comprising $67 million, while subscription income (paid-for streaming services) are worth $23 million, doubling from the previous year (up 111.26 per cent). Vinyl album sales, though small at $6.46 million, increased 127 per cent from the previous year. According to online streamer Spotify, in 2015 the most streamed music in Australia and New Zealand were: Australia 1 2 3 4 5

Ed Sheeran The Weeknd Drake Kanye West Calvin Harris

New Zealand 1 2 3 4 5

Ed Sheeran The Weeknd Six60 Sam Smith Eminem

Table 4.4 illustrates trend projection using a moving average projection of growth rates. The wholesale value of Australian CD sales is secondary data from the Australian Recording Industry Association (ARIA) (see http://www.aria.com.au). The moving average is the sum of growth rates for the past three years divided by three (number of years). The resulting number is a forecast of the percentage increase or decrease in sales for the coming year.

REAL WORLD SNAPSHOT

108

PART THREE > PLANNING THE RESEARCH DESIGN

TABLE 4.4 »

Year

SALES FORECAST USING SECONDARY DATA AND THE MOVING AVERAGE METHOD

Australian sales of CD singles by wholesale value ($000s)

Percentage rate of growth from the previous year

Three-year moving average rate of growth (percentage decline)

2007

611

244.1

236.3

2008

3 570

246.8

243.4

2009

1 314

263.1

251.3

2010

159

287.8

265.9

2011

151

5.0

248.6

2012

697

461.6

126.3

2013

363

247.9

106.0

2014

632

174.1

195.7

Using the three-year average growth rate of 195.7 per cent for the 2012, 2013 and 2014 sales periods, we can forecast CD single sales for 2016 (in 000s) as follows: 632 1 (632 3 1.957 ) 5 1868.824 or $1 868 824

Moving average forecasting is best suited to a static competitive environment. More dynamic situations make other sales forecasting techniques more appropriate. As has been discussed in the previous example, this now makes forecasting of CD sales more difficult, given changes in technology and consumer use patterns, such as downloading music, use of streaming services and the return of vinyl records. Statistical trend analysis using secondary data can be much more advanced than this simple example. Many statistical techniques build forecasting models using secondary data. This chapter emphasises secondary data research rather than statistical analysis. Chapter 14 (Bivariate statistical analysis: Tests of association) and Chapter 15 (Multivariate statistical analysis) explain more sophisticated statistical model building techniques for forecasting sales.

USE OF SECONDARY DATA WITH SURVEY DATA Marketing managers often combine secondary data with survey results in order to determine important market parameters such as market potential, size of an existing market, extent of consumer savings from switching providers and the extent of bill shock, or overspend in market.15 Health and welfare policy may combine findings on survey research as to the extent of costs of accidents or diseases, or the potential savings from quitting smoking, exercising and driving within the speed limit. An example of such an approach is shown more in the mobile phone market, where survey results are combined from secondary data sources, in this case the Australian Bureau of Statistics, to produce an estimate of the total amount of bill shock (overcharging) in the mobile phone industry. The approach uses a representation of bill shock in a survey combined with larger census data to estimate the extent to which consumers save by switching mobile phone providers in Australia.

Estimating how much consumers save by switching mobile phone providers in Australia Online survey results from 1600 mobile phone users, representing a sample of the Australian telecommunications market, showed that 16 per cent of consumers switched mobile phone providers with 79 per cent reporting that they experienced cost savings. The median amount reported in the

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

109

survey was $26.52 a month, which meant that switching providers means consumers can save $318.24 a year. Table 4.5 shows that when these figures are extrapolated across the Australian mobile service provider market, consumers can collectively save $623 million a year by switching providers.

TABLE 4.5 »

STIMATE OF HOW MUCH CONSUMERS SAVED BY SWITCHING MOBILE PHONE PROVIDERS E IN AUSTRALIA

Source

Unit

Assumptions

Estimate

Population 18+ March 2011

ABS

million

17.2

17.2

Population 18+ with mobile phone

survey

million

90%

15.5

Population who switched

survey

million

16%

2.5

Switched who saved money

survey

million

79%

2.0

Median savings per month

survey

$

26.52

Estimate of saving by switching

estimate

$ million

26.52 623

Note: Australian Bureau of Statistics (ABS) 31010DO002_201103 Australian Demographic Statistics, Mar 2011. Other estimates based on the Macquarie University Survey data. (Two million consumers switched and reported they saved money 3 median amount saved per month 3 12) 2 $623 million dollars.

Note that these savings would be bolstered if we also included those consumers reported in the same survey as seriously considering switching but as yet had not done so. Extrapolation of the 41 per cent of the population who considered switching represents a market value of $2.75 billion and potential additional switching savings.

Data mining Large corporations’ decision support systems often contain millions or even hundreds of millions of records of data. These complex data volumes are too large to be understood by managers. Consider these examples: →→ More than 500 hours of video footage is uploaded to YouTube every minute. →→ Google reported that in 2013 there were some 30 trillion webpages, which had increased 30 fold from 2008. →→ There are 350 billion text or SMS messages exchanged worldwide every month. Our smartphones generate mobile network data regardless of whether we send an SMS. →→ And perhaps more impressively, 90 per cent of the world’s data was created in the past two years alone, according to Michael Howard, vice-president of marketing, EMC division Greenplum.16 Two points about data volume are important to keep in mind. First, relevant marketing data are often in independent and unrelated files. Second, the number of distinct pieces of information each data record contains is often large. When the number of distinct pieces of information contained in each data record and data volume are too large, end users do not have the capacity to make sense of it all. Data mining helps clarify the underlying meaning of the data. The term data mining refers to the use of powerful computers to dig through volumes of data to discover patterns about an organisation’s customers and products. It is a broad term that applies to many different forms of analysis. For example, neural networks are a form of artificial intelligence in

data mining The use of powerful computers to dig through volumes of data to discover patterns about an organisation’s customers and products. It is a broad term that applies to many different forms of analysis. neural network A form of artificial intelligence in which a computer is programmed to mimic the way that the human brain processes information.

110

PART THREE > PLANNING THE RESEARCH DESIGN

which a computer is programmed to mimic the way that the human brain processes information. One computer expert put it this way: A neural network learns pretty much the way a human being does. Suppose you say ‘big’ and show a child an elephant, and then you say ‘small’ and show her a poodle. You repeat this process with a house and a giraffe as examples of ‘big’ and then a grain of sand and an ant as examples of ‘small’. Pretty soon she will figure it out and tell you that a truck is ‘big’ and a needle is ‘small’. Neural networks can similarly generalise by looking at examples.17

REAL WORLD SNAPSHOT

market-basket analysis A form of data mining that analyses anonymous point-ofsale transaction databases to identify coinciding purchases or relationships between products purchased and other retail shopping information. customer discovery Involves mining data to look for patterns identifying who is likely to be a valuable customer.

SOCIAL LISTENING AND TAGGING WITH METADATA18

The wealth of information in social media sites such as Facebook and Twitter has provided market researchers with emerging challenges and opportunities. Many organisations now employ ‘social listening’ consultants to monitor sentiment analysis of social media. Foxtel in Australia, for example, used software to identify fans who posted about its latest series The Walking Dead. Analysis found that 5 per cent of the show’s fans were responsible for all the referrals about the show. The National Australia Bank has a team of seven social media staff who cover everything from customer service to campaign production. The team receives some 5000 comments and resolves 600 customer service requests through social media every month. The metadata that we all provide on webpages – through tags on photographs in Facebook, our likes and dislikes – also have value to marketers. Using software called TagMan, Virgin Atlantic is able to determine which visits to which websites ultimately lead to purchases, an approach also used by travel agent Thomas Cook, which saved some 25 per cent of its affiliate marketing budget using such an approach.19 Wal-Mart, the largest retailer in the USA and possibly the world, uses data mining. Wal-Mart’s information system houses more than seven terabytes of data on point of sale, inventory, products in transit, market statistics, customer demographics, finance, product returns and supplier performance. The data are mined to develop ‘personality traits’ for each of Wal-Mart’s 3 000-plus outlets, which Wal-Mart managers use to determine product mix and presentation for each store. Wal-Mart’s data-mining software looks at individual items for individual stores to decide the seasonal sales profile of each item. The data-mining system keeps a year’s worth of data on the sales of 100 000 products, and

Courtesy © IBM

predicts which items will be needed in each store.20 Market-basket analysis is a form of data mining that analyses anonymous point-of-sale transaction databases to identify coinciding purchases or relationships between products purchased and other retail shopping information.21 Consider this example about patterns in customer purchases: Osco Drugs in the USA mined its databases provided by checkout scanners and found that when men go to its stores to buy nappies for babies in the evening between 6 p.m. and 8 p.m., they sometimes walk out with a six-pack of beer as well. Knowing this behavioural pattern, it’s possible for store managers in supermarket chains to lay out their stores so This advertisement’s copy says: ‘To our data mining system, they’re twins. Because both order milk with their hamburgers.’ It is an example of market-basket analysis.

that these items are closer together.22 The example of a credit card company with large volumes of data illustrates a data-mining application known as customer discovery. The credit card company will probably track information about each customer: age, gender, number of children, job status, income level, past credit history and

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

so on. Very often the data about these factors will be mined to find the patterns that make a particular individual a good or bad credit risk.23 When a company knows the identity of the customer who makes repeated purchases from the same organisation, an analysis can be made of sequences of purchases. Sequence discovery, the use of data mining to detect sequence patterns, is a popular application among direct marketers such as

111

database marketing The use of customer databases to promote one-to-one relationships with customers and create precisely targeted promotions.

catalogue retailers. A catalogue merchant has information for each customer, revealing the sets of products that the customer buys in every purchase order. A sequence discovery function can then be used to discover the set of purchases that frequently precedes the purchase of, say, a microwave oven. As another example, sequence discovery used on a set of insurance claims could lead to the identification of frequently occurring medical procedures performed on patients, which in turn could be used to detect cases of medical fraud. Data mining requires sophisticated computer resources and it is expensive. That is why companies such as DataMind, IBM, Oracle, Information Builders and Acxiom Corporation offer data-mining services. Customers send the databases they want analysed and let the data-mining company do the ‘number crunching’.

Database marketing and customer relationship management A CRM (customer relationship management) system is a decision support system that manages the interactions between an organisation and its customers. A CRM maintains customer databases containing customers’ names, addresses, phone numbers, past purchases, responses to past promotional offers and other relevant data, such as demographic and financial data. Database marketing is the practice of using CRM databases to develop one-to-one relationships and precisely targeted promotional efforts with individual customers. For example, a fruit catalogue company CRM contains a database of previous customers, including what purchases they made during the Christmas holidays. Each year the company sends last year’s gift list to customers to help them send the same gifts to their friends and relatives.

BIG DATA AND LOVE AT FIRST SIGHT24

Big data. It’s the biggest buzzword in the technology business. Rick Smolan, the man who co-founded the ‘America 24/7’ and ‘Day in the Life’ photography series, has kick-started an ambitious project called ‘The Human Face of Big Data’. It’s a grand attempt to tap into human consciousness and create a snapshot of global sentiment using the web, volunteers on smartphones and more than 150 photojournalists around the world. The project has already collected more than two million individual responses to questions posed by an Android app, downloadable from the web. A forthcoming version for iPhones is expected to further boost responses. The results, available at http://hfobd.metalayer.com, are curious. Answers to the question ‘Have you experienced love at first sight?’ can be blended with a range of sub-filters including gender, location and other question such as, ‘Are you more like your mother or father?’ A random sample revealed that 78 per cent of men in Singapore who say that they are more like their mothers have experienced love at first sight. Only 60 per cent of men who are more like their father have had the same experience. Because database marketing requires vast amounts of CRM data compiled from numerous sources, secondary data are often acquired for the exclusive purpose of developing or enhancing databases. The transaction record – which often lists the item purchased, its value, customer name, address and postcode – is the building block for many databases. This may be supplemented with data that customers

database marketing

REAL WORLD SNAPSHOT

112

PART THREE > PLANNING THE RESEARCH DESIGN

provide directly, such as data on a warranty card, and by secondary data purchased from third parties. For example, credit services may sell databases about applications for loans, credit card payment history and other financial data. Several companies, such as Dun & Bradstreet, Euromonitor, Donnelley Marketing (with its BusinessContentFile and ConsumerContentFile services) and Claritas (with PRIZM), collect primary data and then sell demographic data that can be related to small geographic areas, such as those with a certain postcode. (Remember that when the vendor collects the data, they are primary data, but when the database marketer incorporates the data into his or her database, they are secondary data.) Now that some of the purposes of secondary data analysis have been addressed, it is appropriate to discuss sources of secondary data.

TIPS OF THE TRADE

MAKING BIG DATA WORK FOR YOU25

An article in the Harvard Business Review suggests organisations should consider the following steps when trying to capitalise on big data: the myriad of information available from customers from online visits, credit card information and account data. These are: 1 Choose the right data. Focus on useable data you already have. 2 Source data creatively. Often the data may be available internally and may be combined with information from social media. 3 Get the necessary IT support. It is important to quickly connect the most powerful data to managerial analytical models. 4 Build models that predict and optimise business outcomes. 5 Develop business relevant analytics that can be put to use. 6 Embed analytics into simple tools for the front lines.

SOURCES OF SECONDARY DATA Chapter 1 classified secondary data as either internal to the organisation or external. Modern information technology makes this distinction seem somewhat simplistic. Some accounting documents are indisputably internal records of the organisation. Researchers in another organisation cannot have access to them. Clearly, a book published by the federal government and located at a public library is external to the company. However, in today’s world of electronic data interchange, the data that appear internal and proprietary data Secondary data that originate inside the organisation.

in a book published by the federal government may also be purchased from an online information vendor for instantaneous access and subsequently stored in a company’s decision support system. Internal data should be defined as data that originated in the organisation, or data created, recorded or generated by the organisation. Internal and proprietary data is perhaps a more descriptive term.

WHAT WENT RIGHT? Bees aren’t the only creatures that buzz. Consumers do, too, and more and more they create that buzz online. Just think about it, the Internet is filled with billions of consumer conversations. Obviously, these billions of data points contain a lot of useful information. But, a lot of it is useless, too. How can a firm make sense of this?

One solution: data-mining software designed for the blogosphere. Buzzmetrics, a part of Nielsen online, serves firms by monitoring Internet conversation and letting them know whether conversation about their brand is up or down at any given time period. Want to know if an AFL ad had any impact? The buzz »

113

»

an ad creates from the time it becomes public is a good indicator. If there’s no buzz, there’s probably not much sizzle in terms of market effectiveness. Is Dancing with the Stars still popular? If people aren’t talking about it online then that show too may be losing its sizzle. For large brands, companies like Buzzmetrics monitor thousands of websites for brand mentions and to ascertain whether those mentions are positive or negative. Thus, secondary data can provide a buzz that can come with or without a sting based on whether the conversations spread good or bad news about the brand.26

Courtesy, The Nielsen Company

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

Internal and proprietary data sources Most organisations routinely gather, record and store internal data to help them solve future problems. An organisation’s accounting system can usually provide a wealth of information. Routine documents such as sales invoices allow external financial reporting, which in turn can be a source of data for further analysis. If the data are properly coded into a modular database in the accounting system, the researcher may be able to conduct more detailed analysis using the decision support system. Sales information can be broken down by account or by product and region; information related to orders received, back orders and unfilled orders can be identified; and sales can be forecast on the basis of past data. Researchers frequently aggregate or disaggregate internal data. Other useful sources of internal data include salespeople’s call reports, customer complaints, service records, warranty card returns and other records. For example, a computer service firm used internal secondary data to analyse sales over the previous three years, categorising business by industry, product, purchase level and so on. The company discovered that 60 per cent of its customers represented only 2 per cent of its business and that nearly all of these customers came through telephone directory advertising. This simple investigation of internal records showed that, in effect, the firm was paying to attract customers it did not want.

External data: the distribution system External data are generated or recorded by an entity other than the researcher’s organisation. The government, newspapers and journals, trade associations and other organisations create or produce information. Traditionally, this information has been in published form, perhaps available from a public library, trade association or government agency. Today, however, computerised data archives and electronic data interchange make external data as accessible as internal data. Exhibit 4.3 illustrates some traditional and some modern ways of distributing information.

Information as a product and its distribution channels Because secondary data have value, they can be bought and sold like other products. And just as bottles of perfume or plumbers’ wrenches may be distributed in many ways, secondary data also flow through various channels of distribution. Many users, such as Standard & Poor’s top

external data Data created, recorded or generated by an entity other than the researcher’s organisation.

114

PART THREE > PLANNING THE RESEARCH DESIGN

EXHIBIT 4.3 → INFORMATION AS A PRODUCT AND ITS DISTRIBUTION CHANNELS

Traditional distribution of secondary data Indirect channel using intermediary

Direct channel

Information producer (federal government)

Information producer (federal government)

Library (storage of government documents and books)

Company user

Company user

Modern distribution of secondary data Indirect computerised distribution using an intermediary Information producer A (federal government – census data)

Information producer B (grocery store – retail scanner data)

Vendor/external distributor (computerised database integrating all three data sources for any geographic area)

Company user

Direct, computerised distribution Information producer’s (just-in-time inventory partner) computerised database

Company user

Information producer C (audience research company – television viewing data)

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

115 TIBCO ®

500 corporations, purchase documents and computerised census data directly from the government. However, many small companies get census data from a library or another intermediary or vendor of secondary information. Information of various kinds – even entire websites complete with images and links – can be gathered according to the researcher’s specifications and delivered by email or continuously through an Internet connection. For example, BackWeb (http://www.backweb.com) empowers companies to adapt quickly to changing market conditions through direct interaction with their employees, partners and customers. BackWeb’s Internet software allows businesses to efficiently gather, target and deliver sizable digital data in any format – audio, video, software files, html and others – to user desktops across their extended enterprise. Users browse the information on their own computers at their convenience.

LIBRARIES Traditionally, the vast storehouses of information in libraries have served as a bridge between users and producers of secondary data. The library staff deals directly with the creators of information, such as the federal government, and intermediate distributors of information, such as abstracting and indexing services. The user need only locate the appropriate secondary data on the library shelves. Libraries provide collections of books, journals, newspapers and so on for reading and reference. They also stock many bibliographies, abstracts, guides, directories and indexes, as well as offering access to basic databases. The word ‘library’ typically connotes a public or university facility. However, many major corporations and government agencies also have libraries. A corporate librarian’s advice on sources of industry information or the United Nations librarian’s help in finding statistics about international markets can be invaluable.

THE INTERNET The Internet is, of course, a source of distribution of much secondary data. Its creation has added an international dimension to the acquisition of secondary data. For example, http: www.libraryspot. com provides links to online libraries, including law libraries, medical libraries and music libraries. Its reference desk features links to calendars, dictionaries, encyclopaedias, maps and other sources typically found at a traditional library’s reference desk. A careful evaluation needs to be made of online sources, as anyone can publish anything on the Web. Information on the Web may not be objective; many websites may have commercial sponsors or vested interests (for example, viewpoints on politics are often expressed on the Web). Information on the Web can also date very quickly, and this may be difficult to determine if the source of the primary information (or when the information was posted online) is not provided. The ‘Tips of the trade’ box provides a checklist that should be followed to assess the quality of information as provided by websites. Table 4.6 lists some of the more popular Internet addresses where secondary data may be found.

quality of online information

116

PART THREE > PLANNING THE RESEARCH DESIGN

TABLE 4.6 »

SELECTED INTERNET SITES FOR SECONDARY DATA

Name

Description

URL

Google

Search engine that serves as a gateway to all kinds of sites on the Web.

http://www.google.com

Media Federation of Australia

Provides links to all major media research companies in Australia.

http://www.mediafederation .org.au/ media_research.htm

Libraries Australia Search

A portal that links the catalogues of all major Australian libraries.

http://librariesaustralia.nla .gov.au/ apps/kss

Australian Bureau of Statistics

Demographic, economic and social information on Australia. Wide ranging and usually available free of charge.

http://www.abs.gov.au

Factiva

Access to business worldwide information from a number of newspaper and magazine sources; also provides company financial and share-market information. A fee-subscription service.

http://www.factiva.com

AdNews

Provides content on marketing media, advertising and public relations in Australia and New Zealand.

http://www.adnews.com.au

Australian Market and Social Research Society

Provides codes of conduct, industry news, and training and development opportunities in the Australian market research industry.

http://www.amsrs.com.au

Datamonitor

Compiles business intelligence reports for a number of industries and country-wide reports. The website provides summaries of reports available for purchase.

http://www.datamonitor.com

IBISWorld Australia

Analytical industry reports on 500 Australian industries as well as 17 industry sector overviews. Also reports available for the top 2 000 companies in Australia, based on total revenue.

http://www.ibisworld.com.au

Snapshots (Asia Pacific) via Proquest

Provides information on market size, market segmentation, market share, market distribution, market forecasts, socioeconomic data and further sources for selected industries in the Asia Pacific.

http://www.proquest.com

Wall Street Journal–Asia

Provides a continually updated view of business news around Asia.

http://online.wsj.com/public/asia

Euromonitor International

Provides information on consumer trends and industries from around the world.

http://www.euromonitor.com

Nerac

Database of technology, science and patents. US-based, fee-subscription service.

http://www.nerac.com

Brint.com: The BizTech Network

Business and technology portal and global network for e-business, information technology and knowledge management.

http://www.brint.com

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

THE GOOD, THE BAD AND THE UGLY – OR WHY IT’S A GOOD IDEA TO EVALUATE WEBSITES27

Authority »» »» »» »» »» »» »» »» »»

Is there an author? Is the page signed? Is the author qualified? An expert? Who is the sponsor? Is the sponsor of the page reputable? How reputable? Is there a link to information about the author or the sponsor? If the page includes neither a signature nor indicates a sponsor, is there any other way to determine its origin? Look for a header or footer showing affiliation. Look at the URL: www.abs.gov.au Look at the domain: .gov, .edu, .com, .ac.uk, .org, .net etc.

Rationale Anyone can publish anything on the Web. It is often hard to determine a Web page’s authorship. Even if a page is signed, qualifications are not usually provided. Sponsorship is not always indicated. Accuracy »» Is the information reliable and error-free? »» Is there an editor or someone who verifies/checks the information? Rationale See ‘Authority’ above. Unlike traditional print resources, Web resources rarely have editors or factcheckers. Currently, no Web standards exist to ensure accuracy. Objectivity »» Does the information show a minimum of bias? »» Is the page designed to sway opinion? »» Is there any advertising on the page? Rationale Frequently the goals of the sponsors/authors are not clearly stated. Often the Web serves as a virtual ‘Hyde Park Corner’ – a soapbox. Currency »» Is the page dated? »» If so, when was the last update? »» How current are the links? Have some expired or moved? Rationale Publication or revision dates are not always provided. If a date is provided, it may have various meanings. For example, it may indicate when the material was first written, it may indicate when the material was first placed on the Web [or] it may indicate when the material was last revised. Coverage »» What topics are covered? »» What does this page offer that is not found elsewhere? »» What is its intrinsic value? »» How in-depth is the material? Rationale Web coverage often differs from print coverage. Frequently, it’s difficult to determine the extent of coverage of a topic from a Web page. The page may or may not include links to other Web pages or print references. Sometimes Web information is ‘just for fun’, a hoax, someone’s personal expression that may be of interest to no one, or even outright silliness.

117

TIPS OF THE TRADE

118

PART THREE > PLANNING THE RESEARCH DESIGN

VENDORS The information age offers many channels, besides libraries, through which to access data. Many external producers make secondary data available either directly from the organisations that produce the data or through intermediaries, which are often called vendors. Vendors such as the Dow Jones News Retrieval Service now allow managers to access thousands of external databases via computers and telecommunications systems. Dun & Bradstreet (http://www.dnb.com.au) specialises in providing information about thousands of companies’ financial situations and operations, marketing databases and credit risk. Other market research organisations in Australia and New Zealand, such as Roy Morgan Research (http://www.roymorgan.com.au) and Nielsen Australia (http://www.nielsen.com/ au/en.html), produce summary reports on subjects as diverse as politics, social trends, newspaper readership, and television and radio ratings.

PRODUCERS Classifying external secondary data by the nature of the producer of information yields five basic sources: →→ publishers of books and periodicals →→ government sources →→ media sources →→ trade association sources →→ commercial sources. The following section discusses each type of secondary data source.

EXPLORING RESEARCH ETHICS

PRIVACY ISSUES IN SECONDARY RESEARCH

According to the Australian government, companies that collect information about an individual must ensure that the individual understands: »» the purpose for which they are collecting your personal information »» how they are going to use it »» who they are going to give it to »» how you can access and correct the information they hold about you. They must also make sure that they collect your personal information in a fair and lawful way, and that the personal information they hold on you is accurate, up to date and secure (see http://www.privacy.gov. au/individuals/business). Similar privacy principles also apply in New Zealand (http://www.privacy.org.nz/the-privacy-act-andcodes/privacy-principles).

BOOKS AND PERIODICALS Books and periodicals found in a library are considered by some researchers to be the quintessential secondary data source. A researcher who finds books on a topic of interest obviously is off to a good start. Professional journals, such as the Journal of Marketing, Journal of Marketing Research, Journal of the Academy of Marketing Science, Marketing Research: A Magazine of Management and Application and Public Opinion Quarterly, as well as commercial business periodicals such as the Wall Street Journal, Fortune and Business Week, contain much useful material. Sales and Marketing Management’s Survey of Buying Power is a particularly useful source of information about markets. To locate data in periodicals, indexing services such as Proquest and Business Periodicals Index and the Wall Street Journal Asia Index are

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

very useful. Guides to data sources also are helpful. For example, American Statistical Index and Business Information Sources is a very valuable source.

GOVERNMENT SOURCES Government agencies produce data prolifically. Most of the data published by national governments can be counted on for accuracy and quality of investigation. The ABS, for example, provides detailed reports on the economy, population, demographic changes and industry sectors, as well as information on education, health, the environment and social issues of concern to Australians (there are more numerous and detailed categories and these can be seen on the organisation’s website at http://www. abs.gov.au). Information is collected via a census of the Australian population every five years, or by the use of surveys on specialised topics where samples are used. Other important information from government is produced via reports from ministries and departments at both a national and state level. Many cities and states publish information on the Internet. Most search engines have directory entries that allow easy navigation to a particular state’s website. A researcher using Yahoo!, for example, needs only to click ‘Local’ to find numerous paths to information about states.

MEDIA SOURCES Information on a broad range of subjects is available from broadcast and print media. The Australian Financial Review and Business Review Weekly are valuable sources for information on the economy and many industries. Other media, such as The Australian daily broadsheet, frequently commission research studies about various aspects of Australians’ lives, such as financial affairs, and make reports of survey findings available to potential advertisers free of charge. Data about the readers of magazines and the audiences for broadcast media typically are profiled in media kits and advertisements. Information about special-interest topics may also be available. For example, The Australian Woodworker magazine reports that the household expenditure on woodworking products and machines is $70 to $90 million Australian dollars a year.28 The Australian Woodworker has more than 5000 subscribers and a readership estimated at 60 000 per issue. Its customers spend 60 per cent of their woodworking budget on tools and machines and 40 per cent on consumables. Data such as these are plentiful because the media like to show that their vehicles are viewed or heard by advertisers’ target markets. These types of data should be evaluated carefully, however, because often they cover only limited aspects of a topic. Nevertheless, they can be quite valuable for research, and they are generally free of charge.

TRADE ASSOCIATION SOURCES Trade associations, such as the Australian Marketing Institute (AMI) or the Australian Market and Social Research Society (AMSRS), serve the information needs of a particular industry. The trade association collects data on a number of topics of specific interest to firms, especially data on market size and market trends. Association members have a source of information that is particularly germane to their industry questions. For example, Retail World (see www.retailworld.com.au) provides data and reports on major grocery product categories and industry trends in retailing.

COMMERCIAL SOURCES Numerous firms specialise in selling and/or publishing information. For example, IBISWorld Australia (http://www.ibisworld.com.au) provides reports on industry market research, company research, the business environment and industry risk ratings. The following discussion of several of these firms provides a sampling of the diverse data that are available.

119

120

PART THREE > PLANNING THE RESEARCH DESIGN

Claritas ®

Market share and consumption and purchase behaviour data Nielsen29 offers a scanner-based marketing and sales information service called Scan Track. This service gathers sales and marketing data from a representative sample of major retailers in Australia and New Zealand. As part of Nielsen’s Retail Measurement Service, auditors visit the stores at regular intervals to track promotions to customers, retail inventories, displays, brand distribution, out-of-stock conditions and other retail marketing activity. Scanner data allow researchers to monitor sales data before, during and after changes in advertising frequency, price changes, distribution of free samples and similar marketing tactics. Many primary data investigations use scanner data to measure the results of experimental manipulations such as altering advertising copy. For example, scanning systems combined with consumer panels are used to create electronic test markets. Systems based on scanner data, or product codes and similar technology, have been implemented

Many firms specialise in computerised census and demographic data.

in factories, warehouses and transport companies to research inventory levels, shipments and the like.

Demographic and census updates The ABS offers computerised census files and updates of these data broken down by small geographic areas, such as postcodes. These are termed Confidential Unit Record Files (CURFs) and are of interest to researchers and marketers wishing to combine census data with other information collected on a geographic basis; for example, retail sales of shoes in a particular area. This service also extends to special reports collected by the ABS. Note that individual entities such as households, companies or individuals are not identified from information provided by the ABS to other parties.

Consumer attitude and public opinion research Many research firms offer specialised syndicated services that report findings from attitude research and opinion polls. Roy Morgan Research Australia conducts research on values, political preferences and social beliefs, as well as gauging the level of consumer confidence and the public’s view on advertising (see http://www.roymorgan.com). Image courtesy of The Advertising Archives

Advertising research Advertisers can purchase readership and audience data from a number of firms. W. R. Simmons and Associates measures magazine audiences, Nielsen measures radio audiences, Roy Morgan Australia measures readership of newspapers and magazines, and OzTAM estimates television audience ratings. By specialising in collecting and selling audience information on a continuing basis, these commercial sources provide a valuable service to their subscribers. Assistance in measuring advertising effectiveness is another syndicated service. For example, Starch Roper Worldwide measures the impact of advertising in magazines. Readership information can be obtained for competitors’ advertisements or the client’s own advertisements. Respondents are classified as noted readers, associated readers or read-most readers. Burke Marketing Research provides a service that measures the extent to which respondents recall television advertisements aired the night before. It provides product category norms, or Starch Readership Service measures the impact of advertising. People who recognise the advertisement are classified as noted readers, associated readers or read-most readers.

average DAR (day-after recall) scores, and DAR scores for other products. An individual advertiser would be unable to monitor every minute of every television program before deciding on the appropriate ones in which to place advertising. However, numerous clients, agencies, television networks and advertisers can purchase the Nielsen television ratings service.

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

121

SINGLE-SOURCE DATA-INTEGRATED INFORMATION Nielsen combines data from television meters and scanner operations. The integration of these two types of data helps marketers investigate the impact of television advertising on retail sales. In other ways as well, users of data find that merging two or more diverse types of data into a single database offers many advantages. PRIZM by Claritas Corporation, CACI, ClusterPlus by SMI, Mediamark Research Inc. and many other syndicated databases report product purchase behaviour, media usage, demographic characteristics, lifestyle variables and business activity by geographic area, such as postcode. Although such data are often called geodemographic, they cover such a broad range of phenomena that no one name is a good description. These data use small geographic areas as the unit of analysis. The marketing research industry uses the term single-source data for diverse types of data offered by a single company. Table 4.7 identifies several major marketers of single-source data. TABLE 4.7 »

single-source data Diverse types of data offered by a single company. The data are usually integrated on the basis of a common variable such as geographic area or store.

EXAMPLES OF SINGLE-SOURCE DATABASES

CACI Marketing Systems (http:// www.caci.com)

Provides industry-specific marketing services, such as customer profiling and segmentation, custom target analysis, demographic data reports and maps, and site evaluation and selection. CACI offers demographics and data on businesses, lifestyles, consumer spending, purchase potential, shopping centres, traffic volumes and other statistics.

PRIZM by Claritas Corporation (http://www.claritas.com)

PRIZM, which stands for Potential Rating Index for Zip Markets, is based on the ‘birds-of-a-feather’ assumption that people live near others who are like themselves. PRIZM combines census data, consumer surveys about shopping and lifestyle, and purchase data to identify market segments. Colourful names such as ‘Young Suburbia’ and ‘Shot Guns and Pickups’ describe 40 segments that can be identified by postcode. Claritas also has a lifestyle census in the United Kingdom (https://segmentationsolutions.nielsen.com/mybestsegments/ Default.jsp).

MRI Cable Report – Mediamark (http://www .mediamark.com)

Integrates information on cable research including television viewing with demographic and product usage information.

WARC– www.warc.com

Warc describes itself as a comprehensive marketing information service for the global marketing, advertising, media, research and academic communities. It provides authoritative forecasts of advertising expenditure for all major economies, and also provides case studies, best practice guides, marketing intelligence, consumer insight, industry trends, and news from around the world.

SOURCES FOR GLOBAL RESEARCH As business has become more global, so has the secondary data industry. Austrade, part of the Department of Foreign Affairs and Trade (DFAT), maintains a series of offices that help to collect data on important Australian export markets. Austrade also provides information to investors and importers wishing to do business with Australia. Secondary data compiled internationally have the same limitations as domestic secondary data. However, international researchers should watch for certain pitfalls that frequently are associated with foreign data and cross-cultural research. First, data may simply be unavailable in certain countries. Second, the accuracy of some data may be called into question. This is especially likely with official statistics that may be adjusted for the political purposes of foreign governments. Third, although economic terminology may be standardised, various countries use different definitions and accounting and recording practices for many economic concepts. For example, different countries may measure disposable personal income in radically different ways. International researchers should take extra care to investigate the comparability of data among countries.

122

PART THREE > PLANNING THE RESEARCH DESIGN

As already mentioned, the Australian government and other organisations compile databases that may aid international marketers. The Association of Southeast Asian Nations or ASEAN (http://www.asean.org) reports on historical and current activity in South-East Asia. Its website is a comprehensive reference guide that provides information about laws and regulations, a detailed profile of each member state and other information about business resources. The US government offers a wealth of data about foreign countries, which may be useful to researchers based overseas. The CIA’s The World Factbook and the National Trade Data Bank are especially useful. Both can be accessed using the Internet.

SUMMARY

04

DISCUSS THE ADVANTAGES AND DISADVANTAGES OF SECONDARY DATA

DISCUSS AND GIVE EXAMPLES OF THE VARIOUS INTERNAL AND PROPRIETARY SOURCES OF SECONDARY DATA

Secondary data are data that have been gathered and recorded previously by someone else for purposes other than those of the current researcher. Secondary data usually are historical and do not require access to respondents or subjects. Primary data are those data gathered for the specific purpose of the current researcher. The chief advantage of secondary data is that they are almost always less expensive to obtain than primary data. Generally they can be obtained rapidly and may provide information not otherwise available to the researcher. The disadvantage of secondary data is that they were not intended specifically to meet the researcher’s needs. The researcher must examine secondary data for accuracy, bias and soundness. One way to do this is to cross-check various available sources.

Managers often get data from internal proprietary sources such as accounting records. Scanner-based data and behavioural data (see Chapter 6) such as traffic counters may also be useful for managers. Data mining is the use of powerful computers to dig through volumes of data to discover patterns about an organisation’s customers and products. It is a broad term that applies to many different forms of analysis.

UNDERSTAND THE TYPES OF OBJECTIVES THAT CAN BE ACHIEVED USING SECONDARY DATA

Secondary research designs address many common marketing problems. There are three general categories of secondary research objectives: fact-finding, model building and database marketing. A typical fact-finding study might seek to uncover all available information about consumption patterns for a particular product category or to identify business trends that affect an industry. Model building is more complicated: it involves specifying relationships between two or more variables. Model building need not involve a complicated mathematical process, but it can help marketers to estimate market potential, forecast sales, select sites and accomplish many other objectives. The practice of database marketing – which involves maintaining customer databases with customers’ names, addresses, phone numbers, past purchases, responses to past promotional offers, and other relevant data such as demographic and financial data – is increasingly being supported by marketing research efforts.

TO IDENTIFY AND GIVE EXAMPLES OF VARIOUS EXTERNAL SOURCES OF SECONDARY DATA

External or secondary data are generated or recorded by another entity. The government, newspaper and journal publishers, trade associations and other organisations create or produce information. Traditionally this information has been distributed in published form, either directly from producer to researcher or indirectly through intermediaries such as public libraries. Modern computerised data archives, electronic data interchange and the Internet have changed the distribution of external data, making them almost as accessible as internal data. Push technology is a term referring to an Internet information technology that automatically delivers content to the researcher’s or manager’s desktop. This helps in environmental scanning. DESCRIBE THE IMPACT OF SINGLE SOURCE DATA AND THE GLOBALISATION OF SECONDARY DATA RESEARCH

The marketing of multiple types of related data by single-source suppliers has radically changed the nature of secondary data research. As business has become more global, so has the secondary data industry. International researchers should watch for certain pitfalls that can be associated with foreign data and cross-cultural research.

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

123

KEY TERMS AND CONCEPTS cross-check customer discovery data conversion data mining

database marketing environmental scanning external data internal and proprietary data

market-basket analysis market tracking model building neural network

push technology secondary data single-source data

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 Read the first example at the start of this chapter about metadata. What can metadata tell someone about a consumer? Are there any ethical issues involved (Hint: examine some of the endnotes about metadata in this chapter.) 2 Why is secondary data so important to market researchers? 3 Suppose you wish to learn about the size of the music market, particularly CD sales, growth patterns of streaming services and vinyl record sales, where would probable sources for these secondary data be available? 4 Over the past five years, a manager has noted a steady decline in sales and profits for her division’s product line. Does she need to use any secondary data to further evaluate her division’s condition? 5 Identify some typical research objectives for secondary data studies. 6 How might a retailer such as eBay use data mining? 7 What would be the best source for the following data? a population, average income and employment rates for Sydney, Australia b maps of New Zealand suburbs and cities c trends in automobile ownership in New Zealand d divorce trends in Australia e median weekly earnings of full-time, salaried workers for the previous five years in New Zealand f annual sales of the top 10 fast-food companies in Australia.

g top 10 websites ranked by number of unique visitors in the world h attendance at professional sports events in New Zealand for 2015. 8 Suppose you are a marketing research consultant and a client comes to your office and says: ‘I must have the latest information on the supply of and demand for bricks within the next 24 hours.’ What would you do? 9 Find the following data from the Australian Bureau of Statistics: a Australian gross domestic product for the first quarter of 2016 b exports of goods and services for December 2016 c imports of goods and services for December 2016. 10 Use the most recent census of your country to find the total population, median age, family types, Internet use and median income for: (a) your town (or state); and (b) the town (or state) in which your university is located. What are the differences? Why might they be important for a marketer of university education? 11 Use secondary data to learn the size of the New Zealand soccer market and to profile who plays soccer. 12 A newspaper reporter reads a study that surveyed US children and then reports that a high percentage of children recognise Joe Camel (US cigarette spokesperson), but fails to report that the study also found that a much higher percentage of children indicated very negative attitudes towards smoking. Is this a proper use of secondary data?

ONGOING PROJECT DOING A SECONDARY RESEARCH STUDY? CONSULT THE CHAPTER 4 PROJECT WORKSHEET FOR HELP

If you are doing a secondary research study, download the Chapter 4 project worksheet from the CourseMate website. The worksheet

lists all the steps that should be followed in using secondary data in a research study.

124

PART THREE > PLANNING THE RESEARCH DESIGN

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to log in to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ interactive quizzes

☑ case projects ☑ Internet research activities ☑ videos.

☑ flashcards

WRITTEN CASE STUDY 4.1 TWITTER: TRUTH OR DARE?30 Commenters have suggested that social media, in particular tweets, provide an incredible insight into the moods, thoughts and activities of society at large. Tweets can also be surprisingly accurate in the prediction of behaviour. A researcher from a US university found that tracking a few flu-related keywords in 500 million messages from September 2009 to May 2010 allowed him to predict 84 per cent of the amount of flu outbreaks in the USA. Other researchers are doing what is called ‘sentiment analysis’ of tweets, which codes words in tweets in terms of their emotional impact. There are limitations, though, as some words for different subcultures may have different meanings. ‘Sick’ may in some instances be a good thing for Gen Y tweeters. Some segments of the population also use Twitter more than others, suggesting a possible sample bias. It has been suggested that the analysis of tweets can be used to supplement political polling; however, some political parties in the US have used

‘Twitter bombs’, which send out a series of automated tweets to influence the public social media mood. In one US state senate election, some 60 000 tweet bombs were sent out before they were shut down as spam. To combat this practice, other researchers in the US have developed filters that screen out organised partisan spamming. One tool developed for this purpose is ‘Truthy’, which helps researchers identify resent tweets emanating from websites, rather than from people.

QUESTIONS

1 What are the advantages and disadvantages of analysing Twitter data? 2 What are the ethical considerations of using Twitter data? 3 For what kind of market research studies would Twitter data be useful?

WRITTEN CASE STUDY 4.2 THE ELECTRIC CAR IN AUSTRALIA Western Australia is the first Australian state to conduct an Electric Vehicle Trial – actually two trials, one on the vehicle side and one on the recharging side, both launched in early 2010. The advantages of electric cars seem obvious to the consumer. They are emission-free if charged from renewable energy sources. They create very little driving noise at low speeds and require very little service requirements as compared with a petrol/diesel car. Electric cars have much cheaper running costs compared to a petrol/diesel car and are not dependent on imported oil. There is no initial infrastructure required (electric vehicle owners can recharge their cars at home). Despite having a driving range of 300–400 km (at the time of writing), a big disadvantage is the relatively long recharging time.31 Another possible issue is that the recharging of electric cars may place additional pressures

on electricity grids in Australia, especially if recharging is done after a typical work day between 4.00 and 6.00 p.m. when there is already peak demand for electricity.32

QUESTIONS

1 Identify possible sources of information that show the market attractiveness or not of the electric car in Australia. 2 Find evidence that the introduction of the electric car may overload Australia’s electrical grid. 3 Which sources of information for 1 and 2 do you find more trustworthy and accurate? Why?

CHAPTER 04 > SECONDARY RESEARCH WITH BIG DATA

125

ONGOING CASE STUDY MOBILE PHONE SWITCHING AND BILL SHOCK While having coffee after discussing the results of the qualitative research, Steve asked David whether he knew how big the mobile phone service provider market was in Australia, and what percentage of the household budget was spent on this. Leanne suggested it might also be useful to examine whether there is any record of consumer complaints in this industry and on what basis, as this might help with the next stage of research. David said he would look at this over the next two weeks.

QUESTIONS

1 What kind of secondary data sources should David use to estimate the size of the mobile service provider market in Australia and how much consumers spend on this? 2 Using these sources, how big is the market and what amount do households spend on this category? 3 What are possible sources of the amount of consumer complaints in this industry, and how does this compare to other service industries?

NOTES 1 2 3 4 5 6 7

8 9 10 11 12 13

14 15 16

Joseph, S. (2014) ‘Twitter’s Facebookification well and truly under way’, Marketing Week (Online Edition), 2–2. Paredes, D. (2015) ‘Kiwis comfortable with mining social media to identify terrorist activity, but not for marketing’, Unisys. CIO (13284045), 1–1. OzTAM Australian Television Audience Measurement (2012) Metropolitan coverage maps, accessed at http://www.oztam.com.au/documents/other /Coverage%20 Maps%20-%20Updated%20151011.pdf on 25 October 2012. Australian Bureau of Statistics (2011) Australian Standard Geographical Classification, Cat. no. 1216.0, accessed at http://www.abs.gov.au/ausstats /[email protected]/mf/1216.0 on 25 October 2012. The idea for Exhibit 4.1 came from Joselyn, Robert W. (1977) Designing the marketing research project, New York: Petrocelli/Charter. New Zealand Film Commission (2015). Accessed at http://www.nzfilm.co.nz/ on 29 July 2015. Screen Australia (2012) ‘Domestic films’ share in Australia and selected countries 2001–2009’, accessed at http://www.screenaustralia.gov.au /research/statistics/ acompboxofficeozshare.asp on 25 October 2012. New Zealand film Commission (2015) accessed at http://www.nzfilm.co.nz/ Motion producers of pictures in Japan (2015) accessed at http://www.eiren.org/statistics_e/ Statistica (2015) Share of box office revenue in China, accessed at http://www.statista.com/statistics/260212/shareof-box-office-revenue-in-china-by-domestic-and-imported-movies/ on 3 August 2015, UNESCO Office of Statistics (2013) Emerging markets and the digitisation of the film industry. Kingston, A. (2014). ‘Get ready for generation Z’, Maclean’s, 127(28), 42–45. Fleming, Lee (1997) Digital delivery: Pushing content to the desktop, Digital Information Group, 31 January. MIT Media Lab (2012) Software Agents, accessed at http://agents.media.mit.edu on 25 October 2012. MIT Media Lab (2009) Software Agents: Commonsense Recommendations, accessed at www.media.mit.edu/research/groups /software-agents on 19 July 2009. Euromonitor International (2009) Selected Country Statistics, accessed at http:www. Euromonitor.com on 25 October 2012. Based on Nielsen (2014) 2015 US Music Report, accessed at http://www.nielsen. com/us/en/press-room/2015/2014-nielsen-music-report.html on 3 August 2015; top downloaded ringtones: Week of 27 October 2012, accessed at http://www.billboard. com/charts/ringtones# /charts/ringtones on 25 October 2012. Australian Recording Industry Association (2015) 2014 ARIA Yearly Statistics. Accessed at www.aria.com.au, on 3 August 2015. Gray, David, D’Alessandro, Steven & Carter, Leanne (2012) State of the Mobile Nation, Macquarie University, Sydney. Sources: http://www.reelseo.com/hours-minute-uploaded-youtube/, http:// venturebeat.com/2013/03/01/how-google-searches-30-trillion-web-pages100-billion-times-a-month/ and http://www.grabstats.com/statmain. aspx?StatID=402 accessed on 3 March 2016; Jones, Mark (2012) ‘Big data raises hopes and fears’, Australian Financial Review, 8 October, p. 28.

17 Rao, Srikumar S. (1996) ‘Technology: The hot zone’, Forbes, 18 November. 18 Sources: Based on Majewski, N. (2014) ‘Social gets serious’, B&T Magazine, 64(2805), 054–060, Marketing Week (2013). Hi-tech listening tools. Marketing Week, 21–21, Powers, T., Advincula, D., Austin, M. S., Graiko, S., & Snyder, J. (2012) ‘Digital and social media in the purchase decision process: A special report from the advertising research foundation’, Journal of Advertising Research, 52(4), 479–489. 19 New Media Age (2009) ‘Virgin Atlantic uses TagMan system to find top value sites’, New Media Age, 08–08. 20 Foley, John (1996) ‘Squeezing more value from data’, Information Week, 9 December. 21 Demiriz, A., Ertek, G., Atan, T., & Kula, U. (2011) ‘Re-mining item associations: Methodology and a case study in apparel retailing’, Decision Support Systems, 52(1), 284–293. 22 Sources: Wasserman, Todd, Khermouch, Gerry & Green, Jeff (2000) ‘Mining everyone’s business’, Brandweek, 28 February, p. 34; Sager, Ira (1996) ‘Big blue wants to mine your data’, Business Week, 3 June. 23 Excerpt reprinted with permission from Novack, Janet (1996) ‘The data miners’, Forbes, 12 February. Reprinted by permission of Forbes Magazine, © 2002 Forbes Inc. 24 Jones, Mark (2012) ‘Big data raises hopes and fears’, Australian Financial Review, 9 October, p. 28. 25 Barton, Dominic & Court, David (2012) ‘Making advanced analytics work for you’, Harvard Business Review, October, pp. 79–83. 26 Sources: Hargrave, S. (2008) ‘Ears to the Ground,’ New Media Age (17 January), 21; Notarantonio, E.M. (2009) ‘The Effectiveness of a Buzz Marketing Approach Compared to Traditional Advertising: An Exploration’, Journal of Promotion Management, 15 (October–December), pp. 455–64; Alahnah, M. & Khazanchi, D. (2010), ‘The Importance of Buzz’, Marketing Research, 22 (Summer), pp. 20–25. 27 Beck, Susan (2005) ‘The good, the bad and the ugly or why it’s a good idea to evaluate websites’, accessed at lib.nmsu.edu/instruction/evalcrit.html on 16 February 2006. 28 The Australian Woodworker Advertising Statistics and Rates, accessed at http://www. skillspublish.com.au/Skills%20Ad%20Rates.htm on 25 January 2013. 29 ACNielsen, Scan Track, accessed at nz.acnielsen.com/products/rms _market.shtml on 17 March 2006. 30 Savage, Neil (2011) ‘Twitter as the medium and the message’, Communications of ACM, 56(3), pp. 18–20. 31 Bräunl, Thomas (2011) ‘Why electric cars are in pole position’, The Conversation, Conversation Media Group, accessed at http: www.fativa.com on 12 November 2012. 32 Hepworth, Annabel & Guest, Debbie (2011) ‘Fears electric cars will short circuit the grid’, The Australian, 10 November, p. 2.

WHAT YOU WILL LEARN IN THIS CHAPTER

To define surveys and describe the type of information that should be gathered in a survey. To explain the advantages and disadvantages of surveys. To identify sources of error in survey research. To distinguish among the various categories of surveys.

Is life still good in Sydney?

Although the economic conditions for NSW and Sydney seemed buoyant during the 12 months from 2014 to 2015, research undertaken with 1000 Sydney residents suggests that Sydneysiders thought that living conditions were tougher, and that they were more cynical about the future.1 The survey research, conducted by McCrindle Research, shows that two-thirds of participants thought the city was worse than it was five years ago, and more than half thought it would be worse in five years time.

To summarise the different ways researchers implement surveys. To know the advantages and disadvantages of distributing a questionnaire via different means. To appreciate the importance of pretesting questionnaires. To describe ethical issues that arise in survey research.

126

The main issues confronting the participants were the general pace and stress of life (29 per cent), while 47 per cent claimed that planned infrastructure had failed to keep up with a growing population. This glum picture is in contrast to Sydney being ranked in the top 10 of liveable cities by the Economist Intelligence Unit and it being a centre for jobs, higher wages and good educational opportunities. It seems that the transformation of Sydney to a global city has not pleased everyone, though in the same survey, 92.5 per cent of respondents had a great or moderate sense of belonging to the city

iStock.com/andresr

05 »

SURVEY RESEARCH

PART THREE > PLANNING THE RESEARCH DESIGN

THE NATURE OF SURVEYS

ONGOING PROJECT

Surveys require asking people – called respondents – for information using either verbal or written questions. Questionnaires or interviews collect data through the mail, on the telephone or face-to-face. The more formal term, sample survey, emphasises that the purpose of contacting respondents is to obtain a representative sample of the target population. Thus, a survey is defined as a method of collecting primary data based on communication with a representative sample of individuals.

Survey objectives: Type of information gathered The type of information gathered in a survey varies considerably depending on its objectives. Typically, surveys attempt to describe what is happening or to learn the reasons for a particular marketing survey

activity.

Advantages of surveys Surveys provide a quick, inexpensive, efficient and accurate means of assessing information about a population. The examples given earlier illustrate that surveys are quite flexible and, when properly conducted, extremely valuable to the manager.

Take a look at the questions from the section of the student survey. In particular, examine the responses to the questions about how much time students spend blogging, how many social networking sites students subscribe to, and their involvement with various social and online activities. You can analyse the results just by taking frequencies (your instructor may do this for you). Are these types of questions appropriate for survey research? What sources of error might be present in these particular survey questions? Explain your response.

SURVEY THIS!

Courtesy of Qualtrics.com

respondent The person who verbally answers an interviewer’s questions or provides answers to written questions.

Marketing research has proliferated since the general adoption of the marketing concept. The growth of survey research is related to the simple idea that in order to find out what consumers think, you need to ask them. Over the last 50 years, and particularly during the last two decades, survey research techniques and standards have become quite scientific and accurate. When properly conducted, surveys offer managers many advantages. However, they can also be used poorly.

sample survey A more formal term for a survey. survey A method of collecting primary data in which information is gathered by communicating with a representative sample of people.

CHAPTER 05 > SURVEY RESEARCH

127

128

PART THREE > PLANNING THE RESEARCH DESIGN

REAL WORLD SNAPSHOT

THE POPULARITY OF THE SILVER FERN IN PROPOSED NEW ZEALAND FLAG DESIGNS2

According to research by Aardwolf Research Consulting, adding the silver fern looked to be the most popular of proposed changes to the New Zealand flag. The survey was conducted online. Each respondent was shown 10 sets of randomly selected flag designs from the 1794 choices on the government’s website, and participants were asked to pick their favourite flag from each set. The results showed that of the most 50 popular flags, 40 include the silver fern, all but two have the Southern Cross and 17 retain the Union Jack. Interestingly, of the 450 participants who took part in the survey, 69 per cent were against any change to the New Zealand flag. It may be no exaggeration to say that most surveys conducted today are a waste of time and money. Many are simply bad surveys. Samples may be biased; questions poorly phrased; interviewers not properly instructed and supervised; and results misinterpreted. Such surveys are worse than none at all because the sponsor may be misled into a costly area. Even well-planned and neatly executed surveys may be useless if, as often happens, the results come too late to be of value or are converted into a bulky report which no one has time to read.3 The disadvantages of specific forms of survey data collection (personal interview, telephone, mail, Internet and other self-administered formats) are discussed later in this chapter. However, errors are common to all forms of surveys, so it is appropriate to describe them generally.

ONGOING PROJECT

ERRORS IN SURVEY RESEARCH A manager who is evaluating the quality of a survey must estimate its accuracy. Exhibit 5.1 outlines the various forms of survey error. The two major sources of survey error are random sampling error and systematic error.

Random sampling error Most surveys try to portray a representative cross-section of a particular target population. However, even with technically proper random probability samples, statistical errors will occur because of chance variation in the elements selected for the sample. Without increasing sample size, these random sampling error A statistical fluctuation that occurs because of chance variation in the elements selected for a sample. systematic error Error resulting from some imperfect aspect of the research design that causes respondent error or from a mistake in the execution of the research. sample bias A persistent tendency for the results of a sample to deviate in one direction from the true value of the population parameter.

statistical problems are unavoidable. However, such random sampling errors can be estimated; Chapter 10 discusses these in greater detail.

Systematic error Systematic error results from some imperfect aspect of the research design or from a mistake in the execution of the research. Because all sources of error rather than those introduced by the random sampling procedure are included, these errors or biases are also called nonsampling errors. A sample bias exists when the results of a sample show a persistent tendency to deviate in one direction from the true value of the population parameter. An example of this is the way in which the data were collected in the real world snapshot on the popularity of new designs for the New Zealand flag. These responses were collected by advertisements on Google Adwords and Facebook Ads. While this is convenient, it may mean only certain parts of the population; generally young

CHAPTER 05 > SURVEY RESEARCH

129

and well-educated people may have responded to this survey invitation as they are more likely to be online. The many sources of error that in some way systematically influence answers can be divided into two general categories: respondent error and administrative error. These are discussed in the following sections. Nonresponse error

Random sampling error

Respondent error

Acquiescence bias

Deliberate falsification Response bias

Total error

Data processing error

Administrative error

Extremity bias Interviewer bias

Unconscious misrepresentation Systematic error (bias)

← EXHIBIT 5.1 CATEGORIES OF SURVEY ERRORS

Auspices bias Social desirability bias

Sample selection error Interviewer error Interviewer cheating

respondent error A category of sample bias resulting from some respondent action or inaction such as nonresponse or response bias.

Thinkstock

RESPONDENT ERROR Surveys ask people for answers. If people cooperate and give truthful answers, a survey will likely accomplish its goal. If these conditions are not met, nonresponse error or response bias, the two major categories of respondent error, may cause a sample bias.

Nonresponse error Few surveys have 100 per cent response rates. A researcher who obtains an 11 per cent response to a five-page questionnaire concerning various brands of spark plugs may face a serious problem. To use the results, the researcher must be sure that those who did respond to the questionnaire were representative of those who did not.

One problem with web-based surveys is that there is no way of knowing who exactly responded to the questionnaire.

130

PART THREE > PLANNING THE RESEARCH DESIGN

The statistical differences between a survey that includes only those who responded and a perfect survey (that would also include those who failed to respond) are referred to as nonresponse error. This problem is especially acute in mail and Internet surveys, but it also threatens telephone and face-

REAL WORLD SNAPSHOT

CULTURAL INFLUENCES ON NONRESPONSE ERROR

Nonresponse error can be high when conducting international marketing research. People in many cultures do not share Australian and New Zealanders’ views about providing information. In China citizens are reluctant to provide information over the phone to strangers. In many Islamic countries in Asia, such as Malaysia and Indonesia, some women may refuse to be interviewed by a male interviewer.

Shutterstock/Yui

to-face interviews.

People who are not contacted or who refuse to cooperate are called nonrespondents. A nonresponse occurs if someone is not at home at the time of both the initial call and a subsequent callback. The number of no contacts in survey research has been increasing because of the proliferation of answering machines and the growing use of caller ID to screen telephone calls.4 A parent who must juggle the telephone and a baby and refuses to participate in the survey because he or she is too busy is also a nonresponse. Refusals occur when people are unwilling to participate in the research. No contacts and refusals can seriously bias survey data. Arbitron, a research firm, had problems getting people to record their radio-listening habits in nonresponse error The statistical differences between a survey that includes only those who responded and a perfect survey that would also include those who failed to respond. nonrespondent A person who is not contacted or who refuses to cooperate in the research. no contact A person who is not at home or who is otherwise inaccessible on the first and second contact. refusal A person who is unwilling to participate in a research project. self-selection bias A bias that occurs because people who feel strongly about a subject are more likely to respond to survey questions than people who feel indifferent about it.

diaries every day; in one period only 28 per cent of a sample filled in the diary. Arbitron conducted a survey to find differences between people who were willing and unwilling to keep diaries. The diary keepers were found to favour middle-of-the-road, ‘beautiful’ music and news/talk stations; nonrespondents favoured contemporary and rap stations. Comparing the demographics of the sample with the demographics of the target population is one means of inspecting for possible biases in response patterns. If a particular group, such as older citizens, is under-represented or if any potential biases appear in a response pattern, additional efforts should be made to obtain data from the under-represented segments of the population. For example, personal interviews may be used instead of telephone interviews for the under-represented segments. After receiving a refusal from a potential respondent, an interviewer can do nothing other than be polite. The respondent who is not at home when called or visited should be scheduled to be interviewed at a different time of day or on a different day of the week. With a mail survey the researcher never really knows whether a nonrespondent has refused to participate or is just indifferent. Researchers know that those who are most involved in an issue are more likely to respond to a mail survey. Self-selection bias is a problem that frequently plagues self-administered questionnaires. In a restaurant, for example, a customer on whom a waiter spilled soup, a person who was treated to a surprise dinner or others who feel strongly about the service are more likely to complete a self-administered questionnaire left at the table than individuals who are indifferent about the restaurant. Self-selection biases distort surveys because they over-represent extreme positions while under-representing responses from those who are indifferent. Several

CHAPTER 05 > SURVEY RESEARCH

131

techniques are discussed later for encouraging respondents to reply to mail and Internet surveys. Selfselection bias can also occur if people complete self-administered surveys online, as is the case with the Real Snapshot example of New Zealand flag designs, because it is highly likely that respondents who answered questions on this issue feel strongly for and against changing the design of the New Zealand flag.

READERSHIP AND THE CIRCULATION OF NEWSPAPERS: ARE THEY THE SAME THING?5

Survey respondents saying one thing and doing another is a response bias problem. A good example of this is the difference between circulation and readership figures of major newspapers. Circulation figures usually reflect sales, while readership measures whether respondents have read and remember print media content. In a letter to The Australian on 19 November 2005, Roy Morgan Research6 argued that readership and circulation may be different because the readership of two major newspapers, the Herald Sun and the Daily Telegraph, rose during the 2004 Olympics in Greece and then declined afterwards, showing more people who are interested in the Olympics may have read the same newspaper article(s) per issue. On the other hand, promotions used to drive newspaper sales may mean people buy more newspapers to get tokens and enter competitions, which may increase circulation but not readership per newspaper sold (as they may not actually read the newspaper). Circulation figures also do not tell you who actually reads the newspaper or magazine. Even when questioned about which print media people actually read, there is always a possibility of a response bias. Business executives and wealthy individuals in Australia may be more inclined to state that they have read the Australian Financial Review (AFR) to show that they are knowledgeable and up to date on business issues. When business executives were mailed the questionnaires to their offices by Ipsos, 49 per cent stated that they read the AFR. In the Roy Morgan readership survey when interviewed personally as part of a household survey and not directly identified as an important group, this figure was closer to 23 per cent. The difference between the two surveys – besides the low response rates from mail compared with personal interviews – is that the Ipsos survey identifies respondents as important people in the beginning, while the Roy Morgan survey does not do this and records occupational and income information towards the end of the survey.

REAL WORLD SNAPSHOT

response bias

Response bias A response bias occurs when respondents tend to answer questions with a certain slant. People may consciously or unconsciously misrepresent the truth. If a distortion of measurement occurs because respondents’ answers are falsified or misrepresented, either intentionally or inadvertently, the resulting sample bias will be a response bias. A possible example of response bias has been discussed with respect to readership survey results. A response bias may occur because it is deliberate (the respondent intended to provide incorrect or the best information they could, which may not be accurate) or unconscious (the respondent did not consciously intend to deceive the interviewer or provide the wrong information).

DELIBERATE FALSIFICATION Occasionally people deliberately give false answers. It is difficult to assess why people knowingly misrepresent answers. A response bias may occur when people misrepresent answers to appear intelligent, to conceal personal information, to avoid embarrassment and so on. For example, respondents may be able to remember the total amount of money spent grocery shopping, but they may forget the exact prices of individual items that they purchased. Rather than appear ignorant

response bias A bias that occurs when respondents either consciously or unconsciously tend to answer questions with a certain slant that misrepresents the truth.

132

PART THREE > PLANNING THE RESEARCH DESIGN

or unconcerned about prices, they may provide their best estimate and not tell the truth – namely, that they cannot remember. Sometimes respondents become bored with the interview and provide answers just to get rid of the interviewer. At other times respondents provide the answers they think are expected of them to appear well informed. On still other occasions, they give answers simply to please the interviewer. Children when interviewed may want to do so in order to please the adult (usually the interviewer) or they may lack the verbal or emotional maturity to provide insightful responses to the contrary. When interviewing children the following techniques may be useful: 1 Allow for non-verbal descriptions (such as projective techniques). 2 Gain trust (build familiarity). 3 Keep the interview short. 4 Use concrete questions; avoid hypothetical questions.7

REAL WORLD SNAPSHOT

ILLEGAL DOWNLOADING IN AUSTRALIA8

According to survey research conducted by market research firm TNS, around 43 per cent of Australian digital consumers accessed at least one file illegally between March and May of 2015, compared to 20 per cent of UK consumers. Movies were the most popular form of digital content downloaded comprising 48 per cent of those who downloaded illegal material in Australia. Music was also popular, with around 37 per cent of those who downloaded illegal material choosing this format in Australia. The Australian survey found that people would stop infringing if the content were cheaper (37 per cent), more available (38 per cent) and had the same release date as other countries (36 per cent). Only 21 per cent of infringers said they would stop if they received a letter from their Internet service provider saying that their account would be suspended. These results are useful in the framing of government policy and in the responses of content providers looking to encourage the legal purchasing of digital entertainment.

UNCONSCIOUS MISREPRESENTATION Even when a respondent is consciously trying to be truthful and cooperative, response bias can arise from the question format, the question content or some other stimulus. For example, bias can be introduced by the situation in which the survey is administered. The results of two in-flight surveys concerning aircraft preference illustrate this point. Passengers flying on B-747s preferred B-747s to L-1011s (74 per cent versus 19 per cent), while passengers flying on L-1011s preferred L-1011s to B-747s (56 per cent versus 38 per cent). The difference in preferences appears to have been largely a function of the aircraft the respondents were flying on when the survey was conducted, although sample differences may have been a factor. A likely influence was the respondent’s satisfaction with the plane on which he or she was flying when surveyed. In other words, in the absence of any strong preference, the respondent may simply have identified the aircraft travelled on and indicated that as his or her preference.9 Respondents who misunderstand questions may unconsciously provide biased answers – or they may be willing to answer but unable to do so because they have forgotten the exact details. Asking ‘When was the last time you attended a concert?’ may result in a best-guess estimate because the respondent has forgotten the exact date.

TYPES OF RESPONSE BIAS There are five specific categories of response bias: acquiescence bias, extremity bias, interviewer bias, auspices bias and social desirability bias. These categories overlap and are not mutually exclusive. A single biased answer may be distorted for many complex reasons; some distortions being deliberate and some being unconscious misrepresentations.

CHAPTER 05 > SURVEY RESEARCH

Acquiescence bias Some respondents are very agreeable; these yea-sayers accept all statements they are asked about. This tendency to agree with all or most questions is known as acquiescence bias, and it is particularly prominent in new product research. Questions about a new product idea generally elicit some acquiescence bias because respondents give positive connotations to most new ideas. Another form of acquiescence is evident in some people’s tendency to disagree with all questions. Thus, acquiescence bias is a response bias due to the respondents’ tendency to concur with a particular position.10

Extremity bias Some individuals tend to use extremes when responding to questions; others always avoid extreme positions and tend to respond more neutrally. Response styles vary from person to person, and extreme responses may cause an extremity bias in the data.11 This issue is dealt with in Chapter 8 on measurement. Another form of extremity bias is the tendency of respondents who do not want to be seen as controversial, or do not want to offend the interviewer, to use the midpoint to indicate a neutral view on an issue. This may be particularly so in Asian cultures that embrace Confucian philosophy (for example, China/Hong Kong, Singapore and South Korea), which teaches moderation

133

acquiescence bias A category of response bias that results because some individuals tend to agree with all questions or to concur with a particular position. extremity bias A category of response bias that results because some individuals tend to use extremes when responding to questions. interviewer bias A response bias that occurs because the presence of the interviewer influences respondents’ answers. auspices bias Bias in the responses of subjects caused by their being influenced by the organisation conducting the study.

and avoids extremes. Respondents from such cultures may not wish to stand out in the crowd and may seek a non-offensive middle way when answering direct questions.12 A means of reducing such Getty Images/Fotosearch Premium

a bias may be to include a calibration question, which will assess the degree of midpoint bias.13 The response to this question can then be used to weight further questions to take into account any midpoint or other response bias issues.

Interviewer bias Response bias may arise from the interplay between interviewer and respondent. If the interviewer’s presence influences respondents to give untrue or modified answers, the survey will be marred by interviewer bias. Many homemakers and retired people welcome an interviewer’s visit as a break in routine activities. Other respondents may give answers they believe will please the interviewer rather than the truthful responses. Respondents may wish to appear intelligent and wealthy – of course they read Time rather than Playboy. The interviewer’s age, gender, style of dress, tone of voice, facial expressions or other nonverbal characteristics may have some influence on a respondent’s answers. If an interviewer smiles and makes a positive statement after a respondent’s answers, the respondent will be more likely to give similar responses. In a research study on sexual harassment against saleswomen, male interviewers might not yield as candid responses as female interviewers would. Many interviewers, contrary to instructions, shorten or rephrase questions to suit their needs. This potential influence on responses can be avoided to some extent if interviewers receive training and supervision that emphasise the necessity of appearing neutral. If interviews go on too long, respondents may feel that time is being wasted. They may answer as abruptly as possible with little forethought.

Auspices bias When respondents are interviewed by people or by organisations that they hold in high esteem (for example, the United Nations, OECD, Greenpeace, Red Cross, Salvation Army or a local church), their answers to the survey may be deliberately or subconsciously misrepresented because respondents wish to assist the organisation or individual conducting the study. This is known as auspices bias.

In Asia, cultural values about survey research differ from those in Australia and New Zealand. Asian people generally have less patience with the abstract and rational question wording commonly used in Australia and New Zealand. Researchers must be alert for culture-bound sources of response bias in international marketing research. For example, the Japanese do not wish to contradict others, leading to a bias towards acquiescence and yea-saying.

134

PART THREE > PLANNING THE RESEARCH DESIGN

Social desirability bias A social desirability bias (SDB) may occur either consciously or unconsciously because the respondent wishes to create a favourable impression or save face in the presence of an interviewer. Answering that one’s income is $35 000 a year might be difficult for someone whose self-concept is that of an upper-middle-class person ‘about to make it big’. Incomes may be inflated, education overstated or perceived respectable answers given to gain prestige. In contrast, answers to questions that seek factual information or responses about matters of public knowledge (postcode, number of children and so on) usually are quite accurate. An interviewer’s presence may increase a respondent’s tendency to give inaccurate answers to sensitive questions such as: ‘Did you vote in iStockphoto/Catharina van den Dikkenberg

the last election?’, ‘Do you have termites or cockroaches in your home?’ or ‘Do you colour your hair?’ Social desirability bias appears to be influenced by the degree of national development and culture. Research suggests that respondents from Asian countries such as Malaysia exhibit higher social desirability bias than Western countries such as the USA and France.14 It appears that social desirability bias in this study in Malaysia was influenced by reactions of family and friends, while in the USA this appeared to be society and the media. The French respondents, on the other hand, were not influenced greatly by other factors in terms of social desirability bias, other than the overall reaction from society. Therefore, research in international contexts needs to take into account not only the When asked in surveys, parents with children in nappies gave socially desirable answers, saying that they did not want the nappies they used to stereotype their children in male or female roles. However, when one company introduced genderspecific nappies, an Australian cultural tradition was reaffirmed: parents bought pink nappies for girls and would not put pink nappies on boys.

degree but also the causes of social desirability bias. Research also shows that the degree of reported consumer materialism and compulsive buying is negatively affected by the degree of SDB;15 in other words, respondents exhibiting SDB report being less materialistic and engaging in less compulsive buying.

Administrative error The results of improper administration or execution of the research task are administrative errors. They are caused by carelessness, confusion, neglect, omission or some other blunder. Four types of administrative error are data-processing error, sample-selection error, interviewer error and interviewer cheating.

social desirability bias Bias in responses caused by respondents’ desire, either conscious or unconscious, to gain prestige or appear in a different social role. administrative error An error caused by the improper administration or execution of the research task. data-processing error A category of administrative error that occurs because of incorrect data entry, incorrect computer programming or other procedural errors during data analysis. sample-selection error An administrative error caused by improper sample design or sampling procedure execution.

DATA-PROCESSING ERROR Processing data by computer, like any arithmetic or procedural process, is subject to error because data must be edited, coded and entered into the computer by people. The accuracy of data processed by computer depends on correct data entry to programming. Data-processing errors can be minimised by establishing careful procedures for verifying each step in the data-processing stage.

SAMPLE-SELECTION ERROR Sample-selection error is systematic error that results in an unrepresentative sample because of an error in either the sample design or the execution of the sampling procedure. Executing a sampling plan free of procedural error is difficult. A firm that selects its sample from the phone book will have some systematic error, because unlisted numbers are not included. Stopping female respondents during daytime hours in shopping centres excludes working people who shop by mail, Internet or telephone. In other cases, the wrong person may be interviewed. Consider a political pollster who uses randomdigit dialling to select a sample rather than a list of registered voters. Unregistered 17-year-olds may be willing to give their opinions, but they are the wrong people to ask because they cannot vote.

CHAPTER 05 > SURVEY RESEARCH

135

INTERVIEWER ERROR Interviewers’ abilities vary considerably. Interviewer error is introduced when interviewers record answers but check the wrong response or are unable to write quickly enough to record answers verbatim. Alternatively, selective perception may cause interviewers to misrecord data that do not support their own attitudes and opinions.

INTERVIEWER CHEATING Interviewer cheating occurs when an interviewer falsifies entire questionnaires or fills in answers to questions that have been intentionally skipped. Some interviewers cheat to finish an interview as quickly as possible or to avoid questions about sensitive topics. If interviewers are suspected of faking questionnaires, they should be told that a small percentage of respondents will be called back to confirm whether the initial interview was actually conducted. This should discourage interviewers from cheating.

Rule-of-thumb estimates for systematic error Sampling error due to random or chance fluctuations may be estimated by calculating confidence intervals, which is discussed in Chapter 12. The techniques for estimating systematic (or nonsampling) error are less precise. Many researchers have established conservative rules of thumb based on experience to estimate systematic error. They have found it useful to have some benchmark figures or standards of comparison to understand how much error can be expected. For example, according to some researchers in the consumer packagedgoods field, approximately one-half of those who say they ‘definitely will buy’ or ‘probably will buy’ within the next three months actually do make a purchase.16 For consumer durables, however, the figures are considerably lower: only about one-third of those who say they definitely will buy a certain durable within the next three months will actually do so. Among those who say they probably will buy, the number who actually purchase durables is so much lower that it is scarcely worth including in the early purchase estimates for new durables. Thus, researchers often present actual survey findings and their interpretations of estimated purchase response based on estimates of nonsampling error. For example, one pay-per-view pay television company surveys geographic areas it plans to enter and estimates the number of people who indicate they will subscribe to its service. The company knocks down the percentage by a ‘ballpark 10 per cent’ because experience in other geographic areas has indicated that there is a systematic upward bias of 10 per cent on this intentions question.

WHAT CAN BE DONE TO REDUCE SURVEY ERROR? Now that we have examined the sources of error in surveys, you may have lost some of your optimism about survey research. Don’t be discouraged! The discussion emphasised the bad news because it is important for marketing managers to realise that surveys are not a panacea. There are, however, ways to handle and reduce survey errors. For example, Chapter 8 on measurement and Chapter 9 on questionnaire design discuss the reduction of response bias. Chapter 10 discusses the reduction of sample selection and random sampling error. Indeed, much of the remainder of this book discusses various techniques for reducing bias in marketing research. The good news lies ahead!

interviewer error Mistakes made by interviewers who fail to record survey responses correctly. interviewer cheating The practice of filling in fake answers or falsifying questionnaires while working as an interviewer.

136

PART THREE > PLANNING THE RESEARCH DESIGN

CLASSIFYING SURVEY RESEARCH METHODS Now that we have discussed some advantages and disadvantages of surveys in general, it is appropriate to classify surveys according to several criteria. Surveys may be classified based on the method of communication, the degrees of structure and disguise in the questionnaire, and the time frame in which the data are gathered (temporal classification). Surveys can be classified according to the method of communicating with the respondent. This includes personal interviews, telephone interviews, mail surveys and Internet surveys. Surveys can also be classified according to structure and disguise and on time frame, and these are discussed next.

Structured and disguised questions In designing a questionnaire17 (or an interview schedule), the researcher must decide how much structure or standardisation is needed. A structured question limits the number of allowable responses. For example, the respondent may be instructed to choose one alternative response such as ‘under 18’, ‘18 to 35’ or ‘over 35’ to indicate his or her age. Unstructured questions do not restrict the respondent’s answers. An open-ended, unstructured question such as ‘Why do you shop at Coles?’ allows the respondent considerable freedom in answering. The researcher must also decide whether to use undisguised questions or disguised questions. A straightforward, or undisguised, question such as ‘Do you have dandruff problems?’ assumes that the respondent is willing to reveal the information. However, researchers know that some questions are threatening to a person’s ego, prestige or self-concept. Therefore, they have designed a number of structured question

indirect techniques of questioning to disguise the purpose of the study. Questionnaires can be categorised by their degree of structure and degree of disguise. For example, interviews in exploratory research might use unstructured-disguised questionnaires. The projective techniques discussed in Chapter 3 fall into this category. Other classifications are structured-undisguised, unstructured-undisguised and structured-disguised. These classifications have two limitations. First, the degree of structure and the degree of disguise vary: they are not clear-cut categories. Second, most surveys are hybrids, asking both structured and unstructured questions. Recognising the degrees of structure and disguise necessary to meet survey objectives will help in the selection of the appropriate

structured question A question that imposes a limit on the number of allowable responses. unstructured question A question that does not restrict the respondents’ answers.

communication medium for conducting the survey.

Temporal classification Although most surveys are for individual research projects conducted only once over a short time period, other projects require multiple surveys over a long period. Thus, surveys can be classified on a

undisguised question A straightforward question that assumes the respondent is willing to answer.

temporal basis.

disguised question An indirect question that assumes the purpose of the study must be hidden from the respondent.

Most marketing research surveys are cross-sectional studies and aim to find out descriptive information

cross-sectional study A study in which various segments of a population are sampled and data are collected at a single moment in time.

based on income and analysed to reveal similarities or differences among the income subgroups. Cross-

CROSS-SECTIONAL STUDIES about a market or relevant stakeholder group at a point in time. The typical method of analysing a cross-sectional survey is to divide the sample into appropriate subgroups. For example, if a winery expects income levels to influence attitudes towards wines, the data are broken down into subgroups sectional surveys are also frequently used as the results are quickly obtained and they cost less than other designs (which are discussed next). The disadvantage with cross-sectional designs is that they measure behaviour and attitudes at a point of time, so the predictability of findings is questionable.

CHAPTER 05 > SURVEY RESEARCH

137

LONGITUDINAL STUDIES In a longitudinal study respondents are questioned at two or more different times. The purpose of longitudinal studies is to examine continuity of responses and to observe changes that occur over time. The Australian Bureau of Statistics (ABS), for example, has found that those who purchased something online in the last year increased from 31 per cent of Internet users in 2004 to 74 per cent in 2012–13.18 Roy Morgan, a syndicated research company, found that the use of tablets or mobile phones to do banking by Australians had tripled in three years from 1.9 per cent in 2012 to 5.85, or an estimated 1.1 million consumers.19 The ABS has also published a number of reports on social trends that combine results from a number of different studies. One report found that household debt had increased from a rate of 10 per cent per year between 2001 and 2007, but had slowed after the global financial crisis to around $79 000 for each person in Australia, or about $1.84 trillion dollars at the end of 2013. The rise in household debt has only been partially matched by the increase in the value of household assets. Expressed as a percentage, household debt in Australia had increased from 11 per cent of household assets in 1988 to a little below 20 per cent in 2013.20 Such information is vital to the government and the Reserve Bank when they consider fiscal and

longitudinal study A survey of respondents at different times, thus allowing analysis of response continuity and changes over time. tracking study A type of longitudinal study that uses successive samples to compare trends and identify changes in variables, such as consumer satisfaction, brand image or advertising awareness. consumer panel A longitudinal survey of the same sample of individuals or households to record their attitudes, behaviour or purchasing habits over time. Shutterstock/Auremar

monetary policy, as well as other policy-makers and business groups. Longitudinal studies of this type are sometimes called cohort studies, because similar groups of people who share a certain experience during the same time interval (cohorts) are expected to be included in each sample. In applied marketing research, a longitudinal study that uses successive samples is called a tracking study because successive waves are designed to compare trends and identify changes in variables such as consumer satisfaction, brand image or advertising awareness. These studies are useful for assessing aggregate trends, but do not allow for tracking changes in individuals over time. Conducting surveys in waves with two or more sample groups avoids the problem of response bias resulting from a prior interview. A respondent who was interviewed in an earlier survey about a certain brand may become more aware of the brand or pay more attention to its advertising after being interviewed. Using different samples eliminates this problem. However, researchers can never be sure whether the changes in the variable being measured are due to a different sample or to an actual change in the variable over time.

Consumer panels A longitudinal study that gathers data from the same sample of individuals or households over time is called a consumer panel. Consider the packaged-goods marketer who wishes to learn about brand-switching behaviour. A consumer panel that consists of a group of people who record their purchasing habits in a diary over time will provide the manager with a continuous stream of information about the brand and product class. Diary data that are recorded regularly over an extended period enable the researcher to track repeat-purchase behaviour and changes in purchasing habits that occur in response to changes in price, special promotions or other aspects of marketing strategy. Panel members may be contacted by telephone, in a personal interview, by mail questionnaire or by email. Typically, respondents complete media exposure or purchase diaries and mail them back to the survey organisation. If the panel members have agreed to field test new products, face-to-face or telephone interviews may be required. The nature of the problem dictates which communication method to use.

Consumer panels provide longitudinal data. Most established commercial panels allow researchers to break panel data down by demographics. For example, researchers interested in Australian dual-income families can track how this demographic group’s purchasing behaviour changes over time.

138

PART THREE > PLANNING THE RESEARCH DESIGN

Because establishing and maintaining a panel is expensive, they often are managed by contractors who offer their services to many organisations. A number of commercial firms, such as Nielsen and Roy Morgan Research, specialise in maintaining consumer panels. In recent years, Internet panels have grown in popularity. Because clients of these firms need to share the expenses with other clients to acquire longitudinal data at a reasonable cost, panel members may be asked questions about a number of product classes. The first questionnaire a panel member is asked to complete typically includes questions about product ownership, product usage, pets, family members and demographic data. The purpose of such a questionnaire is to gather the behavioural and demographic data that will be used to identify heavy buyers, difficult-to-reach customers and so on for future surveys. Individuals who serve as members of consumer panels usually are compensated with cash, attractive gifts or the chance to win a sweepstakes. Marketers whose products are purchased by few households find panels an economical means of reaching respondents who own their products. A two-stage process typically is used. A panel composed of around 15 000 households can be screened with a one-question statement attached to another project. For example, a question in a questionnaire screens for ownership of certain uncommon products, such as LCD televisions and motorcycles. This information is stored in a database. Then households with the unusual item can be sampled again with a longer questionnaire.

TIPS OF THE TRADE

Standard survey research works best for implementing a descriptive research design. In particular, survey research is useful for describing: »» consumer beliefs »» consumer attitudes »» consumer opinions »» characteristics of respondents such as demographics and lifestyles. Avoid studying sensitive topics with survey research approaches. The extent to which respondents will actually behave as they describe in survey research is overstated. Some researchers discount behavioural intentions as stated in surveys by as much as 10 per cent.

DIFFERENT WAYS THAT MARKETING RESEARCHERS CONDUCT SURVEYS During most of the 20th century, survey data were obtained when individuals responded to questions asked by human interviewers (interviews) or to questions they read (questionnaires). Interviewers communicated with respondents face to face or over the telephone, or respondents filled out selfadministered paper questionnaires, which were typically distributed by mail. These media for conducting surveys remain popular with marketing researchers. However, as we mentioned in Chapters 1 and 4, in the 21st century digital technology is having a profound impact on society in general and on marketing research in particular. Its greatest impact is in the creation of new forms of communications media.

Human interactive media and electronic interactive media When two people engage in a conversation, human interaction takes place. Human interactive media are a personal form of communication. One human being directs a message to and interacts with another individual (or a small group). When they think of interviewing, most people envision two people

CHAPTER 05 > SURVEY RESEARCH

engaged in a face-to-face dialogue or a conversation on the telephone. Electronic interactive media allow marketers to reach a large audience, to personalise individual messages and to interact using digital technology. To a large extent electronic interactive media are controlled by the users themselves. No other human need be present. In the context of surveys, respondents are not passive audience members; they are actively involved in a two-way communication when electronic interactive media are used. The Internet, a medium that is radically altering many organisations’ research strategies, provides a prominent example of electronic interactive media. Consumers determine what information they will be exposed to, and for how long they will view or hear it. Electronic interactive media also include CD-ROM and DVD materials, touch-tone telephone systems, touch-screen interactive kiosks in shops, and other forms of digital technology.

Noninteractive media The traditional questionnaire received by mail and completed by the respondent does not allow a dialogue or an exchange of information providing immediate feedback. Hence, from our perspective, self-administered questionnaires printed on paper are noninteractive. This does not mean that they are without merit. It only means that this type of survey is not as flexible as surveys using interactive communication media. Each technique for conducting surveys has its merits and shortcomings. The purpose of this section is to explain when different types of surveys should be used. The section begins with a discussion of surveys that use live interviews. It next turns to noninteractive, self-administered questionnaires. It then explains how the Internet and digital technology are dramatically changing survey research.

USING INTERVIEWS TO COMMUNICATE WITH RESPONDENTS Interviews can be categorised based on the medium the researcher uses in communicating with individuals and recording data. For example, interviews may be conducted door to door, in shopping malls or on the telephone. Traditionally interview results have been recorded using paper and pencil, but computers increasingly are supporting survey research. The discussion about interviews begins by examining the general characteristics of face-to-face personal interviews. It then looks at the unique characteristics of door-to-door personal interviews, personal interviews conducted in shopping malls, and telephone interviews.

Personal interviews Although the history of marketing research is sketchy, the gathering of information through face-toface contact with individuals has a long history. Periodic censuses were used to set tax rates and aid military conscription in the ancient empires of Egypt and Rome.21 During the Middle Ages, the merchant families of Fugger and Rothschild prospered in part because their far-flung organisations enabled them to get information before their competitors could.22 Today, it is common for survey researchers to present themselves on doorsteps throughout Australia, New Zealand and the Asia Pacific region and announce: ‘Good afternoon, my name is__________________. I am _________________with Marketing Research Company and we are conducting a survey on _________________________________.’

139

140

PART THREE > PLANNING THE RESEARCH DESIGN

SURVEY THIS!

personal interview Face-to-face communication in which an interviewer asks a respondent to answer questions.

WHAT WENT WRONG?

How would you classify the survey you participated in as part of this class? Which approach did it use? What media type was involved? What do you think the response rate for this survey is? Email the survey link to 10 of your friends and simply tell them it is a Courtesy of Qualtrics.com survey about everyday things and you would like them to respond. Find out how many actually did respond. What is the click rate and response rate? Take a look at the screenshot of this section. What other survey media could be used to effectively collect this specific information? A personal interview is a form of direct communication in which an interviewer asks respondents questions face to face. This versatile and flexible method is a two-way conversation between interviewer and respondent.

Does gamification mean better results for surveys with children

and adolescents?23 One way of increasing young respondent interest in online surveys is thought to be through the use of greater visuals and by gamification (where respondents receive points for answers and interact with an avatar). An online experiment of 737

respondents aged 7–15 showed that while gamification increased respondent interest and response rates, there was no increase in the accuracy of the results. This suggests, as will be discussed later in this book, that the design of measurements and questions are still crucial to researchers gaining accurate information, even in this age of interactive technology.

ADVANTAGES OF PERSONAL INTERVIEWS Marketing researchers find that personal interviews offer many unique advantages. One of the most important is the opportunity for feedback.

HOME AWAY FROM HOME

Marketing managers at Marriott Corporation know that people want a hotel room to feel residential because they see a hotel room as a home away from home. Customers prefer a clearly marked place to sit down, straight furniture legs, light walls, big bathrooms and spacious desk areas. How did Marriott Corporation, one of the largest operator of hotels in the USA, learn about its customers’ preferences? Simple! Marketing researchers built an assortment of fake hotel rooms modelled on the competition and conducted surveys to test consumers’ reactions. Customers walked through and rated the rooms.

Alamy/Caro

WHAT WENT RIGHT?

CHAPTER 05 > SURVEY RESEARCH

The opportunity for feedback Personal interviews provide the opportunity to give feedback to the respondent. For example, in a personal interview a consumer who is reluctant to provide sensitive information may be reassured by the interviewer that his or her answers will be strictly confidential. Personal interviews also offer the lowest chance of misinterpretation of questions because the interviewer can clarify any questions respondents have about the instruction or questions. Circumstances may dictate that at the conclusion of the interview, the respondent be given additional information concerning the purpose of the study. This is easily accomplished with the personal interview.

Probing complex answers

141

probing A method used in personal interviews in which the interviewer asks the respondent for clarification of answers to standardised questions. item nonresponse The failure of a respondent to provide an answer to a survey question; the technical term for an unanswered question on an otherwise complete questionnaire.

An important characteristic of personal interviews is the opportunity to follow up by probing. If a respondent’s answer is too brief or unclear, the researcher may probe for a more comprehensive or clearer explanation. In probing, the interviewer asks for clarification of answers to standardised questions such as: ‘Can you tell me more about what you had in mind?’ (See Chapter 11 on fieldwork and editing for an expanded discussion of probing.) Although interviewers are expected to ask questions iStockphoto/Edyta Pawlowska

exactly as they appear on the questionnaire, probing allows them some flexibility. Depending on the research purpose, personal interviews vary in the degree to which questions are structured and in the amount of probing required. The personal interview is especially useful for obtaining unstructured information. Skilled interviewers can handle complex questions that cannot easily be asked in telephone or mail surveys.

Length of interview If the research objective requires an extremely lengthy questionnaire, personal interviews may be the only option. Generally telephone interviews last less than 10 minutes, whereas a personal interview can be much longer, perhaps 1½ hours. A general rule of thumb on mail surveys is that they should not exceed six pages.

Completeness of questionnaire The social interaction between a well-trained interviewer and a respondent in a personal interview increases the likelihood that the respondent will answer all the items on the questionnaire. The respondent who grows bored with a telephone interview may stop it at his or her discretion simply by terminating the call. Self-administration of a mail questionnaire, however, requires more effort by the respondent. Rather than write lengthy responses, the respondent may fail to complete some of the questions. Item nonresponse – the failure to provide an answer to a question – is least likely to occur when an experienced interviewer asks questions directly.

Props and visual aids Interviewing respondents face to face allows the investigator to show them new product samples, sketches of proposed advertising or other visual aids. In a survey to determine whether a superlightweight chainsaw should be manufactured, visual props were necessary because the concept of weight is difficult to imagine. Two small chainsaws currently on the market, plus a third wooden prototype disguised and weighted to look and feel like the proposed model, were put in the back of a station wagon. Respondents were asked to go to the car, pick up each chainsaw and compare them. This research could not have been done in a telephone interview or mail survey.

Entryware data collection software from Techneos Systems (www.techneos.com) can be used on mobile devices to automate personal interviews. Using easyto-carry handheld computers increases the time efficiency of interviewers and eliminates the need for later data entry.

142

PART THREE > PLANNING THE RESEARCH DESIGN

Marketing research that uses visual aids has become increasingly popular with researchers who investigate film concepts, advertising problems and filmgoers’ awareness of performers. Research for films often begins by showing respondents recordings of the prospective cast. After the film has been produced, film clips are shown and interviews conducted to evaluate its appeal, and especially which scenes to emphasise in advertisements.

High participation Although some people are reluctant to participate in a survey, the presence of an interviewer generally increases the percentage of people willing to complete the interview. Respondents typically are required to do no reading or writing – all they have to do is talk. Many people enjoy sharing information and insights with friendly and sympathetic interviewers.

DISADVANTAGES OF PERSONAL INTERVIEWS Personal interviews also have some disadvantages. Respondents are not anonymous and therefore may be reluctant to provide confidential information to another person. Suppose a survey asked top executives: ‘Do you see any major internal instabilities or threats (people, money, material and so on) to the achievement of your marketing objectives?’ Many managers may be reluctant to answer this sensitive question honestly in a personal interview in which their identities are known.

REAL WORLD SNAPSHOT

MATTERS OF TASTE24

Asking an opinion is easy to do over the phone or online, but not if you want people’s reactions to a new food, wine or perfume. For those opinions, you probably want people to sample the new product first, so a face-to-face interview seems to be the only option. However, researchers have some creative options. Sometimes researchers simply want to know whether consumers like the product, but in other situations they are trying to meet objectives such as maintaining the same taste after substituting a new ingredient. Sartori Foods uses a chart it calls the Italian Cheese Flavor Wheel to ask consumers to describe various cheeses. The chart matches consumer-friendly terms such as nutty, buttery and creamy with terms useful in the industry (for example, aromatic amino acids and sulphur compounds). Researchers can use an alternative to face-to-face interviews by sending both a sample of the product to potential respondents along with a colour wheel. A phone interview can then have the respondent use the wheel to describe the taste. Better yet, scientists have developed a virtual tongue. That’s right! A synthetic surface absorbs foods and identifies certain tastes by feeding information into a computer. This may eliminate the need for a respondent altogether. For instance, the virtual tongue has quite a wine palate. It can tell Chardonnay from Pinot Grigio and it can tell a 2009 vintage from a 2011 vintage. Well, this may be reliable, but many consumers are willing to stand in line to verify the wine’s characteristics by tasting it for themselves!

Interviewer influence Some evidence suggests that demographic characteristics of the interviewer influence respondents’ answers. For example, one research study revealed that male interviewers produced larger amounts of interviewer variance than female interviewers in a survey in which 85 per cent of the respondents were female. Older interviewers who interviewed older respondents produced more variance than other age combinations, whereas younger interviewers who interviewed younger respondents produced the least variance. Differential interviewer techniques may be a source of bias. The rephrasing of a question, the interviewer’s tone of voice and the interviewer’s appearance may influence the respondent’s answer. Consider the interviewer who has conducted 100 personal interviews. During the next one, he or she

CHAPTER 05 > SURVEY RESEARCH

143

may lose concentration and either selectively perceive or anticipate the respondent’s answer. The interpretation of the response may differ somewhat from what the respondent intended. Typically, the public thinks of the person who does marketing research as a dedicated scientist. Unfortunately, some interviewers do not fit that ideal. Considerable interviewer variability exists. Cheating is possible; interviewers may cut corners to save time and energy, faking parts of their reports by dummying up part or all of the questionnaire. Control over interviewers is important to ensure that difficult, embarrassing or time-consuming questions are handled in the proper manner.

Lack of anonymity of respondent Because a respondent in a personal interview is not anonymous and may be reluctant to provide confidential information to another person, researchers often spend considerable time and effort in the phrasing of sensitive questions to avoid social desirability bias. For example, the interviewer may show the respondent a card that lists possible answers and ask the respondent to read a category number, rather than be required to verbalise sensitive answers.

Cost Personal interviews are expensive – generally substantially more costly than mail, Internet or telephone surveys. The geographic proximity of respondents, the length and complexity of the questionnaire, and the number of people who are nonrespondents because they could not be contacted (not-at-homes) will all influence the cost of the personal interview.

DOOR-TO-DOOR INTERVIEWS AND SHOPPING MALL INTERCEPTS Personal interviews may be conducted at the respondents’ homes or offices, or in many other places. Increasingly, personal interviews are being conducted in shopping malls. Mall intercept interviews allow many interviews to be conducted quickly. Often respondents are intercepted in public areas of shopping malls and then asked to come to a permanent research facility to taste new food items or to view advertisements. The locale for the interview generally influences the participation rate, and thus the degree to which the sample represents the general population.

Door-to-door interviews The presence of an interviewer at the door generally increases the likelihood that a person will be willing to complete an interview. Because door-to-door interviews increase the participation rate, they provide a more representative sample of the population than mail questionnaires. People who do not have telephones, who have unlisted telephone numbers, or who are otherwise difficult to contact may be reached using door-to-door interviews. Such interviews can help solve the nonresponse problem; however, they may under-represent some groups and over-represent others. Door-to-door interviews may exclude individuals who live in multiple-dwelling units with security systems, such as high-rise apartment dwellers, or executives who are too busy to grant personal interviews during business hours. Telephoning an individual in one of these subgroups to make an appointment may make the total sample more representative; however, obtaining a representative sample of this security-conscious subgroup based on a listing in the telephone directory may be difficult. People who are at home and willing to participate, especially if interviewing is conducted in the daytime, are somewhat more likely to be homemakers or retired people. These and other variables related to respondents’ tendencies to stay at home may affect participation.

door-to-door interview A personal interview conducted at respondents’ doorsteps in an effort to increase the participation rate in a survey.

144

PART THREE > PLANNING THE RESEARCH DESIGN

Callbacks When a person selected to be in the sample cannot be contacted on the first visit, a systematic procedure is normally initiated to call back at another time. A callback, or an attempt to recontact an individual selected for the sample, is the major means of reducing nonresponse error. Calling back a sampling unit is more expensive than interviewing the person the first time round, because subjects who initially were not at home generally are more widely dispersed geographically than the original sample units. Callbacks in door-to-door interviews are important because not-at-home individuals (for example, working parents) may systematically vary from those who are at home (nonworking parents, retired people and the like).

Mall intercept interviews A personal conducted in shopping malls is referred to as a mall intercept interview, or shopping centre sampling. Interviewers typically intercept shoppers at a central point within the mall or at an entrance. The main reason mall intercept interviews are conducted is because their costs are lower. No travel is required to the respondent’s home; instead, the respondent comes to the interviewer, and many interviews can be conducted quickly in this way. A major problem with mall intercept interviews is that individuals usually are in a hurry to shop, so the incidence of refusal is high – typically around 50 per cent. Nevertheless, the commercial marketing research industry conducts more personal interviews in shopping malls than it conducts door to door. In a mall interview, the researcher must recognise that he or she should not be looking for a representative sample of the total population. Each mall will have its own target market’s characteristics, and there is likely to be a larger bias than with careful household probability sampling. However, personal interviews in shopping malls are appropriate when the target group is a special market segment such as the parents of children of bike-riding age. If the respondent indicates that he or she has a child of this age, the parent can then be brought into a rented space and shown several bikes. The mall intercept interview allows the researcher to show large, heavy or immobile visual materials, such as a television advertisement. A mall interviewer can give an individual a product to take home to use and obtain a commitment that the respondent will cooperate when recontacted later by telephone. Mall intercept interviews are also valuable when activities such as cooking and tasting of food must be closely coordinated and timed to follow each other. They may also be appropriate when a consumer durable product must be demonstrated. For example, when videocassette recorders and DVD players were innovations in the prototype stage, the effort and space required to set up and properly display these units ruled out in-home testing. callback An attempt to recontact an individual selected for a sample who was not available initially. mall intercept interview A personal interview conducted in a shopping mall.

GLOBAL CONSIDERATIONS Willingness to participate in a personal interview varies dramatically around the world. For example, in many Middle Eastern or Islamic countries women would rarely consent to be interviewed by a man. And in many countries the idea of discussing personal habits and personal-care products with a stranger would be highly offensive. Few people would consent to be interviewed on such topics. The norms about appropriate business conduct also influence business people’s willingness to provide information to interviewers. For example, conducting business-to-business interviews in Japan during business hours is difficult, because managers are strongly loyal to their firm and believe that they have an absolute responsibility to oversee their employees while on the job. In some cultures when a business person is reluctant to be interviewed, it may be possible to get a reputable third party

telephone interview

to intervene so an interview may take place.

CHAPTER 05 > SURVEY RESEARCH

145 Shutterstock/Sean Prior

Telephone interviews Good evening, I’m with a nationwide marketing research company. Are you watching television tonight? A: Yes. Did you see the made-for-television film on SBS?

For several decades, telephone interviews have been the mainstay of commercial survey research. The quality of data obtained by telephone may be comparable to the quality of the data collected in personal interviews. Respondents are more willing to provide detailed and reliable information on a variety of personal topics over the telephone than with personal interviews. Telephone surveys can provide representative samples of the general population of Australia or New Zealand, but may be a problem in less developed countries. Telephone interviews can provide representative samples. However, willingness to cooperate with telephone surveys has declined in recent years. In addition, the widespread use of answering machines and caller ID systems makes it increasingly difficult to contact individuals.

THE CHARACTERISTICS OF TELEPHONE INTERVIEWS Telephone interviews have several distinctive characteristics that set them apart from other survey techniques. The advantages and disadvantages of these characteristics

telephone interview A personal interview conducted by telephone; the mainstay of commercial survey research.

are discussed in this section.

Speed One advantage of telephone interviewing is the speed of data collection. Whereas data collection with mail or personal interviews can take several weeks, hundreds of telephone interviews can be conducted literally overnight. When the interviewer enters the respondents’ answers directly into a computerised system, the data processing speeds up even more.

Cost As the cost of personal interviews continues to increase, telephone interviews are becoming relatively inexpensive. It is estimated that the cost of telephone interviews is less than 25 per cent of the cost of door-to-door personal interviews. Travel time and costs are eliminated. However, the typical Internet survey is less expensive than a telephone survey.

PRIVACY LEGISLATION IN AUSTRALIA AND THE DO-NOT-CALL REGISTER25

In Australia, respondents may be interviewed by companies pretending to be market research organisations. One way of reducing nuisance calls is to register with the Australian Direct Marketing Association (ADMA; http://www.adma.com.au). The do-not-call register allows consumers to opt out of receiving any form of electronic communication from telemarketers and market research firms (the exception is political polling and political marketing). This follows similar initiatives by governments in the United Kingdom, USA and Canada. There are now significant penalties for companies who continue to contact people on this register – up to $200 000 per day per offence. The Association of Market and Social Research Organisations (AMSRO) supported a do-not-call register for telemarketing, but lobbied successfully for an exemption for legitimate market and social research. This means that members of the public who have registered with ADMA may still be contacted

EXPLORING RESEARCH ETHICS

»

146

PART THREE > PLANNING THE RESEARCH DESIGN

»

by the market research agencies. While it is possible to check the bona fides of any caller claiming to be a from a market research firm (by telephoning Surveyline on 1300 364 830), many in the public think that being on a do-not-call register should also exclude calls from telephone interviewers. AMSRO responded by running a national campaign called ‘Making your views count’, which aimed to lift response rates and show the importance of market research for better decision-making by business and organisations. Some of these are listed on the website at http://www.amsrs.com.au/about/information-for-the-general-public/ your-views-count. Many market research firms are now faced with a dilemma. By requesting market research information from people on a do-not-call register, they may face unfavourable reactions from the public and perhaps more regulation from government. Given the importance of market research for not-for-profit organisations and business, what is your view? How should professional bodies and market research agencies cope with this change?

Absence of face-to-face contact Telephone interviews are more impersonal than face-to-face interviews. Respondents may answer embarrassing or confidential questions more willingly in a telephone interview than in a personal interview. However, mail and Internet surveys, although not perfect, are better media for gathering extremely sensitive information because they can be completely anonymous. There is some evidence that people provide information on income and other financial matters only reluctantly, even in telephone interviews. Such questions may be personally threatening for a variety of reasons, and high refusal rates for this type of question occur with each form of survey research. Although telephone calls may be less threatening because the interviewer is not physically present, the absence of face-to-face contact can also be a liability. The respondent cannot see that the interviewer is still writing down the previous comment and may continue to elaborate on an answer. If the respondent pauses to think about an answer, the interviewer may not realise this and may go on to the next question. Hence, there is a greater tendency for interviewers to record no answers and incomplete answers in telephone interviews than in personal interviews.

Cooperation In some suburbs, people are reluctant to allow a stranger to come inside the house or even stop on the doorstep. The same people, however, may be perfectly willing to cooperate with a telephone survey request. Likewise, interviewers may be somewhat reluctant to conduct face-to-face interviews in certain suburbs, especially during evening hours. Telephone interviewing avoids these problems. However, some individuals refuse to participate in telephone interviews, and the researcher should be aware of potential nonresponse bias. Finally, there is some evidence that the likelihood that a call will go unanswered because a respondent is not at home varies by the time of day, the day of the week and the month. One trend is very clear: in the last decade, telephone response rates have dropped from 40 per cent to as low as 15 per cent.26 It is estimated that the average response rate in Australia is 27 per cent27 and that response rates are declining at about 3 per cent per year.28 In addition, it is increasingly difficult to establish contact with potential respondents for three major reasons: (1) the proliferation of telephone numbers dedicated exclusively to fax machines and/or computers; (2) the widespread use of a non-dedicated phone line to access the Internet; and (3) the use of call-screening devices to avoid unwanted calls.29 The universal acceptance of mobile phones in Australia further reduces the likely response rate. Mobile phone subscribers have a much lower response rate of around 8 per cent, and there has also been found to be high number of non-usable numbers when conducting mobile surveys.30

CHAPTER 05 > SURVEY RESEARCH

147

Many people who own telephone answering machines will not return a call to help someone conduct a survey. Some researchers argue that leaving the proper message on an answering machine will produce return calls. The message left on the machine should explicitly state that the purpose of the call is not sales related.31 Others believe no message should be left because researchers will reach respondents when they call back. The logic is based on the fact that answering machines are usually not turned on 100 per cent of the time. Thus, if enough callbacks are made at different times and on different days, most respondents will be reached.32 Caller ID services can have the same effect as answering machines if respondents do not pick up the phone when the display reads ‘out of area’ or when an unfamiliar survey organisation’s name and number appear on the display. Refusal to cooperate with interviews is directly related to interview length. A major study of survey research found that interviews of five minutes or less had a refusal rate of 21 per cent; interviews of between six and 12 minutes had 41 per cent refusal rates; and interviews of 13 minutes or more had 47 per cent refusal rates. In unusual cases a few highly interested respondents will put up with longer interviews. A good rule of thumb is to keep telephone interviews approximately 10 to 15 minutes long. In general, 30 minutes is the maximum amount of time most respondents will spend unless they are highly interested in the survey subject.

Representative samples Practical difficulties complicate obtaining representative samples based on listings in the telephone book. Unlisted phone numbers and numbers too new to be printed in the directory or available on the White Pages online can be problematic. People have unlisted phone numbers for two reasons: because of mobility and by choice. Individuals whose phone numbers are unlisted because of a recent move differ slightly from those with unpublished numbers. The unlisted group tends to be younger, more urban and less likely to own a single-family dwelling. Households that maintain unlisted phone numbers by choice tend to have higher incomes. It is also possible that some low-income households may be unlisted by circumstance. In Australia it is estimated that around 15 per cent of phone numbers are unlisted.33 Also complicating matters is that there are more mobile subscriptions in Australia than people – around 131 per 100 people34 – and that the number of landlines has been declining. In 2014 there were approximately 9 190 000.35 So, the increasing use by households of multiple telephone numbers for mobiles and Internet connections means that obtaining a representative sample in telephone interviewing is much more difficult than it used to be. Table 5.1 shows some of the differences in surveying respondents by landline or mobile phones. The problem of unlisted phone numbers can be partially resolved through the use of random digit dialling. Random digit dialling eliminates the counting of names in a list (for example, calling every fiftieth name in a column) and subjectively determining whether a directory listing is a business, institution or legitimate household. In the simplest form of random digit dialling, telephone exchanges (prefixes) for the geographic areas in the sample are obtained. Using a table of random numbers, the last four digits of the telephone number are selected. Telephone directories can be ignored entirely or used in combination with the assignment of one or several random digits. Random digit dialling also helps overcome the problem of new listings and recent changes in numbers. Unfortunately, the refusal rate in commercial random digit dialling studies (approximately 40 per cent) is higher than the 25 per cent refusal rate for telephone surveys that use only listed telephone numbers.

random digit dialling The use of telephone exchanges and a table of random numbers to contact respondents with unlisted phone numbers.

148

PART THREE > PLANNING THE RESEARCH DESIGN

TABLE 5.1 »

LANDLINE AND MOBILE PHONES – PROS AND CONS

Survey characteristic

Landline phones

Mobile phones

Ability to respond

Respond in privacy of their home or office.

More likely to be distracted or involved in some activity.

Sampling regionally

Region identified by area code.

Area code does not indicate region due to mobility.

Standardisation

Landline appliances relatively standardised.

Mobile phone technologies vary from one appliance to another.

Response

About one in four calls is answered. The best time to get an answer is on the weekend.

About one in three calls is answered. The best time to get an answer is during working hours or weekdays.

Refusals

Expect one third of people to refuse.

Expect half of people to refuse.

Compensation

Answering does not usually cost respondent money.

Listening to messages left by research company will cost respondent money.

Callbacks An unanswered call, a busy signal or a respondent who is not at home requires a callback. Telephone callbacks are much easier to make than callbacks in personal interviews. However, as mentioned, the use of voicemail is growing, and its effect on callbacks need to be studied.

Limited duration Respondents who run out of patience with the interview can merely end the call. To encourage participation, interviews should be relatively short. The length of the telephone interview is definitely limited.

Lack of visual medium Because visual aids cannot be used in telephone interviews, packaging research, copy testing of television and print advertising, and concept tests that require visual materials cannot be conducted by phone. Certain attitude scales and measuring instruments, such as the semantic differential (see Chapter 8), cannot be used easily because they require the respondent to see a graphic scale.

CENTRAL LOCATION INTERVIEWING Research agencies or interviewing services typically conduct all telephone interviews from a central location. They contract for 1300 or 1800 lines from long-distance telephone services at fixed rates, which allow them to make unlimited telephone calls throughout the entire country or within specific geographic areas. Such central location interviewing allows firms to hire a staff of professional interviewers and to supervise and control the quality of interviewing more effectively. When telephone central location interviewing Telephone interviews conducted from a central location using lines at fixed charges.

interviews are centralised and computerised, an agency or business can benefit from additional cost

computer-assisted telephone interview (CATI) Technology that allows answers to telephone interviews to be entered directly into a computer for processing.

COMPUTER-ASSISTED TELEPHONE INTERVIEWING

economies. It is also possible to use central location interviewing in international marketing research, given the fall in international telephone costs and the establishment of call centres in low labour cost countries such as India and the Philippines.

Advances in computer technology allow responses to telephone interviews to be entered directly into the computer in a process known as computer-assisted telephone interviewing (CATI). Telephone interviewers are seated at computer terminals. Monitors display the questionnaires, one question at a time, along with precoded possible responses to each question. The interviewer reads each question

CHAPTER 05 > SURVEY RESEARCH

149

as it appears on the screen. When the respondent answers, the interviewer enters the response directly into the computer and it is automatically stored in the computer’s memory. The computer then displays the next question on the screen. Computer-assisted telephone interviewing requires that answers to the questionnaire be highly structured. If a respondent gives an unacceptable answer (that is, one not precoded and programmed), the computer will reject it. Computer-assisted telephone interviewing systems include telephone management systems that select phone numbers, dial the numbers automatically and perform other labour-saving functions. These systems can automatically control sample selection by randomly generating names or fulfilling a sample quota. A computer can generate an automatic callback schedule. A typical call management system might schedule recontact attempts to recall no answers after two hours and busy numbers after 10 minutes, and allow the interviewer to enter a more favourable time slot (day and hour) when a respondent indicates that he or she is too busy to be interviewed. Software systems also allow researchers to request daily status reports on the number of completed interviews relative to quotas.

COMPUTERISED VOICE-ACTIVATED TELEPHONE INTERVIEW Technological advances have combined computerised telephone dialling and voice-activated computer messages to allow researchers to conduct telephone interviews without human interviewers. However, researchers have found that computerised voice-activated telephone interviewing works best with very short, simple questionnaires. One system includes a voice-synthesised module controlled by a microprocessor. With it the sponsor is able to register a caller’s single response such as ‘true/false’, ‘yes/no’, ‘like/dislike’ or ‘for/against’. This type of system has been used by television and radio stations to register callers’ responses to certain issues. One system, Telsol, begins with an announcement that the respondent is listening to a recorded message. Many people are intrigued with the idea of talking to a robot or a computer, so they stay on the line. The computer then asks questions, leaving blank tape in between to record the answers. If respondents do not answer the first two questions, the computer disconnects and goes to the next call.

AUTOMATING PHONE SURVEYS OF TEENS36

Automatic telephone surveys are a good way to reach all members of the family, not just the head of the household. What if you wanted to ask questions about holiday shopping, what’s for dinner or what kind of holiday the family would like? A short telephone survey may be the answer. One advantage is that no ‘real person’ has to hear the answers to potentially sensitive questions. Computer-assisted telephone interviewing (CATI) and computerised self-interviewing, in which the subjects listened to pre-recorded questions and then responded by entering answers with the telephone’s keypad, have been used to ask the ‘teen’ in the house about smoking. The researchers predicted that the young people would be more likely to say they smoke in the self-administered survey than in response to a live interviewer, because pressing keys on the keypad would feel more confidential. The interviewers were right. In the self-administered survey, teens were more likely to say they had smoked in the past 30 days or, if they had not smoked, to lack a firm commitment not to smoke in the future. Many of them indicated a parent was present while they answered the questions; when they did, their responses were less likely to indicate smoking desire or susceptibility. This pattern suggests that they might be under-reporting their smoking behaviour. These findings encourage researchers to be attentive to confidentiality when working with teenage subjects.

REAL WORLD SNAPSHOT

150

PART THREE > PLANNING THE RESEARCH DESIGN

GLOBAL CONSIDERATIONS The effectiveness of telephone interviewing varies by country due to economic development and the degree of acceptance of this form of interviewing. In Papua New Guinea, for example, telephone penetration rates are only around 40 per cent of the population.37 Different cultures often have different norms about proper telephone behaviour. For example, researchers dealing with Chinese respondents have learned that they will not open up to strangers on the telephone. Hence, researchers here usually find personal interviews more suitable than telephone surveys. In Japan, respondents consider it ill-mannered if telephone interviews last more than 20 minutes.

SELF-ADMINISTERED QUESTIONNAIRES Many surveys do not require an interviewer’s presence. Marketing researchers distribute questionnaires to consumers through the mail and in many other ways (see Exhibit 5.2). They insert questionnaires in packages and magazines. They may place questionnaires at points of purchase or in high-traffic locations in shops or malls. They may even fax or email questionnaires to individuals. Questionnaires are usually printed on paper, but they may be posted on the Internet as well as being sent via email. No matter how the self-administered questionnaires are distributed, they are different from interviews because the respondent takes responsibility for reading and answering the questions. Self-administered questionnaires present a challenge to the marketing researcher because they rely on the clarity of the written word rather than on the skills of the interviewer. The nature of selfadministered questionnaires is best illustrated by explaining mail questionnaires.

Mail questionnaires A mail survey is a self-administered questionnaire sent to respondents through the mail. This paperand-pencil method has several advantages and disadvantages. mail survey

GEOGRAPHIC FLEXIBILITY Mail questionnaires can reach a geographically dispersed sample simultaneously because interviewers are not required. Respondents (such as farmers) who are located in isolated areas or those (such as executives) who are otherwise difficult to reach can easily be contacted by mail. For example, a pharmaceutical firm may find that doctors are not available for personal or telephone interviews. However, a mail survey can reach both rural and urban doctors who practise in widely dispersed geographic areas.

COST Mail questionnaires are relatively inexpensive compared with personal interviews and telephone self-administered questionnaire A survey in which the respondent takes the responsibility for reading and answering the questions.

surveys, though they are not cheap. Most include follow-up mailings, which require additional postage

mail survey A self-administered questionnaire sent to respondents through the mail.

RESPONDENT CONVENIENCE

and printing costs. And it usually isn’t cost-effective to try to cut costs on printing – questionnaires photocopied on poor-grade paper have a greater likelihood of being thrown in the wastebasket than those prepared with more expensive, high-quality printing.

Mail surveys and other self-administered questionnaires can be filled out when the respondents have time; thus, there is a better chance that respondents will take time to think about their replies. Many

CHAPTER 05 > SURVEY RESEARCH

Email

Website invitations Electronic questionnaires

Social networking

Interactive kiosks Smartphones/ tablets

Snail mail Courier Printed (paper) questionnaires

Inserts Drop-offs Point of sale

hard-to-reach respondents place a high value on convenience and thus are best contacted by mail. In some situations, particularly in business-to-business marketing research, mail questionnaires allow respondents to collect facts, such as sales statistics, that they may not be able to recall without checking. Being able to check information by verifying records or, in household surveys, by consulting with other family members should provide more valid, factual information than either personal or telephone interviews would allow. A catalogue retailer may use mail surveys to estimate sales volume for catalogue items by sending a mock catalogue as part of the questionnaire. Respondents would be asked to indicate how likely they would be to order selected items. Using the mail allows respondents to consult other family members and to make their decisions within a reasonable time span.

ANONYMITY OF RESPONDENT In the cover letter that accompanies a mail or self-administered questionnaire, marketing researchers almost always state that the respondents’ answers will be confidential. Respondents are more likely to provide sensitive or embarrassing information when they can remain anonymous. For example, personal interviews and a mail survey conducted simultaneously asked the question: ‘Have you borrowed money at a regular bank?’ Researchers noted a 17 per cent response rate for the personal interviews and a 42 per cent response rate for the mail survey. Although random sampling error may have accounted for part of this difference, the results suggest that for research on personal and sensitive financial issues mail surveys are more confidential than personal interviews. Anonymity can also reduce social desirability bias. People are more likely to agree with controversial issues, such as extreme political candidates, when completing self-administered questionnaires than when speaking to interviewers on the phone or at their doorsteps.

151

← EXHIBIT 5.2 SELF-ADMINISTERED QUESTIONNAIRES CAN BE EITHER PRINTED OR ELECTRONIC

152

PART THREE > PLANNING THE RESEARCH DESIGN

ABSENCE OF INTERVIEWER Although the absence of an interviewer can induce respondents to reveal sensitive or socially undesirable information, it can also be a disadvantage. Once the respondent receives the questionnaire, the questioning process is beyond the researcher’s control. Although the printed stimulus is the same, each respondent will attach a different personal meaning to each question. Selective perception operates in research as well as in advertising. The respondent does not have the opportunity to question the interviewer. Problems that might be clarified in a personal or telephone interview can remain misunderstandings in a mail survey. There is no interviewer to probe for additional information or clarification of an answer, and the recorded answers must be assumed to be complete. Respondents have the opportunity to read the entire questionnaire before they answer individual questions. Often the text of a later question will provide information that affects responses to earlier questions.

STANDARDISED QUESTIONS Mail questionnaires typically are highly standardised, and the questions are quite structured. Questions and instructions must be clear-cut and straightforward; if they are difficult to comprehend, the respondents will make their own interpretations, which may be wrong. Interviewing allows for feedback from the interviewer regarding the respondent’s comprehension of the questionnaire. An interviewer who notices that the first 50 respondents are having some difficulty understanding a question can report this to the research analyst so that revisions can be made. With a mail survey, however, once the questionnaires are mailed, it is difficult to change the format or the questions.

TIME IS MONEY If time is a factor in management’s interest in the research results, or if attitudes are rapidly changing (for example, towards a political event), mail surveys may not be the best communication medium. A minimum of two or three weeks is necessary for receiving the majority of the responses. Follow-up mailings, which usually are sent when the returns begin to trickle in, require an additional two or three weeks. The time between the first mailing and the cut-off date (when questionnaires will no longer be accepted) normally is six to eight weeks. In a regional or local study, personal interviews can be conducted more quickly. However, conducting a national study by mail might be substantially faster than conducting personal interviews across the nation.

LENGTH OF MAIL QUESTIONNAIRE Mail questionnaires vary considerably in length, ranging from extremely short, postcard questionnaires to lengthy, multipage booklets that require respondents to fill in thousands of answers. A general rule of thumb is that a mail questionnaire should not exceed six pages in length. When a questionnaire requires a respondent to expend a great deal of effort, an incentive is generally required to induce the respondent to return the questionnaire. The following sections discuss several ways to obtain high response rates even when questionnaires are longer than average.

RESPONSE RATES Questionnaires that are boring, unclear or too complex get thrown in the wastebasket. A poorly response rate The number of questionnaires returned or completed divided by the number of eligible people who were asked to participate in the survey.

designed mail questionnaire may be returned by only 15 per cent of those sampled; thus, it will have a 15 per cent response rate. The basic calculation for obtaining a response rate is to count the number of questionnaires returned or completed, then divide the total by the number of eligible people who were contacted or requested to participate in the survey. Typically, the number in the denominator is adjusted for faulty addresses and similar problems that reduce the number of eligible participants.

CHAPTER 05 > SURVEY RESEARCH

153

The major limitations of mail questionnaires relate to response problems. Respondents who complete the questionnaire may not be typical of all people in the sample. Individuals with a special interest in the topic are more likely to respond to a mail survey than those who are indifferent. A researcher has no assurance that the intended subject will be the person who fills out the questionnaire. The wrong person answering the questions may be a problem when surveying corporate executives, doctors and other professionals, who may pass questionnaires on to subordinates to complete. There is some evidence that cooperation and response rates rise as home value increases. Also, if the sample has a high proportion of retired and well-off householders, response rates will be lower. Mail survey respondents tend to be better educated than nonrespondents. If they return the questionnaire at all, poorly educated respondents who cannot read and write well may skip open-ended questions to which they are required to write out their answers. Rarely will a mail survey have the 80 to 90 per cent response rate that can be achieved with personal interviews. However, the use of follow-up mailings and other techniques may increase the response rate to an acceptable percentage. If a mail survey has a low response rate, it should not be considered reliable unless it can be demonstrated with some form of verification that the nonrespondents are similar to the respondents.

INCREASING RESPONSE RATES FOR MAIL SURVEYS Nonresponse error is always a potential problem with mail surveys. Individuals who are interested in the general subject of the survey are more likely to respond than those with less interest or little experience. Thus, people who hold extreme positions on an issue are more likely to respond than individuals who are largely indifferent to the topic. To minimise this bias, researchers have developed a number of techniques to increase the response rate to mail surveys. For example, almost all surveys include postage-paid return envelopes. Forcing respondents to pay their own postage can substantially reduce the response rate. Using a stamped return envelope instead of a business reply envelope increases response rates even more.38 Designing and formatting attractive questionnaires and wording questions so that they are easy to understand also help ensure a good response rate. However, special efforts may be required even with a sound questionnaire. Several of these are discussed in the following subsections.

Cover letter The cover letter that accompanies a questionnaire or is printed on the first page of the questionnaire booklet is an important means of inducing a reader to complete and return the questionnaire. Exhibit 5.3 illustrates a cover letter and some of the points considered by a marketing research professional to be important in gaining respondents’ attention and cooperation. The first paragraph of the letter explains why the study is important. The basic appeal alludes to the social usefulness of responding. Two other frequently used appeals are asking for help (‘Will you do us a favour?’) and the egotistical appeal (‘Your opinions are important!’). Most cover letters promise confidentiality, invite the recipient to use an enclosed postage-paid reply envelope, describe any incentive or reward for participation, explain that answering the questionnaire will not be difficult and will take only a short time, and describe how the person was scientifically selected for participation. The use of various appeals to social usefulness or ego has been shown to increase response rates.39 A personalised letter addressed to a specific individual shows the respondent that he or she is important. Including an individually typed cover letter on letterhead rather than a printed form is an important element in increasing the response rate in mail surveys.

Money helps The respondent’s motivation for returning a questionnaire may be increased by offering monetary incentives or premiums. Although pens, lottery tickets and a variety of premiums have been used, monetary incentives appear to be the most effective and least biasing incentive. Although money may be useful to all respondents, its primary advantage may be that it attracts attention and creates

cover letter A letter that accompanies a questionnaire to induce the reader to complete and return the questionnaire.

154

PART THREE > PLANNING THE RESEARCH DESIGN

a sense of obligation. It is perhaps for this reason that monetary incentives work for all income categories. Often cover letters try to boost response rates with messages such as: ‘We know that the attached dollar [or coin] cannot compensate you for your time. It is just a token of our appreciation.’ Response rates increase dramatically when the monetary incentive is to be sent to a charity of the respondent’s choice rather than directly to the respondent.

Interesting questions The topic of the research and thus the point of the questions cannot be manipulated without changing the definition of the marketing problem. However, certain interesting questions can be added to the questionnaire, perhaps at the beginning, to stimulate respondents’ interest and to induce cooperation. Questions that are of little concern to the researchers, but which the respondents want to answer, may provide respondents who are indifferent to the major portion of the questionnaire with a reason for responding. EXHIBIT 5.3 → EXAMPLE OF COVER LETTER FOR HOUSEHOLD SURVEY

Date Dear Last week I wrote to you about a survey about to commence in your local area that seeks to identify ways of improving incentive programs that seek to support landholders in managing their properties. The information collected will be used by your local Catchment Management Authority (CMA) and other groups to improve the design and delivery of incentive programs. The survey will provide information on: • which types of programs landholders would like to see offered • landholders’ preferences for important design features • how best to communicate information about programs to interested landholders. By participating in the survey you will play a vital part in helping to improve incentive programs delivered in your area. We have no other way of obtaining this property level information about how landholders would like to see incentive programs changed. We are partnering with your local Catchment Management Authority (CMA) in completing this project. The project is being managed by [Institution], and other research partners include [Organisation], and [Organisation]. We hope that you will be willing to participate in this survey as your opinions are of great value for understanding how these programs can be improved. To show our thanks, you will be sent a book on farm forestry when your completed questionnaire has been received. The survey should take about 30 to 45 minutes to complete. You can be assured of complete confidentiality and anonymity. If you would prefer someone to go through the survey with you over the phone, call the free call number 1800 XXX XXX. If you have any questions, please contact [Name] on [landline number] or [email address] or me on [landline number] [email address]. Thank you very much for your help.

»

CHAPTER 05 > SURVEY RESEARCH

»

155

Associate Professor [Name] ---------------------------------------------------------------------------------------------------------+ Would you like a copy of the survey results? Please return this slip with your questionnaire. Name: Email:

Follow-ups Exhibit 5.4 shows graphic plots of cumulative response rates for two mail surveys. The curves are typical of most mail surveys: the response rates start relatively high for the first two weeks (as indicated by the steepness of each curve), then gradually taper off. After responses from the first wave of mailings begin to trickle in, most studies use a follow-up letter or postcard reminder. These request that the questionnaire be returned because a 100 per cent return rate is important. A follow-up may include a duplicate questionnaire or may merely be a reminder to return the original questionnaire. Multiple contacts almost always increase response rates. The more attempts made to reach people, the greater the chances of their responding. Both of the studies in Exhibit 5.4 used follow-ups. Notice how the cumulative response rates picked up around week four. Survey of research firms

Cumulative response proportion

0.30

← EXHIBIT 5.4 PLOTS OF ACTUAL RESPONSE PATTERNS FOR TWO COMMERCIAL SURVEYS

0.25 Survey of purchasing departments

0.20 0.15 0.10 0.05 0

1

2

3

4

5

6

7

8

Weeks after mailing

Advance notification Advance notification, by either letter or telephone, that a questionnaire will arrive has been successful in increasing response rates in some situations. For example, Nielsen has used this technique to ensure a high cooperation rate in filling out diaries of television watching. Advance notices that go out closer to the questionnaire mailing time produce better results than those sent too far in advance. The optimal lead time for advance notification is three days before the mail survey is to arrive.

Survey sponsorship Auspices bias may result from the sponsorship of a survey. One business-to-business marketer wished to conduct a survey of its wholesalers to learn their stocking policies and their attitudes concerning competing manufacturers. A mail questionnaire sent on the corporate letterhead very likely would have received a much lower response rate than the questionnaire actually sent, which used the letterhead of a commercial marketing research firm. Sponsorship by well-known and prestigious organisations such

156

PART THREE > PLANNING THE RESEARCH DESIGN

as universities or government agencies may also significantly influence response rates. Response rates in mail surveys in field experimental research have been found to be higher for non-profit organisations, such as a university (44 per cent), than for a commercial sponsor such as a theme park (17 per cent).40 A mail survey sent to members of a consumer panel will receive an exceptionally high response rate because panel members have already agreed to cooperate with surveys.

Other techniques Numerous other devices have been used for increasing response rates. For example, the type of postage (commemorative versus regular stamp), envelope size, colour of the questionnaire paper and many other factors have been varied in efforts to increase response rates. Each has had at least limited success in certain situations; unfortunately, under other conditions each has failed to increase response rates significantly. The researcher should consider his or her particular situation. For example,

Samsonite inserts this product registration questionnaire into all luggage and business case products. The chance to win a sweepstakes prize encourages consumers to respond to the questionnaire. The results of the questionnaire become key elements of Samsonite’s consumer database and its directmarketing programs.

CHAPTER 05 > SURVEY RESEARCH

157

the researcher who is investigating consumers faces one situation, but the researcher who is surveying corporate executives faces quite another.

KEYING MAIL QUESTIONNAIRES WITH CODES A marketing researcher planning a follow-up letter or postcard should not disturb respondents who already have returned the questionnaires. The expense of mailing questionnaires to those who already have responded is usually avoidable. One device for eliminating those who have already responded from the follow-up mailing list is to mark the questionnaires so that they may be keyed to identify members of the sampling frame who are nonrespondents. Blind keying of questionnaires on a return envelope (systematically varying the job number or room number of the marketing research department, for example) or a visible code number on the questionnaire have been used for this purpose. Visible keying is indicated with statements such as: ‘The sole purpose of the number on the last page is to avoid sending a second questionnaire to people who complete and return the first one.’ Ethical researchers key questionnaires only to increase response rates, thereby preserving respondents’ anonymity.

GLOBAL CONSIDERATIONS Researchers conducting surveys in more than one country must recognise that postal services and cultural circumstances differ around the world. Care also needs to be taken with incentives, especially if they are sweepstakes or lottery tickets, which may be seen as gambling, particularly in Muslim countries. Literacy rates also vary across nations. For this reason, market researchers when conducting surveys in less developed countries may rely on personal interviews rather than mail surveys.

SELF-ADMINISTERED QUESTIONNAIRES THAT USE OTHER FORMS OF DISTRIBUTION Many forms of self-administered, printed questionnaires are very similar to mail questionnaires. Airlines frequently pass out questionnaires to passengers during flights. Restaurants, hotels and other service establishments print short questionnaires on cards so that customers can evaluate the service. Tennis Magazine, Advertising Age, Wired and many other publications have used inserted questionnaires to survey current readers inexpensively, and often the results provide material for a magazine article. Many manufacturers use their warranty or owner registration cards to collect demographic information and data about where and why products were purchased. Using owner registration cards is an extremely economical technique for tracing trends in consumer habits. Again, problems may arise because people who fill out these self-administered questionnaires differ from those who do not. Extremely long questionnaires may be dropped off by an interviewer and then picked up later. The drop-off method sacrifices some cost savings because it requires travelling to each respondent’s location. email surveys

Email surveys Questionnaires can be distributed via email. Email is a relatively new method of communication, and some individuals cannot be reached this way. However, certain projects lend themselves to email surveys, such as internal surveys of employees or satisfaction surveys of retail buyers who regularly deal with an organisation via email. The benefits of incorporating a questionnaire in an email include the speed of distribution, lower distribution and processing costs, faster turnaround time, more flexibility and less handling of paper questionnaires. The speed of email distribution and the quick response time can be major advantages for surveys dealing with time-sensitive issues.

drop-off method A survey method that requires the interviewer to travel to the respondent’s location to drop off questionnaires that will be picked up later. email surveys Surveys distributed through electronic mail.

158

PART THREE > PLANNING THE RESEARCH DESIGN

It has been argued that many respondents feel they can be more candid in email than in person or on the telephone, for the same reasons they are candid on other self-administered questionnaires. However, maintaining respondents’ anonymity is difficult because a reply to an email message typically includes the sender’s address. Researchers designing email surveys should assure respondents that their answers will be confidential. Response rates in email and online surveys (discussed below) have been found to be lower than traditional methods,41 and may have response rates comparable to mail surveys – around 15 per cent. Using email or Internet surveys to represent the views of the population may still be questionable. This is because in many countries, Internet penetration (percentage of individuals with access to an Internet connection), although growing, is yet to reach levels where the vast majority (that is, over 75 per cent) are connected to the Internet (see Table 5.2). In the Asia Pacific region this level varies considerably from around 91.5 per cent of the population in South Korea and New Zealand to about 40 per cent in Malaysia. Australia and New Zealand, with Internet penetration levels of 89.6 per cent and 91.5 per cent respectively, are approaching levels where email and Internet surveys may become a more representative means of gaining market research information. Care should be taken in the use of Internet surveys in China, the world’s second biggest economy. The low level of Internet penetration in that country means that, as of 2015, when these data were collected, that surveys collected online are not yet representative of the population. TABLE 5.2 »

INTERNET PENETRATION IN THE ASIA PACIFIC REGION42

Nation

Population

Internet users

New Zealand

4 383 393

4 000 000

91.50%

South Korea

49 115 196

44 900 000

91.50%

Australia

22 751 014

20 200 000

89.60%

126 919 659

109 300 000

86.00%

5 674 472

4 500 000

80.70%

Taiwan

23 415 126

16 100 000

70.00%

China

1 367 485 383

626 600 000

46.00%

30 513 848

12 100 000

40.30%

Japan Singapore

Malaysia

Internet penetration

Other difficulties with email questionnaires include the use of anti-virus and firewall software, which make it harder to recruit respondents.43 The growing incidence of unsolicited email or ‘spam’, and legislation enacted in Australia in 2003, making it illegal to contact people without a pre-existing commercial relationship, means that implementing email surveys is becoming increasingly difficult. Respondents may also vary in their technical proficiency, and be concerned about the risk of viral and worm infection of their computers from unsolicited emails. To partially overcome this, some researchers give respondents the option to print out the questionnaire, complete it in writing and return it via regular mail. Unless the research is an internal organisational survey, this, of course, requires the respondent to pay postage. Another important key to gaining respondent cooperation may be to ask their permission by other means (personal or telephone interviews) before an email survey is sent to them. Two types of email surveys may be used. The email message itself may form the questionnaire, to which the respondent provides information in the body of their reply, or an attachment to be opened by the respondent and completed electronically and then emailed back to the researcher. This latter method is gaining in popularity, especially since many popular software programs (Word, Excel and Access) allow for users to electronically complete checkboxes or response categories.

CHAPTER 05 > SURVEY RESEARCH

159

In general, the guidelines for printed mail surveys apply to email surveys. However, there are some differences because the cover letter and the questionnaire appear in a single email message. A potential respondent who is not immediately motivated to respond, especially one who considers an unsolicited email survey to be ‘spam’, can quickly hit the delete button to remove the email. This suggests that email cover letters should be brief and the questionnaires relatively short. The cover letter should explain how the company obtained the recipient’s name. It should include a valid return email address in the ‘from’ box and reveal who is conducting the survey. Also, if the email lists more than one address in the ‘to’ or ‘cc’ field, all recipients will see the entire list of names. This has the potential to cause response bias and nonresponse error. When possible, the email should be addressed to a single person. (The blind carbon copy, or bcc, field can be used if the same message must be sent to an entire sample.)44 Email has another important role in survey research: Email letters can be used as cover letters asking respondents to participate in an Internet survey. Such emails typically provide a password and a link to a unique website location that requires a password for access.

Internet surveys

Internet surveys An Internet survey is a self-administered questionnaire posted on a website. Respondents provide answers to questions displayed onscreen by highlighting a phrase, clicking an icon or keying in an answer. Many in the survey research community believe Internet surveys are the wave of the future.

Internet survey A self-administered questionnaire posted on a website.

This page from the Comscore website (http://www.comscore.com) shows part of an Internet survey to assess consumers’ attitudes towards banner advertising. The Internet is an excellent medium for survey research on visual materials containing animation. Go to this site to take a demonstration Internet survey.

SPEED AND COST-EFFECTIVENESS Internet surveys allow marketers to reach a large audience (possibly a global one) to personalise individual messages and to secure confidential answers quickly and cost-effectively. These computer-to-computer selfadministered questionnaires eliminate the costs of paper, postage and data entry, as well as other administrative costs. Data entry, for example, occurs automatically as the results from each online survey are sent electronically to data files on a computer server. Once an Internet questionnaire has been developed, the incremental cost of reaching additional respondents is marginal. Hence, samples can be larger than with interviews or other types of self-administered questionnaires. Even with larger samples, surveys that used to take many weeks can be conducted in a week or less.

VISUAL APPEAL AND INTERACTIVITY Surveys conducted on the Internet can be interactive. The researcher can use more sophisticated lines of questioning based on the respondents’ prior answers. Many of these interactive surveys use colour, sound and animation, which may help to increase respondents’ cooperation and willingness to spend time answering the questionnaires. The Internet is an excellent medium for the presentation of visual materials, such as photographs or drawings of product prototypes, advertisements and film trailers. Innovative measuring instruments that take advantage of the ability to adjust backgrounds, fonts, colour and other features have been designed and applied with considerable success.

http://www.comscore.com.au

Like every other type of survey, Internet surveys have both advantages and disadvantages.

160

PART THREE > PLANNING THE RESEARCH DESIGN

Digital Marketing Services’ Video E-Val is a proprietary technique combining a CD-ROM that is mailed to potential respondents and Internet software that controls the playing of high-quality video clips from the disc.45 This technique allows researchers to evaluate television advertisements, television programs and other large video files without being restricted by the small percentage of potentially qualified respondents who have access to broadband communications.

RESPONDENT PARTICIPATION AND COOPERATION Participation in some Internet surveys occurs because computer users intentionally navigate to a particular website where questions are displayed. For example, a survey of over 10 000 visitors to the Ticketmaster website helped Ticketmaster better understand its customer purchase patterns and evaluate visitor satisfaction with the site. In some cases individuals expect to encounter a survey at a website; in other cases it is totally unexpected. In some instances the visitor cannot venture beyond the survey page without providing information for the organisation’s ‘registration’ questionnaire. When the computer user does not expect a survey on a website and participation is voluntary, response rates are low. And, as with other questionnaires that rely on voluntary self-selection, participants tend to be more interested in or involved with the subject of the research than the average person. In order to increase response rates in Internet surveys, many companies now use online panels, whereby a respondent becomes a member of a survey group and gets paid for the completion of surveys. While this may greatly increase response rates – up to 80 per cent as it is claimed in some cases – there are concerns about response bias and whether people interviewed repeatedly by market research companies represent the views of the public.46 Ideally, the welcome screen contains the name of the research company and information about how to contact the organisation if the respondent has a problem or concern. A typical statement might be: ‘If you have any concerns or questions about this survey or if you experience any technical difficulties, please contact [name of research organisation].’

REAL WORLD SNAPSHOT

A TALE OF TWO PANELS – WHICH ONE DO YOU THINK IS BETTER?

In 2005 Newspoll launched its Online Omnibus survey, a questionnaire that asks respondents a range of questions from a number of clients and is usually used in advertising tracking studies (see ‘company profile’ at http://www.newspoll.com.au). The Omnibus survey is based on a national sample of 1200 adults aged 18 to 64. Respondents are sourced from a dedicated market research panel managed by Lightspeed Research, which specialises in online panel management globally. In the same year, UK-based panel specialist Research Now announced the launch of its online panel with access to 60 000 household members. The panel is managed locally from Research Now offices in Sydney and Brisbane under the website Valued Opinions (see http://www.valuedopinions.com.au). Visit both sites and, based on what you have read so far, judge which you think is the better company to collect market research information. Some important panel management practices concern the identity of respondents and ensuring that questionnaires are only completed once per respondent. This is seen as an issue when incentives are used.47 For many other Internet surveys, respondents are initially contacted via email. Often they are members of consumer panels who have previously indicated their willingness to cooperate. When individuals receive an email invitation to participate, they are given a password or a personal identification

welcome screen The first Web page in an Internet survey, which introduces the survey and requests that the respondent enter a password or PIN.

number (PIN). The email invitation also provides a link to a URL or instructs a user to visit a certain website that contains a welcome screen. Like a cover letter in a mail survey, the welcome screen on an Internet survey serves as a means to gain respondents’ cooperation and provides brief instructions. Experienced researchers require a respondent to provide a password or PIN to move from the welcome page to the first question. This prevents access by individuals who are not part of the scientifically selected sample.

CHAPTER 05 > SURVEY RESEARCH

Assigning a unique password code also allows the researchers to track the responses of each respondent, thereby identifying any respondent who makes an effort to answer the questionnaire more than once. Other ways that this problem can be reduced is through the use of online cookies and only allowing one access per Internet provider address.48

REPRESENTATIVE SAMPLES The population to be studied, the purpose of the research and the sampling methods determine the quality of Internet samples, which varies substantially. If the sample consists merely of those who visit a Web page and voluntarily fill out a questionnaire, it is not likely to be representative of the entire population, because of self-selection error. However, if the purpose of the research is to evaluate how visitors feel about a website, randomly selecting every 100th visitor may accomplish the study’s purpose. Of course, a major disadvantage of Internet surveys is that some individuals in the general population cannot access the Internet – and all people with Internet access do not have the same level of technology. Many people with low-speed Internet connections (low bandwidth) cannot quickly download high-resolution graphic files. Many lack powerful computers or software that is compatible with the advanced features programmed into many Internet questionnaires. Some individuals have minimal computer skills. They may not know how to navigate through and provide answers to an Internet questionnaire. For example, the advanced audio and video streaming technology of RealPlayer or Windows Media Player software can be used to incorporate a television advertisement, plus questions about its effectiveness, into an Internet survey. However, some respondents might find downloading the file too slow or even impossible; others might not have the RealPlayer or Windows Media Player software; and still others might not know how to use the streaming media software to view the advertisement. It appears that for the foreseeable future, Internet surveys sampling the general public should be designed with the recognition that problems may arise for the reasons just described. Thus, photographs, animation or other cutting-edge technological features created on the researcher’s (or Web designer’s) powerful computer may have to be simplified or eliminated, so that all respondents can interact at the same level of technological sophistication. Because Internet surveys can be accessed at any time from anywhere, they can reach certain hardto-reach respondents, such as doctors. Chapter 10 discusses sampling techniques for Internet surveys.

ACCURATE REAL-TIME DATA CAPTURE The computer-to-computer nature of Internet surveys means that each respondent’s answers are entered directly into the researcher’s computer as soon as the questionnaire is submitted. In addition, the questionnaire software may be programmed to reject improper data entry. For example, on a paper questionnaire a respondent might incorrectly check two responses, even though the instructions call for a single answer. In an Internet survey, this mistake can be interactively corrected as the survey is taking place. Thus, the data capture is more accurate than when humans are involved. Real-time data capture allows for real-time data analysis. A researcher can review up-to-theminute sample size counts and tabulation data from an Internet survey in real time.

CALLBACKS When the sample for an Internet survey is drawn from a consumer panel, it is easy to recontact those who have not completed the survey questionnaire. It is often a simple matter of having the computer software automatically send email reminders to panel members who did not visit the welcome page. Computer software also can identify the passwords of respondents who completed only a portion

161

162

PART THREE > PLANNING THE RESEARCH DESIGN

of the questionnaire, and send those people customised messages. Sometimes such emails offer additional incentives to any individuals who terminated the questionnaire with only a few additional questions to answer so that they are motivated to comply with the request to finish the questionnaire.

PERSONALISED AND FLEXIBLE QUESTIONING Computer-interactive Internet surveys are programmed in much the same way as computer-assisted telephone interviews. That is, the software allows questioning to branch off into two or more different lines, depending on a respondent’s answer to a filtered question. The difference is that there is no interviewer. The respondent interacts directly with software on a website. In other words, the computer program asks questions in a sequence determined by the respondent’s previous answers. The questions appear on the computer screen, and answers are recorded by simply pressing a key or clicking an icon, thus immediately entering the data into the computer’s memory. Of course, these methods avoid labour costs associated with data collection and processing of paper-and-pencil questionnaires. This ability to sequence questions based on previous responses is a major advantage of computerassisted surveys. For example, the computer can be programmed to skip from question six to question nine if the answer to question six is ‘No’. Furthermore, responses to previous questions can lead to questions that can be personalised for individual respondents (for example, ‘When you cannot buy your favourite brand, Revlon, what brand of lipstick do you prefer?’). Often the respondent’s name appears in questions to personalise the questionnaire. Fewer and more relevant questions speed up the response process and increase the respondent’s involvement with the survey. dialogue box A window that opens on a computer screen to prompt the user to enter information.

Use of a variety of dialogue boxes (windows that prompt the respondent to enter information) allows designers of Internet questionnaires to be creative and flexible in the presentation of questions. Chapter 9 discusses software issues, the design of questions, and questionnaire layouts for Internet surveys.

RESPONDENT ANONYMITY Respondents are more likely to provide sensitive or embarrassing information when they can remain anonymous. The anonymity of the Internet encourages respondents to provide honest answers to sensitive questions. This is particularly useful in health research and studies dealing with illegal behaviour, such as estimating the amount of illegal downloads, as was discussed earlier in this chapter.

RESPONSE RATES As mentioned earlier, with a password system, people who have not participated in a survey in a predetermined period of time can be sent a friendly email reminder asking them to participate before the study ends. This kind of follow-up – along with preliminary notification, interesting early questions and variations of most other techniques for increasing response rates to mail questionnaires – is recommended for Internet surveys. A rule of thumb though for many online surveys is that response rates may be as low as 10 per cent.

SECURITY CONCERNS Many organisations worry that hackers or competitors may access websites in order to discover new product concepts, new advertising campaigns and other top-secret ideas. Respondents may worry whether personal information will remain private. No system can be 100 per cent secure. However, many research service suppliers specialising in Internet surveying have developed password-protected systems that are very secure. One important feature of these systems restricts access and prevents individuals from filling out a questionnaire over and over again.

CHAPTER 05 > SURVEY RESEARCH

ASSESSING MARKET RESEARCH PANELS

In order to validate information gained from a market research panel, the following questions need to be addressed: 1 How was the panel recruited? 2 What panel management practices were in place? 3 How large is the panel and what proportion of the panel is active? 4 How many new panellists are being recruited?

163

TIPS OF THE TRADE

Kiosk interactive surveys A computer with a touch screen may be installed in a kiosk at a trade show, at a professional conference, in an airport or in any other high-traffic location to administer an interactive survey. Because the respondent chooses to interact with an on-site computer, self-selection often is a problem with this type of survey. Computer-literate individuals are most likely to complete these interactive questionnaires. At temporary locations such as conventions, these surveys often require a fieldworker to be at the location to explain how to use the computer system. This is an obvious disadvantage.

Survey research that mixes modes For many surveys, research objectives dictate the use of some combination of telephone, mail, email, Internet and personal interview. For example, the researcher may conduct a short telephone screening interview to determine whether respondents are eligible for recontact in a more extensive personal interview. Such a mixed-mode survey combines the advantages of the telephone survey (such as fast screening) with those of the personal interview. A mixed-mode survey can employ any combination of two or more survey methods. Conducting a research study in two or more waves, however, creates the possibility that some respondents will no longer cooperate or will be unavailable in the second wave of the survey.

mixed-mode survey A study that employs any combination of survey methods.

Several variations of survey research use pay television channels. For example, a telephone interviewer calls a subscriber and asks him or her to tune in to a particular channel at a certain time. An appointment is made to interview the respondent shortly after the program or visual material is displayed. Foxtel uses this type of mixed-mode survey to test the concepts for many proposed new programs.

SELECTING THE APPROPRIATE SURVEY RESEARCH DESIGN Earlier discussions of research design and problem definition emphasised that many research tasks may lead to similar decision-making information. There is no best form of survey: each has advantages and disadvantages. A researcher who must ask highly confidential questions may use a mail survey, thus sacrificing speed of data collection to avoid interviewer bias. If a researcher must have considerable control over question phrasing, central location telephone interviewing may be appropriate. To determine the appropriate technique, the researcher must ask several questions: Is the assistance of an interviewer necessary? Are respondents interested in the issues being investigated? Will cooperation be easily attained? How quickly is the information needed? Will the study require a long and complex questionnaire? How large is the budget? The criteria – cost, speed, anonymity and so forth – may differ for each project.

ONGOING PROJECT

164

PART THREE > PLANNING THE RESEARCH DESIGN

Table 5.3 summarises the major advantages and disadvantages of typical door-to-door, mall intercept, telephone, mail and Internet surveys. It shows the typical types of surveys. For example, a creative researcher might be able to design highly versatile and flexible mail questionnaires, but most researchers use standardised questions. An elaborate mail survey may be far more expensive than a short personal interview, but generally this is not the case. TABLE 5.3 »

ADVANTAGES AND DISADVANTAGES OF TYPICAL SURVEY METHODS

Door-to-door personal interview

Mall intercept personal interview

Telephone interview

Mail survey

Internet survey

Speed of data collection

Moderate to fast

Fast

Very fast

Slow; researcher has no Instantaneous; 24/7 control over return of questionnaire

Geographic flexibility

Limited to moderate

Confined, possible urban bias

High

High

High (worldwide)

Respondent cooperation

Excellent

Moderate to low

Good

Moderate; poorly designed questionnaire will have low response rate

Varies depending on website; high from consumer panels

Versatility of questioning

Quite versatile

Extremely versatile

Moderate

Not versatile; requires highly standardised format

Extremely versatile

Questionnaire length

Long

Moderate to long

Moderate

Varies depending on incentive

Moderate; length customised based on answers

Item nonresponse rate

Low

Medium

Medium

High

Software can assure none

Possibility for respondent misunderstanding

Low

Low

Average

High; no interviewer present for clarification

High

Degree of interviewer High influence on answers

High

Moderate

None; interviewer absent

None

Supervision of interviewers

Moderate

Moderate to high

High, especially with central-location interviewing

Not applicable

Not applicable

Anonymity of respondent

Low

Low

Moderate

High

Respondent can be either anonymous or known

Ease of callback or follow-up

Difficult

Difficult

Easy

Easy, but takes time

Difficult, unless email address is known

Cost

Highest

Moderate to high

Low to moderate

Lowest

Low

Special features

Visual materials may be shown or demonstrated; extended probing possible

Fieldwork and Taste tests and supervision of viewing of television advertisements possible data collection are simplified; quite adaptable to computer technology

Respondent may answer questions at own convenience; has time to reflect on answers

Streaming media software allows use of graphics and animation

CHAPTER 05 > SURVEY RESEARCH

165

PRETESTING A researcher who is surveying 3000 consumers does not want to find out after the questionnaires have been completed or returned that most respondents misunderstood a particular question, skipped a series of questions or misinterpreted the instructions for filling out the questionnaire. To avoid problems such as these, screening procedures, or pretests, are often used. Pretesting involves a trial run with a group of respondents to iron out fundamental problems in the instructions or design of a questionnaire. The researcher looks for such things as the point at which respondent fatigue sets in and whether there are any particular places in the questionnaire where respondents tend to terminate. Unfortunately, this stage of research may be eliminated because of costs or time pressures.

pretesting A screening procedure that involves a trial run with a group of respondents to iron out fundamental problems in the survey design.

Broadly speaking, there are three basic ways to pretest. The first two involve screening the questionnaire with other research professionals; and the third – the one most often called pretesting – is a trial run with a group of respondents. When screening the questionnaire with other research professionals, the investigator asks them to look for such things as difficulties with question wording, problems with leading questions, and bias due to question order. An alternative type of screening might involve a client or the research manager who ordered the research. Often managers ask researchers to collect information, but when they see the questionnaire they find that it does not really meet their needs. Only by checking with the individual who has requested the questionnaire does the researcher know for sure that the information needed will be provided. Once the researcher has decided on the final questionnaire, data should be collected with a small number of respondents (perhaps 100) to determine whether the questionnaire needs refinement.

ETHICAL ISSUES IN SURVEY RESEARCH According to the Australian Market and Social Research Society’s (AMSRS) code of ethics, respondents should be informed about the nature of the survey (this is usually done at the end of the interview), and surveys should not be used as means of telemarketing or promoting a company’s products or services. Information about respondents should be kept confidential, and they should not (without their permission) be identified in any reports from the survey. This may become more difficult when newer Internet or mobile surveys also use GPS or IP addresses to locate a respondent. Respondents should not be interrupted or inconvenienced by survey research. Calling respondents around dinner by telephone or sending businesspeople email surveys that clog up their in-box make the task of respondent cooperation more difficult and can be seen as bordering on unethical behaviour. Newer technology, in which surveys are downloaded at the respondent’s cost (such as mobile phone, Internet and email surveys), means that care should be taken to get respondent approval to receive surveys and compensation for a respondent’s cost and time need to be carefully considered. There are also issues in panel research – how they are conducted so that clients receive accurate and representative information.

TIPS OF THE TRADE

Qualitative or interpretative research is best handled by personal interviews. »» The longer the questionnaire, the lower the response rate. »» When long questionnaires are absolutely necessary, the researcher should: • look for respondents who are essentially a captive audience, like students in a class • offer a non-trivial incentive to respond

»

166

PART THREE > PLANNING THE RESEARCH DESIGN

»

• try to target the survey towards individuals who are highly involved in the topic • use a survey research panel. »» Mobile phone survey calls are slightly more likely to be answered on weekdays, but mobile phone users are more likely to refuse to participate than are landline phone users. »» Email and Internet surveys can be quick and cheap, but: • care must be taken that the type of respondents to be interviewed use these communication media • the effectiveness of email and Internet surveys is influenced by the Internet penetration of the country they are conducted in • when a panel or special interest group provides responses, the researcher should be extra vigilant for bogus response patterns • response rates are between 10 and 15 per cent. »» A short pretest is better than no pretest at all.

SUMMARY DEFINE SURVEYS AND DESCRIBE THE TYPE OF INFORMATION THAT SHOULD BE GATHERED IN A SURVEY

05

The survey is a common tool for asking respondents questions. Surveys can provide quick, inexpensive and accurate information for a variety of objectives. The typical survey is a descriptive research study with the objective of measuring awareness, product knowledge, brand usage behaviour, opinions and so on. The term ‘sample survey’ is often used because a survey is expected to obtain a representative sample of the target population.

of the target population and making a special effort to contact under-represented groups. Response bias occurs when a response to a questionnaire is falsified or misrepresented, either intentionally or inadvertently. There are five specific categories of response bias: acquiescence bias, extremity bias, interviewer bias, auspices bias and social desirability bias. An additional source of survey error comes from administrative problems such as inconsistencies in interviewers’ abilities, interviewer cheating, coding mistakes and so forth.

EXPLAIN THE ADVANTAGES AND DISADVANTAGES OF SURVEYS

DISTINGUISH AMONG THE VARIOUS CATEGORIES OF SURVEYS

Surveys provide a quick, inexpensive, efficient and accurate means of assessing information about a population. Surveys ask people for answers. If people cooperate and give truthful answers, a survey will likely accomplish its goal. If these conditions are not met, nonresponse errors or response biases can easily occur.

Surveys may be classified according to methods of communication, by the degrees of structure and disguise in the questionnaires, and on a temporal basis. Questionnaires may be structured with limited choices of responses, or unstructured to allow open-ended responses. Disguised questions may be used to probe sensitive subjects. Surveys may consider the population at a given moment or follow trends over a period of time. The first approach, the cross-sectional study, usually is intended to separate the population into meaningful subgroups. The second type of study, the longitudinal study, can reveal important population changes over time. Longitudinal studies may involve contacting different sets of respondents or the same ones repeatedly. One form of longitudinal study is the consumer panel. Consumer panels are expensive to conduct, so firms often hire contractors who provide services to many companies, thus spreading costs over many clients.

IDENTIFY SOURCES OF ERROR IN SURVEY RESEARCH

Two major forms of error are common in survey research. The first, random sampling error, is caused by chance variation and results in a sample that is not absolutely representative of the target population. Such errors are inevitable, but they can be predicted using the statistical methods discussed in later chapters on sampling. The second major category of error, systematic error, takes several forms. Nonresponse error is caused by subjects failing to respond to a survey. This type of error can be reduced by comparing the demographics of the sample population with those

CHAPTER 05 > SURVEY RESEARCH

SUMMARISE THE DIFFERENT WAYS RESEARCHERS IMPLEMENT SURVEYS

Interviews and self-administered questionnaires are used to collect survey data. Interviews can be categorised based on the medium used to communicate with respondents. A primary classification of survey approaches can be developed based on interactivity. Personal interviews are the most interactive followed by telephone interviews. Self-administered surveys via mail, fax, email and the Internet have little interactivity with the respondent, as there is no immediate two-way communication. Both approaches have disadvantages and advantages. KNOW THE ADVANTAGES AND DISADVANTAGES OF DISTRIBUTING A QUESTIONNAIRE VIA DIFFERENT MEANS

Personal interviewing is a flexible method that allows researchers to use visual aids and various kinds of props. Door-to-door personal interviews can get high response rates, but they are also more costly to administer than other types of surveys. The presence of an interviewer may also influence subjects’ responses. When a sample does not need to represent the entire country, mall intercept interviews may reduce costs. Telephone interviewing has the advantage of providing data quickly and at a lower cost per interview. However, not all households have telephones, and not all telephone numbers are listed in directories. This causes problems in obtaining a representative sample. The absence of face-to-face contact and an inability to use visual materials also limit telephone interviewing. Computerassisted telephone interviewing from central locations can improve the efficiency of certain kinds of telephone surveys. Traditionally, self-administered questionnaires have been distributed by mail. Today, however, self-administered questionnaires may be dropped off to individual respondents, distributed from central locations or administered via computer. Mail questionnaires

generally are less expensive than telephone or personal interviews, but they also introduce a much larger chance of nonresponse error. Several methods can be used to encourage higher response rates. Mail questionnaires must be more structured than other types of surveys and cannot be changed if problems are discovered in the course of data collection. The Internet and other interactive media provide convenient ways for organisations to conduct surveys. Internet surveys are quick and cost-effective, but not everyone has Internet access. Because the surveys are computerised and interactive, questionnaires can be personalised and data can be captured in real time. There are some privacy and security concerns, but the future of Internet surveys looks promising. APPRECIATE THE IMPORTANCE OF PRETESTING QUESTIONNAIRES

Pretesting a questionnaire on a small sample of respondents is a useful way to discover problems while they still can be corrected. Pretests may involve screening of the questionnaire with other research professionals or conducting a trial run with a set of respondents. DESCRIBE ETHICAL ISSUES THAT ARISE IN SURVEY RESEARCH

Researchers must protect the public from misrepresentation and exploitation. This obligation includes honesty about the purpose of a research project and protection of subject’s right to refuse to participate or to answer questions. Researchers also need to protect the confidentiality of the participants and record responses honestly. Lastly, as technology evolves, researchers should be mindful of not necessarily contributing to unwanted electronic or personal communications. Thus, both the reason for and the method of contact should be carefully scrutinised.

KEY TERMS AND CONCEPTS acquiescence bias administrative error auspices bias callback central location interviewing computer-assisted telephone interview (CATI) consumer panel cover letter cross-sectional study data-processing error dialogue box disguised question door-to-door interview

drop-off method email surveys extremity bias Internet surveys interviewer bias interviewer cheating interviewer error item nonresponse longitudinal study mail survey mall intercept interview mixed-mode survey no contact nonrespondent

167

nonresponse error personal interview pretesting probing random digit dialling random sampling error refusal respondent respondent error response bias response rate sample bias sample survey sample-selection error

self-administered questionnaire self-selection bias social desirability bias structured question survey systematic error telephone interview tracking study undisguised question unstructured question welcome screen

168

PART THREE > PLANNING THE RESEARCH DESIGN

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 How do you explain the conflicting survey findings in the opening vignette of this chapter? 2 Do surveys tend to gather qualitative or quantitative data? What kinds of information are commonly measured in surveys? 3 Why is a response bias a concern for survey researchers? Are there other issues that should be of greater concern? 4 What survey research objectives might an online retailer develop so as to learn about consumers? 5 Give an example of each type of error listed in Exhibit 5.1. 6 In a survey, chief executive officers (CEOs) indicated that they would prefer to relocate their businesses in Perth (first choice), Wellington (New Zealand), Auckland (New Zealand), Singapore or Kuala Lumpur (Malaysia). The CEOs who said they were going to build the required office space in the following year were asked where they were going to build. They indicated they were going to build in Hong Kong, Sydney or Ho Ch Minh City. Explain the difference. 7 What potential sources of error might be associated with the following situations? a In a survey of frequent flyers aged 50 and older, researchers concluded that price does not play a significant role in airline travel because only 25 per cent of the respondents check off price as the most important consideration in determining where and how they travel, while 35 per cent rate price as being unimportant. b A survey of voters finds that most respondents do not like negative political ads – that is, advertising by one political candidate that criticises or exposes secrets about the opponent’s ‘dirty laundry’. c A survey of Facebook users ranks MacBooks as far superior to other personal computers for business applications. d A company’s sales representatives are asked what percentage of time they spend making presentations to prospects, travelling, talking on the telephone, participating in meetings, working on the computer and engaging in other on-the-job activities. e An insurance company obtains a 75 per cent response rate from a sample of university students contacted by mobile phone in a study in attitudes towards health insurance. f Researchers who must conduct a 45-minute personal interview decide to offer $50 to each respondent because they believe that people who will sell their opinions are more typical than someone who will talk to a stranger for 45 minutes. 8 What topics about illegal downloads might be extremely sensitive issues about which to directly question respondents?

9 A survey conducted by the Australia Council for the Arts asked: ‘Have you read a book within the last year?’ What response bias might arise from this question? 10 How might survey results for buying intentions be adjusted to account for consumer pessimism? 11 Name some common objectives of cross-sectional surveys. 12 Give an example of a political situation in which longitudinal research might be useful. Name some common objectives for a longitudinal study in a business situation. 13 What are the advantages and disadvantages of using online consumer panels? 14 Go through your local newspaper to find some stories derived from survey research results. Was the study’s methodology appropriate? Could the research have been termed advocacy research? 15 Suppose you are the marketing research director for your state’s or regional tourism bureau. Assess the state’s information needs and identify the information you will collect in a survey of tourists who visit your state. 16 A researcher sends out 200 questionnaires, but 50 are returned because the addresses are inaccurate. Of the 150 delivered questionnaires, 50 are completed and mailed back. However, 10 of these respondents wrote that they did not want to participate in the survey. The researcher indicates the response rate was 33.3 per cent. Is this the right thing to do? 17 What type of communication medium would you use to conduct the following surveys? Why? a buying motives of fashion buyers for retail chains b satisfaction levels of hotel customers c television advertisement advertising awareness d top corporate executives’ exercise habits. 18 A publisher offers a university lecturer one of four best-selling mass-market books as an incentive for filling out a 10-page mail questionnaire about a new textbook. What advantages and disadvantages does this incentive have? 19 ‘Individuals are less willing to cooperate with surveys today than they were 15 years ago.’ Comment on this statement. 20 Do most surveys use a single communication mode (for example, the telephone), as most textbooks suggest? 21 Evaluate the following survey designs: a A text message survey asks potential respondents to indicate yes or no whether they are driving or not, whether they are alone, and whether they believe the roads in the area can adequately handle traffic, whether more money should be spent on better roadways, whether or not traffic

CHAPTER 05 > SURVEY RESEARCH

b

c

d

e

b A political action committee conducts a survey about its cause. At the end of the questionnaire, it includes a request for a donation. c A telephone interviewer calls at 1 p.m. on Sunday and asks the person who answers the phone to take part in an interview. d A B2B marketer wishes to survey its own distributors. It invents the name ‘Eastern States Marketing Research’ and sends out a mail questionnaire under this name. e A questionnaire is printed on the back of a warranty card included inside the package of a coffee machine. The questionnaire includes a number of questions about shopping behaviour, demographics and customer lifestyles. At the bottom of the warranty card is a short note in small print that says: ‘Thank you for completing this questionnaire. Your answers will be used for marketing studies and to help us serve you better in the future. You will also benefit by receiving important mailings and special offers from a number of organisations whose products and services relate directly to the activities, interests and hobbies in which you enjoy participating on a regular basis. Please indicate if there is some reason you would prefer not to receive this information.’ f A research company in the Netherlands offers a free computer to a sample of citizens who agree to answer questions downloaded every week in exchange for the computer.

is effectively policed, and whether or not automatic cameras should be used to issue speeding tickets. The sample is drawn from people who have agreed to be contacted via mobile phone regarding road traffic conditions. A researcher suggests mailing a small safe (a metal file box with a built-in lock) without the lock combination to respondents, with a note explaining that respondents will be called in a few days for a telephone interview. During the telephone interview, the respondent is given the combination and the safe may be opened. A shopping mall that wishes to evaluate its image places packets (including a questionnaire, cover letter and stamped return envelope) in the mall where customers can pick them up if they wish. An email message is sent to individuals who own computers, asking them to complete a questionnaire on a website. Respondents answer the questions and then have the opportunity to play a poker-machine game on the website. Each respondent is guaranteed a monetary incentive, but has the option to increase it by playing the poker-machine game. National Geographic magazine opts to conduct a mobile phone survey rather than a mail survey for a study to determine the demographic characteristics and purchasing behaviour of its subscribers in Australia and Malaysia.

22 What type of research studies lend themselves to the use of email for survey research? What are the advantages and disadvantages of using email? 23 Comment on the ethics of the following situations: a A researcher plans to use invisible ink to code questionnaires to identify respondents in a distributor survey.

169

24 How might the marketing research industry take action to ensure that the public believes that telephone surveys and door-to-door interviews are legitimate activities, and that firms that misrepresent and deceive the public using marketing research as a sales ploy are not true marketing researchers?

ONGOING PROJECT DOING A SURVEY RESEARCH PROJECT? CONSULT THE CHAPTER 5 PROJECT WORKSHEET FOR HELP

Choosing the right type of survey design is very important, as this will not only affect the data collected but also whether you

meet your research objectives. Download the Chapter 5 project worksheet from the CourseMate website. It outlines the steps to be considered in choosing a survey design method.

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to log in to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ interactive quizzes ☑ flashcards

☑ crosswords on key concepts ☑ research activities ☑ videos.

170

PART THREE > PLANNING THE RESEARCH DESIGN

WRITTEN CASE STUDY 5.1 GOOGLE CONSUMER SURVEYS Google has recently developed its own online survey services, which is called Google Consumer Surveys (see http://www.google. com/insights/consumersurveys/how). For as little as 10 cents a respondent, Google can collect survey data across different markets and consumers around the world. Unlike more accepted consumer panels, Google recruits respondents by allowing them to access premium online content in return for them answering their surveys. The publishers of other websites also make money as the result of people answering surveys from links to their site. The survey method used by Google is to ask one question at a time and then ‘aggregate’ these responses when dealing with multi-item questionnaires. It is claimed that by using information such as IP address, the page visited and cookie data, Google can infer age, gender and location of respondents. The company

claims that sample error of its surveys is as low at 3.5 per cent, below the 4.5 per cent of telephone surveys.

QUESTIONS

1 What are the advantages of Google Survey over online and telephone research? 2 What are some limitations of Google Survey as survey methodology? 3 For what kind of research questions would Google Survey be useful? 4 Describe any possible ethical issues involved in the use of Google Survey.

WRITTEN CASE STUDY 5.2 PEDAL POWER IN AUCKLAND49 Cycling is gaining in popularity in Auckland, New Zealand, as shown by a survey showing a 35 per cent annual increase in pedalling numbers. Auckland Transport says 27 per cent of 1615 city residents who took part in an online survey in May 2015 reported cycling at least occasionally. That was up from 20 per cent surveyed in the previous year. Although only 11 per cent said they cycled at least once a week, that was well up from 6 per cent in that category in 2014. On the other hand, the proportion of those describing themselves as regular walkers fell to 42 per cent, compared with 46 per cent in 2014.

QUESTIONS

1 Comment on the use of online surveys to collect the information in this case. Is there a better approach? 2 What are some possible sample biases that may have occurred from the way the data were collected? 3 Are there any possible response biases that may have influenced the results? If so what are these? And how could these be controlled?

ONGOING CASE STUDY MOBILE PHONE SWITCHING AND BILL SHOCK In the next stage of research, David, Leanne and Steve were asked to develop a survey in order to examine the factors affecting mobile phone switching and bill shock in Australia. They only have $20 000 of their budget remaining and less than a month to collect the data. After an extensive secondary review of literature in the area, David noted that this will be quite a long survey– some 25 minutes. The three of them met for dinner to discuss what to do next.

QUESTIONS

1 What kind of survey research do you think is suitable here and why? 2 If David, Leanne and Steve decide to use a consumer panel, what advice would you give them? 3 Go online and examine some consumer panels. Which one do you think is most suitable for them?

CHAPTER 05 > SURVEY RESEARCH

171

NOTES 1

Koziol, Michael (2015) ‘Life in the emerald city better than people think’, Sydney Morning Herald, November 7–8, p. 12. 2 Daley, Michael (2015) ‘Silver fern prominent in flag options’, The Press, Christchurch, 27 May, p. 5. 3 Sheatsley, Paul B. (1974) ‘Survey design’, in Handbook of Marketing Research, Robert Ferber, ed., New York: McGraw-Hill, pp. 2–66. 4 Tuckel, Peter and O’Neill, Harry (2001) ‘The vanishing respondent in telephone surveys’, paper presented at the 56th annual conference of the American Association of Public Opinion Research (AAPOR) in Montreal on 17–20 May. 5 Blackadder, Jesse (2006) ‘The holy grail of consumer-centric audience measurement’, Research News, February edition, accessed at www.mrsa.com.au/ index.cfm?a=detail&id=2026&eid=119 on 2 April 2006. 6 Morgan, Roy (2005) ‘Letter to The Australian: Readership and circulation changes do not always correlate’, accessed at www.roymorgan.com/news / pressreleases/2005/3935/index.cfm?printversion =yes on 2 April 2006. 7 Tokar, Nicole (2006) ‘Child’s play – getting children to talk’, Research News: Special issue on research and children, 23(2), March, p. 16. 8 Gardiner, Bonnie, CDG Communications, July 22, accessed at www.fativa.com on November 25, 2015 9 Douglas Aircraft, (undated) Consumer Research, p. 13. 10 Arndt, Johan and Crane, Edgar (1975) ‘Response bias, yea-saying and the double negative’, Journal of Marketing Research, 12, May, pp. 218–20. 11 For an interesting study of extremity bias, see Baumgartner, Hans and Steekamp, Jan-Benedict E. M. (2001) ‘Response styles in marketing research: A cross-national investigation’, Journal of Marketing Research, May, pp. 143–56. 12 Roberts, Gerrie (2005) ‘You say, “potato”’. (How to deal with mid-point bias)’, Research News, Australian Market and Social Research Society, October, p. 24. 13 Roberts, Gerrie (2005) ‘Stats: Not eight, I said eight’, Research News, Australian Market and Social Research Society, June, accessed at www .amsrs.com.au/index. cfm?a=detail&id=1803&eid=112 on 3 April 2006. 14 Kellior, Bruce, Owens, Deborah and Pettijohn, Charles (2001) ‘A cross-cultural/cross national study of influencing factors and socially desirable response biases’, International Journal of Market Research, 43(1), first quarter, pp. 63–84. 15 Mick, Glen (1996) ‘Are studies of the dark side variables confounded by socially desirable responding? The case of materialism’, Journal of Consumer Research, 23(2), September, pp. 106–19. 16 See Lavidge, Robert (1990) ‘Seven tested ways to abuse and misuse strategic advertising research’, Marketing Research: A magazine of management and applications, March, p. 43; ‘VCR/VDP market products and important research questions’, Marketing News, 6 January 1984, p. 7. Lavidge says: ‘Suppose a researcher asks purchasing intention questions for an inexpensive nondurable product using answer categories as follows: “Definitely will buy”, “Probably will buy”, “May or may not buy”, “Probably will not buy” and “Definitely will not buy”. Based on past experience the researchers may estimate that about 80 per cent of the “Definitely will buy” respondents plus 30 per cent of the “probably will buy” category will actually buy the product if it becomes available on the market’. 17 The term ‘questionnaire’ technically refers only to mail and self-administered surveys, and the term ‘interview schedule’ is used for interviews by telephone or face to face. However, we will use questionnaire to refer to all three forms of communications in this book. 18 Australian Bureau of Statistics (2014) Household use of Information Technology, Australia 2012–2013, Catalogue no. 8146.0. 19 Roy Morgan (2015) Aussies using their mobile or tablet to bank triples in 3 years, accessed at http://www.roymorgan.com/findings/6404-mobilebanking-201508200535, on 26 November 2015. 20 Australian Bureau of Statistics (2014) Australian Social Trends, 2014, Catalogue no. 4102.0, http://www.abs.gov.au/ausstats/[email protected]/Lookup/4102.0main+featur es202014, accessed on 26 November 2015. 21 Warwick, Donald T. and Lininger, Charles, A. (1975) The Sample Survey: Theory and practice, New York: McGraw-Hill, p. 2. 22 Lockley, L. C. (1950) ‘Notes on the history of marketing research’, Journal of Marketing, 19 April, p. 733. 23 Mavletova, A. (2015) ‘A gamification effect in longitudinal web surveys among children and adolescents’, International Journal of Market Research, 57(3), 413–438. 24 Sources: Based on O’Donnell, Claudia D. (2005) ‘Tips for sensory tests’, Prepared Foods (January), downloaded from InfoTrac at www.galenet.com; LaBell, Fran (2002)

‘International sensory tests: When in Rome’, Prepared Foods (February), downloaded from Business & Company Resource Center, http://galenet.galegroup.com; Abend, L. (2008) ‘E-Tongue passes wine taste test’, Time (12 August 2008), www.time.com/ time/business /article/0,8599,1831413,00.html, accessed on 1 August 2008. 25 Association of Market and Social Research Organisations (2006) Media release: Do not call register, accessed at www.amsro.com.au/index .cfm?p=1965 on 3 April 2006. 26 Sources: Rubin, Jon (2000) ‘Online marketing research comes of age’, BrandWeek, 30 October, p. 28; Lewis, Michael (2000) ‘The two-bucks-a-minute democracy’, New York Times Magazine, 5 November, p. 65. 27 Australian Market and Social Research Society (2003) ‘Australia a “high cost” country in which to conduct research’, Research News, April, accessed at www.mrsa.com.au/ index.cfm?a=detail&eid=91&id=1209 on 3 April 2006. 28 Australian Market and Social Research Society (2000) Call for Action on Response Rates, November, accessed at www.mrsa.com.au/index .cfm?a=detail&eid=33&id=152 on 3 April 2006. 29 Tuckel, Peter and O’Neill, Harry (2001) ‘The vanishing respondent in telephone surveys’, a paper presented at the 56th annual conference of the American Association of Public Opinion Research (AAPOR) in Montreal on 17–20 May. 30 Vincente, Paula, and Ries, Elizabeth (2009) ‘Telephone surveys using mobile phones: An analysis of response rates , survey procedures, and respondent characteristics’, Australasian Journal of Market Research, 17(2): 49–56 31 Tuckel, Peter and Shukers, Trish (1997) ‘The answering machine dilemma’, Marketing Research, Fall, pp. 5–9. 32 Tuckel, Peter and Finberg, B. (1991). ‘The answering machine poses many questions’, Public Opinion Quarterly (Summer). 33 Grande, E. D., Taylor, A. and Wilson, D. (2005), ‘Is there a difference in health estimates between people with listed and unlisted telephone numbers?’ Australian and New Zealand Journal of Public Health, 29, pp. 448–56. 34 Australian Bureau of Statistics (2003) Measures of a Knowledge Based Economy and Society, Catalogue no. 1377.0. 35 Australian Bureau of Statistics (2012) Household Use of Technology, Australia, 2010–11, Catalogue No. 8146.0. 36 Sources: Stores (2001) ‘Survey: Consumers say “yes” to holiday shopping’, (December), 83(12), p. 18; Moskowitz, Joel M. (2004) ‘Assessment of cigarette smoking and smoking susceptibility among youth: Telephone computer-assisted selfinterviews versus computer-assisted telephone interviews’, Public Opinion Quarterly, 68 (Winter), pp. 565–87. 37 World Factbook, accessed at https://www.cia.gov/library/publications/the-worldfactbook/geos/pp.html on 20 June 2016. 38 Dillman, Don (2000) Mail and Internet Surveys: The Tailored Design Method, New York: John Wiley and Sons, p. 173. 39 Jones, Wesley and Linda, Gerald (1978) ‘Multiple criteria effects in a mail survey experiment’, Journal of Marketing Research, 25, May, pp. 280–4. 40 Childers, Terry, Pride, William and Ferrell, O. C. (1980) ‘A reassessment of the effects of appeals on response to mail surveys’, Journal of Marketing Research 27, August, pp. 365–70. 41 McDonald, H. and Adam, S. (2003) ‘A comparison of online and postal data collection methods in marketing research’, Marketing Intelligence and Planning, vol. 21(2), pp. 85–95. 42 The World Factbook, accessed at www.cia.gov on 26 November 2015. 43 Black, Ian, Efron, Alejandra, Ioannou, Christina and Rose, John (2005) ‘Designing and implementing Internet questionnaires with Microsoft Excel’, Australasian Marketing Journal, vol. 13(2), pp. 75–6. 44 http://www.websurveyor.com/learn_howto.asp, downloaded 14 May 2001. 45 http://www.dmsdallas.com/cust_research/videval.html, downloaded 23 April 2001. 46 Australian Market and Social Research Society (2004) ‘Cash for comments’, Research News, March, accessed at www.amsrs.com.au /index.cfm?a=detail&eid=98&id=1378 on 9 April 2006. 47 Tierney, P. (2000) ‘Internet-based evaluation of tourism Web site effectiveness: Methodological issues and survey results’, Journal of Travel Research, 39(2), pp. 212–20. 48 Forrest, Edward (2003) Internet Marketing Intelligence, McGraw-Hill Australia, pp. 129–30. 49 Dearnaley, Mathew (2015) ‘Pedal power in the upward cycle’, New Zealand Herald, 28 October, accessed from http://www.fativa.com on 26 November 2015.

06 » WHAT YOU WILL LEARN IN THIS CHAPTER

To discuss the role of observation as a marketing research method.

To describe the use of direct observation and contrived observation. To consider ethical issues of observation. To explain the observation of physical objects and message content.

OBSERVATION Marketing to doctors in New Zealand1

Research published in 2015 in The New Zealand Medical Journal examined, by observation, advertisements in the New Zealand Doctor and Pharmacy Today. The research found that one-third of all advertisements examined had no supporting evidence and only 35 per cent had supporting evidence. Of concern was that of those which did have supporting evidence, 76 per cent of results were heavily weighted towards industry sponsorship. New Zealand is only one of two countries in the world where there is direct pharmaceutical

172

iStock.com/xijian

To describe the major types of mechanical observations and observations of physiological reactions.

marketing to consumers, but the concern here is that doctors may be also being marketed to without all the necessary information about a new drug’s effectiveness being presented fairly to them. These concerns have led to some health experts in New Zealand calling for greater monitoring and regulation of pharmaceutical advertising. This research also shows the importance of observation as a research tool, as it deals directly with presented phenomena, rather than the recollection of respondents. As will be discussed in this chapter, observation is a very flexible research method, which is becoming increasingly important in market and social research.

PART THREE > PLANNING THE RESEARCH DESIGN

WHAT IS OBSERVATION? In marketing research, observation is the systematic process of recording the behavioural patterns of people, objects and occurrences as they are witnessed. The researcher who uses the observation method of data collection witnesses and records information as events occur, or compiles evidence from records of past events.

observation The systematic process of recording the behavioural patterns of people, objects and occurrences as they are witnessed.

WHEN IS OBSERVATION SCIENTIFIC? Observation becomes a tool for scientific inquiry when it: →→ serves a formulated research purpose →→ is planned systematically →→ is recorded systematically and relates to general propositions rather than simply reflecting a set of interesting curiosities

observation

→→ is subjected to checks or controls on validity and reliability.2

SURVEY THIS!

Perhaps you recall answering these questions about opinions and preferences for technological products. Take a look at the results from this section of the survey. In what way has behavioural observation been used to collect additional data – if at all? How might this information be useful to companies that sell small electronic appliances? Are there other places in the survey where behavioural observation has been combined with this traditional survey approach?

Courtesy of Qualtrics.com

WHAT CAN BE OBSERVED? A wide variety of information about the behaviour of people and objects can be observed. Table 6.1 outlines seven kinds of observable phenomena: physical actions, such as shopping patterns or television viewing; verbal behaviour, such as sales conversations; expressive behaviour, such as tone of voice or facial expressions; spatial relations and locations, such as traffic patterns; temporal patterns, such as amount of time spent shopping or driving; physical objects, such as the amount of newspapers recycled; and verbal and pictorial records, such as the content of advertisements, as was discussed in the opening vignette of this chapter. (Although investigation of secondary data uses observation – see Chapter 4 – it is not extensively discussed in this chapter.)

CHAPTER 06 > OBSERVATION

173

174

PART THREE > PLANNING THE RESEARCH DESIGN

TABLE 6.1 »

WHAT CAN BE OBSERVED

Phenomenon

Example

Physical action

A shopper’s movement pattern in a store

Verbal behaviour

Statements made to call centre staff about bank service

Expressive behaviour

Facial expressions, tone of voice and other forms of body language

Spatial relations and locations

How close visitors at an art museum stand to paintings

Temporal patterns

How long fast-food customers wait for their orders to be served

Physical objects

What brand-name items are stored in consumers’ pantries

Verbal and pictorial records

Bar codes on product packages

The observation method may be used to describe a wide variety of behaviour, but cognitive phenomena such as attitudes, motivations and preferences cannot be observed. Thus, observation research cannot provide an explanation of why a behaviour occurred or what actions were intended. Another limitation is that the observation period generally is of short duration. Behaviour patterns that occur over a period of several days or weeks generally are either too costly or impossible to observe.

ONGOING PROJECT

THE USE OF OBSERVATIONAL RESEARCH BY AUSTRALIAN COMPANIES3

When developing food products, companies such as Sanitarium, Campbell Arnott’s and Heinz often use at-home research. Sanitarium brand managers may arrange to visit halfa-dozen homes at 7 a.m. to observe consumers’ breakfast routines, and Heinz has used at-home research for its steam-fresh vegetable bags. A Heinz spokeswoman says: ‘It was important to see how people used the product in their home situation when it wasn’t precisely prepared in a [company] kitchen and observe the sensory aspect of how they found the product.

Alamy/Denkou Living

REAL WORLD SNAPSHOT

THE NATURE OF OBSERVATION STUDIES Marketing researchers can observe people, objects, events or other phenomena using either human observers or machines designed for specific observation tasks. Human observation best suits a situation or behaviour that is not easily predictable in advance of the research. Mechanical observation, as performed by supermarket scanners or traffic counters, can very accurately record situations or types of behaviour that are routine, repetitive or programmatic. Human or mechanical observation may be unobtrusive; that is, it may not require communication with a respondent. For example, rather than asking customers how much time they spend shopping in

CHAPTER 06 > OBSERVATION

the store, a supermarket manager might observe and record the intervals between when shoppers enter and leave the store. The unobtrusive or nonreactive nature of the observation method often generates data without a subject’s knowledge. A situation in which an observer’s presence is known to the subject involves visible observation; a situation in which a subject is unaware that observation is taking place is hidden observation. Hidden, unobtrusive observation minimises respondent error. Asking subjects to participate in the research is not required when they are unaware that they are being observed. The major advantage of observation studies over surveys (which obtain self-reported data from

175

visible observation Observation in which the observer’s presence is known to the subject. hidden observation Observation in which the subject is unaware that observation is taking place.

respondents) is that the data do not have distortions, inaccuracies or other response biases due to memory error, social desirability bias and so on. The data are recorded when the actual behaviour takes place.

WHAT WENT WRONG? PROBLEMS WITH ‘BIG DATA’ 4

As discussed in Chapter 4, many Australian and New Zealand companies now have access to vast amounts of consumer behavioural data, including transaction records, social media and scanner data. Yet for many organisations the issue now seems to be too much data and how to make sense of it. According to head of IT consulting firm Accenture’s Australian and New Zealand marketing and analytics practice, Jason Juma Ross, many Australian organisations are headed down the wrong road. ‘What organisations need to do is take that next step and figure out what’s useful from an analytics perspective and where they should be focusing their investment,’ Ross says. One major issue, according to Ross, is that executives often spend more time

perfecting techniques and diagnosing current problems rather than getting analytics to achieve something in the real world. ‘If an executive can address the analytics investment against a tangible customer or a business need then they are on the right track,’ Ross says. ‘If it’s just for the sake of having a beautiful model, then there is a big question mark.’ According to Ross, analytics isn’t always about making the most accurate decision but rather about making the best possible decision faster than the competition. ‘If you take 12 months to make a decision, it might be perfectly right, but it might also be too late.’

OBSERVATION OF HUMAN BEHAVIOUR Surveys emphasise verbal responses, while observation studies emphasise and allow for the systematic recording of nonverbal behaviour. Toy manufacturers such as Fisher-Price use the observation technique because children often cannot express their reactions to products. By observing children at play with a proposed toy, doll or game, marketing researchers may be able to identify the elements of a potentially successful product. Toy marketing researchers might observe play to answer the following questions: How long does the child’s attention stay with the product? Does the child put the toy down after two minutes or 20 minutes? Are the child’s peers equally interested in the toy? Behavioural scientists have recognised that nonverbal behaviour can be a communication process by which meanings are exchanged among individuals. Head nods, smiles, raised eyebrows and other facial expressions or body movements have been recognised as communication symbols. Observation of nonverbal communication may hold considerable promise for the marketing researcher. For example, with regard to customer–salesperson interactions it has been hypothesised that in low-importance transactions, where potential customers are plentiful and easily replaced (for example, a shoe shop), the salesperson may show definite nonverbal signs of higher status than the customer. When customers are scarce, as in big-ticket purchase situations (for example, real estate sales), the opposite should be true: the salesperson might show many nonverbal indicators of deference. An observation study using the nonverbal communication measures shown in Table 6.2 could test this hypothesis.

176

PART THREE > PLANNING THE RESEARCH DESIGN

TABLE 6.2 »

NONVERBAL COMMUNICATION: STATUS AND POWER GESTURES5

Between people of equal status

Between people of unequal status

Between men and women

Behaviour

Intimate

Nonintimate

Used by superior

Used by subordinate

Used by men

Used by women

Posture

Relaxed

Tense (less relaxed)

Relaxed

Tense

Relaxed

Tense

Personal space

Closeness

Distance

Closeness (optional)

Distance

Closeness

Distance

Touching

Touch

Don’t touch

Touch (optional)

Don’t touch

Touch

Don’t touch

Eye gaze

Establish

Avoid

Stare, ignore

Avert eyes, watch

Stare, ignore

Avert eyes

Demeanour

Informal

Circumspect

Informal

Circumspect

Informal

Circumspect

Emotional expression

Show

Hide

Hide

Show

Hide

Show

Facial expression

Smile

Don’t smile

Don’t smile

Smile

Don’t smile

Smile

iStockphoto/Daniel Laflor

Of course, verbal behaviour is not ignored – indeed, in certain observation studies it is very important.

Complementary evidence The results of observation studies may amplify the results of other forms of research by providing complementary evidence concerning individuals’ ‘true’ feelings. Focus group interviews often are conducted behind oneway mirrors from which marketing executives observe as well as listen to what is occurring. This allows for interpretation of nonverbal behaviour such as facial expressions or head nods to supplement information from interviews. For example, in one focus group session concerning hand lotion, Focus groups observed behind mirrors are often recorded. The ability to replay the recordings allows researchers to perform detailed analysis of physical actions.

researchers observed that all the women’s hands were above the table while they were casually waiting for the session to begin. Seconds after the women were told that the topic was to be hand lotion, all hands were placed out of sight. This observation, along with the group discussion, revealed the women’s anger, guilt and shame about the condition of their hands. Although they felt they were expected to have soft, pretty hands, their housework obligations required them to wash dishes, clean floors and do other chores that were hard on their hands. When focus group behaviour is filmed, observation of the nonverbal communication symbols can add even more to marketers’ knowledge of the situation.

REAL WORLD SNAPSHOT

HOW MATURE WOMEN BUY CLOTHES: INSIGHTS FROM OBSERVATIONAL RESEARCH6

Insights into consumer decision-making can often occur from observational research. Research of 10 Finnish women and their observed shopping behaviour for clothes revealed some important differences from those shopping alone or in pairs. Women shopping alone stopped here and there to study some items more carefully, and picked up different types of clothes of different brands before heading to the fitting rooms. Thus, they seemed to be fulfilling specific needs. The women in pairs moved around very

»

CHAPTER 06 > OBSERVATION

»

177

close to one another, and had eye contact with each other all the time, commenting on and continuously discussing colours, details and the quality of the apparel. They also discussed clothing experiences, prices and a third person’s clothing, and made suggestions on what would suit their friends. They tried on clothes simultaneously, and seemed to be just looking for ideas as much as buying for specific needs.

DIRECT OBSERVATION

ONGOING PROJECT

Newspix/Erica Harrison

Direct observation can produce a detailed record of events that occur, or what people actually do. The observer plays a passive role. That is, there is no attempt to control or manipulate a situation – the observer merely records what occurs. Many types of data can be obtained more accurately through direct observation than by questioning. For example, recording traffic counts and/or observing the direction of traffic flows within a supermarket can help managers to design store layouts that maximise the exposure of departments selling impulse goods. A manufacturer can determine the number of facings, shelf locations, display maintenance and other characteristics that make for better store conditions. If directly questioned in a survey, most shoppers would be unable to accurately portray the time they spent in each department. The observation method could, however, determine this without difficulty. With the direct observation method, the data consist of records of events made as they occur. An observation form often helps keep the observations consistent and ensures that all relevant information is recorded. A respondent is not required to recall – perhaps inaccurately – an event after it has occurred; instead, the observation is instantaneous. In many cases, direct observation is the most straightforward form of data collection (or the only form possible). The produce manager at a Woolworths supermarket may periodically gather competitive price information at Coles and IGA stores in the suburb. In other situations, observation is

Direct observation of consumers of juices and juice beverages revealed that many of them poured their beverages from large bottles they had purchased into smaller empty water bottles: the kind with a push-up top. This led to the conclusion that juices packaged in smaller, convenientto-transport bottles would find a market.7

the most economical technique. In a common type of observation study, a manager of a tourist winery may observe the licence plate numbers on cars in its car park. These data, along with motor vehicle registration information, provide an inexpensive means of determining where customers live. Certain data may be obtained more quickly or easily using direct observation than by other methods. Gender, race and other respondent characteristics can simply be observed. Researchers investigating a diet product may use observation when selecting respondents in a shopping mall. Overweight people may be pre-screened by observing pedestrians, thus eliminating a number of screening interviews. In a quality-of-life survey, respondents were asked a series of questions that were compiled into an index of wellbeing. Direct observation was also used by the interviewers because the researchers wanted to investigate the effect of weather conditions on people’s answers. The researchers quickly and easily observed and recorded outside weather conditions on the day of the interviews, as well as the temperature and humidity in the building in which the interviews were conducted.8 Recording the decision time necessary to make a choice between two alternatives is a relatively simple, unobtrusive task that can be done through direct observation. Response latency refers to the choice time recorded as a measure of the strength of the preference between alternatives. It is hypothesised that the longer a decision-maker takes to choose between two alternatives, the closer the two alternatives are in terms of preference. A quick decision is assumed to indicate that the psychological distance between alternatives is considerable. The response latency measure is gaining

direct observation A straightforward attempt to observe and record what naturally occurs; the investigator does not create an artificial situation. response latency The amount of time it takes to make a choice between two alternatives; used as a measure of the strength of preference.

178

PART THREE > PLANNING THE RESEARCH DESIGN

popularity now that computer-assisted data collection methods are becoming more common (because the computer can record decision times).

Errors associated with direct observation Although no interaction with the subject occurs in direct observation, the method is not error-free: the observer may record events subjectively. The same visual cues that may influence the interplay between interviewer and respondent (e.g., the subject’s age or gender) may come into play in some types of direct observation settings. For example, the observer may subjectively attribute a particular economic status or educational background to a subject. A distortion of measurement resulting from the cognitive behaviour or actions of the witnessing observer is called observer bias. For instance, in a research project using observers to evaluate whether sales clerks are rude or courteous, fieldworkers may be required to rely on their own interpretations of people or situations during the observation process. If the observer does not record every detail that describes the people, objects and events in a given situation, accuracy may suffer. As a general guideline, the observer should record as much detail as possible. However, the pace of events, the observer’s memory and writing speed, and other factors will limit the amount of detail that can be recorded. Interpretation of observation data is another major source of potential error. Facial expressions and other nonverbal communication may have several meanings. Does a smile always mean happiness? Does the fact that someone is standing or seated in close proximity to the president of a company necessarily indicate the person’s status?

Scientifically contrived observation Most observation takes place in a natural setting. Observation in which the investigator intervenes to create an artificial environment in order to test a hypothesis is called contrived observation. Contrived observation can increase the frequency of occurrence of certain behaviour patterns. For example, an airline passenger complaining about a meal or service from the flight attendant may actually be a researcher recording that person’s reactions. If situations were not contrived, the research time spent waiting and observing a situation would expand considerably. A number of retailers use observers called mystery shoppers to come into a store and pretend to be interested in a particular product or contrived observation

service; after leaving the store, the ‘shopper’ evaluates the salesperson’s performance.

COMBINING DIRECT OBSERVATION AND INTERVIEWING Some research studies combine visible observation with personal interviews. Researchers in these observer bias A distortion of measurement resulting from the cognitive behaviour or actions of a witnessing observer. contrived observation Observation in which the investigator creates an artificial environment in order to test a hypothesis.

ethnographic studies, as they are called in anthropology, closely observe individual consumers’ behaviour in everyday situations. During or after the in-depth observations, the individuals are asked to explain the meaning of their actions.9 For example, direct observation of women applying hand and body lotion identified two kinds of users. Some women slapped on the lotion, rubbing it briskly into their skin. Others caressed their skin as they applied the lotion. When the women were questioned about their behaviour, the researchers discovered that women who slapped the lotion on were using the lotion as a remedy for dry skin. Those who caressed their skin were more interested in making their skin smell nice and feel soft.

CHAPTER 06 > OBSERVATION

ETHICS QUESTIONS OF OBSERVATIONAL RESEARCH

1 Is the behaviour being observed commonly performed in public where it is expected that others can observe the behaviour? 2 Is the behaviour performed in a setting in which the anonymity of the person being observed is assured (meaning that there is no way of identifying individuals). 3 Has the person agreed to be observed?

179

EXPLORING RESEARCH ETHICS

Shutterstock/Yellowj

ETHICAL ISSUES IN THE OBSERVATION OF HUMANS Observation methods introduce a number of ethical issues. Hidden observation raises the issue of the respondent’s right to privacy. For example, a firm interested in acquiring information about how women put on their bras might persuade some retailers to place two-way mirrors in dressing rooms so that this behaviour may be observed unobtrusively. Obviously, there is an ethical question to be resolved in such a situation. Other observation methods, especially contrived observation, raise the possibility of deception of subjects. Some people might see contrived observation as entrapment. To entrap means to deceive or trick into difficulty, which clearly is an abusive action. The problem is one of balancing values. If the researcher obtains permission to observe someone, the subject may not act in a typical manner. Thus, the researcher must determine his or her own view of the ethics involved and decide whether the usefulness of the information is worth telling a lie.

OBSERVATION OF PHYSICAL OBJECTS Physical phenomena may be the subject of observation study. Physical trace evidence is a visible mark of some past event or occurrence. For example, the wear on library books indirectly indicates which books are actually read (handled most) when checked out. As we mentioned in Chapter 1, a classic example of physical trace evidence in a non-profit setting was erosion traces on the floor tiles around the hatching-chick exhibit at Chicago’s Museum of Science and Industry. These tiles had to be replaced every six weeks; tiles in other parts of the museum did not need to be replaced for years. The selective erosion of tiles, indexed by the replacement rate, was a measure of the relative popularity of exhibits. Clearly, a creative marketing researcher has many options available for determining the solution to a problem. The story about Charles Coolidge Parlin (generally recognised as one of the founders of commercial marketing research) counting garbage cans at the turn of the 20th century illustrates another study of physical traces. Parlin designed an observation study to persuade Campbell’s Soup Company to advertise in the Saturday Evening Post. Campbell’s was reluctant to advertise because they believed that the Post was read primarily by working people who would prefer to make soup from scratch, peeling the potatoes and scraping the carrots, rather than paying 10c for a can of soup. To demonstrate that rich people weren’t the target market, Parlin selected a sample of Philadelphia garbage routes. Garbage from each specific area of the city that was selected was dumped on the floor of a local National Guard Armory. Parlin had the

Even if fashion companies could learn a lot about the types of problems consumers typically have when purchasing and wearing clothes, would observation through two-way mirrors be appropriate?

180

PART THREE > PLANNING THE RESEARCH DESIGN

number of Campbell’s soup cans in each pile counted. The results indicated that the garbage from the rich people’s homes didn’t contain many cans of Campbell’s soup. Although they didn’t make soup from scratch themselves, their servants did. The garbage piles from the blue-collar area showed a large number of Campbell’s soup cans. This observation study was enough evidence for Campbell’s. They advertised in the Saturday Evening Post.10

The method used in this study has been used in a scientific project at the University of Arizona in which aspiring archaeologists sifted through modern rubbish, examining soggy cigarette butts, empty milk cartons and half-eaten Big Macs. What is most interesting about this rubbish project was the comparison between the results of surveys about food consumption and the contents of respondents’ rubbish – rubbish does not lie. The University of Arizona project indicates that people consistently under-report the quantity of junk food they eat and over-report the amount of fruit and diet soft drink they consume. Most dramatically, however, studies show that alcohol consumption is under-reported by 40 to 60 per cent.11 Rubbish is even more revealing in Buenos Aires, Argentina. The research company, Garbage Data Dynamics, analyses discarded containers, newspapers and other garbage in that city. Because garbage is collected daily in Buenos Aires and people typically dispose of garbage in small bags with supermarket names printed on them, certain types of data that cannot be collected in Western countries can be obtained. The results are so specific that they can show what brand of soft drink was consumed with a certain meal. Counting and recording physical inventories by means of retail or wholesale audits allows researchers to investigate brand sales on regional and national levels, market shares, seasonal Getty Images/Oxford Scientific/Mary Clark

purchasing patterns and so on. Marketing research suppliers offer audit data at both retail and wholesale levels. An observer can record physical trace data to discover things that a respondent could not recall accurately. For example, actually measuring the number of grams of liquid bleach used during a test provides precise physical trace evidence without relying on the respondent’s memory. The accuracy of respondents’ memories is not a problem for the firm that conducts a pantry audit. The pantry audit requires an inventory of the brands, quantities and package sizes in a consumer’s home, rather than responses from individuals. The problem of untruthfulness or some other form of response bias is avoided. For example, the pantry audit prevents the Picking through the garbage on the side of the road can reveal behaviours of fast-food customers.

possible problem of respondents erroneously claiming to have purchased prestige brands. However, gaining permission to physically check consumers’ pantries is not easy, and the fieldwork is expensive. Furthermore, the brand in the pantry may not reflect the brand purchased most often if it was substituted because of a discount coupon or sale, because the subject’s normal brand was out of stock, or for another reason.

CONTENT ANALYSIS content analysis The systematic observation and quantitative description of the manifest content of communication.

Content analysis obtains data by observing and analysing the contents or messages of advertisements, newspaper articles, television programs, letters and the like. It involves systematic analysis as well as observation to identify the specific information content and other characteristics of the messages.

CHAPTER 06 > OBSERVATION

181

Content analysis studies the message itself: it involves the design of a systematic observation and recording procedure for quantitative description of the manifest content of communication. This technique measures the extent of emphasis or omission of a given analytical category. For example, the content of advertisements might be investigated to evaluate their use of words, themes, characters, or space and time relationships. The frequency of appearance of women, Asians or other minorities in mass media has been a topic of content analysis.

content analysis

Content analysis may be used to investigate questions such as whether some advertisers use certain types of themes, appeals, claims or deceptive practices more than others. A pay–television programmer might do a content analysis of network programming to evaluate its competition. For example, every year researchers might analyse the Australian Football League grand final to see how much of the visual material is live-action play and how much is replay, or how many shots focus on cheer squads and how many on spectators. The information content of television advertisements directed at children can be investigated, as can company images portrayed in advertising and numerous other aspects of sponsorship. Study of the content of communications is more sophisticated than simply counting the items: it requires a system of analysis to secure relevant data. After one employee role-playing session involving leaders and subordinates, recordings were analysed to identify categories of verbal behaviours (for example, positive reward statements, positive comparison statements and self-evaluation requests). Trained coders, using a set of specific instructions, then recorded and coded the leaders’ behaviour into specific verbal categories.

MEDIA MONITORS: TRACKING THE ISSUES OF THE DAY12

With more than 5000 clients and more than 750 staff in cities and regional centres in the Asia Pacific region, Media Monitors tracks everything that is written in the media or broadcast on radio or television every second of every hour of every day, which includes some 21 000 print media articles and 1500 radio and television programs daily. The organisation also monitors web activity, such as what is discussed in blogs, and provides comprehensive reports of the main issues being discussed in the media. Media Monitors also tracks some six million blogs and 25 000 online news sources from over 200 countries. This is of interest to companies and governments, and shows the importance of content analysis in marketing research.

REAL WORLD SNAPSHOT

MECHANICAL OBSERVATION In many situations the primary – and sometimes the only – means of observation is mechanical rather than human. Video cameras, traffic counters and other machines help to observe and record behaviour. Some unusual observation studies have used motion picture cameras and time-lapse photography. An early application of this observation technique photographed train passengers and determined their levels of comfort by observing how they sat and moved in their seats. Another time-lapse study filmed traffic flows in an urban square and resulted in a redesign of the streets. Similar techniques may help managers design shop layouts and resolve problems in moving people or objects through spaces over time.

Television monitoring Perhaps the best-known marketing research project involving mechanical observation and computerised data collection is the OzTAM television monitoring system for estimating national television audiences. This uses a consumer panel and a sophisticated monitoring device called a

television monitoring Computerised mechanical observation used to obtain television ratings.

182

PART THREE > PLANNING THE RESEARCH DESIGN

People Meter to obtain ratings for television programs; a similar approach is used by Nielsen in 22 other countries.13 Electronic boxes are hooked up to video monitors to capture important information about program choices, the length of viewing time and the identity of the viewer. Knowing who in the family is watching allows executives to match television programs with demographic profiles. When the panel household’s television is turned on, a question mark appears on the screen to remind viewers to indicate who is watching. The viewer then uses a handheld electronic device that resembles a remote control to record who is watching. A device attached to the television automatically sends the observed data – the viewer’s age and gender and what programs are being watched – by phone to Nielsen’s computers: 3500 households of television-only watchers and 1413 homes with national satellite television are used to scientifically represent the Australian television viewing audience. These people have agreed to become members of the panel and have meters placed in their homes. Critics of the People Meter argue that subjects in Nielsen’s panel grow bored over time and do not always record when they begin or stop watching television. Nielsen Media Research has a unique technology that will allow its People Meters to scan the room, recognise each family member by his or her facial characteristics, and record when the person enters or leaves the room.

HOW OBSERVATION INCREASES THE SUCCESS OF PRODUCT INNOVATIONS14

Management consultancy firm Booz & Co surveyed nearly 700 innovation leaders from companies worldwide to determine which they see as the most innovative companies in the world. The study confirmed that the most innovative companies are seldom the biggest spenders. Among successful firms, the most common mechanism for developing new ideas, by a substantial margin, was direct observation of customers, which was ranked first by 42 per cent of all respondents. Traditional market research was ranked second, with 31 per cent of respondents ranking it among their top five mechanisms. Change in the top 10 most innovative companies 2010–2012 2010

2011

2012

1st

Apple

Apple

Apple

2nd

Google

Google

Google

3rd

3M

3M

3M

4th

GE

GE

Samsung

5th

Toyota

Microsoft

GE

6th

Microsoft

IBM

Microsoft

7th

P&G

Samsung

Toyota

8th

IBM

P&G

P&G

9th

Samsung

Toyota

IBM

10th

Intel

Facebook

Amazon

Tie Booz & Co

REAL WORLD SNAPSHOT

In 2010, OzTam introduced the time-shift viewing (TSV) service, recognising the increasing use of personal video recorders (PVRs). This was the most significant change to Australian television audience measurement since People Meters were introduced in 1991. OzTAM’s TSV service measures and reports the viewing of recorded broadcast television content that is played back within seven days of the original broadcast, in addition to live viewing of broadcasts.

CHAPTER 06 > OBSERVATION

Monitoring website traffic Most organisations record how many people visit their websites. A hit occurs when a user clicks on a single page of a website. If the visitor clicks on many places to access graphics or the like, that page receives multiple hits.15 Organisations with websites consisting of multiple pages find it useful to track page views – single, discrete clicks on individual pages. Page views more conservatively indicate how many users visit each individual page on the website and may also be used to track the path or sequence of pages that each visitor follows. A variety of information technologies are used to measure Web traffic and to maintain access logs. Hitwise and Nielsen Online are marketing research companies that specialise in monitoring Internet activity. The typical Internet monitoring company installs a special tracking program on the personal computers of a sample of Internet users who agree to participate in the research effort. Nielsen Online has its software installed in 225 000 computers in homes and workplaces in 26 countries. Internet monitoring enables these companies to identify the popularity of websites (http://Google.com and http://ebay.com are among the most popular), measure the effectiveness of advertising banners and provide other audience information. For example, a study by a market research company indicated that 63 per cent of online shoppers stop short of completing their purchases after shipping charges are computed at the last step, known as the Checkout Line. Such companies can provide myriad Web metrics that are of interest to an online marketer; some selected examples are shown in Table 6.3. TABLE 6.3 »

SELECTED WEB ADVISERS’ OBJECTIVES AND THE METRICS THAT ADDRESS THESE OBJECTIVES16

Objectives and metrics

What does it measure?

How is it used?

What is it used for?

Visits

Number of user sessions

Measure of site exposure

Evaluation of message exposure

Average time per unique visitor

Usefulness of site (time spent on site)

Measure of usefulness and comparison over time of sites

Evaluation of usefulness and interest

Stickiness

Composite of the number of users, frequency, recency and average time per visit/visitor

Composite measure of stickiness

Evaluation of advertising appeals

Clicks

Number of clicks from originating buttons/links

Measure of communality with other sites (shared users)

To evaluate co-marketing pattern partners; improve co-marketing programs

Path analysis

Paths taken through site by visitors

Indicates most popular paths to site

To review and change content or site navigation

Global geographic

Visitor’s country

Assess exposure by country

To evaluate and improve targeting of messages by country

Observed profiling

Visitor’s previous site behaviour

Understanding what visitors do on the site

To improve targeting of messages by studying each visitor’s behavioural patterns

Scanner-based research Lasers performing optical character recognition and bar code technology – such as the universal product code (UPC) – have accelerated the use of mechanical observation in marketing research.

183

184

PART THREE > PLANNING THE RESEARCH DESIGN

Chapter 4 noted that a number of syndicated services offer secondary data about product category movement generated from retail stores using scanner technology. This technology now allows researchers to investigate more demographically or promotionally specific questions. For example, scanner research has investigated the different ways consumers respond to price promotions and how those differences affect a promotion’s profitability. One of the primary means of implementing this type of research is through the establishment of a scanner-based consumer panel to replace consumer purchase diaries. In a typical scanner panel, each household is assigned a bar-coded card that members present to the clerk at the register. The household’s code number is coupled with the purchase information recorded by the scanner. Furthermore, as with other consumer panels, background information about the household (obtained through answers to a battery of demographic and psychographic survey questions) also can be coupled with the household scanner-based consumer panel

code number. Aggregate data, such as actual shop sales as measured by scanners, are also available. These data parallel data provided by a standard mail diary panel, with some important improvements: 1 The data measure observes (actual) purchase behaviour rather than reported behaviour (recorded later in a diary). 2 Substituting mechanical for human record-keeping improves accuracy. 3 Measures are unobtrusive, eliminating interviewing and the possibility of social desirability or other bias on the part of respondents. 4 More extensive purchase data can be collected, because all UPC categories are measured. In a mail diary respondents could not possibly reliably record all items they purchased. Because all UPC-coded items are measured in the panel, users can investigate many product categories to determine loyalty, switching rates and so on for their own brands as well as for other companies’ products, and locate product categories for possible market entry.

scanner-based consumer panel A type of consumer panel in which participants’ purchasing habits are recorded with a laser scanner rather than a purchase diary. at-home scanning system A system that allows consumer panellists to perform their own scanning after taking home products, using hand-held wands that read UPC symbols.

5 The data collected from computerised checkout scanners can be combined with data about advertising, price changes, displays and special sales promotions. Researchers can scrutinise them with powerful analytical software provided by the scanner data providers. Scanner data can show a marketer week by week how a product is doing, even in a single shop, and track sales in response to local ads or promotions. Furthermore, several organisations, such as Information Resources Inc. Behaviour Scan System, have developed scanner panels and expanded them into electronic test market systems. These are discussed in greater detail in Chapter 7. Advances in bar-code technology have led to at-home scanning systems that use hand-held wands to read UPC symbols. Consumer panellists perform their own scanning after they have taken home the products. This advance makes it possible to investigate purchases made at shops that do not have in-store scanning equipment.

WHAT WENT WRONG? PROJECT APOLLO AND THE DEATH OF SINGLE-SOURCE DATA17

Five multinational companies – PepsiCo, Unilever, Procter & Gamble, Kraft and SC Johnson – funded a research project called Project Apollo, which was managed by market research companies Arbitron and Nielsen. Project Apollo used a panel of 5000 households who use portable People Meters to record every advertisement they encounter. This information was then matched with the purchase data people record with personal scanners, creating single-source

data that should enable marketers to determine what factors will influence consumer behaviour. The pilot of this project was cancelled in 2008 because of the difficulty in finding advertisers to fund it. Other difficulties were that respondents were asked to wear portable People Meters around their necks. The project also focused too much on television, radio and print media, and not enough on online and social media.

CHAPTER 06 > OBSERVATION

185

Measuring physiological reactions Marketing researchers have used a number of other mechanical devices to evaluate consumers’ physical and physiological reactions to advertising copy, packaging and other stimuli. This new area of research is often called neuromarketing. Researchers use such means when they believe that consumers are unaware of their own reactions to stimuli such as advertising, or that consumers will not provide honest responses. There are four major categories of mechanical devices used to measure physiological reactions: 1 eye-tracking monitors 2 pupilometers 3 psychogalvanometers 4 voice pitch analysers. A magazine or newspaper advertiser may wish to grab readers’ attention with a visual scene and then direct it to a package or coupon, or a television advertiser may wish to identify which selling points to emphasise. Eye-tracking equipment records how the subject reads a print advertisement or views a television advertisement, and how much time is spent looking at various parts of the stimulus. In physiological terms, the gaze movement of a viewer’s eye is measured with an eye-tracking monitor, which measures unconscious eye movements. Originally developed to measure astronauts’ eye fatigue, modern eye-tracking systems need not keep a viewer’s head in a stationary position. The devices track eye movements through invisible infrared light beams that lock onto a subject’s eyes. The light reflects off the eye, and eye-movement data are recorded while another tiny video camera monitors which magazine page is being perused. The data are analysed by computer to determine which components in an ad (or other stimuli) were seen and which were overlooked. The other physiological observation techniques are based on a common principle:

eye-tracking monitor

Physiological research depends on the fact that adrenalin is produced when the body is aroused. When adrenalin goes to work, the heart beats faster and more strongly, and even enlarges. Blood flows to the extremities and increases capillary dilation at the fingertips and earlobes. Skin temperature increases, hair follicles stand up, skin pores emit perspiration, and the electrical conductivity of skin surfaces is affected. Eye pupils dilate, electrical waves in the brain increase in frequency, breathing is faster and deeper, and the chemical composition of expired air is altered. This process offers a choice of about 50 different measures – the question of which measure to use is to some extent irrelevant since they are all measuring arousal.18

A pupilometer observes and records changes in the diameter of a subject’s pupils. A subject is instructed to look at a screen on which an advertisement or other stimulus is projected. When the brightness and distance of the stimulus from the subject’s eyes are held constant, changes in pupil size may be interpreted as changes in cognitive activity that result from the stimulus (rather than from eye dilation and constriction in response to light intensity, distance from the object or other physiological reactions to the conditions of observation). This method of research is based on the assumption that increased pupil size reflects positive attitudes towards and interest in advertisements. A psychogalvanometer measures galvanic skin response (GSR), a measure of involuntary changes in the electrical resistance of the skin. This device is based on the assumption that physiological changes such as increased perspiration, accompany emotional reactions to advertisements, packages, and slogans. Excitement increases the body’s perspiration rate, which increases the electrical resistance of the skin. The test is an indicator of emotional arousal or tension.

eye-tracking monitor A mechanical device used to observe eye movements. Some eye monitors use infrared light beams to measure unconscious eye movements. pupilometer A mechanical device used to observe and record changes in the diameter of a subject’s pupils. psychogalvanometer A device that measures galvanic skin response, a measure of involuntary changes in the electrical resistance of the skin.

186

PART THREE > PLANNING THE RESEARCH DESIGN

REAL WORLD SNAPSHOT

EYE-TRACKING TECHNOLOGY INCREASES WEB EFFECTIVENESS

Measurements with an eye-tracking monitor of subjects’ responses are available on a number of website including https://www.quicksprout. com/2014/04/16/8-powerful-takeaways-from-eye-tracking-studies. The eye-tracked image from Gateway’s website shows how the viewer missed the information about Gateway’s convertible notebook (red arrow) because it looked too much like an advertisement.

Voice pitch analysis is a relatively new physiological measurement technique that measures emotional reactions as reflected in physiological changes in a person’s voice. Abnormal frequencies in the voice caused by changes in the autonomic nervous system are measured with sophisticated, audio-adapted computer equipment. Computerised analysis compares the respondent’s voice pitch during warm-up conversations (normal range) with verbal responses to questions about his or her evaluative reaction to television advertisements or other stimuli. This technique, unlike other physiological devices, does not require the researcher to surround subjects with mazes of wires or voice pitch analysis

equipment. There are a number of new and exciting neuroscience approaches to consumer research which rely on brain scanning either by electrical pulses or changes in blood pressure. There are also significant ethical issues here of informed consent and whether such scarce medical technology should be applied to commercial applications. These observational approaches generally use small samples in controlled laboratory conditions, so care needs to be taken that results from this research are triangulated with results from other qualitative and quantitative approaches. Electroencephalography (EEG) uses electrodes applied to the scalp. It detects brief neuronal events and measures changes in the electrical field in the brain region underneath. Magnetoencephalography (MEG) measures changes in the magnetic fields induced by neuronal activity. It has better spatial resolution than EEG. Transcranial Magnetic Stimulation (TMS) uses an iron core, often in the shape of a toroid wrapped in electrical wire, to create a magnetic field strong enough to induce electrical currents in underlying neurons when placed on the head.

Functional magnetic resonance imaging (fMRI) voice pitch analysis A physiological measurement technique that records abnormal frequencies in the voice that are supposed to reflect emotional reactions to various stimuli. functional magnetic resonance imaging (fMRI) A magnetic scan that reveals in different colours which parts of the brain are active in real time.

Functional magnetic resonance imaging (fMRI) is the latest innovation in observation research. Respondents’ brains are scanned while they receive marketing messages or other stimuli. The resultant real-time video pictures show in colour which parts of the brain are activated (sources of thought), which can signify emotional as well as cognitive reactions to advertisements and other marketing communications. In a US study, the brain scans of 67 people given a blind taste test of Coca-Cola and Pepsi were recorded. The results were evenly split as to preference. Told what they were drinking, the ‘brand loyalty’ or emotional section of the brain then overrode the original preferences. Three out of four then indicated they preferred Coke.19 Findings of another breakthrough study published in the Journal of Advertising Research20 reveal how consumers perceive advertising messages and how those perceptions trigger emotional responses. Again, contrary to longstanding theories, the relationship between entertaining television advertisements and positive emotions was surprisingly low, while there was a strong association between negative reactions and what are known in the industry as ‘cut-through’ ads.

CHAPTER 06 > OBSERVATION

187

All of these devices discussed assume that physiological reactions are associated with persuasiveness or predict some cognitive response. However, this has not yet been clearly demonstrated. There is no strong theoretical evidence to support the argument that a physiological change is a valid measure of future sales, attitude change or emotional response. Another major problem with physiological research is the calibration, or sensitivity, of measuring devices. Identifying arousal is one thing, but precisely measuring levels of arousal is another. In addition, most of these devices are expensive. However, as a prominent researcher points out, physiological measurement is coincidental: ‘Physiological measurement isn’t an exit interview. It’s not dependent on what was remembered later on. It’s a live blood, sweat and tears, moment-by-moment response, synchronous with the stimulus.’21 The table below shows some of the estimated costs and applications of each of these types of

Getty Images/Science Photo Library

neuroscience approaches. TABLE 6.4 » SELECTED NEUROSCIENCE APPROACHES IN MARKETING: COSTS AND APPLICATIONS22

Neuroscience tool

Cost

Market research applications

Electroencephalography (EEG)

Equipment costs ,USD $10 000 USD

Used to measure momentary fluctuations in emotion in response to advertisements.

Transcranial Magnetic Stimulation (TMS)

Costs anywhere from USD $1500 USD and above depending on setup

Used to study the causal role of specific brain regions, in particular tasks, by temporarily taking them ‘off-line’.

functional MRI (fMRI)

MRI scanners cost approximately USD $1 million per Tesla and have annual operating costs of USD $100 000$300 000

Used in refining the product attributes before releasing them for the market. fMRI temporal resolution 1–3 seconds.

Each of these mechanical devices has another limitation, in that the subjects are usually placed in artificial settings (watching television in a laboratory, being transported through a noisy brain scanner rather than at home) and they know that they are being observed.

»» Generally, observations have the most validity when they are performed unobtrusively. The reason for this is that social influences (such as may occur from an interviewer’s presence or the knowledge that one is being observed) are eliminated. »» Marketing research often involves information processing. Researchers should strongly consider using measures of response latency when studying information processing. Computer-aided survey technology makes observing response latency easy and accurate. »» Artefacts are great ways to put together a physical trace of human activities and to understand the value of those objects to the individuals involved. »» To avoid ethical issues, the anonymity of people whose behaviour is captured using observational data collection should be protected at all times, unless consent has been obtained to identify the person.

Physiological responses to advertising can be recorded with a device like this one.

TIPS OF THE TRADE

188

PART THREE > PLANNING THE RESEARCH DESIGN

SUMMARY DISCUSS THE ROLE OF OBSERVATION AS A MARKETING RESEARCH METHOD

Observation is a powerful tool for the marketing researcher. Scientific observation is the systematic process of recording the behavioural patterns of people, objects and occurrences as they are witnessed. Questioning or otherwise communicating with subjects does not occur. A wide variety of information about the behaviour of people and objects can be observed. Seven kinds of phenomena are observable: physical actions, verbal behaviour, expressive behaviour, spatial relations and locations, temporal patterns, physical objects, and verbal and pictorial records. Thus, both verbal and nonverbal behaviour may be observed. A major disadvantage of the observation technique is that cognitive phenomena such as attitudes, motivations, expectations, intentions and preferences are not observable. Furthermore, only overt behaviour of a short duration can be observed. Nevertheless, many types of data can be obtained more accurately through direct observation than by questioning respondents. Observation is often the most direct or the only method for collecting certain data. DESCRIBE THE USE OF DIRECT OBSERVATION AND CONTRIVED OBSERVATION

Marketing researchers employ both human observers and machines designed for specific observation tasks. Human observation is commonly used when the situation or behaviour to be recorded is not easily predictable in advance of the research. Mechanical observation can be used when the situation or behaviour to be recorded is routine, repetitive or programmatic. Human or mechanical observation may be unobtrusive. Human observation brings the possibility of observer bias, even though the observer does not interact with the subject. Observation can sometimes be contrived by creating the situations to be observed. This can reduce the time and expense of obtaining reactions to certain circumstances.

ETHICAL ISSUES OF OBSERVATION

06

Contrived observation, hidden observation and other observation research designs that might use deception often raise ethical concerns about subjects’ right to privacy and right to be informed. This chapter provides a checklist of ethical issues that need to be addressed if observational research is to be used. EXPLAIN THE OBSERVATION OF PHYSICAL OBJECTS AND MESSAGE CONTENT

Physical trace evidence serves as a visible record of past events. Researchers may examine whether there is physical evidence of consumer behaviour via garbage analysis or items in a consumer’s pantry. Content analysis obtains data by observing and analysing the contents of the messages in written and/or spoken communications. It is often used to assess political, economic and social trends and the way they are depicted in the media. DESCRIBE THE MAJOR TYPES OF MECHANICAL OBSERVATIONS AND OBSERVATIONS OF PHYSIOLOGICAL REACTIONS

Mechanical observation uses a variety of devices to record behaviour directly. Mechanical observation takes many forms. National television audience ratings are based on mechanical observation and computerised data collection. There is also a number of web-based behavioural measures that can assess the effectiveness of online marketing campaigns, such as website hits and unique site visits. Scanner-based research provides product category sales data recorded by laser scanners in retail stores. Many syndicated services offer secondary data collected through scanner systems. Physiological reactions, such as arousal or eye movement patterns, may be observed using a number of mechanical devices. The major problem for many of these techniques is in the interpretation of physiological reactions.

KEY TERMS AND CONCEPTS at-home scanning system content analysis contrived observation direct observation eye-tracking monitor functional magnetic resonance imaging (fMRI) hidden observation

observation observer bias psychogalvanometer pupilometer response latency scanner-based consumer panel

television monitoring visible observation voice pitch analysis

CHAPTER 06 > OBSERVATION

189

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 Read the opening vignette on the use of observational research and the nature of advertising to New Zealand doctors. What flaws do you think there could be in the use of observational research in this case? How could it be improved? 2 What are the ethical issues of using hidden or unobtrusive observation? 3 Click-through rates for advertisements placed in websites are very low (1 per cent or less). What types of error might exist using click-through rate data as a measure of an advertisement’s success? 4 What are some problems faced by firms with using neuroscience approaches? 5 A multinational fast-food corporation plans to locate a restaurant in Jakarta, Indonesia. Secondary data for this city are outdated. How might you determine the best location using observation? 6 How might an observation study be combined with a qualitative in-depth interviews? 7 What is a scanner-based consumer panel? 8 How might a smartphone serve to provide data to a company that provides consulting to restaurants? 9 Outline a research design using observation for each of the following situations: a A telecommunication provider wishes to collect data on the number of customer services and the frequency of customer use of these services. b A state government wishes to determine the driving public’s use of texting whilst driving c A researcher wishes to know how many women have been featured on The Australian covers over the last 10 years. d A fast-food franchise wishes to determine how long a customer entering a store has to wait for his or her order.

e An online magazine publisher wishes to determine exactly what people see and what they pass over while reading one of her magazines. f A health food manufacturer wishes to determine how people use snack foods in their homes. g An overnight package delivery service wishes to observe delivery workers, beginning at the moment when they stop the truck, continuing through the delivery of the package, and ending when they return to the truck. h A motivational researcher wants to know if people wear sunglasses to protect their eyes and/or for reasons of fashion and style. 10 Watch the nightly news on a major network for one week. Observe how much time is devoted to national news, advertisements and other activities. (Hint: think carefully about how you will record the contents of the programs.) 11 Comment on the ethics of the following situations: a During the course of telephone calls with customers, a telecommunication provider records respondents’ voices when they have a bill complaint and then conducts a voice pitch analysis. The respondents do not know that their voices are being recorded. b A researcher plans to invite consumers to be test users in a simulated kitchen located in a shopping mall and then to record their reactions to a new microwave dinner from behind a two-way mirror. c A marketing researcher arranges to purchase the garbage from the headquarters of a major competitor. The purpose is to sift through discarded documents to determine the company’s strategic plans. 12 What is a fMRI device and how can it be used to research advertising effectiveness?

ONGOING PROJECT DOING AN OBSERVATIONAL STUDY? CONSULT THE CHAPTER 6 PROJECT WORKSHEET FOR HELP

Observation research is one of the most flexible research methods. The research problem needs to be considered before using it.

Download the Chapter 6 project worksheet from the CourseMate website. It outlines the steps to be considered in choosing an observational design method. Make sure you meet any ethical considerations discussed in this chapter.

190

PART THREE > PLANNING THE RESEARCH DESIGN

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to log in to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ interactive quizzes

☑ crosswords on key concepts ☑ online research activities ☑ online video activities.

☑ flashcards

WRITTEN CASE STUDY 6.1 THE PEPSI / COKE CHALLENGE AND NEUROSCIENCE23 Can marketers really trust customers to act on their intentions? Although they’d like to think so, marketing research suggests that our thinking processes often push us in one direction while our emotions tug us in another. It is claimed by researchers in neuromarketing, who use brain scanning techniques such as functional magnetic resonance imaging (fMRI) and electroencephalogram (EEG), that most of our decisions are made subconsciously or due to a subconscious reaction. Neuroscientist Read Montague discovered a vivid illustration of this phenomenon. In 2003, Montague recreated the classic Pepsi Challenge experiment, in which participants are given sips of both Coke and Pepsi and asked which they prefer. But he also monitored the sippers’ brain activity. His results matched those of the original challenge: more than half preferred Pepsi. Participants’ brains, meanwhile, showed bursts of excitement in the region that is activated by appealing tastes. Then Montague did the test again, but told participants which product they were drinking. This time, 75 per cent voted for Coke. More significantly, their brains also registered activity in an area linked to higher thinking and discernment. The reason, according to neuromarketing guru Martin Lindstrom, who described the research in his book

Buyology, is that our brains encode products with values. We experience a drink such as Coke not just through our senses but also through our emotional associations with the brand. We may expect that we will like – and want to buy – an unfamiliar item, but subconscious reactions will always come into play – all of which helps explain that oft-cited failure rate: 85 per cent of new products fail not because market research showed that customers did not want the product, but because it suggested they did.

QUESTIONS

1 What are the advantages of using brain scanning technology for marketing research? 2 What are some potential pitfalls? 3 Do a Google search on the use of brain-scanning technology in marketing research. Are any of these pitfalls and problems addressed by marketers? 4 What are some ethical issues involved in the use of brain scanning technology in marketing research?

WRITTEN CASE STUDY 6.2 MAZDA AND SYZYGY When Mazda Motor Europe set out to improve its website, the company wanted details about how consumers were using the site and whether finding information on the site was easy. Mazda hired a research firm called Syzygy to answer those questions with observational research.24 Syzygy’s methods include the use of an eye-tracking device that uses infrared light rays to record which areas of a computer screen a user is viewing. For instance, the device measured the process computer users followed in order to look for a local dealer or arrange a test drive. Whenever a process seemed confusing or difficult, the company looked for ways to make the website easier to navigate. To conduct this observational study, Syzygy arranged for 16 subjects in Germany and the UK to be observed as they used the website. The subjects in Germany were observed with the eyetracking equipment. As the equipment measured each subject’s

gaze, software recorded the location on the screen and graphed the data. Syzygy’s results included three-dimensional contour maps highlighting the ‘peak’ areas where most of the computer users’ attention was directed.

QUESTIONS

1 What could Mazda learn from eye-tracking software that would be difficult to learn from other observational methods? 2 What are the shortcomings of this method? 3 Along with the eye-tracking research, what other research methods could help Mazda assess the usability of its website? Summarise your advice for how Mazda could use complementary methods to obtain a complete understanding of its website usability.

CHAPTER 06 > OBSERVATION

191

ONGOING CASE STUDY MOBILE PHONE SWITCHING AND BILL SHOCK In the next stage of research, David, Leanne and Steve were asked to examine evidence for bill shock. David thought that the complexity of mobile phone bills may be contributing to this. Steve suggested they sample a number of mobile phone bills and see if this was the case. Through a research company the three collected 50 mobile phone bills across a number of carriers. David, tasked with collating the data, asked for help. ‘These are quite complex and varied,’ he said, ‘and I am not sure how we can collate all this information.’

QUESTIONS

1 What are the advantages and disadvantages of David’s use of observational research here? 2 Develop a spreadsheet which could be used to collate this information. Start with your own mobile phone bill. 3 What are some ethical issues for David, Leanne and Steve to consider using this research?

NOTES 1 2 3 4 5 6 7 8

9 10 11 12

Goodwin, Ellen (2015) ‘More curbs on drug advertising urged’, Otago Daily Times, 4 September, accessed at http://www.fativa.com, on 26 November 2015. Selltiz, Claire, Wrightsman, Lawrence & Cook, Stuart (1976) Research methods in social relations, New York: Holt, Rinehart and Winston, p. 251. Searle, Jane (2008) ‘Knowledge is money’, Business Review Weekly, 19 January, p. 84. Adhikari, Supratim (2012) ‘Time for a Big Data Diet’, Business Spectator, 12 July, accessed http://www.businessspectator.com.au/article/2012/7/11/technology/time-bigdata-diet March 2016. Henley, Nancy (1977) Body Politics: Power, sex and nonverbal communication, New York: Simon & Schuster, p. 181. Maria, H., Anne, H., & Pia, P. (2011) ‘An exploration of how mature women buy clothing: Empirical insights and a model,’ Journal of Fashion Marketing & Management, 15(1), 108–122. Abrams, Bill (2000) The observational research handbook, Chicago: NTC Business Books, p 14. Campbell, Angus, Converse, Philip & Rodgers, Willard (1976) The quality of American life, New York: Russell Sage Foundation, p. 112. Although weather conditions did not correlate with perceived quality of life, the comfort variable did show a relationship with the index of wellbeing. This association might be confounded by the fact that ventilation and/or air-conditioning equipment is less common in less affluent homes. Income was previously found to correlate with quality of life. Abrams, Bill (2000) The observational research handbook, Chicago: NTC Business Books, p. 1105. Adapted with permission from the 30 April 1980 issue of Advertising Age. Copyright © 1980 by Crain Communications, Inc. Rybczynski, Witold (1992) ‘We are what we throw away’, New York Times Book Review, 5 July, pp. 5–6. Media Monitors (2013 ) Promotional material from website, accessed at http://www. http://www.isentia.com/about/our-story March 2016.

13 http://www.agbnielsen.net/products/peoplemeter.asp, accessed on 25 January 2013. 14 Yuanyuan Hu (2012) ‘Innovation from Rising R&D’, China Daily, Hong-Kong Edition, 19 November, p. 14. 15 www.mra-net.org/docs/resources/technology/buzzwords.cfm, downloaded on 24 April 2001. 16 Bhat, Subodh, Bevans, Michael & Sengupta, Sanjit (2002) ‘Measuring users’ Web activity to evaluate and enhance advertising effectiveness’, Journal of Advertising, 31(3), Fall, pp. 98–9. 17 Shoebridge, Neil (2006) ‘Revealing why people buy’, Australian Financial Review, 10 April, p. 49. 18 Marketing News (1981) ‘Live, simultaneous study of stimulus, response is physiological measurement’s great virtue’, 15 May, pp. 1, 20. 19 Walton, Chris (2004) ‘The brave new world of neuromarketing is here’, B&T, 2 December, accessed at www.bandt.com.au/news/0e/0c02910e.asp on 19 April 2006. 20 Rossiter, John Silberstein, Harris, R. & Neild, G. (2001) ‘Brain imaging detection of visual scene encoding in long term memory for TV commercials’, Journal of Advertising Research, 41(2), pp. 9–26. 21 Herbert B. Krugman’s statement as quoted in ‘Live, simultaneous study of stimulus, response is physiological measurement’s greatest virtue’, p. 1. 22 Samuel Babu, S., & Prasanth Vidyasagar, T. (2012) ‘Neuromarketing: Is Campbell in soup?’ IUP Journal of Marketing Management, 11(1): 76–100. 23 Lindstrom, M., & Underhill, P. (2010) Buyology: Truth and lies about why we buy, New York: Broadway Business. 24 Based on ‘Mazda turns to eye-tracking to assist revamp of European site’ (2005) New Media Age (3 November), p. 8; Revolution (2006) ‘Persuasion is the new focus’ (21 February), downloaded from http://www.syzygy.co.uk

WHAT YOU WILL LEARN IN THIS CHAPTER

To understand the steps in experimental research. To decide on a field or laboratory experimental design.

To decide on the choice of independent and dependent variable(s). To select and design the test units. To address issues of validity in experiments. To select and implement an experimental design.

‘Playing’ with test markets1

As will be discussed later in this chapter, test markets can be expensive and time consuming. One Australian company, Play, which has been product test marketing for 15 years, has come up with a novel approach. It costs clients less than $5000 to test market their products – less than 5 per cent of the cost of a traditional test market which, normally, can easily run into millions of dollars. Using the Pop UP Shopper Online Community, Australian grocery shoppers sign up to provide feedback on new and existing supermarket products. Members are profiled to find out how frequently they buy from different product

To address issues of ethics in experimentation. To understand the steps in test marketing. To decide whether to test market or not. To work out the function of the test market. To decide on the type of test market. To decide the length of the test market. To decide where to conduct the test market. To estimate and project the results of the test market.

192

categories, and the research includes all standard demographic information. Each month, at least 100 shoppers are shown a range of relevant products and are asked for options before receiving their product-testing parcel of up to 10 products on which to provide further feedback. Testing processes include everything from sensory testing and observational recordings to key performance indicators to measure the likelihood of purchasing. This chapter explores the nature of experimentation in marketing, of which test markets are the ultimate example. This chapter also examines how companies use a number of experimental designs in order to provide definitive answers on issues such as pricing, taste testing and packaging, to name a few. iStock.com/gilaxia

07 »

EXPERIMENTAL RESEARCH AND TEST MARKETING

PART THREE > PLANNING THE RESEARCH DESIGN

THE NATURE OF EXPERIMENTS Most students are familiar with the concept of experimentation in the physical sciences. The term experiment can conjure up an image of a chemist surrounded by bubbling test tubes and Bunsen burners, but we are talking about a rather different type of experimentation. Behavioural and physical scientists have been far ahead of marketing researchers in its use. Nevertheless, the purpose of experimental research is the same. Experimental research allows the investigator to control the research situation so that causal relationships among variables may be evaluated. The marketing experimenter manipulates a single variable in an investigation and holds constant all other relevant, extraneous variables. Events may be controlled in an experiment to a degree not possible in a survey. The researcher’s goal in conducting an experiment is to determine whether the experimental treatment

experiment A research method in which conditions are controlled so that one or more independent variables can be manipulated to test a hypothesis about a dependent variable. Experimentation allows evaluation of causal relationships among variables while all other variables are eliminated or controlled.

is the cause of the effect being measured. If a new marketing strategy (for example, new advertising) is used in a test market, and sales subsequently increase in that market but not in markets where the new strategy is not employed, the experimenter can feel confident that the new strategy caused the increase in sales. For example, one marketing experiment investigated the influence of brand name identification on consumers’ taste perceptions.2 The experimenter manipulated whether consumers tasted beer in labelled or unlabelled bottles. In week one, respondents were given a six-pack containing bottles labelled only with letters. The following week, respondents received another six-pack with brand labels. Thus, respondents did not actually purchase beer from a shop, but they drank the beer at home at their leisure. The experimenter measured reactions to the beers after each tasting. The beer itself was the same in each case, so differences in taste perception were attributed to label (brand) influence. This example illustrates that once an experimenter manipulates the independent variable, he or she measures changes in the dependent variable. The essence of an experiment is to do something to an individual and observe the reaction under conditions that allow this reaction to be measured against a known baseline.

By now, perhaps you’ve had an opportunity to explore the editing features of the Qualtrics survey platform. As the name implies, the tool edits a ‘survey’. Typically, surveys are thought of in association with descriptive research designs. Consider the following points in trying to understand the role such tools may play in causal designs: 1 What types of variables can be measured using survey items created with Qualtrics? 2 Would it be possible to implement an Courtesy of Qualtrics.com experimental manipulation within or in conjunction with a Qualtrics survey application? 3 Is it possible to create a manipulation check with a Qualtrics survey item? 4 How might computer technology assist in randomly assigning subjects to experimental conditions?

SURVEY THIS!

CHAPTER SEVEN > EXPERIMENTAL RESEARCH AND TEST MARKETING

193

194

PART THREE > PLANNING THE RESEARCH DESIGN

This chapter deals with experimental design and a more applied form of experimental research in marketing called test markets. The steps involved in experimental design are discussed later in this chapter. In order to conduct experimental research the following steps should be followed: 1 Decide on a field or laboratory experimental design. 2 Decide on the choice of independent and dependent variable(s). 3 Select and design the test units. 4 Address issues of validity in experiments. 5 Select and implement an experimental design. 6 Address issues of ethics in experimentation. ONGOING PROJECT

Step 1: Field and laboratory experiments Experiments differ from other research methods in degree of control over the research situation. In an experiment, one variable (the independent variable) is manipulated and its effect on another variable (the dependent variable) is measured, while all other variables that may confound the relationship are eliminated or controlled. The experimenter either creates an artificial situation that limits the influence of outside factors or deliberately manipulates a real-life situation, which allows the influence of outside factors but makes it harder to determine cause-and-effect relationships. Research done by P!nk on the success of her songs in music clubs is realistic – and therefore it has high external validity – although she cannot be absolutely sure that people danced only because her music was played, since other factors (time of night, previous music played, alcohol and drug consumption, number of available partners etc.) may have influenced the results. Thus her field experiment may have problems in internal validity (knowing for sure that people danced more only because of P!nk’s music). It is important to remember that market researchers trade off external and internal validity. The greater the degree of control of factors and conditions of the experiment (high internal validity), the less realistic are the results (low external validity). Very realistic field experiments – for example, test markets – which are high in external validity, have low internal validity or may suffer from problems in producing reliable results as other factors beyond the researcher’s control (such as actions by competitors aware of the test market) may have influenced results. Marketing experiments conducted in a natural setting are called field experiments; in an artificial

laboratory experiment field experiment

setting – one contrived for a specific purpose – they are called laboratory experiments. In a laboratory experiment the researcher has almost complete control over the research setting. For example, subjects are recruited and brought to an advertising agency’s office, a research agency’s office or perhaps a mobile unit designed for research purposes. They are exposed to a television advertisement within the context of a program that includes competitors’ ads, and they are not interrupted as they view the advertisements. They are then allowed to purchase either the advertised product or one of several competing products in a simulated shop environment. Trial purchase measures are thus obtained. A few weeks later, subjects are contacted again to measure their satisfaction and determine repeat purchasing intention. This typical laboratory experiment gives the consumer an opportunity to ‘buy’ and ‘invest’. In a short time span, the marketer is able to collect information on decision-making.

laboratory experiment An experiment conducted in a laboratory or other artificial setting to obtain almost complete control over the research setting. field experiment An experiment conducted in a natural setting, where complete control of extraneous variables is not possible.

Another variation of a simulated shopping experiment involves using a representative panel of homemakers, who receive a weekly visit at home from a salesperson in a mobile shopping van. This allows the researcher to measure trial, repeat purchase and buying rates. Prior to the salesperson’s visit, the subjects are mailed a sales catalogue and an order form that features the products being tested, along with all the leading brands and any promotional support that is either current or being tested. Field experiments generally are used to fine-tune marketing strategies and to determine sales volume. McDonald’s conducted a field experiment to test market Triple Ripple, a three-flavour icecream product. The product was dropped because the experiment revealed distribution problems reduced product quality and limited customer acceptance. In the distribution system the product

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

195

would freeze, defrost and refreeze. Solving the problem would have required each McDonald’s city to have a local ice-cream plant with special equipment to roll the three flavours into one. A naturalistic setting for the experiment helped McDonald’s executives realise the product was impractical.

SMALLER PLATE SIZES, A BENEFIT TO CUSTOMERS AND RESTAURANTS?3

Field experiments suggest that using smaller plates rather than the traditional, larger plates results in consumers eating less, and reduced waste for restaurants. Using smaller plates may also assist with weight loss as there seems to be natural bias towards filling a plate. Field research published in the Journal of Applied Experimental Psychology showed that Chinese buffet diners with large plates served themselves 52 per cent more, ate 45 per cent more, and wasted 135 per cent more food than diners with smaller plates. Moreover, education does not appear effective in reducing such biases. Even a 60-minute, interactive multimedia warning on the dangers of using large plates had seemingly no impact on 209 health conference attendees, who subsequently served themselves nearly twice as much food when given a large buffet plate two hours later.

REAL WORLD SNAPSHOT

Experiments vary in their degree of artificiality. Exhibit 7.1 shows that as experiments increase in naturalism, they begin to approach the pure field experiment, and as they become more artificial, they approach the laboratory type. The degree of artificiality in experiments refers to the amount of manipulation and control of the situation that the experimenter creates to ensure that the subjects will be exposed to the exact conditions desired. In a field experiment, the researcher manipulates some variables but cannot control all the extraneous ones.

Laboratory experiments

Artificial environmental setting

Natural environmental setting

Field experiments

controlled store test A hybrid between a laboratory experiment and a test market; test products are sold in a small number of selected stores to actual customers.

← EXHIBIT 7.1 THE ARTIFICIALITY OF LABORATORY VERSUS FIELD EXPERIMENTS

Shutterstock/Mirenska Olga

Generally, subjects know when they are participating in a laboratory experiment. Performance of certain tasks, responses to questions or some other form of active involvement is characteristic of laboratory experiments. Subjects in laboratory experiments are commonly debriefed to explain the purpose of the research. In some situations only field studies are usable because it is not feasible to simulate such things as reactions to a new product by a retailer or a company’s sales force. One common hybrid of a laboratory experiment that simulates both a controlled purchasing environment and a test market that provides a natural testing of consumers’ reactions is the controlled store test. The products are put into stores in a number of small cities or into selected supermarket chains. Product deliveries are made not through the traditional warehouse but by the research agency, so product information remains confidential. Later, a test market in a larger geographic area is used to confirm expectations and to fine-tune the marketing strategy. Controlled store tests offer secrecy, and sales movement and

The naturally occurring noise that exists in the field can interfere with experimental manipulations.

196

PART THREE > PLANNING THE RESEARCH DESIGN

market share can be measured weekly – even daily if desired. However, national sales projections cannot be made; only benchmark sales data can be obtained because of the relatively small sample of stores and the limitations on the type of outlet where the product is tested. Decisions must be made about several basic elements of an experiment. These issues are: 1 manipulation of the independent variable 2 selection and measurement of the dependent variable 3 selection and assignment of subjects 4 control over extraneous variables.4

MOBILE PHONE EXPERIMENTS5

We all know mobile phones sell. But what kinds of effects can mobile phones cause? Many people are calling for experiments to examine the potential health effects of heavy mobile phone usage. Additionally, ancillary technologies such as bluetooth devices, which many people hook to their ear for hours on end each day, constantly bombard us with radio waves. While most people do not believe these to be serious threats to our health, others are not so sure. One of the challenges in studying this issue is the difficulty in implementing an experimental design on human subjects – much as in the case of smoking. One thing is certain though, more and more marketing researchers are finding ways to conduct experiments with mobile phone technology. For instance, advertising appeals can be delivered via text message or voice mail. Advertisers can manipulate the size of a discount offered for a brief period of time and then track to see whether the subject takes advantage of the discount. For instance, advertisers in Hong Kong, where consumers are more receptive to advertising via mobile phones, can send a short text blast to all consumers near a Starbucks. They can then have the consumer send back a reply to activate a discount at that store. In this way, they might be able to test whether a free cookie or a half-price latte is the better incentive and results in more patronage. However, conducting experiments in this manner threatens internal validity in several ways. Although the large number of mobile phones in use has made them more practical for undertaking research, and their increased flexibility in delivering messages has provided more capability for that research, it is nearly impossible to control for extraneous variables. Is the subject in a car, on a train, in a meeting, alone or with others? Many such factors might interfere with experimental results. Despite the weaknesses, the convenience and technological advantages will likely lead to more rather than less ‘mobile’ experiments.

Shutterstock.com/wxin

REAL WORLD SNAPSHOT

ONGOING PROJECT

Step 2: Decide on the choice of independent and dependent variable(s) MANIPULATION OF THE INDEPENDENT VARIABLE The experimenter has some degree of control over the independent variable. The variable is independent because its value can be manipulated by the experimenter to whatever he or she wishes

independent variable A variable that is expected to influence a dependent variable. experimental treatments Alternative manipulations of the independent variable being investigated.

it to be. Its value may be changed or altered independently of any other variable. The independent variable is hypothesised to be the causal influence. Experimental treatments are the alternative manipulations of the independent variable being investigated. For example, prices of $1.29, $1.69 and $1.99 might be the treatments in a pricing experiment. Price changes, advertising strategy changes, taste formulation and so on are typical treatments.

197

iStockphoto/Sang Nguyen

iStockphoto/Sang Nguyen

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

An experiment can capture whether or not colour can cause differences in consumer preference for products.

In marketing research the independent variable often is a categorical or classificatory variable that represents some classifiable or qualitative aspect of marketing strategy. To determine the effects of point-of-purchase displays, for example, the experimental treatments that represent the independent variable are themselves the varying displays. Alternative advertising copy is another example of a categorical or classificatory variable. In other situations the independent variable is a continuous variable. The researcher must select the appropriate levels of that variable as experimental treatments. For example, the number of dollars that can be spent on advertising may be any number of different values.

Experimental and control groups In the simplest type of experiment, only two values of the independent variable are manipulated. For example, consider measuring the influence of advertising on sales. In the experimental condition (treatment administered to the experimental group), the advertising budget may be set at $200 000. In the control condition (treatment administered to the control group), advertising may remain at zero or without change. By holding conditions constant in the control group, the researcher controls for potential sources of error in the experiment. Sales (the dependent variable) in the two treatment groups are compared at the end of the experiment to determine whether the level of advertising (the independent variable) had any effect. Note that often in field experiments and test markets – for reasons of additional cost, time and ethics – it may not be possible to have a control group. For this reason, quasi-experimental or compromise designs are sometimes used.

experimental group The group of subjects exposed to the experimental treatment. control group The group of subjects exposed to the control condition in an experiment – that is, not exposed to the experimental treatment.

WHAT WENT WRONG? DOES ‘GREENWASHING’ COFFEE WORK? 6

Coffee labelled as ‘eco-friendly’ can attract a premium price, with consumers led to believe it tastes better, according to university research from Sweden. The researchers, from the University of Gävle and the University of Chicago, asked study participants to taste and rate two types of coffee, after telling them that one was ‘eco-friendly’. In reality, both coffees were the same. The study, published in PLOS ONE, found participants preferred the taste of, and were willing to pay more for the ‘eco-friendly’ coffee. In a second experiment, participants were asked to taste coffee from two different cups, but this time they were not told which of the two cups

contained eco-friendly coffee until after they made the preference decision. Those consumers who identified as ‘high sustainability’ felt the strongest about the label, even when they were told, after their decision, that they preferred the nonlabelled alternative. Low sustainability consumers appeared to be willing to pay more for the eco-friendly alternative as long as they preferred the taste of the product. But Robin Canniford, researcher in Melbourne University’s department of marketing, said that these types of experiments don’t tell us enough about consumers’ eco-friendly intentions because people perform differently in lab settings compared to their daily lives.

198

PART THREE > PLANNING THE RESEARCH DESIGN

dependent variable A criterion or variable to be predicted or explained. The criterion or standard by which the results of an experiment are judged; a variable expected to be dependent on the experimenter’s manipulation of the independent variable.

Several experimental treatment levels The advertising–sales experiment with one experimental and one control group may not tell the advertiser everything he or she wishes to know about the advertising–sales relationship. If the advertiser wished to understand the functional nature of the relationship between sales and advertising at several treatment levels, additional experimental groups with advertising expenditures of $200 000, $500 000 and $1 million might be studied. This type of design would allow the experimenter to form a better idea of an optimal advertising budget. Extreme levels of treatments (high and low price, wellknown versus unknown brands) tend also to be used in experiments to provide a critical test as to whether these factors or variables have any influence on dependent variables, such as perceptions of quality, value and purchase intent.

Wired magazine

More than one independent variable It is possible to assess the effects of more than one independent variable. For example, a restaurant chain might investigate the combined effects of increased advertising and a change in prices on sales volume. The purpose of most marketing research experimentation is to measure and compare the effects of experimental treatments on the dependent variable. Experimental designs that investigate the influence of other factors are called factorial designs and are discussed later in this chapter.

SELECTION AND MEASUREMENT OF THE DEPENDENT VARIABLE The dependent variable is so named because its value is expected to be dependent on the experimenter’s manipulation of the independent variable; it is the criterion or standard by which the results are judged. Changes in the dependent variable are presumed to be a consequence of changes in the independent variable. Selection of the dependent variable is a crucial decision in the design of an experiment. If researchers introduce a new pink grapefruit tea mix in a test market, sales volume is most likely to be the dependent variable. However, if Experimental treatments are alternative manipulations of the independent variable. Variations of advertising copy, graphic designs and levels of prices charged are typical independent variables in marketing experiments.

researchers are experimenting with different forms of advertising copy appeals, defining the dependent variable may be more difficult. For example, measures of advertising awareness, recall, changes in brand preference or sales might be used as the dependent variable, depending on the purpose of the ads. In the unit pricing experiment, the dependent variable was the average price paid per unit. However, the dependent variable might have been the preference for format of pricing information (a cognitive variable), brand-switching behaviour expressed as a percentage of consumers, or attitudes towards the store. Often the dependent variable selection process, like the problem definition process, is considered less carefully than it should be. The experimenter’s choice of a dependent variable determines what type of answer is given to the research question. In a test market the amount of time needed for the effects to become evident should be considered in choosing the dependent variable. Sales may be measured several months after the experiment to determine if there were any carry-over effects. Changes that are relatively

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

199

permanent or longer lasting than changes generated only during the period of the experiment should be considered; repeat purchase behaviour may be important. Consumers may try a ‘loser’ once, but they might not re-buy. The introduction of the Kraft Vegemite i-snack 2.0 illustrates the need to think beyond consumers’ initial reactions. When this combination cheese and Vegemite spread was introduced, its name was based on online suggestions from 48 000 people, collected from a public competition. After sales of only 3 million jars, the company admitted that the new name did not resonate with the Australian public and recalled the thousands of excess stock.7 The brand thus never achieved high repeat sales within a sufficiently large market segment. Changing the brand name of Vegemite was not carefully considered as an important independent variable – the dependent variables of interest not only included sales but also the reputation of what many in Australia consider to be a national icon. Brand awareness, trial purchase and repeat purchase are all possible dependent variables in an experiment. The dependent variable, therefore, should be considered carefully. Thorough problem definition will help the researcher select the most important dependent variable(s).

Step 3: Select and assign test units Test units are the subjects or entities whose responses to the experimental treatment are measured or observed. Individuals, organisational units, sales territories or other entities may be the test units. People are the most common test units in most marketing and consumer behaviour experiments.

SAMPLE SELECTION AND RANDOM SAMPLING ERRORS As in other forms of marketing research, random sampling errors and sample selection errors may occur in experimentation. In the case of an interactive campaign by the Commonwealth Bank of Australia, the households included in the experiment were not randomly selected. They selected themselves. This is often called a self-selection bias. People who participate in this way may be more interested in interactive television and/or home loans than those that may have been randomly selected, so that it might not be possible to generalise the results to the larger population. Another example may be the use of online panels for test markets. People in panels are often paid, and select themselves to take part, or may be more interested in product categories studied and so may not represent the views of the population. Sample selection error may also occur because of the procedure used to assign subjects or test units to either the experimental or the control group. Random sampling error may occur if repetitions of the basic experiment sometimes favour one experimental condition and sometimes the other on a chance basis. An experiment dealing with charcoal briquettes may require that the people in both the experimental and the control groups be identical with regard to level of product usage and barbecuing habits. However, if subjects are randomly assigned to conditions without knowledge of their product usage, errors resulting from differences in that usage will be random sampling errors. Suppose a potato chip manufacturer that wishes to experiment with new advertising appeals wants the groups to be identical with respect to advertising awareness, media exposure and so on. The experimenter must decide how to place subjects in each group and which group should receive treatment. Researchers generally agree that the random assignment of participants to groups and experimental treatments to groups is the best procedure.

RANDOMISATION Randomisation – the random assignment of subject and treatments to groups – is one device for equally distributing or scattering the effects of extraneous variables to all conditions. Thus, the chance

test units Subjects or entities whose responses to experimental treatments are observed or measured. random sampling error A statistical fluctuation that occurs because of chance variation in the elements selected for a sample. randomisation A procedure in which the assignment of subjects and treatments to groups is based on chance.

200

PART THREE > PLANNING THE RESEARCH DESIGN

of unknown nuisance effects piling up in particular experimental groups can be identified. The effects of the nuisance variables will not be eliminated, but they will be controlled. Randomisation assures the researcher that overall repetitions of the experiment under the same conditions will show the true effects, if those effects exist. Random assignment of conditions provides ‘control by chance’.8 Random assignment of subjects allows the researcher to assume that the groups are identical with respect to all variables except the experimental treatment.

MATCHING Random assignment of subjects to the various experimental groups is the most common technique used to prevent test units from differing from each other on key variables; it assumes that all characteristics of the subjects have been likewise randomised. If the experimenter believes that certain extraneous variables may affect the dependent variable, he can make sure that the subjects in each group are matched on these characteristics. Matching the respondents on the basis of pertinent background information is another technique for controlling assignment errors. For example, in a taste test experiment for a dog food, it might be important to match the dogs in various experimental groups on the basis of age or breed. Similarly, if age is expected to influence savings behaviour, a bank conducting an experiment may have greater assurance that there are no differences among subjects if subjects in all experimental conditions are matched according to age. Although matching assures the researcher that the subjects in each group are similar on the matched characteristics, the researcher cannot be certain that subjects have been matched on all characteristics that could be important to the experiment.

ONGOING PROJECT

Step 4: Address issues of validity in experiments The major difference between experimental research and other research is an experimenter’s ability to hold conditions constant and to manipulate the treatment. To conclude that A causes B, a brewery experimenting with a new clear beer’s influence on beer drinkers’ taste perceptions must determine the possible extraneous variables (other than the treatment) that may affect the results and attempt to eliminate or control those variables. We know that brand image and packaging are important factors in beer drinkers’ reactions, so the experimenter may wish to eliminate the effects associated with them. She may eliminate these two extraneous variables by packaging the test beers in plain packages without brand identification. When extraneous variables cannot be eliminated, experimenters may strive for constancy of conditions; that is, they make efforts to expose all subjects in each experimental group to situations that are exactly alike, except for the differing conditions of the independent variable. For example, experiments to measure consumers’ evaluations of tissue paper softness have indicated that variation

matching A procedure for the assignment of subjects to groups that ensures each group of respondents is matched on the basis of pertinent characteristics. constancy of conditions A situation in which subjects in experimental groups and control groups are exposed to situations identical except for differing conditions of the independent variable.

in humidity influences reactions. In this situation, holding extraneous variables constant might require that all experimental sessions be conducted in the same room at the same time of day. A supermarket experiment involving four test products shows the care that must be taken to hold all factors constant. The experiment required that all factors other than shelf space be kept constant throughout the testing period. In all stores the shelf level that had existed before the tests began was maintained throughout the test period; only the amount of shelf space (the treatment) was changed. One problem involved supermarket personnel accidentally changing the shelf level when stocking the test products. This deviation from the constancy of conditions was minimised by auditing each supermarket four times a week. In this way, any change could be detected in a minimum amount of time. The experimenter personally stocked as many of the products as possible, and the cooperation of supermarket staff also helped reduce treatment deviations.

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

201

Another example involves the confounding effect of television program content on advertising copy effectiveness. An advertisement that on one television show produces moderate brand recall may give a higher or lower recall result on another program of the same type. To eliminate the confounding effects of program content, copy-testing experiments should be conducted in a constant environment. Some researchers believe that the only safe course is to test an advertisement on several different television shows and then focus on the average across all the shows used. If the experimental method requires that the same subjects be exposed to two or more experimental treatments, an error may occur due to the order of presentation. If a soft-drink company plans to test consumers’ comparison of a high-caffeine, extra-sugar version of its cola to its regular cola, one of the drinks must be tasted before the other. Consumers might tend to prefer the first drink they taste if they cannot tell any difference between the drinks. Another example is an electronic games manufacturer that has subjects perform an experimental task requiring some skill (that is, playing a game). Subjects might perform better on the second task simply because they have had some experience with the first task. Counterbalancing attempts to eliminate the confounding effects of order of presentation by requiring that half the subjects be exposed to treatment A first and then to treatment B, while the other half receive treatment B first and then treatment A. Blinding is used to control subjects’ knowledge of whether or not they have been given a particular experimental treatment. The cola taste test mentioned above might have used two groups of subjects: one exposed to the new formulation and the other exposed to the regular cola. If the subjects were blinded, all may have been told they had not been given the new formulation (or all may have been told they had been given the new formulation). This technique frequently is used in medical research when subjects are given chemically inert pills (placebos) rather than medication. It may also be used in marketing experiments. For example, if the researchers themselves do not know which toothpastes are in the tubes marked with triangles, circles or squares, they will not unconsciously influence the subjects. In these circumstances, neither the subjects nor the experimenter knows which are the experimental and which are the controlled conditions. Both parties are blinded; hence, such experiments are called double-blind designs. The random assignment of subjects and experimental treatments to groups is an attempt to control extraneous variations that result from chance. If certain extraneous variations cannot be controlled, the researcher must assume that the confounding effects will be present in all experimental conditions with approximately the same influence. (This assumption may not hold true if the assignments were not random.) In many experiments, especially laboratory experiments, interpersonal contact between members of the various experimental groups and/or the control group must be eliminated or minimised. After the subjects have been assigned to groups, the various individuals should be kept separated so that discussion about what occurs in a given treatment situation will not become an extraneous variable that contaminates the experiment.

CONSTANT EXPERIMENTAL ERROR We have already discussed random error in the context of experimental selection and assignment of test units. Constant error (bias) occurs when the extraneous variables or the conditions of administering the experiment are allowed to influence the dependent variables every time the experiment is repeated. When this occurs, the results will be confounded because the extraneous variables have not been controlled or eliminated. For example, if subjects in an experimental group are always administered treatment in the morning and subjects in the control group always receive the treatment in the afternoon, a constant, systemic error will occur. In such a situation, the time of day (an uncontrolled extraneous variable) is a cause of constant error. In a training experiment the sources of constant error might be the people who

blinding A technique used to control subjects’ knowledge of whether or not they have been given a particular experimental treatment. double-blind design A technique in which neither the subject nor the experimenter knows which are the experimental and which the controlled conditions. constant error An error that occurs in the same experimental condition every time the basic experiment is repeated; a systemic bias.

202

PART THREE > PLANNING THE RESEARCH DESIGN

do the training (line or external specialists) or whether the training is conducted on the employees’ own time or on company time. These and other characteristics of the training may have an impact on the dependent variable and will have to be taken into account: The effect of a constant error is to distort the results in a particular direction, so that an erroneous difference masks the true state of affairs. The effect of a random error is not to distort the results in any particular direction, but to obscure them. Constant error is like a distorting mirror in a fun house; it produces a picture that is clear but incorrect. Random error is like a mirror that has become cloudy with age; it produces a picture that is essentially correct but unclear.9

Experiments are judged by two measures. The first, internal validity, indicates whether the independent variable was the sole cause of the change in the dependent variable. The other, external validity, indicates the extent to which the results of the experiment are applicable to the real world.10

INTERNAL VALIDITY When choosing or evaluating experimental research designs, researchers must determine whether they have internal validity and external validity. The first has to do with the interpretation of the cause-and-effect relationship in the experiment. Internal validity refers to the question of whether internal validity

the experimental treatment was the sole cause of observed changes in the dependent variable. If the observed results were influenced or confounded by extraneous factors, the researcher will have problems making valid conclusions about the relationship between the experimental treatment and the dependent variable. If the observed results can be unhesitatingly attributed to the experimental treatment, the experiment will be internally valid. It is helpful to classify several types of extraneous variables that may jeopardise internal validity. The six major ones are history, maturation, testing, instrumentation, selection and mortality.

History Suppose a before-and-after experiment is being conducted to test a new packaging strategy for an internal validity Validity determined by whether an experimental treatment was the sole cause of changes in a dependent variable or whether the experimental manipulation did what it was supposed to do. history effect The loss of internal validity caused by specific events in the external environment, occurring between the first and second measurements, that are beyond the control of the experimenter. cohort effect A change in the dependent variable that occurs because members of one experimental group experience different historical situations from members of other experimental groups. maturation effect An effect on the results of an experiment caused by experimental subjects maturing or changing over time.

imported Indonesian toy. If some Indonesian people engage in an anti-Australian political action that attracts considerable media coverage, this action may jeopardise the validity of the experiment because many Australians may boycott this brand of toy. This is an example of a history effect, which refers to specific events in the external environment between the first and second measurements that are beyond the experimenter’s control. A common history effect occurs when competitors change their marketing strategies during a test marketing experiment. A special case of the history effect is the cohort effect, which refers to a change in the dependent variable that occurs because members of one experimental group experienced historical situations that were different from members of other experimental groups. For example, two groups of managers used as subjects may be in different cohorts because one group experienced a different history and therefore might behave differently in a workplace experiment.

Maturation People change over time; that is, they undergo a process of maturation. During the course of an experiment, subjects may mature or change in some way that will have an impact on the results. Maturation effects are effects on the results of an experiment caused by changes in experimental subjects over time. They are a function of time rather than of a specific event. For example, during a day-long experiment subjects may grow hungry, tired or bored. In an experiment over a longer time span, maturation may influence internal validity because subjects grow older or more experienced, or change in other ways that may influence the results. For example, suppose an experiment was designed to test the impact of a new compensation program on sales productivity.

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

203

If this program were tested over a year-long period, some of the salespeople probably would mature as a result of more selling experience or perhaps increased knowledge. Their sales productivity might improve because of their knowledge and experience rather than the compensation program.

Testing Testing effects are also called pretesting effects because the initial measurement or test alerts respondents to the nature of the experiment, causing a change in the validity of the experiment. Respondents may act differently from how they would if no pretest measures were taken. In a beforeand-after study, taking a pretest before the independent variable is manipulated may sensitise respondents when they are taking the test the second time. For example, students taking standardised achievement and intelligence tests for the second time usually do better than those taking the tests for the first time. The effect of testing may increase awareness of socially approved answers, increase attention to experimental conditions (that is, the subject may watch more closely) or make the subject more conscious than usual of the dimensions of a problem.

Instrumentation Measuring the dependent variable in an experiment requires the use of a questionnaire or other form of measuring instrument. If the identical instrument is used more than once, a testing effect may occur. To avoid the effects of testing, an alternative form of the measuring instrument (for example, a questionnaire or test) may be given as the post-measurement. Although this may reduce the testing effect, it might result in an instrumentation effect because of the change in the measuring instrument. A change in the wording of questions, a change in interviewers or a change in other procedures used to measure the dependent variable causes an instrumentation effect, which may jeopardise internal validity. For example, if the same interviewers are used to ask questions for both before and after measurement, some problems may arise. With practice, interviewers may acquire increased skill in interviewing, or they may become bored and decide to reword the questionnaire in their own terms. To avoid this problem, new interviewers are hired, but different individuals are also a source of extraneous variation due to instrumentation variation. There are numerous other sources of instrument decay or variation.

Selection The selection effect is a sample bias that results from differential selection of respondents for the comparison groups, or sample selection error (discussed earlier).

Mortality If an experiment is conducted over a period of a few weeks or more, some sample bias may occur due to mortality, or sample attrition. Sample attrition occurs when some subjects withdraw from the experiment before it is completed. Mortality effects may occur if many subjects drop from one experimental treatment group and not from other treatment or control groups. Consider a sales training experiment investigating the effects of close supervision of salespeople (high pressure) versus low supervision (low pressure). The high-pressure condition may misleadingly appear superior if those subjects who completed the experiment did very well. If, however, the high-pressure condition caused more subjects to drop out than the other conditions, this apparent superiority may be due to a self-selection bias – perhaps only very determined and/or talented salespeople made it through to the end of the experiment.

Demand characteristics The term demand characteristics refers to experimental design procedures that unintentionally hint to subjects about the experimenter’s hypothesis. Demand characteristics are situational aspects of the experiment that demand that the participants respond in a particular way; hence, they are a source of constant error. If participants recognise the experimenter’s expectation or demand, they are likely to

testing effect In a before-and-after study, the effect of pretesting may sensitise subjects when taking a test for the second time, thus affecting internal validity. instrumentation effect An effect on the results of an experiment caused by a change in the wording of questions, a change in interviewers or other changes in procedures used to measure the dependent variable. selection effect A sampling bias that results from differential selection of respondents for the comparison groups. mortality (sample attrition) effect A sample bias that results from the withdrawal of some subjects from the experiment before it is completed. demand characteristics Experimental design procedures or situational aspects of an experiment that provide unintentional hints about the experimenter’s hypothesis to subjects.

204

PART THREE > PLANNING THE RESEARCH DESIGN

iStockphoto/Leigh Schindler

act in a manner consistent with the experimental treatment; even slight nonverbal cues may influence their reactions. In most experiments the most prominent demand characteristic is the person who actually administers the experimental procedures. If an experimenter’s presence, actions or comments influence the subjects’ behaviour or sway the subjects to slant their answers to cooperate with the experimenter, the experiment has experimenter bias. When subjects slant their answers to cooperate with the experimenter, they in effect are acting as guinea pigs and tend to exhibit behaviours that might not represent their behaviour in the marketplace. For example, if subjects in an advertising experiment understand that the experimenter is interested in whether they changed their attitudes in accord with a given advertisement, they may answer in the desired direction to please him or her. This attitude change reflects a guinea pig effect rather than a true experimental treatment effect. A

famous

management

experiment

illustrates

a

common

demand

characteristic. Researchers were attempting to study the effects on productivity of various working conditions – such as hours of work, rest periods, lighting and methods of pay – at a Western Electric Hawthorne plant in Illinois. The researchers The experimenter unintentionally can create a demand effect by smiling, nodding or frowning at the wrong time.

found that workers’ productivity increased whether the work hours were lengthened or shortened, whether lighting was very bright or very dim, and so on. The surprised investigators realised that the workers’ morale was higher because they were aware of being part of a special experimental group. This totally unintended effect is now known as the Hawthorne effect because researchers realise that people will perform differently when they know they are experimental subjects.11 If subjects in a laboratory experiment interact (that is, if they are not relatively isolated), their

guinea pig effect An effect on the results of an experiment caused by subjects changing their normal behaviour or attitudes in order to cooperate with an experimenter.

conversations may produce joint decisions rather than a desired individual decision. For this reason,

Hawthorne effect An unintended effect on the results of a research experiment caused by the subjects knowing that they are participants.

be told that the purpose of the experiment is one thing when it is actually something else. If the purpose

TIPS OF THE TRADE?

social interaction generally is restricted in laboratory experiments. To reduce demand characteristics, researchers typically take steps to make it difficult for subjects to know what the researchers are trying to establish. Experimenters are trained and experimental situations are designed to reduce cues that might serve as demand characteristics. For example, the subjects may of the experiment is disguised, the participant does not know how to be a ‘good’ subject to help confirm the hypothesis. Of course, the use of deception (for example, telling a lie to the subject) presents an ethical question that must be resolved by the researcher. (See the discussion later in the chapter.)

REDUCING DEMAND CHARACTERISTICS

»» Use the placebo technique, where a false experimental treatment disguises the fact that no experimental treatment is used. It’s useful in testing food additives, price levels and so on. »» Isolate experimental subjects. This reduces the chance of respondent discussion on the likely nature of the experiment. »» Use a ‘blind’ experimental administrator. Where possible use administrators of the experiment who do not know the nature of the hypotheses being investigated. »» Administer only one experimental condition per subject. As respondents are exposed to more experimental treatments, such as in a subject design, they may learn more about the true nature of the experiment. This is less likely between individual subject designs where respondents only encounter one set of experimental conditions.

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

205

EXTERNAL VALIDITY The second type of validity involves the researcher’s ability to generalise the results of an

external validity

experiment to the marketplace or the external environment. External validity is the ability of an experiment to generalise beyond the data of the experiment to other subjects or groups in the population under study. In essence, determining external validity involves a sampling question. For example: to what extent can the results of a simulated shopping experiment be transferred to real-world supermarket shopping? Will a test market in Fremantle, Western Australia, be representative of a nationwide introduction of the product under study? Can one extrapolate the results from a tachistoscope to an in-store shopping situation? Problems of external validity generally are related to the threat that a specific but limited set of experimental conditions will not deal with the interactions of untested variables in the real world. In other words, the experimental

external validity The ability of an experiment to generalise beyond the experiment data to other subjects or groups in the population under study. tachistoscope A device that controls the amount of time a subject is exposed to a visual image.

situation may be artificial and may not represent the true setting and conditions in which the investigated behaviour takes place. If the study lacks external validity, the researcher will have difficulty repeating the experiment with different subjects, settings or time intervals. If subjects in a shopping mall view a recording that simulates an actual television program with a test advertisement inserted along with other advertisements, will the subjects view the advertisement just as they would if it were being shown during a regular program? There probably will be some contamination, but the experiment may still be externally valid if the researcher knows how to adjust Damian Strohmeyer/Getty Images

results from an artificial setting to the marketplace. Comparative norms may be established based on similar, previous studies so that the results can be projected beyond the experiment. If an experiment lacks internal validity, projecting results is not possible. Thus, threats to internal validity may jeopardise external validity.

STUDENT SURROGATES One issue relating to external validity concerns the use of university students as experimental subjects. Time, money and a host of other practical considerations often necessitate the use of student surrogates as research subjects. This practice is widespread in academic studies. Some evidence shows that students are quite similar to non-student household consumers, but other evidence indicates that they do not provide accurate predictions of other populations. This is particularly true when students are used as substitutes or surrogates for businesspeople. Any researcher who uses student surrogates should take care to ensure that the student subjects resemble the populations they are to portray. This may not be easy, unless the literacy, alertness and rationality of the population under study parallel those of the student surrogates. The issue of external validity should be seriously considered because the student population is likely to be atypical. Students are easily accessible, but they often are not representative of the total population.

EXTRANEOUS VARIABLES Most students of marketing realise that the marketing mix variables – price, product,

Sports Illustrated conducts experiments to test its creative strategies and renewal materials in order to reveal the promotions that generate the highest response rates. For each renewal cycle, Sports Illustrated researchers include several control variables – source, price, amount paid, etc.– that are used to test the efficacy of renewal methods. (For example, the payment methods offered – bill me later, credit card, instalment – will each elicit different responses.)

promotion and distribution – interact with uncontrollable forces in the market, such as competitors’ activities, to influence consumer behaviour and ultimately sales. The experiments discussed so far (indeed, most experiments) concern the identification of a single independent variable and the measurement of its effects on the dependent variable. A number of extraneous variables may affect the dependent variable, thereby distorting the experiment.

206

PART THREE > PLANNING THE RESEARCH DESIGN

The following example shows how extraneous variables may affect results. Suppose a television advertisement for brand Z of petrol shows two cars on a highway. The announcer states that one car has used brand Z without the special additive and the other has used it with the additive. The car without the special additive comes to a stop first, and the car with it comes to a stop 10 to 15 metres further down the road. (We will assume that both cars used the same quantity of petrol.) The implication of this advertisement is that the special additive (the independent variable) results in extra mileage (the dependent variable). As experimenters who are concerned with extraneous variables that could affect the result, we can raise the following questions: 1 Were the engines of the same size and type? Were the conditions of the engines the same (tuning and so on)? 2 Were the cars of the same condition (gear ratios, fuel injector settings, weight, wear and tear, and so on)? 3 Were the drivers different? Were there differences in acceleration? Were there differences in the drivers’ weights? Because an experimenter does not want extraneous variables to affect the results, he or she must control or eliminate such variables.

PROBLEMS CONTROLLING EXTRANEOUS VARIABLES In marketing experiments it is not always possible to control everything that should be controlled in order to have the perfect experiment. For example, competitors may bring out a product during the course of an experiment. A competitor who learns of a test market experiment may knowingly change its prices or increase advertising to confound the test results. This gives the competitor more time to investigate a similar new-product possibility. ONGOING PROJECT

Step 5: Select and implement an experimental design The next decision is to decide on the type of experimental design. There are two broad choices: 1 Experimental for one independent variable or outcome; factorial to consider a number of causal factors. There are also a number of experimental designs. 2 Repeated measures or not.

BASIC VERSUS FACTORIAL EXPERIMENTAL DESIGNS In basic experimental designs a single independent variable is manipulated to observe its effect on a single dependent variable. However, we know that complex marketing-dependent variables such as sales, product usage and preference are influenced by several factors. The simultaneous change in independent variables such as price and advertising may have a greater influence on sales than if either variable is changed alone. Factorial experimental designs are more sophisticated than basic experimental designs; they allow for an investigation of the interaction of two or more independent variables.

REPEATED MEASURES OR NOT repeated measures An experimental technique in which the same subjects are exposed to all experimental treatments to eliminate any problems due to subject differences.

Experiments in which the same subjects are exposed to all experimental treatments are said to have repeated measures. This technique eliminates any problems due to subject differences, but it causes some other problems, such as demand characteristics. Repeated measures designs are cheaper and more efficient than other experimental designs. It is also argued that repeated measures designs are realistic as they present an array of alternatives that may be similar to those seen by respondents when they are shopping or considering competing brands in the marketplace.

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

207

BASIC EXPERIMENTAL DESIGNS The design of an experiment may be compared to an architect’s plans for a structure, whether a giant skyscraper or a modest home. The basic requirements for the structure are given to the architect by the prospective owner. It is the architect’s task to fill these basic requirements; yet the architect has ample room for exercising ingenuity. Several different plans may be drawn up to meet all the basic requirements. Some may be more costly than others; of two plans with the same cost, one may offer potential advantages that the second does not. There are various types of experimental designs. If only one variable is manipulated, the experiment has a basic experimental design. If the experimenter wishes to investigate several levels of the independent variable (for example, four price levels) or to investigate the interaction effects of two or more independent variables, the experiment requires a complex, or statistical, experimental design.

SYMBOLISM FOR DIAGRAMMING EXPERIMENTAL DESIGNS The work of Campbell and Stanley has helped many students master the subject of basic experimental designs.12 The following symbols will be used in describing the various experimental designs: X = exposure of a group to an experimental treatment O=o bservation or measurement of the dependent variable; if more than one observation or measurement is taken, subscripts (that is, O1, O2 etc.) indicate temporal order R = r andom assignment of test units; R symbolises that individuals selected as subjects for the experiment are randomly assigned to the experimental groups. The diagrams of experimental designs that follow assume a time flow from left to right.

THREE EXAMPLES OF QUASI-EXPERIMENTAL DESIGNS Quasi-experimental designs do not qualify as true experimental designs because they do not adequately control for the problems associated with loss of external or internal validity.

One-shot design The one-shot design, or after-only design, is diagrammed as follows: X O1

Suppose that during a very cold winter a motor vehicle dealer finds himself with a large inventory of cars. He decides to experiment with a promotional scheme, offering a free trip to the Gold Coast with every car sold. He experiments with the promotion (X = experimental treatment) and measures sales (O1 = measurement of sales after the treatment is administered). The dealer is not really conducting a formal experiment; he is just ‘trying something out’. This one-shot design is a case study of a research project fraught with problems. Subjects or test units participate because of voluntary self-selection or arbitrary assignment, not because of random assignment. The study lacks any kind of comparison or any means of controlling extraneous influences. There should be a measure of what will happen when the test units have not been exposed to X to compare with the measures of when subjects have been exposed to X. Nevertheless, under certain circumstances, even though this design lacks internal validity, it is the only viable choice. The nature of taste tests or product usage tests may dictate the use of this design. In a taste test experiment, Australian novice wine drinkers when asked whether they preferred Australian versus French wine inevitably chose French wine, even across both high and low quality wine.13

basic experimental design An experimental design in which a single independent variable is manipulated to measure its effect on another single dependent variable. quasi-experimental design A research design that cannot be classified as a true experiment because it lacks adequate control of extraneous variables. one-shot design An after-only design in which a single measure is recorded after the treatment is administered.

208

PART THREE > PLANNING THE RESEARCH DESIGN

REAL WORLD SNAPSHOT

DID YOU HEAR ABOUT THE IMPORTANT INTERACTIONS IN MARKETING STRATEGY?14

Marketing research has shown that many parts of the marketing mix interact to produce consumer reactions. An example is what combinations of media and advertising messages are effective. Researchers in the United States using a field study of 43 344 participants with hearing disorders found that consistent combinations of media (both private or both public) were more effective than mixed media. The two private media (telemarketing combined with direct marketing) outperformed any two mixed media (telemarketing and print, telemarketing and TV, direct marketing and print, or direct marketing and television). In addition, the private media combination outperformed the public media, which makes sense for this product category. If there’s a stigma or embarrassment about a purchase, even learning more about the product before purchase might require some delicacy and privacy. The researchers also found that the effectiveness of the type of media depended on the nature of appeal used (warm emotional or wedge of doubt). The integrated private media (telemarketing and direct marketing) performed best, first with the wedge of doubt (getting people concerned about their lack of hearing) and then with the warm and emotional ad content. Note that two private exposures were not always superior – the combination with the educational advertising message performed poorly.

One-group pretest–posttest design Suppose a real estate franchiser wishes to provide a training program for franchisees. If it measures subjects’ knowledge of real estate selling before (O1) they are exposed to the experimental treatment (X) and then measures real estate selling knowledge after (O2) they are exposed to the treatment, the design will be as follows: O1 X O2

In this example, the trainer is likely to conclude that the difference between O2 and O1 (O2 − O1) is the measure of the influence of the experimental treatment. This one-group pretest–posttest design offers a comparison of the same individuals before and after training. Although this is an improvement over the one-shot design, this research still has several weaknesses that may jeopardise internal validity. For example, if the time lapse between O1 and O2 was a period of several months, the trainees may have matured as a result of experience on the job (maturation effect). History effects may also influence this design. Perhaps some subjects dropped out of the training program (mortality effect). The effect of testing may also have confounded the experiment. For example, taking a test on real estate selling may have made subjects more aware of their lack of specific knowledge; either during the training sessions or on their own, they may have sought to learn subject material about which they realised they were ignorant. If the second observation or measure (O2) of salespersons’ knowledge was not an identical test, the one-group pretest– posttest design A quasi-experimental design in which the subjects in the experimental group are measured before and after the treatment is administered, but there is no control group. static group design An after-only design in which subjects in the experimental group are measured after being exposed to the experimental treatment and the control group is measured without having been exposed to the experimental treatment; no premeasure is taken.

research may suffer from the instrumentation effect. If the researcher gave an identical test but had different graders for the before and after measurements, the data may not be directly comparable. Although this design has a number of weaknesses, it is used frequently in marketing research. Remember, the cost of the research is a consideration in most business situations. While there will be some problems of internal validity, the researcher must always take into account questions of time and cost.

Static group design In the static group design each subject is identified as a member of either an experimental group or a control group (for example, exposed or not exposed to an advertisement). The experimental group is measured after being exposed to the experimental treatment, and the control group is measured without having been exposed to the experimental treatment: Experimental group: X O1 Control group: O2

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

209

The results of the static group design are computed by subtracting the observed results in the control group from those in the experimental group (O1 − O2). A major weakness of this design is its lack of assurance that the groups were equal on variables of interest before the experimental group received the treatment. If the groups were selected arbitrarily by the investigator, or if entry into either group was voluntary, systematic differences between the groups could invalidate the conclusions about the effect of the treatment. For example, suppose a company that manufactures rubbish compactors wishes to compare the attitudes of subjects who have used a rubbish compactor for the first time with those of subjects who have not. If entry into the groups is voluntary, we might find that the group that receives the use of a rubbish compactor might have had some reason for choosing that option (for example, atypical amounts of garbage or poor garbage-collection service). Sample attrition of experimental group members who do not like rubbish compactors might also be a source of error. Corbis/Louie Psihoyos

Random assignment of subjects may minimise problems with group differences. If groups are established by the experimenter rather than existing as a function of some other causation, the static group design is referred to as an after-only design with control group. On many occasions, an after-only design is the only possible option. This is particularly true when conducting use tests for new products or brands. Cautious interpretation and recognition of the design’s shortcomings may make this necessary evil quite valuable. For example, Airwick Industries conducted use tests of Carpet Fresh, a rug cleaner and room deodoriser, in the USA. Experiments with Carpet Fresh, which originally was conceived as a granular product to be sprinkled on the floor before vacuuming, indicated that people were afraid the granules would lodge under furniture. This research led to changing the texture of the product to a powdery form.

THREE BETTER EXPERIMENTAL DESIGNS

The worst job in marketing research: without a pretest it is impossible to say how effective an experimental treatment (a deodorant) has been.

In a formal, scientific sense, the three designs just discussed are not true experimental designs. Subjects for the experiments were not selected from a common pool of subjects and randomly assigned to one group or another. In the following discussion of three basic experimental designs, the symbol R to the left of the diagram indicates that the first step in a true experimental design is the randomisation of subject assignment.

Pretest–posttest control group design (before–after with control) The pretest–posttest control group design, or before–after with control group design, is the classic experimental design: Experimental group: R O1 X O2 Control group: R O3 O4

As the diagram indicates, the subjects in the experimental group are tested before and after being exposed to the treatment. The control group is tested at the same two times as the experimental group, but subjects are not exposed to the experimental treatment. This design has the advantages of the before–after design with the additional advantages gained by its having a control group. The effect of the experimental treatment equals: (O2 − O1) − (O4 − O3)

pretest–posttest control group design A true experimental design in which the experimental group is tested before and after exposure to the treatment, and the control group is tested at the same two times without being exposed to the experimental treatment.

210

PART THREE > PLANNING THE RESEARCH DESIGN

If there is brand awareness among 20 per cent of the subjects (O1 = 20 per cent, O3 = 20 per cent) before an advertising treatment and then 35 per cent awareness in the experimental group (O2 = 35 per cent) and 22 per cent awareness in the control group (O4 = 22 per cent) after exposure to the treatment, the treatment effect equals 13 per cent: (0.35 − 0.20) − (0.22 − 0.20) = (0.15) − (0.02) = 0.13 or 13 per cent

The effect of all extraneous variables is assumed to be the same on both the experimental and the control groups. For instance, since both groups receive the pretest, no difference between them is expected for the pretest effect. This assumption is also made for effects of other events between the before and after measurements (history), changes within the subjects that occur with the passage of time (maturation), testing effects and instrumentation effects. In reality there will be some differences in the sources of extraneous variation. Nevertheless, in most cases assuming that the effect is approximately equal for both groups is reasonable. However, a testing effect is possible when subjects are sensitised to the subject of the research. This is analogous to what occurs when people learn a new vocabulary word. Soon they discover that they notice it much more frequently in their reading. In an experiment the combination of being interviewed on a subject and receiving the experimental treatment might be a potential source of error. For example, a subject exposed to a certain advertising message in a split-cable experiment might say: ‘Ah, there is an ad about the product I was interviewed about yesterday!’ The subject may pay more attention than normal to the advertisement and be more prone to change his or her attitude than in a situation with no interactive testing effects. This weakness in the before–after with control group design can be corrected (see the next two designs).

WHAT WENT RIGHT?

MUSIC TO THEIR EARS: EXPERIMENTAL RESEARCH ON THE EFFECT OF MUSIC IN RETAILING15

Chant and Link, a market research company in Australia, was approached by a retailer who suggested that introducing music into its shops would improve customer outcomes. The client’s idea was to survey customers about their reactions to the concept. The research company suggested that an experimental design would deliver far more useful insights. The research project involved pretesting customers before music

was introduced, testing them after music was introduced, and testing a control group in one retail outlet where no music was introduced. The researchers looked at emotional, cognitive and behavioural responses. The researchers believed that they would be able to draw conclusions about the unconscious effect of music on customers – for example, that the presence of music appeared to result in perceptions of shorter queue waiting times.

Posttest-only control group design (after-only with control) In some situations, pretest measurements are impossible. In other situations, selection error is not posttest-only control group design An after-only design in which the experimental group is tested after exposure to the treatment, and the control group is tested at the same time without having been exposed to the treatment; no premeasure is taken. Random assignment of subjects and treatment occurs.

anticipated to be a problem because the groups are known to be equal. The posttest-only control group design, or after-only with control group design, is diagrammed as follows: Experimental group: R X O1 Control group: R O2

The effect of the experimental treatment is equal to O2 − O1. Suppose the manufacturer of a tinea (athlete’s foot) remedy wishes to demonstrate by experimentation that its product is better than a competing brand. No pretest measure about the effectiveness of the remedy is possible. The design is to randomly select subjects, perhaps students,

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

211

who have contracted tinea and randomly assign them to the experimental or the control group. With only the posttest measurement, the effects of testing and instrument variation are eliminated. Furthermore, researchers make the same assumptions about extraneous variables described above – that is, that they operate equally on both groups, as in the before–after with control group design.

Solomon four-group design By combining the pretest–posttest (before–after) with control group and the posttest-only with control group designs, the Solomon four-group design provides a means for controlling the interactive testing effect as well as other sources of extraneous variation. In the following diagram the two Xs symbolise the same experimental treatment given to each experimental group: Experimental group 1: R O1 X O2 Control group 1: R O3 O4 Experimental group 2: R X O5 Control group 2: R O6

Although we will not go through the calculations, it is possible to isolate the effects of the experimental treatment and interactive testing in this design. This design allows for the isolation of the various effects, but it is rarely used in marketing research because of the effort, time and cost of implementing it. However, it points out that there are ways to isolate or control most sources of variation.

Compromise designs In many instances of marketing research, true experimentation is not possible; the best the researcher can do is approximate an experimental design. Such a compromise design may fall short of the requirement to assign subjects or treatments randomly to groups. Consider the situation in which the researcher wishes to implement a pretest–posttest control group design, but the subjects cannot be assigned randomly to the experimental and the control group. Because the researcher cannot change a workplace situation, one department of an organisation is used as the experimental group and another department is used as the control group. The researcher has no assurance that the groups are equivalent; he or she has compromised because of the nature of the situation. The alternative to the compromise design when random assignment of subjects is not possible is to conduct the experiment without a control group. Generally this is considered a greater weakness than using groups that have already been established. When the experiment involves a longitudinal study, circumstances usually dictate a compromise with true experimentation.

TIME SERIES DESIGNS Many marketing experiments may be conducted in a short period of time (a month or less than half a year). However, a marketing experiment to investigate long-term structural change may require a time series design. When experiments are conducted over long periods of time, they are most vulnerable to history effects due to changes in population, attitudes, economic patterns and the like. Although seasonal patterns and other exogenous influences may be noted, the experimenter can do little about them when time is a major factor in the design. Hence, time series designs are quasiexperimental because they generally do not allow the researcher full control over the treatment exposure or influence of extraneous variables. Political polls provide an example. A pollster normally uses a series of surveys to track candidates’ popularity. Consider the candidate who plans a major speech (the experimental

Solomon four-group design A true experimental design that combines the pretest–posttest with control group design and the posttest-only with control group design, thereby providing a means for controlling the interactive testing effect and other sources of extraneous variation. compromise design An approximation of an experimental design, which may fall short of the requirements of random assignment of subjects or treatments to groups. time series design An experimental design used when experiments are conducted over long periods of time. It allows researchers to distinguish between temporary and permanent changes in dependent variables.

212

PART THREE > PLANNING THE RESEARCH DESIGN

treatment) to refocus the political campaign. The simple time series design can be diagrammed as follows: O1 O2 O3 X O4 O5 O6

Several observations have been taken to identify trends before the treatment (X) is administered. After the treatment has been administered, several observations are made to determine if the patterns after the treatment are similar to those before. If the longitudinal pattern shifts after the political speech, the researcher may conclude that the treatment had a positive impact on the pattern. Of course, this time series design cannot give the researcher complete assurance that the treatment caused the change in the trend. Problems of internal validity are greater than in more tightly controlled beforeand-after designs for experiments of shorter duration. One unique advantage of the time series design is its ability to distinguish temporary from

Degree of change 1

2

3 4 5 Period (time)

6

1

2

3 4 5 Period (time)

6

Temporary change

Degree of change

Permanent change

Degree of change

EXHIBIT 7.2 → SELECTED TIME SERIES OUTCOMES

Degree of change

permanent changes. Exhibit 7.2 shows some possible outcomes in a time series experiment.

1

2

3 4 5 Period (time) No change

6

1

2

3 4 5 Period (time)

6

Continuation of trend

There is another problem in our political campaign example: a political conversion during August may affect the number of political conversions in September, which may influence what happens in October. In time series designs there may be carry-over effects that cannot be controlled. An improvement on the basic time series design is a time series with control group design. For example, many test markets use different geographic areas that are similar demographically as a basis for experimental control. Rarely will geographic areas be identical in any characteristic of interest; thus, control will be less than perfect.

COMPLEX EXPERIMENTAL DESIGNS Complex experimental designs are statistical designs that isolate the effects of confounding extraneous variables or allow for manipulation of more than one independent variable in the experiment. These

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

213

include completely randomised designs, randomised block designs and factorial designs; analysis of the results of these experimental designs are also discussed in Chapter 13 on tests of differences.

COMPLETELY RANDOMISED DESIGN The completely randomised design is an experimental design that uses a random process to assign experimental units to treatments. Randomisation of experimental units is the researcher’s attempt to control all extraneous variables while manipulating a single factor: the treatment variable. Several of the experiments discussed in previous chapters are completely randomised designs. As an example, consider a posttest-only with control group experiment to examine the effects of various incentives on response rate in a mail survey. Receiving a monetary incentive or having a contribution made to a charity of their choice were the two experimental treatments chosen to provide the incentives (see Table 7.1). Use of a control group made a total of three treatments: (1) no incentive to the control group; (2) $1 personal incentive; and (3) $1 charity incentive. Suppose the sample frame is divided into three groups of 150 each (n). Assigning treatments to groups is a simple random process. Table 7.1 shows how to compare response rates (the dependent variable) of each of the three treatment groups to determine which method of increasing response was the best. In this example the donation-tocharity incentive had the greatest influence on response rate. TABLE 7.1 »

A COMPLETELY RANDOMISED DESIGN

Experimental treatment Control: No incentive

$1 personal incentive

$1 charity incentive

Response rate

23.3%

26.0%

41.3%

Number of observations (n)

150

150

150

Overall response rate 136/450 = 30.2 per cent

A pretest–posttest design (before–after) with control group(s) that replicates or repeats the same treatment on different experimental units is another example of a completely randomised design. Analysis of variance (ANOVA) is a statistical technique that involves investigating the effects of one treatment variable on an interval-scaled dependent variable. ANOVA is the appropriate form of statistical analysis for a completely randomised design that replicates a previous design. This topic is discussed in Chapter 13.

RANDOMISED BLOCK DESIGN The randomised block design is an extension of the completely randomised design. A form of randomisation is used to control for most extraneous variation; however, when the researcher has identified a single extraneous variable that might affect test units’ response to the treatment he or she can attempt to isolate the effects of this single variable by blocking out its effects. The term ‘randomised block’ originated in agricultural research that applied several levels of a treatment variable to each of several blocks of land. Systematic differences in agricultural yields due to the quality of the blocks of land may be controlled in the randomised block design. In marketing research the researcher may wish to isolate block effects such as shop size, territory location, market shares of the test brand or its major competition, per capita consumption levels for a product class, city size and so on. Grouping test units into homogeneous blocks on the basis of some relevant characteristic allows researchers to separately account for one known source of extraneous variation. Suppose that a manufacturer of Mexican food is considering two packaging alternatives. Marketers suspect that geographic region might confound the experiment. They have identified three regions where attitudes towards Mexican food may differ. They assume that within each region the relevant

completely randomised design An experimental design that uses a random process to assign subjects (test units) to treatments to investigate the effects of only one independent variable. randomised block design An extension of the completely randomised design in which a single extraneous variable that might affect test units’ response to the treatment is identified and the effects of this variable are isolated by being blocked out.

214

PART THREE > PLANNING THE RESEARCH DESIGN

attitudinal characteristics are relatively homogeneous. In a randomised block design, each block must receive every treatment level. Assigning treatments to each block is a random process. In this example the two treatments will be randomly assigned to different cities within each region. Sales results such as those in Table 7.2 might be observed. The logic behind the randomised block design is similar to that underlying the selection of a stratified sample rather than a simple random one. By isolating the block effects, one type of extraneous variation is partitioned out and a more efficient experimental design therefore results. This is because experimental error is reduced with a given sample size. TABLE 7.2 »

RANDOMISED BLOCK DESIGN

Percentage who purchase product Treatment

South Australia

Western Australia

New Zealand

Mean for treatments

Package A

14.0% (Adelaide)

12.0% (Perth)

7.0% (Auckland)

11.0%

Package B

16.0% (Port Augusta)

15.0% (Albany)

10.0% (Christchurch)

13.6%

Mean for cities

15.0%

13.5%

8.5%

FACTORIAL DESIGNS Suppose a brand manager believes that an experiment that only manipulates a price factor is too limited because price changes have to be communicated with increased promotional support. The brand manager suggests that more than one independent variable must be incorporated into the research design. Even though the single-factor experiments considered so far may have one specific variable blocked and other confounding sources controlled, they are still limited. Factorial designs allow for the testing of the effects of two or more treatments (factors) at various levels. Consider the experimenter who wishes to answer the following questions: 1 What is the effect of varying the number of rows (facings) in a brand’s shelf display? 2 What is the effect of varying the height of the display from the floor? 3 Is the effect of varying the number of rows different if an item is near the floor as opposed to being near the top? In other words, is there an interaction between the effect of the rows and the level of the display?16 A factorial design might be used to answer these questions because it allows for the simultaneous manipulation of two or more independent variables at various levels. In this example the independent variables are the number of rows (facings) and shelf height. Increases in sales, the dependent variable, attributed to each of these variables considered separately are referred to as ‘main effects’. A main effect is the influence on the dependent variable of a single independent variable. Each individual variable has a separate main effect. The effect of the combination of these two shelf policy variables is the interaction effect. A major factorial design An experiment that investigates the interaction of two or more independent variables on a single dependent variable. main effect The influence of a single independent variable on a dependent variable. interaction effect The influence on a dependent variable of combinations of two or more independent variables.

advantage of the factorial design is its ability to measure interaction effects, which may be greater or smaller than the total of the main effects. To further explain the terminology of experimental designs, let’s use the example of a manufacturer of toy robots who wishes to measure the effect of different prices and packaging designs on consumers’ perceptions of product quality. Table 7.3 indicates three experimental treatment levels of price ($25, $30 and $35) and two levels of packaging design (red and gold). The table shows that every combination of treatment levels requires a separate experimental group. In this experiment, with three levels of price and two levels of packaging design, we have a 3 × 2 (read ‘three by two’) factorial design because the first factor (variable) is varied in three ways and the second factor is varied in two ways. A 3 × 2 design requires six cells, or six experimental groups (3 × 2 = 6).

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

TABLE 7.3 »

FACTORIAL DESIGN – TOY ROBOTS

Package design Price ($)

Red

Gold

25

Cell 1

Cell 4

30

Cell 2

Cell 5

35

Cell 3

Cell 6

The number of treatments (factors) and the number of levels of each treatment identify the factorial design. A 3 × 3 design incorporates two factors, each having three levels; a 2 × 2 × 2 design has three factors, each having two levels. The treatments need not have the same number of levels; for example, a 3 × 2 × 4 factorial design is possible. The important idea is that in a factorial experiment, each treatment level is combined with every other treatment level. A 2 × 2 experiment requires four different subgroups or cells for the experiment; a 3 × 3 experiment requires nine combinations of subgroups or cells. In addition to the advantage of investigating two or more independent variables simultaneously, factorial designs allow researchers to measure interaction effects. In a 2 × 2 experiment, the interaction is the effect produced by treatments A and B simultaneously, which cannot be accounted for by either treatment alone. If the effect of one treatment differs at various levels of the other treatment, interaction occurs. To illustrate the value of a factorial design, suppose a researcher is comparing two magazine ads. The researcher is investigating the believability of the ads on a scale from 0 to 100 and wishes to consider the gender of the reader as another factor. The experiment has two independent variables: gender and ads. This 2 × 2 factorial experiment permits the experimenter to test three hypotheses. Two hypotheses are about the main effects: which ad is more believable and which gender more consistently tends to believe magazine advertising. However, the primary research question deals with the interaction hypothesis. A high score indicates a highly believable ad. Table 7.4 shows that the mean believability score for both genders is 65. This suggests that there is no main gender effect, because men and women evaluate believability of the advertisements equally. The main effect for ads indicates that ad A is more believable than ad B (70 versus 60). However, if we inspect the data and look within the levels of the factors, we find that men find ad B more believable and women find ad A more believable. This is an interaction effect because the believability score of the advertising factor differs at different values of the other independent variable, gender. TABLE 7.4 » A 2 × 2 FACTORIAL DESIGN THAT ILLUSTRATES THE EFFECTS OF GENDER AND AD CONTENT ON BELIEVABILITY

Ad A

Ad B

Men

60

70

65

Women

80

50

65

Mean for cities

70

60

Main effects of ad

Main effects of gender

215

216

PART THREE > PLANNING THE RESEARCH DESIGN

GRAPHIC INTERACTION Exhibit 7.3 graphs the results of the believability experiment. The line for men represents the two mean believability scores for the advertising copy for ads A and B; the other line represents the same relationship for women. When there is a difference between the slopes of the two lines, as in this case, the graph indicates interaction between the two treatment variables. The difference in the slopes means that the believability of the advertising copy depends on whether a man or a woman is reading the advertisement. EXHIBIT 7.3 → GRAPHIC ILLUSTRATION OF INTERACTION BETWEEN GENDER AND ADVERTISING COPY

100 90

Believability

80

Wo

me

70

n

Men

60 50 40 30 20 10 Ad A

Ad B

Step 6: Address ethical issues in experimentation Experimental researchers address privacy, confidentiality, deception, accuracy of reporting and other ethical issues common to other research methods. The question of subjects’ right to be informed, however, tends to be very prominent in experimentation. Research codes of conduct often suggest that the experimental subjects should be fully informed and receive accurate information. Yet, experimental researchers who know that demand characteristics can invalidate an experiment may not give subjects complete information about the nature and purpose of the study. Simply put, experimenters often intentionally hide the true purpose of their experiments from the subjects. debriefing The process of providing subjects with all pertinent facts about the nature and purpose of an experiment after its completion.

Debriefing is the process of providing subjects with all the pertinent facts about the nature and purpose of the experiment after the experiment has been completed. Debriefing experimental subjects by communicating the purpose of the experiment and the researcher’s hypotheses about the nature of consumer behaviour is expected to counteract negative effects of deception, relieve stress and provide an educational experience for the subject: Proper debriefing allows the subject to save face by uncovering the truth for himself. The experimenter should begin by asking the subject if he has any questions or if he found any part of the experiment odd, confusing or disturbing. This question provides a check on the subject’s suspiciousness and effectiveness of manipulations. The experimenter continues to provide the subject cues to the deception until the subject states that he believes there was more to the experiment than met the eye. At this time the purpose and procedure of the experiment [are] revealed.16

Researchers debrief subjects when there has been clear-cut deception or when they fear subjects may have suffered psychological harm in participating in an experiment (a rarity in marketing research). However, if the researcher does not foresee potentially harmful consequences in participation, debriefing may be omitted because of time and cost considerations.

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

AN EXTREME EXAMPLE OF A LACK OF DEBRIEFING

In an experiment conducted in a natural setting, independent food merchants in a number of Dutch towns were brought together for group meetings, in the course of which they were informed that a large organisation was planning to open up a series of supermarkets in the Netherlands.17 Subjects in the high threat condition were told that there was a high probability that their town would be selected as a site for such markets and that the advent of these markets would cause a considerable drop in their business. On the advice of the executives of the shopkeepers’ organisations who had helped to arrange the group meetings, the investigators did not reveal the experimental manipulations to their subjects. Commenting on this study, Hervert Kelman, a social psychologist and expert on experimental design, noted:

217

EXPLORING RESEARCH ETHICS

I have been worried about these Dutch merchants ever since I heard about this study for the first time. Did some of them go out of business in anticipation of the heavy competition? Do some of them have an anxiety reaction every time they see a bulldozer?

The chances are that they soon forgot about this threat (unless, of course, supermarkets actually did move into town) and that it became just one of the many little moments of anxiety that must occur in every shopkeeper’s life. However, do we have a right to add to life’s little anxieties and to risk the possibility of more extensive anxiety purely for the purposes of our experiments? Particularly since deception deprives the subjects of the opportunity to choose whether or not they wish to expose themselves to the risks that might be entailed.

Another issue that may – but typically does not – arise in marketing experiments is the subject’s right to safety from physical and mental harm. Most researchers believe that if the subject’s experience may be stressful or cause physical harm, the subject should receive adequate information about this aspect of the experiment before agreeing to participate.

TEST MARKETING: AN APPLICATION OF FIELD EXPERIMENTS Test marketing refers to scientific testing and controlled experimentation rather than merely ‘trying something out in the marketplace’. Just because a businessperson introduces a product in a small marketing area before doing so on a national level does not mean a test market has been conducted. Those who use this loose definition of test marketing may wonder why they succeed in their test market, but fail in their national introduction. Test marketing is an experimental procedure that provides an opportunity to test a new product or a new marketing plan under realistic market conditions to measure sales or profit potential. Cities such as Adelaide, Newcastle, Perth and Christchurch are typical settings for field experiments where a new product is distributed and marketed. The major advantage of test marketing is that no other form of research can beat the real world when it comes to testing actual purchasing behaviour and consumer acceptance of a product. For example, the Australian Tourism Commission conducted final concept testing of its advertising campaign ‘Where the bloody hell are you?’ in a number of countries before launching this campaign globally in 2006.18 Although the use of slogan worked well in test markets, it failed to register with consumers in many countries because of the idiosyncratic nature of the use of the word ‘bloody’. In 2005, New Zealand’s ripeSense and Australia’s J-Tech launched ripeSense, a new form of fruit packaging in Australia. The ripeSense label changes colour, from red to orange to yellow, as the fruit

test marketing A scientific testing and controlled experimental procedure that provides an opportunity to measure sales or profit potential for a new product or to test a new marketing plan under realistic marketing conditions.

218

PART THREE > PLANNING THE RESEARCH DESIGN

ripens. The general manager of ripeSense, Cameron McInnes, cited test market research results showing that 85 per cent of respondents would make a repeat purchase of a ripeSense pear pack as a basis for the introduction of this new product.19 The following steps should be followed in test marketing: 1 Decide whether to test market or not. 2 Work out the functions of the test market. 3 Decide on the type of test market. 4 Decide the length of the test market. 5 Decide where to conduct the test market. 6 Estimate and project the results of the test market. ONGOING PROJECT

Step 1: Decide whether to test market or not Test marketing is an expensive research procedure. Developing local distribution, arranging media coverage and monitoring sales results take considerable effort. It should come as no surprise that this laborious process is costly. Test marketing a packaged-goods product typically costs million of dollars. As with other forms of marketing research, the value of the information must be compared with the costs of the research. The expense of test marketing certainly is of great concern to marketing researchers. Making a decision to test market would be easier if there were a guarantee that a product which was successful in test marketing would succeed nationally. Unfortunately, there are a great many uncertainties and risks even with test marketing. The appropriate time period for a test market varies depending on the research objectives. A marketing executive at Mattel says that his firm, although deeply committed to marketing research, does little test marketing of toys: ‘We telescope our market research and testing into a much shorter time. There’s no time to put a new toy in a test market and attempt to learn about customer reaction.’20 The marketing research manager in a firm that does no test marketing may say: ‘It takes too long. We have a good idea, we move fast with it.’ This attitude is not unique, and it illustrates another disadvantage of test marketing: It takes a long time to do it properly. The average test market requires approximately nine to 12 months.

WHAT WENT RIGHT? Companies always need to test market new products. One reason is that some organisations feel that test markets delay unnecessarily what is likely to be a successful product, are expensive and may stifle innovation. For example, Mazda developed a high-performance version of its popular roadster, the MX-5 in the United Kingdom in 2012. Named the MX-5 GT, this concept car developed by British engineering specialists featured a more powerful engine, fully adjustable suspension, a track-focused aero kit and interior features made of carbon fibre to reduce weight. The MX-5 GT was developed to examine if demand in the United Kingdom existed for a high-performance road-going version of the iconic sports car. It foreshadowed the design of the new model in later years.

Courtesy Mazda UK Ltd

MAZDA MX-5 GT: NO TEST MARKET REQUIRED21

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

When a firm must commit a substantial amount of money to investments in plant and equipment, the cost of test marketing may appear minimal compared with that of a possible product failure. However, the mistakes would have been even more costly had the brands immediately been introduced nationally. Test marketing is warranted only if it will save the company money in the long run. Because of time and money, test markets are used after other forms of research have been conducted. Only products with a high probability of success are test marketed. Usually the go/no-go decision has already been made; the product has already gone through a screening process. Exploratory and other research suggests the product will have an acceptable sales volume. Thus, the test market will be used to refine the marketing mix or to evaluate a proposed marketing plan. The major advantage of test marketing is the opportunity to conduct a trial run in the marketplace. The benefits of this trial run, however, must be weighed against the probability of potential loss or failure inherent in a national introduction. Sometimes the wrong judgement is made. For instance, Unilever thought it had a winner with Persil Power, a concentrated laundry detergent, and the new product was launched nationally in the United Kingdom without a test market.22 After the market launch, it was discovered that the powerful detergent was so strong it damaged clothes of many consumers. Competitor Procter & Gamble discredited the product by demonstrating it caused clothes to tear and fade. There is always some risk in decision-making, with or without test marketing. Risk may be minimised but never avoided.

LOSS OF SECRECY If a firm delays national introduction of a product to allow time for test marketing, a competitor may find out about the experiment and ‘read’ the results of the test market. The firm, therefore, runs the risk of exposing a new product or its plans to competitors. If the competitor finds the product easy to imitate, it may beat the originating company to the national marketplace. While Clorox Super Detergent with Bleach remained in the test market stage in the United Kingdom, Procter & Gamble introduced Tide with Bleach nationally. Fab 1 Shot, a pouch laundry from Colgate-Palmolive, pre-empted Cheer Power Pouches by Procter & Gamble, but P&G wasn’t sorry. Fab 1 Shot was not a commercial success. Although customers tried the new product, over the long run they stayed with the more traditional means of doing laundry.

WHEN NOT TO TEST MARKET Not all product introductions are test marketed. Expensive durables, such as refrigerators, motor vehicles and forklift trucks, are rarely test marketed because of the prohibitive cost of producing a test unit. Many line extensions and me-too products that do not change consumers’ usage habits are considered relatively safe bets for national or regional introductions without test marketing. For example, Swatch Olympic-themed watches were introduced without test marketing. In other cases, test marketing is used only as a last resort because a new concept might be easily imitated by competitors – secrecy is more important than research. Research studies conducted before making the decision to test market may present findings that leave no doubt in management’s minds that everything is right. Steve Jobs, from Apple, famously never test marketed any of Apple’s successful products such as the iPod, the iPhone and the iPad. Other considerations – such as the seasonality of the product, distribution strength or experience with the product category – may also influence the test marketing decision.

Step 2: Work out the functions of test market Test marketing performs two useful functions for management. First, it offers the opportunity to estimate the outcomes of alternative courses of action. Estimates can be made about the optimal advertising expenditures, the need for product sampling, or how advertising and product sampling

219

220

PART THREE > PLANNING THE RESEARCH DESIGN

will interact. Researchers may be able to predict the sales effects of specific marketing variables – such as package design, price or couponing – and select the best alternative action. Test marketing permits evaluation of the proposed national marketing mix. A marketing manager for Life-Savers confectionery vividly portrays this function of experimentation in the marketplace: A market test may be likened to an orchestra rehearsal. The violinists have adjusted their strings, the trumpeters have tested their keys, the drummer has tightened his drums. Everything is ready to go. But all these instruments have not worked in unison. So a test market is like an orchestra rehearsal where you can practice with everything together before the big public performance.23

With test marketing, not only can researchers evaluate the outcomes of alternative actions on a product’s sales volume, but they can also investigate the new product’s impact on other items within the firm’s product line. Test marketing allows a firm to determine whether a new product will cannibalise sales from already profitable company lines. The second useful function of test market experimentation is that it allows management to identify and correct any weaknesses in either the product or its marketing plan, before committing the company to a national sales launch – by which time it normally will be too late to incorporate product modifications and improvements. Thus, if test market results fall short of management’s expectations, advertising weights, package sizes and so on may be adjusted. For example, McDonald’s test marketed pizza in the 1980s and 1990s. In its first test market it learned competitors’ reactions and the problems associated with offering small, individual-portion pizzas. The product strategy was repositioned and the product testing shifted to marketing a 14-inch pizza that was not available until late afternoon. The research then focused on how consumers reacted to these pizzas sold in experimental restaurants remodelled to include ‘Pizza Shoppes’, in which employees assembled ingredients on ready-made dough. Ultimately, McDonald’s decided that pizza for adults should not be on its menu. Note that just because a product turns out to be a marketing failure does not mean that the test market is a failure; rather, it is a research success. Encountering problems in a local testing situation enables management to make adjustments in marketing strategy before national introduction. The managerial experience gained in test marketing, therefore, can be invaluable.

WHAT WENT RIGHT?

HEAD AND SHOULDERS ABOVE THE REST: TEST MARKET SUCCESS FOR PROCTER & GAMBLE IN CHINA24

In the test markets in China, Procter & Gamble’s shampoo brand, Head & Shoulders, fared very well. Since the Chinese had low disposable incomes, P&G offered the shampoo in small

ONGOING PROJECT

packs. P&G was able to get off to a flying start, thanks to a highquality branded product offered in the right amounts that made it affordable to the customer.

Step 3: Decide on the type of test market The discussion so far has focused on the standard method of test marketing, which requires that the firm chooses test markets and then obtains distribution using its own sales force. The major advantage of this method is that there is considerable external validity: everything is just as it would be in a national introduction. However, as we already pointed out, there may be several problems (cost, lack of secrecy and so on) with a standard test market. To reduce test market costs and the probability of competitive interference, researchers have been using

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

221

controlled shop tests that simulate the retail conditions that would occur if the product were distributed nationally. The control method of test marketing involves a ‘minimarket test’ in a small city, using control store distribution, or forced distribution. A marketing research company that specialises in test marketing performs the entire test marketing task, including the initial sale to retailers (referred to as sell-in), warehousing, distribution and shelving of the test product. The research company pays retailers for shelf space and therefore can guarantee distribution to stores that represent a predetermined percentage of the market’s all-commodity volume (the total dollar sales for that product in a defined market). Thus, the firm is guaranteed distribution in stores that represent a predetermined percentage of the market. The warehousing function and portions of the retailer’s stocking function are performed by the research agency. Thus, the retailer is more willing to cooperate with the research because selling the product will be effortless. However, this raises the question of whether the retailer would react to the product in the same fashion if the product were going through its normal distribution channels. With controlled store testing, running out of stock – a potential problem in the traditional channel of distribution – rarely occurs. The research agency monitors the sales results without the use of an outside auditing firm.

ADVANTAGES OF THE CONTROL METHOD OF TEST MARKETING

The advantages of the control method of test marketing are: 1 reduced costs 2 shorter time period needed for reading test market results 3 increased secrecy from competitors 4 no distraction of company salespeople from regular product lines. Lower costs result from the smaller market test. Because distribution is guaranteed, no waiting period is necessary to obtain regular channels. Secrecy is increased, and monitoring the test product’s movement is increasingly difficult for competitors. One potential problem with the controlled store test is that distribution may be abnormally high. Also, retailers’ complete cooperation with promotions, such as ensuring that the product is never out of stock, may result in higher-than-normal sales. This type of study becomes more like a laboratory study in which factors are increasingly controlled. Thus, if a firm’s objective is to see if it can obtain distribution for a product, a standard test market will be much more appropriate. However, when the problem is to test a specific set of alternatives and determine which is the most appropriate marketing activity, controlled store testing may be superior to the standard test market.

HIGH-TECHNOLOGY SYSTEMS USING SCANNER DATA Several research suppliers offer test marketing systems that combine scanner-based consumer panels (discussed in Chapter 6) with high-technology broadcasting systems that allow experimentation with different advertising messages via split-cable broadcasts or other technology. These systems, sometimes called electronic test markets, enable researchers to measure the immediate impact of commercial television viewing of specific programs on unit sales volume. A household’s barcoded identification number is entered into a store’s computer when a household member makes a purchase. The computer links the household’s item-by-item purchases with television viewing data during extensive test marketing programs. For example, Project Apollo (as discussed in Chapter 6), in which 10 000 respondents used portable People Meter scanners linking advertising and consumer spending, did provide a consumer panel in which new products, preferably fast-moving consumer goods, could have been test marketed.

TIPS OF THE TRADE?

control method of test marketing A ‘minimarket test’ using forced distribution in a small city. Retailers are paid for shelf space so that the test marketer can be guaranteed distribution. electronic test markets A system of test marketing that measures results based on Universal Product Code data; often scanner-based consumer panels are combined with high-technology television broadcasting systems to allow experimentation with different advertising messages via split-cable broadcasts or other technology.

222

PART THREE > PLANNING THE RESEARCH DESIGN

SIMULATED TEST MARKETS simulated test market A research laboratory in which the traditional shopping process is compressed into a short time span.

Marketing research program strategies often include plans for simulated test markets because managers wish to minimise the number of products that go through the lengthy and costly process of full-scale marketing. Simulated test markets are research laboratories in which the traditional shopping process is compressed into a short time span. Consumers visit a research facility, where they are exposed to advertisements (usually as part of a television program shown in a theatre setting). They then shop in a room that resembles a supermarket aisle. Researchers estimate trial purchase rates and how frequently consumers will repurchase the product based on their simulated purchases in the experimental store. Simulated test markets almost always use a computer model of sales to produce estimates of sales volume. For example, M/A/R/C Research offers the Assessor modelling system. Simulated test markets cannot replace full-scale test marketing, but they allow researchers to make early predictions about the likelihood of success of a go/no-go decision. These results become significant information for determining which products ultimately will be introduced into real test markets. Advances in computer software, especially the development of agent-based models using a program called NetLogo also allow managers and researchers to construct simple models of consumer behaviour in a simulated market place. Variables such as price and product features (which can often be labelled as buttons or sliders) can then be manipulated in a simulated experiment of the marketplace. One such NetLogo model on mobile phone network choice (3G versus 4G) is shown in Exhibit 7.4.

EXHIBIT 7.4 → A NetLogo SIMULATION OF A TEST MARKET FOR MOBILE PHONE PROVIDER CHOICE

Results from this model showed that consumer tolerance (or patience), the number of Four G Access points, capacity of each access point and only the price of 3G seemed to influence the happiness of consumers and the mean use of technology (4G and 3G). Loss of customers is determined mainly by tolerance. The implications for managers are that planning for capacity is important but you do not have to provide access to all consumers; the price of the old technology is more important than the price of the new technology and relationship marketing is important to increase tolerance and prevent churn. When compared to actual large-scale survey results, predictions of this model in terms of customer satisfaction have been surprisingly accurate.25 A major problem with this simulated test marketing occurs when the marketer does not execute the marketing plan that was tested during the simulated test. If the marketer changes the advertising copy, price or another variable, the model used to measure product acceptance will no longer be accurate.

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

223

VIRTUAL-REALITY SIMULATIONS Advances in computer graphics and three-dimensional modelling have led to the development of virtual-reality simulated test markets. A virtual-reality simulated test market attempts to reproduce the atmosphere of an actual retail store with visually compelling images appearing on a computer screen. The computer-simulated environment allows the research subject to ‘move’ through a store, pause in front of a shelf or display, and inspect the various product offerings. The subject, acting as a shopper, can ‘pick up’ a package displayed on the screen and, using a mouse or trackball, turn

virtual-reality simulated test market An experiment that attempts to reproduce the atmosphere of an actual retail store with visually compelling images appearing on a computer screen.

the package to examine labels, lists of ingredients and other information printed on all sides of the package. Subjects can make a ‘purchase’ by clicking on an icon (often a shopping cart) on the screen. During the shopping simulation, the computer can record the amount of time the subject spends shopping for each product category and inspecting information on the package, the number of items the subject purchases, and the order in which items are purchased.

REAL WORLD SNAPSHOT

Modern technologies are offering alternatives to conventional test markets. These come in several different forms. High-tech firms have developed Web-based software that allows consumers to enter virtual worlds where they can select, among other things, what types of clothes to wear to certain events, or what kind of shoes to wear to get there. Companies such as Levi’s can provide a virtual world through their website and include some new jeans designs for consumers to try out virtually. Certainly much cheaper than a conventional test market, the company can get a sense for what types of consumers are turned on by which types of jeans. Video-game-type test markets also can be conducted. These work much like the popular simulation games in which consumers control the daily lives of virtual families. However, these simulated people are programmed to act like real consumers. Among other things, a soft-drink company can use these to estimate how vending machine usage is affected by different arrangements of machines within a given building. Pepsi and Coke have both taken advantage of virtual test markets of this type. The simulated test market has even been a basis for pricing and promotion budgets in North Africa. The simulated consumers were programmed based on brand and media preferences for several thousand actual Moroccan consumers. Only time will tell whether these virtual test markets have accuracy approaching actual test markets or not. Certainly, the cost advantages will make them attractive to many companies. In fact, technology is advancing so that Ferrari soon will be able to virtually test market new models. Consumers should virtually stand in line for a shot at a test drive!

Shutterstock/Shamzami

TESTING IN A VIRTUAL WORLD26

Virtual-reality simulated test markets have many potential uses. Of course, traditional experimental manipulations of price, package design and advertising copy can be investigated with virtual-reality simulated test markets at a much lower cost than with traditional test marketing. However, the real advantage of a virtual-reality test market is its ability to modify environments that would be extremely difficult or too expensive to modify in an actual test market. For example, managers for a chain of fast-food restaurants noticed that customers would stand at registers, staring at the menu board, and take a long time to place an order. Often this created long lines, and customers waiting in these lines became frustrated and walked away. Managers speculated that the menu boards were too extensive

224

PART THREE > PLANNING THE RESEARCH DESIGN

and confusing. A virtual-reality simulated test market with multiple virtual menu boards was easily designed on a computer. The research findings revealed that grouping products together into meals with a small discount increased ordering speed and total order size for a significant percentage of the research subjects. This issue would have been very difficult to study with another research design.

ONLINE TEST MARKETS online text market

An extension of online survey panels, online test markets allow marketers to test advertising copy, sales promotions and possibly purchases by allowing respondents to later purchase the products. In some respects this is similar to a controlled method of test marketing, since distribution is still

online test market An online panel used to test market new products and advertising copy. Global coverage, but respondents are usually paid.

forced. Online test markets may be suitable for products and services that are increasingly purchased online, such as music, books and airline tickets, and may provide a cheap alternative to the testing of advertising copy. Some potential problems in the use of online test markets include sample bias and that respondents are usually paid. Nevertheless, this is an emerging, cheap and very fast means of conducting a test market and responses can be gained from around the world in a number of languages.

ONGOING PROJECT

Step 4: Decide on the length of the test market In a discussion concerning the testing of packaged-goods products, a vice-president of research for the J. Walter Thompson advertising agency made this statement: A new product’s volume typically rises rapidly to a peak, then begins to decline. The product usually will peak out in 3 to 4 months and, about a year after introduction, volume will level out at between threefourths and one-seventh of the peak.27

This suggests that a number of people try new products, but many do not repeat their purchases. Thus, test markets should be long enough for consumers to become aware of the product and to try it more than once. Test marketing for an adequate period of time minimises potential biases due to abnormal buying patterns. For instance, a test market that is too short may overestimate sales, because typically the early triers are heavy users of the product: Time must be allowed for sales to settle down from their initial honeymoon level; in addition, the share and sales levels must be allowed to stabilise. After the introduction of a product, peaks and troughs will inevitably stem from initial customer interest and curiosity as well as from competitive product retaliation.28

The time required for test marketing also depends on the product: a package of chewing gum is consumed much sooner than a bottle of shampoo. The average length of test markets for grocery and packaged goods is 10 months. After high initial penetration and when the novelty of the product has stabilised, the researcher may make an estimate of market share.

Step 5: Decide where to conduct the test market Selecting test markets is for the most part a sampling problem. Say that the researcher wishes to choose a sample of markets that is representative of the population of all cities and towns throughout Australia and New Zealand. The test market cities should represent the competitive situation, distribution channels, media usage patterns, product usage and other relevant factors. Of course, there

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

is no single ideal test market that is a perfect miniature of the entire market such as Australia and New Zealand. Nevertheless, the researcher must avoid cities that are not representative of both nations. Regional or urban differences, atypical climates, unusual ethnic compositions or different lifestyles may dramatically affect a marketing program. Researchers who wish to select representative test markets have a more complex problem because it may be necessary to use three or four cities. Cities are selected as experimental units, and one or more additional cities may be required as control markets. Thus, each of the experimental and control markets should be similar in population size, income, ethnic composition and so on. Differences in these demographic factors and other characteristics among the experimental or control markets affect the test results. Because of the importance of having representative markets for comparisons, certain cities are used repeatedly for test market operations.

FACTORS TO CONSIDER IN TEST MARKET SELECTION Obtaining a representative test market requires considering many factors that may not be obvious to the inexperienced researcher. As with all decisions, the objectives of the decision-makers will influence the choice of alternative. The following factors should be considered in the selection of a test market.

Population size The population should be large enough to provide reliable, projectable results, yet small enough to ensure that costs will not be prohibitive. Sydney and Melbourne are not a popular test markets for the Australian market as their size make them unacceptable.

Demographic composition and lifestyle considerations Ethnic backgrounds, incomes, age distributions, lifestyles and so on within the market should be representative of the nation. For example, test marketing in Western Australia may not be representative because people there tend to be quick to accept innovations that might not be adopted elsewhere.

Competitive situation Competitive market shares, competitive advertising and distribution patterns should be typical so that test markets will represent other geographic regions. If they are not representative, it will be difficult to project the test market results to other markets. Consider the firm that test markets in one of its strongest markets. Its sales force has an easy time getting trade acceptance, but might have difficulty in a market in which the firm is weak. That will influence the acceptance level, the cost of the sell-in (obtaining initial distribution) and the ultimate results of the test market. Hence, projecting the results of the test market into weaker markets becomes difficult. Selecting an area with an unrepresentative market potential may cause innumerable problems. Firms probably should not test Bundaberg Rum in Queensland, dairy products in Japan or antihistamines in the Northern Territory.

Media coverage and efficiency Local media (television spots, newspapers) will never exactly replicate national media. However, duplicating the national media plan or using one similar to it is important. Using newspapers’ Sunday supplements as a substitute for magazine advertising does not duplicate the national plan, but may provide a rough estimate of the plan’s impact. Ideally, a market should be represented by the major television networks, typical pay television programming and newspaper coverage. Some magazines have regional editions or advertising inserts.

225

226

PART THREE > PLANNING THE RESEARCH DESIGN

Media isolation Advertising in communities outside the test market may contaminate the test market. Furthermore, advertising money is wasted when it reaches consumers who cannot buy the advertised product because they live outside the test area. Regional markets and centres are highly desirable because advertising does not spill over into other areas.

Self-contained trading area Distributors should sell primarily or exclusively in the test market area. Shipments in and out of markets from chain warehouses can produce confusing shipping figures.

Overused test markets If consumers or retailers become aware of the tests, they will react in a manner different from their norm. Thus, it is not a good idea to establish one great test market. Perth, Western Australia, for example, is less likely to be used as a test market for technology products because of low mainstream use compared to other centres.

Availability of scanner data Markets in which a high proportion of stores can supply scanner data are attractive to many test marketers, particularly for fast moving consumables such as grocery items and convenience products.

REAL WORLD SNAPSHOT

AUSTRALIA AS A TEST MARKET: McDONALD’S LOADED FRIES29

In 2016, Australia became the first country to sell McDonald’s Loaded Fries – or chips with toppings. Two versions of Loaded Fries – Bacon and Cheese or Guacamole and Salsa – were test marketed; a share pack cost A$6.95, while a standard size cost A$4.10. Australia has a track record for being a test market for new ideas for McDonald’s. McCafé was first trialled in Melbourne, as well as the custom burger range Create Your Taste. Both ideas are now in the US market.

Step 6: Estimate and project test market results The main reasons for conducting a test market are to estimate sales, attitude change, repeat-purchase behaviour and the like, and to project the results on a national level. A number of methodological factors may cause problems in estimating the results on a national level. These problems usually result from mistakes in the design or execution of the test market.

OVERATTENTION If too much attention is paid to testing a new product, the product may be more successful than it normally would be. The advertising agency may make sure that the test markets have excellent television coverage (which may or may not be representative of the national television coverage). If salespeople are aware that a test is being conducted in their territory, they may spend unusual amounts of time making sure the product is more available or better displayed.

UNREALISTIC STORE CONDITIONS Store conditions may be set at the level of the market leader rather than at the national level. For example, extra shelf facings, eye-level stocking and other conditions resulting from artificial distribution may be obtained in the test market. This situation may result from research design problems or overattention, as previously described. For example, if retailers are made aware that someone is paying more attention to their efforts with a given product, they may give it artificially high distribution and extra retail support.

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

227

READING THE COMPETITIVE ENVIRONMENT INCORRECTLY Another common mistake is to assume that the competitive environment will be the same nationally as it was in the test market. If the competition is unaware of the test market, the results will not measure competitors’ reactions to company strategy. Competitors’ responses after a national introduction may differ substantially from what occurred in the test market. On the other side of the coin, competitors may react to a test market by attempting to undermine it. If they know that a firm is testing, they may attempt to disrupt test market results with increased promotions and lower prices for their own products.

WHAT WENT WRONG? THE HIDDEN IN HIDDEN VALLEY RANCH

Hidden Valley Ranch (HVR) once conducted a field market experiment to examine how effective three new flavours of salad dressings would be in the US marketplace. Thus, there were three levels of the experimental variable, each representing a different flavour. HVR had to produce small batches of each flavour, get them bottled and ship them to their sales representatives, who then had to stock the dressings in the participating retail stores. All of this was very expensive and the cost to produce each bottle used in a test market was almost $20. So, the first day of the test was consumed with sales reps placing the products in the salad dressing sections of retail stores. The second day, each rep went back to each store to

record the number of sales for each flavour. By the third day, all of the bottles of all flavours had sold. Amazing! Was every flavour a huge success? Actually, one of HVR’s competitors had sent their sales reps around beginning on the second day of the test to buy every bottle of the new HVR dressings in every store it had been placed in. Thus, HVR was unable to produce any sales data (the dependent variable) and the competitor was able to break down the dressing in their labs and determine the recipe. This illustrates one risk that comes along with field tests. Once a product is available for sale, there are no secrets. Also, you risk espionage of this type that can render the experiment invalid.

INCORRECT VOLUME FORECASTS

110 000

In the typical test market, unit sales volume or market share

100 000

is the focus of attention. Shipments, warehouse withdrawals

90 000

or store scanner data may be the major basis for projecting sales. Forecasted volume for test markets should be adjusted

80 000

to reflect test distribution levels, measurement problems with

70 000

Initial penetration, if projected directly, may overstate the situation. Many consumers who make a trial purchase may not repurchase the product. Researchers must be concerned with repurchase rates as well as with initial trial purchases. Supplementing retail store scanner data with purchase diaries and panel data will help to indicate what sales volume will be over time.

TIME LAPSE One relatively uncontrolled problem results from the time lapse between the test market experiment and the national introduction of the product. If the time period between the

Sales Units

store data, and other differences between test markets and national markets.

Wow! Sales curve in Indianapolis test Periods 22/3/7 to 24/1/98

Lay's Wow!

60 000 50 000

Ruffles Wow!

40 000

Doritos Nacho Cheese Wow!

30 000 20 000 10 000

Doritos Wow! 0 3/97 4/97 5/97 6/97 7/97 8/97 9/97 10/97 11/97 12/97 1/98 Unit sales by brand, all sizes, in supermarkets with sales $2 million-plus Source: ACNielsen Corp.

228

PART THREE > PLANNING THE RESEARCH DESIGN

national introduction and the test market is a year or more (which is not unusual), the time difference can have an important effect on consumers’ receptivity to the product. In the late 1990s, Frito-Lay test marketed several brands of WOW! Chips in Indianapolis, USA, before the product was launched nationally.30 These sales curves illustrate why test markets often last a year or more. Initial trial purchases often do not reflect repeat-purchase rates. Marketers with substitutable items in a line (for example, Frito-Lay’s Baked Lays) must also look for possible cannibalisation.

PROJECTING TEST MARKET RESULTS Consumer surveys In addition to actual sales data, most test marketers use consumer survey data. These help to measure levels of change in consumer awareness of and attitudes towards the product and rates of purchasing and repeat purchasing. Frequently this information is acquired via consumer panels. Sales-oriented measures, such as incidences of purchase and customer satisfaction ratings, are used to project sales volume nationwide.

Straight trend projections Sales can be identified and the market share for the test area calculated. The simplest method of projecting test market results involves straight trend projections. Suppose the market share is 3.5 per cent in the test market region. A straight-line projection assumes that the market share nationwide will be 3.5 per cent. Rarely will every market be identical to the test market, but this gross prediction may indicate whether the product has a viable marketing mix.

Ratio of test product sales to total company sales A measure of the company’s competitive strength in the test market region might be used as a basis for adjusting test market results. Calculating a ratio of test product sales to total company sales in the area may provide a benchmark for modifying projections into other markets.

Market penetration × repeat-purchase rate To calculate market share for products that are subject to repeat purchases, the following formula is used: market penetration (trial buyers) × repeat purchase rate = market share. For example, suppose a product is tried by 30 per cent of the population and the repeat purchase rate is 25 per cent. Market repeat-purchase rate The percentage of purchasers who make a second or repeat purchase. market penetration The percentage of potential customers who make at least one trial purchase.

share will then be 7.5 per cent (30 per cent × 25 per cent = 7.5 per cent). The repeat-purchase rate must be obtained from longitudinal research that establishes some form of historical record. Traditionally, a consumer panel has been necessary for recording purchases over time. Thus, panel data may indicate a cumulative product class buying rate, or market penetration, in the early weeks of the test market. As the test market continues, repeat purchases from these buyers can be recorded until the number of trial purchases has levelled off. Exhibit 7.5 indicates typical purchase and repurchase patterns for a new product in a test market.

Time (a) Market penetration curve

SUMMARY

229

← EXHIBIT 7.5 NEW-PRODUCT TRIAL PURCHASE CURVE AND REPEATPURCHASE CURVE

Percentage of purchasers making repeat purchases

Percentage of potential customers making at least one trial purchase

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

Time (b) Repeat purchase curve

07

UNDERSTAND THE STEPS IN EXPERIMENTAL RESEARCH

ADDRESS ISSUES OF VALIDITY IN EXPERIMENTS

Experimental research allows the investigator to control the research situation to evaluate causal relationships among variables. In an experiment, one variable (the independent variable) is manipulated to determine its effect on another (the dependent variable). The alternative manipulations of the independent variable are referred to as experimental treatments. When conducting experimental research six steps should be followed.

Researchers can control extraneous variables by eliminating them or by holding them constant for all treatments. Some extraneous error may arise from the order of presentation. This can be controlled by counterbalancing the order. Blinding can be used by keeping subjects ignorant of the treatment they are receiving. Sometimes the blinding is extended to the person who administers the experimental treatment. Finally, random assignment is an attempt to control extraneous variables by chance. Experiments are judged by two measures of validity. One is internal validity – whether the independent variable was the sole cause of the change in the dependent variable. Six types of extraneous variables may jeopardise internal validity: history, maturation, testing, instrumentation, selection and mortality. The second type of validity is external validity – the extent to which the results are applicable to the real world. Field experiments are lower than laboratory experiments on internal validity, but higher on external validity. Other errors may arise from using non-representative populations (for example, university students) as sources of samples, or from sample mortality or attrition (the withdrawal of subjects from the experiment before it is completed). In addition, marketing experiments often involve extraneous variables that may affect dependent variables and obscure the effects of independent variables. Experiments may also be affected by demand characteristics when experimenters inadvertently give cues about the desired responses. The guinea pig effect occurs when subjects modify their behaviour because they wish to cooperate with an experimenter.

DECIDE ON A FIELD OR LABORATORY EXPERIMENTAL DESIGN

Two main types of marketing experiments are field experiments conducted in natural environments (such as test markets) and laboratory experiments conducted in artificial settings contrived for specific purposes. Field experiments low internal validity and high external validity. The reverse is true for laboratory experiments. DECIDE ON THE CHOICE OF INDEPENDENT AND DEPENDENT VARIABLE(S)

The choice of dependent variable is crucial because this determines the kind of answer given to the research problem. In some situations deciding on an appropriate operational measure of the dependent variable is difficult. SELECT AND DESIGN THE TEST UNITS

For experiments, random sampling error is especially associated with selection of subjects and their assignment to the treatments. The best way to overcome this problem is by random assignment of subjects to groups and of groups to treatments.

230

PART THREE > PLANNING THE RESEARCH DESIGN

SELECT AND IMPLEMENT AN EXPERIMENTAL DESIGN

DECIDE ON THE TYPE OF TEST MARKET

Experimental designs fall into two groups. A basic design manipulates only one variable. A complex design isolates the effects of extraneous variables or uses more than one treatment (independent variable). Poor basic designs include the one-shot design, the one-group pretest–posttest design and the static group design. Better basic designs include the pretest–posttest control group design, the posttest-only control group design and the Solomon four-group design. Time series designs are used when experiments are conducted over long periods. They allow researchers to distinguish between temporary and permanent changes in dependent variables. There are also various complex experimental designs that are used in marketing research. These include the completely randomised design, the randomised block design and various factorial designs. Factorial designs allow for investigation of interaction effects among variables.

An alternative to standard test marketing, which uses a company’s own sales force and normal distribution channels, is the control method. With this method, a marketing research service handles distribution to particular stores in a small city with controlled stocking and shelf facings. This type of test has the characteristics of a laboratory situation as well as a real-world field experiment. Electronic test markets use scanner-based systems to immediately reveal the impact of experimental manipulations. Simulated test markets are research laboratories that compress the shopping process into a short time span. Virtual-reality simulated test markets attempt to reproduce the atmosphere of an actual retail store with visually compelling images appearing on a computer screen. An extension of online survey panels, online test markets allow marketers to test advertising copy, sales promotions and possibly purchases by allowing respondents to latter-purchase the products. This is an emerging cheap and very fast means of conducting a test market and responses can be gained from around the world in a number of languages. There still remain problems of sample bias in that respondents are usually paid for their views. Controlled test markets, while cheaper and faster than a standard test market, are not as realistic, although there is likely to be a greater amount of secrecy in a controlled rather than standard test market.

ADDRESS ISSUES OF ETHICS IN EXPERIMENTATION

Experiments can involve disguise and the choice of beneficial treatments (for example, a new medical treatment or price discount) or harm (paying higher bank fees). In these cases it is important that respondents are informed at the end of the experiment what the purpose of the study was and are given the option of receiving a favourable experimental treatment and or compensation for a ‘harmful’ treatment. Therefore debriefing respondents at the end of an experiment is crucial. UNDERSTAND THE STEPS IN TEST MARKETING

Test marketing is an experimental procedure that provides an opportunity to test a new product or marketing plan under realistic conditions to obtain a measure of sales or profit potential. Its major advantage as a research tool is that it closely approximates reality. There are five major steps in test marketing. DECIDE WHETHER TO TEST MARKET OR NOT

Test marketing is an expensive research procedure; the value of the information gained from test marketing must be compared with the cost. It can also expose the new product to competitive reaction before the product is introduced. Test marketing generally occurs late in the product development process when a high probability of success is predicted. WORK OUT THE FUNCTION OF THE TEST MARKET

Test marketing provides the opportunity to estimate the outcomes of alternative courses of action. It also allows marketers to identify and correct weaknesses in the product or its marketing plan before a full-scale sales launch. Test market failures can be research successes if they point out the need for such adjustments.

DECIDE THE LENGTH OF THE TEST MARKET

The test market should allow enough time for consumers to use up the product and make a repeat purchase if they choose to do so. Too short a test period may overstate potential sales because many initial buyers may not repeat their purchases. DECIDE WHERE TO CONDUCT THE TEST MARKET

Selecting test markets is a sampling problem. The researcher wishes to use sample markets that are representative of the whole population. Several factors are important in test market selection, including population size, demographics, lifestyles, competitive situation, media coverage and efficiency, media isolation and a selfcontained trading area. Researchers should also avoid overuse of particular markets, as people who are aware that a test is going on may alter their purchase behaviour patterns. ESTIMATE AND PROJECT THE RESULTS OF THE TEST MARKET

One objective of test marketing is to estimate the sales that eventually may be expected. Several problems must be overcome. One is overattention to the product or test, which can result in an overly high estimate of sales. A related problem is unrealistic store conditions. Also, competitive conditions in national introductions may be very different from those in a test market. Incorrect volume forecasts and time lapses between test and introduction

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

may confound the results. Consumer surveys may be used to supplement the data gained in the test market. Projections may be

231

straight trends, adjusted for the company’s competitive strength, or adjusted for market penetration and repeat-purchase rate.

KEY TERMS AND CONCEPTS basic experimental design blinding cohort effect completely randomised design compromise design constancy of conditions constant error control group control method of test marketing controlled store test debriefing demand characteristics dependent variable double-blind design

electronic test markets experiment experimental group experimental treatments external validity factorial design field experiment guinea pig effect Hawthorne effect history effect independent variable instrumentation effect interaction effect internal validity laboratory experiment

main effect market penetration matching maturation effect mortality (sample attrition) effect one-group pretest–posttest design one-shot design online test market posttest-only control group design pretest–posttest control group design quasi-experimental design random sampling error

randomisation randomised block design repeated measures repeat-purchase rate selection effect simulated test market Solomon four-group design static group design tachistoscope test marketing test units testing effect time series design virtual-reality simulated test market

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 Read the opening vignette of this chapter. What are some potential problems in using an online panel as a test market? 2 Name some independent and dependent variables that a marketing manager would be interested in including in an experiment on beer tasting. 3 In a test of a new coffee, three styrofoam cups – labelled A, B and C – are placed before subjects. The subjects are instructed to taste the coffee from each cup. What problems might arise in this situation? 4 What are demand characteristics? Give some examples. 5 What is the difference between a main effect and an interaction in an experiment. In Question 3 what will create a main effect? Is interaction possible? 6 Why is an experimental confound so damaging to the conclusions drawn from an experiment? 7 Name the type of experiment described in each of the following situations. Evaluate the strengths and weaknesses of each design. a A major petroleum corporation is considering phasing out its premium unleaded petrol. It selects Queenstown, New Zealand, as an experimental market in which the product might be eliminated and decides to watch product line sales results. b A soft-drink manufacturer puts the same brand of orange drink into two different containers with different designs. Two groups are given a package and asked about the drink’s taste. A third group is given the orange drink in an unlabelled package and asked the same question.

c An advertising agency pretests a television advertisement with a portable television set, simulating an actual television program with the test advertisement inserted along with other advertisements. This program is shown to a focus group, and a group discussion follows. d A manufacturer of a new brand of cat food tests product sampling with a trial-size package versus no sampling and three price levels simultaneously to determine the best market penetration strategy. 8 Provide an example for each of the six major factors that influence internal validity. 9 Name the type of experiment described in each of the following situations. Evaluate the strengths and weaknesses of each design. a A major fast-food corporation is considering a drug-testing program for its counter workers. It selects its largest outlet in Sydney, implements the program and measures the impact on productivity. b A mass merchandiser conducts an experiment to determine whether a flexible work-time program (allowing employees to choose their own work hours between 6 a.m. and 7 p.m.) is better than the traditional working hours (9 a.m. to 5 p.m.) for sales personnel. Each employee in the Brisbane office is asked if he or she would like to be in the experimental or the control group. All employees in the Perth office remain on the traditional schedule.

232

PART THREE > PLANNING THE RESEARCH DESIGN

10 Evaluate the ethical and research design implications of the following study. – Sixty-six willing Australian drinkers helped a Federal Court judge decide that Tooheys didn’t engage in misleading or deceptive advertising for its 2.2 beer. The beer contains 2.2 per cent alcohol, compared with 6 per cent for other beers. – The volunteers were invited to a marathon drinking session after the Aboriginal Legal Service claimed Tooheys advertising implied beer drinkers could imbibe as much 2.2 as desired without becoming legally intoxicated. Drinkdriving laws prohibit anyone with a blood-alcohol level above 0.05 from getting behind the wheel. – But the task wasn’t easy; nor was it all fun. Some couldn’t manage to drink in one hour the required 10 ‘middies’, an Australian term for a beer glass of 10 fluid ounces. – Thirty-six participants could manage only nine glasses. Four threw up and were excluded; another two couldn’t manage the ‘minimum’ nine glasses and had to be replaced. – Justice Beaumont observed that consuming enough 2.2 in an hour to reach the 0.05 level was ‘uncomfortable and therefore an unlikely process’. Because none of the ads mentioned such extreme quantities, he ruled they couldn’t be found misleading or deceptive. 11 A night-time cough relief formula contains alcohol. An alternative formulation contains no alcohol. During the experiment the subjects are asked to try the product in their homes. Alternative formulations are randomly assigned to subjects. No mention of alcohol is given in the instructions to subjects. Is this ethical? 12 In a 2 × 2 factorial design, there are eight possible patterns of effects. Assume that independent variable A and independent variable B have significant main effects, but there is no interaction between them. Another combination might be no effects of variable A, but a significant effect of variable B with a significant interaction effect between them. Create a diagram of each of these eight possible effects.

13 Which of the following products or marketing strategies are likely to be test marketed? Why or why not? a a computerised robot lawn mower b a line of 80 g servings of vegetarian dishes for aged consumers c a forklift truck d a new brand of eye drops especially for blue-eyed people e a new, heavy-duty dishwashing powder f an advertising campaign to get people to drink an energy drink in the morning 14 What measures should be used to project the results of test markets for the following? a a new confectionery bar b a new solar-powered radio built into sunglasses c a toothpaste’s new advertising campaign 15 How long should the test market periods last for the products in Question 14? 16 What is the difference between standard test marketing and controlled test marketing? What are the advantages and disadvantages of each? 17 How are the results of a test market projected to a national level? What problems arise in making such projections? How can projections be adjusted to be more accurate? 18 What factors are important in selecting test markets? 19 What advantages does simulated test marketing have over traditional test marketing? What limitations does it have? 20 A mouthwash manufacturer learns that a competitor is test marketing a new lemon-flavoured mouthwash in Brisbane. The marketing research department of the competing firm is told to read the results of the test market, and the marketing manager is told to lower the price of the company’s brand to disrupt the test market. Is this ethical?

ONGOING PROJECT DOING AN EXPERIMENTAL STUDY? CONSULT THE CHAPTER 7 PROJECT WORKSHEET FOR HELP

This project worksheet examines both the use of experiments in marketing research and test markets. Decisions to be made with either design are shown in the Chapter 7 project worksheet, which can

be downloaded from the CourseMate website. Make sure you have identified clearly the independent variables (experimental treatments), the dependent variables (outcomes) and how realistic you want the experiment or test market to be.

CHAPTER 07 > EXPERIMENTAL RESEARCH AND TEST MARKETING

233

WRITTEN CASE STUDY 7.1 TEST MARKETING GUINNESS IN PERTH, WESTERN AUSTRALIA31 Anyone visiting Perth in the last few years may have found themselves being used as a guinea pig by one of the world’s biggest brewers. Perth and some pubs in the south-west region of Western Australia, along with Northern Ireland, Malaysia and parts of the USA, were selected as test markets for Guinness Black Lager. Western Australia was chosen as the test market over other states because it has the highest percentage of premium beer sales in Australia – 33 per cent. It’s about 22 per cent everywhere else. The new Guinness brand, Black Lager, is only sold in a bottle as a dark lager with 4.5 per cent alcohol. The company has made several attempts to expand the Guinness brand over the years without much success,

because no matter what it does the name will forever be associated with its iconic dry stout, a beer that is an acquired taste with a crusty image. Whether new drinkers can be lured out of the mindset that if it is a black beer and its name is Guinness, it must be a heavy, bitter-tasting stout remains to be seen.

QUESTIONS

1 Why was Perth Western Australia, chosen as the test market? 2 How realistic do you think this test market is? 3 How accurate do you think test market results are in predicting sales? Why?

WRITTEN CASE STUDY 7.2 KELLOGG AUSTRALIA REVAMPS NUTRI-GRAIN32 Kellogg Australia has announced that Nutri-Grain is changing for the better, with the introduction of a new and improved recipe. Kellogg Australia’s Director of Innovation Tamara Howe said the team has been developing the new recipe for the past 18 months, and the challenge was to perfect the balance of taste and nutrition that Nutri-Grain fans are so passionate about. The revamped recipe sees Nutri-Grain Original receive a 4 Health Star Rating under the Government’s Health Star Rating System. In the lead– up to the release, Kellogg Australia shared the new cereal with Nutri-Grain lovers and did extensive product testing to ensure it remained true to the classic Nutri-Grain crunch and flavour.

Almost 70 per cent thought the new food was better than they expected, while more than 90 per cent said they liked the taste.

QUESTIONS

1 What are some potential issues that may affect the results of the product test? 2 What type of experimental design would you recommend in this case? 3 What improvements would you suggest for the experiment?

ONGOING CASE STUDY MOBILE PHONE SWITCHING AND BILL SHOCK After extensive qualitative and survey research, AusBargain, along with Sydney academics, David, Leanne and Steve felt confident enough to introduce a new product onto the mobile phone provider market. They are ready to launch a SIM card that has access to a number of inexpensive providers in Europe so that consumers can use this card when travelling overseas and not pay high roaming costs. AusBargain as a start-up and small medium enterprise does not have the money nor the time for a traditional test market but would like some sort of market testing to be done.

QUESTIONS

1 What kind of test market would be suitable in this case? 2 Provide a design for this test market. What are some potential advantages and disadvantages of this approach?

234

PART THREE > PLANNING THE RESEARCH DESIGN

NOTES 1 2 3 4 5

6 7 8 9 10 11 12 13 14 15

B&T Magazine (2015) ‘Play Market Research launches pop up product community’, December 1, Accessed at http://www.bandt.com.au.ezproxy.csu.edu.au/marketing/playmarket-research-launches-pop-up-product-testing-community, on 6 December 2015. Allison, R. I, & Uhl, K. P. (1964) ‘Influence of beer brand identification on taste perception’, Journal of Marketing Research, 36–39. Wansink, B., & van Ittersum, K. (2013) ‘Portion size me: Plate-size induced consumption norms and win-win solutions for reducing food intake and waste’, Journal of Experimental Psychology: Applied, 19(4), 320–332. Ellingstad, Vernon & Heimstra, Norman (1974) Methods in the study of human behaviour, Monterey, CA: Brooks/Cole, pp. 61–2. Sources: Committee on Identification of Research Needs Relating to Potential Biological or Adverse Health Effects of Wireless Communication Devices, National Research Council (2008) ‘Identification of research needs relating to potential biological or adverse health effects of wireless communication devices’, National Academies Press; Long, J., Tomak, K. & Whinston, A. B. (2003) ‘Calling all customers’, Marketing News (20 January), p. 18; Grapentine, T. (2005) ‘Don’t cell yourself short’, Marketing Research, 17 (Fall), p. 5. Palmer, C. (2013) ‘Coffee greenwashing works: Study’, December 5, accessed at: http://theconversation.com/coffee-greenwashing-works-study-21119 on 24 July 2016. News.com.au (2009) ‘Kraft dumps Vegemite i-Snack 2.0’, 30 September, accessed at http://www.news.com.au/finance/business/kraft-dumps-vegemite-isnack20/storye6frfkur-1225781284178, on 12 March 2016. Anderson, Barry (1971) The psychological experiment: An introduction to the scientific method, Belmont, CA: Brooks/Cole, p. 28. Anderson, Barry (1971) The psychological experiment: An introduction to the scientific method, Belmont, CA: Brooks/Cole. Kelman, Hervert (1967) ‘Human use of human subjects: The problem of deception in social psychological experiments’, Psychological Bulletin, January, pp. 1–11. Roethlisberger J., Dickson, W. (1939) Management and the worker, Cambridge, MA: Harvard University Press. Campbell, Donald Stanley, Julian (1963) Experimental and quasi-experimental designs for research, Chicago: Rand McNally, pp. 5–9. D’Alessandro, Steven Pecotich, Anthony (2013) ‘Evaluation of wine by expert and novice consumers in the presence of variations in quality, brand and country of origin cues’, Food Quality and Preference 28(1): 287–303. Iacobucci, Dawn, Calder, Bobby J, Malthouse, Edward Duhachek, Adam (2002) ‘Did you hear?’, Marketing Health Services, 22(2), pp. 16–20. Research News (2005) ‘Pop culture researchers focus on what people do, not what they say’, October, accessed at www.amsrs.com.au on 22 April 2006.

16 Based on: National Public Radio (1983) All things considered, 8 February; Soloman, Paul and Friedman, Thomas (1983) Life and death on the corporate battlefield, New York: Simon & Schuster, pp. 26–7, 352. 17 Kelman, Hervert (1967) ‘Human use of human subjects: The problem of deception in social psychological experiments’, Psychological Bulletin, January, pp. 1–11. 18 Research News (2006) ‘Bloody hell, does research work’, Research News, April, pp. 12–13. 19 Food Australia (2005) ‘Packaging senses pear ripeness’, July, p. 288. 20 Harris, Timothy (1987) ‘Marketing research passes toy marketer test’, Advertising Age, 24 August, p. 5-8. 21 Motor (2012) ‘Race Ready MX-5’, distributed by Contify.com, accessed at http://www. factiva.com on 25 January 2013. 22 Koranteng, Juliana (1998) ‘Unilever rolls out Persil tablets in Europe soap derby’, Advertising Age, 4 May, p. 52. 23 Market testing consumer products (1967) National Industrial Conference Board, New York, p. 13. 24 Vedpuriswar, V. (2004) ‘Delivering value at low price’, The Hindu Business Line, p. 2. 25 D’Alessandro, Steven, Johnson, Lester, Gray, David & Carter, Leanne (2015) Consumer satisfaction versus churn in the case of upgrades from 3G to 4G cell networks. Marketing Letters, 26(4): 489–500. 26 Sources: Elkin, Tobi (2003) ‘Virtual test markets’, Advertising Age, 74(43), 27 October, p. 6; Fass, Allison (2005) ‘Game theory’, Forbes, 176, 14 November, pp. 93–9; Computer Workstations (2008) ‘Moog selected to provide driving simulator for Ferrari’, 21, July, pp. 4–5. 27 Hull, Barry (1979) ‘Not all products deserve market testing’, Marketing News, 20 April, p. 8. 28 Hull, Barry (1979) ‘Not all products deserve market testing’, Marketing News, 20 April, p. 8. 29 Lieu, Johnny (2015) McDonald’s ‘Loaded fries with cheese and guacamole land in Australia’, Mashable-Australia, December 4, accessed on 8 December 2015 at http:// mashable.com/2015/12/03/taste-mcdonalds-loaded-fries/?utm_campaign=MashProd-RSS-Feedburner-All-Partial&utm_cid=Mash-Prod-RSS-Feedburner-AllPartial#IwfOTeenz5qG. 30 Pollack, Judann (1998) ‘Frito claims success …’, Advertising Age, 20 July, p. 30. 31 Gibson, Roy (2011) ‘Guinness black lager’, The Courier Mail, 4 February, p. 45. 32 Girl.com.au (2015) Kellogg revamps Nutri-Grain, 2 December, accessed at http://www.girl.com.au.ezproxy.csu.edu.au/kellogg-revamps-nutri-grain.htm, on 8 December 2015.

08 » WHAT YOU WILL LEARN IN THIS CHAPTER

To determine what is to be measured. To determine how it is to be measured. To apply a rule of measurement. To determine if the measure consists of a number of measures. To determine the type of attitude and the scale to be used to measure it. To evaluate the measure.

MEASUREMENT What is emma?

Australia’s newest crossplatform audience insights survey is called emma (Enhanced Media Metrics Australia). Its website (see https://emma.com.au/what-is-emma) states: The new measurement survey has been developed for The Readership Works by independent research company Ipsos MediaCT, global leaders in local audience measurement. Ipsos conducts national audience surveys and is the official measurement system in over 40 countries including the UK, Italy and France.1

The measurement of readership by emma is by 90 per cent CAWI (computer—aided web interview), 10 per cent CAPI (computer—aided personal interview), and measures cross-platform readership of not only print, but also online, and via smartphones and tablets. It uses a sample of 54 000 respondents in Australia. The results of the latest survey of readership, conducted in 2015, showed that a large number of Australians still read newspapers. The Sydney Morning Herald proved to be Australia’s best-read publication and increased its readership to 5.37 million readers, using figures tracking readership across mobile, tablets, online and

print.2 This was followed by News Corp’s Melbourne tabloid, the Herald Sun, which had 4.35 million. The Daily Telegraph, another News Corp publication, was third with 4.23 million. They are the survey’s only publications that had more than 4 million readers. The Australian Financial Review, also published by Fairfax Media, was steady at 1.3 million readers, while News Corp broadsheet The Australian had slipped slightly to 3.14 million. This chapter discusses the important issue of measurement. Organisations need measurements so that they can monitor performance and track consumer satisfaction. Measurement is also a means of simplifying a description of a market (in this case, the market performance of two markets is summarised). Like most things in market research, measurement is not an exact science, partially because many consumer measurements such as attitudes, motivation, personality and intention are not tangible aspects but internalised conditions, which the researcher tries to measure in concrete terms. This means that an ideal measurement is almost possible, although we can say that some measurements are better than others. Measurement in market research can be improved by following a systematic process, which is described next.

CHAPTER 08 > MEASUREMENT

235

236

PART THREE > PLANNING THE RESEARCH DESIGN

THE MEASUREMENT PROCESS There are six steps in the measurement process, each of which is addressed in detail throughout this chapter: 1 Determine what is to be measured. 2 Determine how it is to be measured. 3 Apply a rule of measurement. 4 Determine if the measure consists of a number of measures. 5 Determine the type of attitude and scale to be used to measure it. 6 Evaluate the measure.

ONGOING PROJECT

STEP 1: DETERMINE WHAT IS TO BE MEASURED An object, such as the edge of your textbook, can be measured with either side of a ruler (see Exhibit 8.1). Note that one side has inches and the other has centimetres. Thus, the scale of measurement varies depending on whether the metric side or the imperial side is used. Many measurement problems in marketing research are similar to this ruler with its alternative scales of measurement. Unfortunately, unlike the two edges of the ruler, many measurement scales used in marketing research are not

iStock.com/BigJoker

directly comparable. EXHIBIT 8.1 → A TWO-SIDED RULER THAT OFFERS ALTERNATIVE SCALES OF MEASUREMENT

The first question the researcher must answer is: ‘What is to be measured?’ This is not as simple a question as it may at first seem. The definition of the problem, based on exploratory research or managerial judgement, indicates the concept to be investigated (for example, sales performance). However, a precise definition of the concept may require a description of how it will be measured – and frequently there is more than one way to measure a particular concept. For example, if we are conducting research to determine which factors influence a sales representative’s performance, we might use a number of measures to indicate a salesperson’s success, such as dollar or unit sales volume or share of accounts lost. Furthermore, true measurement of concepts requires a process of precisely assigning scores or numbers to the attributes of people or objects. The purpose of assigning numbers is to convey information about the variable being measured. Hence, the key question becomes: ‘On what basis will numbers or scores be assigned to the concept?’ Suppose the task is to measure the height of a boy named Michael. There are a number of ways to do this. 1 We can create five categories: a quite tall for his age b moderately tall for his age

CHAPTER 08 > MEASUREMENT

237

c about average for his age d moderately short for his age e quite short for his age. Then we can measure Michael by saying that, because he is moderately tall for his age, his height measurement is 2. 2 We can compare Michael to 10 other neighbourhood children. We give the tallest child the rank of 1 and the shortest the rank of 11; using this procedure, Michael’s height measurement is 4 if he is fourth tallest among the 11 neighbourhood children. 3 We can use some conventional measuring unit such as centimetres and, measuring to the nearest centimetre, designate Michael’s height as 137. 4 We can define two categories: a nice height b a not-so-nice height. By our personal standard, Michael’s height is a nice height, so his height measurement is 1. In each measuring situation, a score has been assigned for Michael’s height (2, 4, 137 and 1). In scientific marketing research, however, precision is the goal. These various scores have differing precision. The researcher must determine the best way to measure what is to be investigated. On university campuses, girl- or boy-watching constitutes a measurement activity: what might be a 7 to one person may be a 9 to another. Precise measurement in marketing research requires a careful conceptual definition, an operational definition and a system of consistent rules for assigning numbers or scores.

Concepts Before the measurement process can occur, a marketing researcher must identify the concepts relevant to the problem. A concept (or construct) is a generalised idea about a class of objects, attributes, occurrences or processes. Concepts such as age, gender and number of children are relatively concrete properties, and they present few problems in definition or measurement. Other characteristics of individuals or properties of objects may be more abstract. Concepts such as brand loyalty, personality, channel power and happiness are more difficult to define and measure. For example, brand loyalty has been measured using the percentage of a person’s purchases going to one brand in a given period of time, sequences of brand purchases, number of different brands purchased, amount of brand deliberation and various cognitive measures, such as attitude towards a brand. Take for example the measurement of social entrepreneurship. This is defined in some studies as: Initiatives that are associated with aspects of innovation and modes of earned income generation by nongovernmental organization.3

A category such as social entrepreneurs, on the other hand, is defined more broadly as: Nonprofit executives who pay attention to market forces without losing sight of their organizations’ underlying missions and seek to use the language and skills of the business world to advance the material wellbeing of their members or clients.4

As these definitions of constructs determine what is to be measured, the researcher must carefully consult the literature and find agreement wherever possible as to what is being measured. They may well have to justify the selection of their own definition of a construct if a different definition of the measure exists in the peer-reviewed literature.

concept A generalised idea about a class of objects, attributes, occurrences or processes.

238

PART THREE > PLANNING THE RESEARCH DESIGN

WHAT WENT RIGHT?

MAKING DIETING EASIER WITH SIMPLE MEASURES: WEIGHT WATCHERS AND THE USE OF PROPOINTS

Rather than ask consumers to work out calories, kilojoules and percentage of fats and proteins when dieting, Weight Watchers converts this information of food intake, along with exercise, into ‘ProPoints’ and then provides consumers with a daily allowance

ONGOING PROJECT

of ProPoints required for them to lose weight. Each dieter also has an additional 49 ProPoints as a weekly allowance to be used however they choose. This allows room for social occasions or a daily treat, and nothing is off limits, even alcohol.

STEP 2: DETERMINE HOW IT IS TO BE MEASURED Concepts must be made operational in order to be measured. An operational definition gives meaning

operational definition conceptual definition

to a concept by specifying the activities or operations necessary to measure it.5 For example, the concept of nutrition consciousness might be reflected when a shopper reads the nutritional information on a cereal package. Inspecting a nutritional label is not the same as being nutrition conscious, but it is a clue that a person may be nutrition conscious. The operational definition specifies what the researcher must do to measure the concept under investigation. If we wish to measure consumer interest in a specific advertisement, we may operationally define interest as a certain increase in pupil dilation. Another operational definition of interest might rely on direct responses – what people say they are interested in. Each operational definition has advantages and disadvantages. An operational definition tells the investigator: ‘Do such-and-such in so-and-so a manner.’ 6 Table 8.1 presents a conceptual definition and an operational definition from a study on social entrepreneurship.

operational definition An explanation that gives meaning to a concept by specifying the activities or operations necessary to measure it. conceptual definition A verbal explanation of the meaning of a concept. It defines what the concept is and what it is not.

REAL WORLD SNAPSHOT

TABLE 8.1 » SOCIAL ENTREPRENEURSHIP: AN OPERATIONAL DEFINITION7

Concept

Conceptual definition

Operational definition

Social entrepreneurship

Initiatives that are associated with aspects of innovation and modes of earned income generation by non-governmental organisations.

Self report, open-ended questions. Non-profit organisations were asked if any changes to their budget in the past five years reflected any of the following: The development of a new business enterprise, commercial sales activity and charging fees for services. Social entrepreneurship is measured as value of 1 if the respondent reported the adoption of one or more of these approaches and 0 if the respondent did not report undertaking any of these strategies in the five-year time period.

HOW TO MEASURE CUSTOMER LOYALTY8

Bill Etter, a senior research executive with US-based Research International, uses three questions to assess the degree of customer loyalty. These are: 1 On a score out of 10, how likely are you to recommend this company/product to others? (A nine or 10 on this score is a promoter, zero to six is a detractor and eight to seven is neutral.) 2 On a score out of 10, how likely are you to continue to use this product/company? 3 On a score out of 10, how much does this product/service meet your overall requirements? Bill notes that his definition of a loyal customer would require a very positive response to all questions, while an at-risk customer is defined by a somewhat negative response to any one of the three questions. His overall definition of net loyalty, often called the ‘net promoter score’ of a brand or company, is the difference between per cent loyal and per cent at risk.

CHAPTER 08 > MEASUREMENT

STEP 3: APPLY A RULE OF MEASUREMENT

239

ONGOING PROJECT

A rule is a guide that tells someone what to do. An example of a measurement rule might be: ‘Assign the numerals 1 through 7 to individuals according to how brand loyal they are. If the individual is extremely brand loyal, assign a 7. If the individual is a total brand switcher with no brand loyalty, assign the numeral 1.’ Operational definitions help the researcher specify the rules for assigning numbers. If the purpose of an advertising experiment is to increase the amount of time shoppers spend in a department store, for example, shopping time must be operationally defined. Once shopping time is defined as the interval between entering the door and receiving the receipt from the sales assistant, assignment of numbers via a stopwatch is facilitated. If a study on ethyl-petrol (a blend of ethyl alcohol and petrol) is not concerned with a person’s depth of experience, but classifies people as users or non-users, it could assign a 1 for experience with ethyl-petrol or a 0 for no experience with ethyl-petrol. The values assigned in the measuring process can be manipulated according to certain mathematical rules. The properties of the scale of numbers may allow the researcher to add, subtract or multiply answers. In some cases there may be problems with the simple addition of the numbers, or other mathematical manipulations may not be permissible.

Types of scales A scale may be defined as any series of items that are arranged progressively according to value or magnitude, into which an item can be placed according to its quantification.9 In other words, a scale is a continuous spectrum or series of categories. The purpose of scaling is to represent, usually quantitatively, an item’s, person’s or event’s place in the spectrum. Marketing researchers use many scales or number systems. It is traditional to classify scales of measurement on the basis of the mathematical comparisons that are allowable with them. The four types of scale are the nominal, ordinal, interval and ratio scales.

NOMINAL SCALE Harry Kewell wore the number 10 shirt for the Socceroos and Ronaldinho wore that shirt number for Brazil. These numbers nominally identified these superstars. A nominal scale is the simplest type of scale. The numbers or letters assigned to objects serve as labels for identification or classification. These are scales in name only. In Perth, census tract 25 and census tract 87 are merely labels. The number 87 does not imply that this area has more people or higher income than number 25. An example of a typical nominal scale in marketing research would be the coding of males as 1 and females as 2. As another example, the first drawing in Exhibit 8.2 depicts the number 7 on a horse’s colours. This is merely a label to allow bettors and racing enthusiasts to identify the horse.

ORDINAL SCALE If you have been to a racecourse, you know that when your horse finishes in a ‘place’ position it has come in second or third behind the ‘win’ horse (see the second drawing in Exhibit 8.2). An ordinal scale arranges objects or alternatives according to their magnitude in an ordered relationship. When respondents are asked to rank order their shopping centre preferences, they assign ordinal values to them. In our racehorse example, if we assign 1 to the win position, and 2 and 3 to place positions, we can say that 1 was before 2 and 2 was before 3. However, we cannot say anything about the degree of distance or the interval between the win and place horses or the two place horses.

scale Any series of items that are arranged progressively according to value or magnitude; a series into which an item can be placed according to its quantification. nominal scale A scale in which the numbers or letters assigned to objects serve as labels for identification or classification. ordinal scale A scale that arranges objects or alternatives according to their magnitude in an ordered relationship.

240

PART THREE > PLANNING THE RESEARCH DESIGN

EXHIBIT 8.2 → NOMINAL, ORDINAL, INTERVAL AND RATIO SCALES PROVIDE DIFFERENT INFORMATION

7

Nominal Win Place

Ordinal

5

Place

Win

6

7

Ratio

Win 5

6

40 to 1 long shot pays $80 $2

7

Interval 1 Length

2 Lengths

A typical ordinal scale in marketing asks respondents to rate brands, companies and the like as excellent, good, fair or poor. Researchers know excellent is higher than good; but they do not know by how much.

INTERVAL SCALE The third drawing in Exhibit 8.2 depicts a horse race in which the win horse is two lengths ahead of the second-placed horse, which is one length ahead of the third-placed horse. Not only is the order of finish known, but the distance between the horses is also known. (The example assumes a standard measurement for the term ‘length’.) Interval scales not only indicate order, but they also measure order (or distance) in units of equal intervals. The location of the zero point is arbitrary. In the Consumer Price Index, if the base year is 2015 then the price level during 2015 will be set at 100. Although this is an equal-interval measurement scale, the zero point is arbitrary. The classic example of an interval scale is the Celsius temperature scale. If the temperature is 40°, it cannot be said that it is twice as hot as 20°, because 0° represents not the lack of temperature but a relative point on the Celsius scale. Due to the lack of an absolute zero point, an interval scale does not allow the conclusion that the number 36 is three times as great as the number 12, but only that the distance is three times as great. Likewise, when an interval scale is used to measure psychological attributes, the researcher can comment about the magnitude of differences or compare the average differences on the attributes that were measured, but cannot determine the actual strength of the attitude towards an object. However, the changes in concepts over time can be compared if the researcher continues to use the same scale in longitudinal research.

RATIO SCALE interval scale A scale that both arranges objects according to their magnitudes and distinguishes this ordered arrangement in units of equal intervals. ratio scale A scale that has absolute rather than relative quantities, and an absolute zero where there is an absence of a given attribute.

To be able to say that winning tickets pay 40 to 1 or that racehorse number 7 is twice as heavy as racehorse number 5, we need a ratio scale (see the fourth drawing in Exhibit 8.2). Ratio scales have absolute rather than relative quantities. For example, both money and weight are measured with ratio scales that possess absolute zeros and interval properties. The absolute zero represents a point on the scale at which there is an absence of the given attribute. If we hear that a person has zero grams of gold, we understand the natural zero value for weight. In the measurement of temperature, the Kelvin scale (a ratio scale) begins at absolute zero, a point that corresponds to 2273.16° on the Celsius scale (an interval scale). In distribution or logistical research it may be appropriate to think of physical attributes such as weight or distance as ratio scales in which the ratio of scale values is meaningful.

CHAPTER 08 > MEASUREMENT

For most behavioural marketing research, however, interval scales typically are the appropriate measurements. However, if a researcher wishes to construct ratios derived from the original scales, the scale of measurement must be ratio.

Mathematical and statistical analysis of scales The type of scale used in marketing research will determine the form of the statistical analysis. For example, certain operations, such as the calculation of a mean (mathematical average), can be conducted only if the scale is of an interval or ratio nature; they are not permissible with nominal or ordinal scales. Table 8.2 shows the appropriate descriptive statistics for each type of scale. The most sophisticated form of statistical analysis for nominal scale data is counting. Because numbers in such a scale are merely labels for classification purposes, they have no quantitative meaning. The researcher tallies the frequency in each category and identifies which category contains the highest number of observations (individuals, objects and so on). An ordinal scale provides data that may be rank ordered from lowest to highest. Observations may be associated with percentile ranks such as the median. Because all statistical analyses appropriate for lower-order scales are suitable for higherorder scales, an interval scale may be used as a nominal scale to uniquely classify or as an ordinal scale to preserve order. In addition, an interval scale’s property of equal intervals allows researchers to compare differences among scale values and perform arithmetic operations such as addition and subtraction. Numbers may be changed, but the numerical operations must preserve order and relative magnitudes of differences. The mean and standard deviation may be calculated from true interval-scale data. TABLE 8.2 »

FACTS ABOUT FOUR LEVELS OF SCALES

Type of scale

Example

Numerical operation

Descriptive statistics

Nominal

Yes–No Female–Male Buy–Did not buy Postcode

Counting

Frequencies Mode

Ordinal

Rankings indicate your level of education: »» High school »» Some university »» University degree »» Graduate university degree

Counting and rank ordering

Frequencies Mode Median Range

Interval

Most attitude scales Agree–Disagree 5-point scale.

Arithmetic operations that preserve order and relative magnitudes

Mean Median Standard deviation Variance

Ratio

Amount purchased Purchase probability Salesperson sales volume Time spent on a webpage Number of stores visited Webpage hits

All arithmetic

Mean Median Standard deviation Variance

Note: All statistics appropriate for lower-order scales (nominal being the lowest) are appropriate for higher-order scales (ratio being the highest).

241

242

PART THREE > PLANNING THE RESEARCH DESIGN

A ratio scale has all the properties of nominal, ordinal and interval scales. In addition, it allows researchers to compare absolute magnitudes because the scale has an absolute zero point. Using the actual quantities for arithmetic operations is permissible. Thus, the ratios of scale values are meaningful. Chapters 12 to 15 further explore the limitations scales impose on the mathematical analysis of data.

STEP 4: DETERMINE IF THE MEASURE CONSISTS OF A NUMBER OF MEASURES

ONGOING PROJECT

So far we have focused on measuring a concept with a single question or observation. Measuring brand awareness, for example, might involve one question, such as: ‘Are you aware of?’ However, measuring more complex concepts may require more than one question because the concept has several attributes. An attribute is a single characteristic or fundamental feature of an object, person, situation or issue. Multi-item instruments for measuring a single concept with several attributes are called index measures, or composite measures. For example, one index of social class is based on three weighted variables: residence, occupation and education. Measures of cognitive phenomena often are composite indexes of sets of variables or scales. Items are combined into composite measures. For example, a salesperson’s morale may be measured by combining questions such as: ‘How satisfied are you with your job? How satisfied are you with your territory? How satisfied are you in your personal life?’ Measuring the same underlying concept using a variety of techniques is one method for increasing accuracy. Asking different questions to measure the same concept provides a more accurate cumulative measure than does a single-item estimate.

Computing scale values attribute A single characteristic or fundamental feature of an object, person, situation or issue. index (or composite) measure A composite measure of several variables used to measure a single concept; a multi-item instrument. summated scale A scale created by simply summing (adding together) the response to each item making up the composite measure. The scores can be (but do not have to be) averaged by the number of items making up the composite scale. reverse coding Where the value assigned for a response is treated oppositely from the other items.

Exhibit 8.3 demonstrates how a composite measure can be created from rating scales. This particular scale can be used to determine, for example, how much a consumer trusts a website. This particular composite represents a summated scale. A summated scale is created simply by summing the response to each item making up the composite measure. In this case, the consumer would have a trust score based on responses to five items. The items summed have to be measuring different aspects of the same concept: website trust. The items would therefore be related to each other. Before summing items the researcher would need to examine internal consistency of the measure (often called the internal reliability). Sometimes, a response may need to be reverse coded before computing summated score. Reverse coding means that the value assigned to each response is treated oppositely from the other items. If a sixth item was included on the trust scale that said ‘I do not trust this website’, reverse coding would be necessary to make sure the composite scale made sense. The content of this item is the reverse of trust (distrust), so the scale should be reversed. Thus, on a 5-point scale, the values are reversed as follows:

5 becomes 1

4 becomes 2

3 stays 3

2 becomes 4

1 becomes 5

CHAPTER 08 > MEASUREMENT

243

EXHIBIT 8.3 →COMPUTING A COMPOSITE SCALE

Item

Strongly disagree (SD) → Strongly agree (SA)

This site appears to be more trustworthy than other sites I have visited.

SD

D

N

A

SA

My overall trust in this site is very high.

SD

D

N

A

SA

My overall impression of the believability of the information on this site is very high.

SD

D

N

A

SA

My overall confidence in the recommendations on this site is very high.

SD

D

N

A

SA

The company represented in this site delivers on its promises.

SD

D

N

A

SA

Computation: Scale values: SD 5 1, D 5 2, N 5 3, A 5 4, SA 5 5 Thus, the trust score for this consumer is 2 1 3 1 2 1 2 1 4 5 13 Therefore, if the same consumer described in Exhibit 8.3 responded to this new item as 5, it would be recoded as a 1 before computing the summated scale. The new summated scale would therefore become 14. The Real world snapshot shows how a recode can be carried out using SPSS, a frequently used statistical package in the market research industry.

REAL WORLD SNAPSHOT

Most computer statistical software makes scale recoding easy. The screenshot shown here is from SPSS, perhaps the most widely used statistical software in business-related research. All that needs to be done to reverse code a scale is to go through the right click-through sequence. In this case, follow these steps: 1 Click on transform. 2 Click on recode. 3 Choose to recode into the same variable. 4 Select the variable(s) to be recoded. 5 Click on old and new values. 6 Use the menu that appears to enter the old values and the matching new values. Click add after entering each pair. 7 Click continue. This would successfully recode variable X13 in this case.

Courtesy of SPSS Statistics 17.0

RECODING MADE EASY

244

PART THREE > PLANNING THE RESEARCH DESIGN

SURVEY THIS!

ONGOING PROJECT

Take a look at the section of the student survey shown in the screenshot. Suppose someone thought the items made a composite scale and you are asked to analyse its quality. Answer the following questions: 1 Is the level of measurement nominal, ordinal, interval or ratio? 2 Assuming the scale items represent studying concentration, do you think any of the items need to be reverse-coded before a summated scale could Courtesy of Qualtrics.com be formed? If so, which ones? 3 Using the data for these items, compute the coefficient and draw some conclusion about the scale’s reliability. 4 At this point, how much can be said about the scale’s validity? Are there any items that do not belong on the scale?

STEP 5: DETERMINE THE TYPE OF ATTITUDE AND SCALE TO BE USED TO MEASURE IT There are many definitions of attitude. Attitude usually is viewed as an enduring disposition to consistently respond in a given manner to various aspects of the world, including people, events and objects. One conception of attitude is reflected in this brief statement: ‘Sally loves shopping at Sam’s. She believes it’s clean, conveniently located, and has the lowest prices. She intends to shop there every Thursday.’ This short description has identified three components of attitudes: affective, cognitive and behavioural. The affective component reflects an individual’s general feelings or emotions towards an object. Statements such as ‘I love my Mazda MX-5’, ‘I liked that book A Corporate Bestiary’ and ‘I hate cranberry juice’ reflect the emotional character of attitudes. The way one feels about a product, an advertisement or an object is usually tied to one’s beliefs or cognitions. The cognitive component represents one’s awareness of and knowledge about an object. One person might feel happy about the purchase of a car because he or she believes it ‘gets great fuel economy’ or knows that the dealer is ‘the best in Singapore’. The behavioural component includes buying intentions and behavioural expectations and reflects a predisposition to action.

Normative vs ipsative scales Sometimes we want to find an objective measure of the average score for one variable across a sample. Other times we are interested to learn how much one item is preferred over another item. Do you like attitude An enduring disposition to consistently respond in a given manner to various aspects of the world; composed of affective, cognitive and behavioural components. normative scale A measurement designed to make comparisons among other people on a particular construct item, usually a rating scale where items are evaluated in isolation from other items.

butterscotch ice-cream? Do you like vanilla ice-cream? Which one would you like to have right now? These require quite different measurement techniques.

NORMATIVE SCALES Normative scales are intended to make comparisons of one person’s measures against other people’s measures of the same thing. Often these are ratings scales, such as the 1–5 ‘agree-disagree’ rating scales. The evaluation of any one item is made in isolation from evaluations of other items – so two items may be given the same rating. For example, say our TV subscription service is broadcasting five movies tonight. A normative scale would ask respondents to rate each movie individually, according to how much they would like to see the movie. An ipsative scale would ask respondents to select the one movie they would prefer to watch, or to rank all of the movies, or similar.

CHAPTER 08 > MEASUREMENT

245

Normative measure: Please rate each of the movies presented below according to how much you would like to watch the movie tonight. Definitely not!

Yes Definitely!

Kung Fu Panda 3

0

1

2

3

4

5

6

7

8

9

10

Hail Caesar!

0

1

2

3

4

5

6

7

8

9

10

Star Wars: The Force Awakens

0

1

2

3

4

5

6

7

8

9

10

The Revenant

0

1

2

3

4

5

6

7

8

9

10

Pride and Prejudice and Zombies

0

1

2

3

4

5

6

7

8

9

10

IPSATIVE SCALES Ipsative scales (from Latin ‘of the self’) are not readily comparable with measures from other people, but are relative to other measures from the same person. A ‘forced-choice’ task such as ranking or paired comparison tells us that one item is preferred over another, even when both may be highly desirable – so even if two items are attractive, one must be chosen as ‘better’. For example, consider the same five movies on TV tonight. As we’ve said, an ipsative scale would ask respondents to select the one movie they would prefer to watch, or to rank all of the movies, or similar. Ipsative measure: Please rank each of the movies presented below according to how much you would like to watch the movie tonight. Rank your first preference with ‘1’, your second preference with ‘2’, and so on. Rank Kung Fu Panda 3

.

Hail Caesar!

.

Star Wars: The Force Awakens

.

The Revenant

.

Pride and Prejudice and Zombies

.

We can see that the normative question and the ipsative question are somewhat different. They give us very different data. The normative scale lets us easily calculate an average score, and we can easily see how each person scored each movie compared with other people. Normative rating scales also assume that each person rates items in the same way – that my score of 8 has the same meaning as your score of 8 – but we can never be sure of that. Comparisons between any two movies may not be so easy though. How do we interpret a respondent who gives all five movies a rating of 9 or 10? That is, he or she regards all of them as really attractive right now. Can we easily see which one is more likely to be chosen tonight? An ipsative scale lets us see that, even if a movie is rated very highly, it may be chosen last from among a set of other highly attractive alternatives. If you are interested in market share, relative preference or actual behaviour instead of ‘liking’, then an ipsative measure is to be preferred over a normative measure.

Attitudes as hypothetical constructs Many variables that marketing researchers wish to investigate are psychological variables that cannot be directly observed. For example, someone may have an attitude towards a particular brand of shaving cream, but we cannot observe this attitude. To measure an attitude, we must infer it from the way an individual responds (by a verbal expression or overt behaviour) to some stimulus. The term hypothetical construct describes a variable that is not directly observable but is measurable through indirect indicators, such as verbal expression or overt behaviour.

ipsative scale A measurement designed to compare items from the same person, usually ranking or choice tasks where items are explicitly evaluated on their relative attractiveness. hypothetical construct A variable that is not directly observable but is measurable through indirect indicators, such as verbal expression or overt behaviour.

246

PART THREE > PLANNING THE RESEARCH DESIGN

REAL WORLD SNAPSHOT

THE C-OAR-SE MEASUREMENT PROCEDURE REVOLUTION10

C-OAR-SE stands for Construct definition, Object classification, Attribute classification, Rater identification, Scale formation and Enumeration and reporting. A new measurement procedure developed by distinguished Australian marketing researcher and academic Professor John Rossiter, this approach of measurement relies more on rational and content analysis, rather than purely empirical analysis. Rossiter argues strongly against the formation of scales based on after-the-fact statistical analysis, instead positing that measurement should be formed on the basis of good theory and understanding of consumer experiences. Rossiter implores market researchers to use only concrete and single items, where possible, rather than build long questionnaires using multi-item measures. His views on measurement have received notable academic recognition and, in 2012, the market research industry in Australia short-listed his measurement approach as a national finalist for the AMSRS Research Effectiveness Awards.

Measuring attitudes is important to managers Most marketing managers hold the belief that changing consumers’ or prospects’ attitudes towards a product is a major marketing goal. At the individual level this is a complicated issue; however, aggregate attitude change has been shown to be related to aggregate sales volume changes. Because modifying attitudes plays a pervasive role in marketing strategies, the measurement of attitudes is an important task. For example, Whiskas brand cat food had been sold in Europe by Mars’ Pedigree Petcare division for 30 years. Over time, as the brand faced increased competition from new premium brands, consumers had difficulty identifying with the brand. The company conducted attitude research to determine how people felt about their cats and their food alternatives. The study revealed that cat owners see their pets both as independent and as dependent fragile beings.11 Cat owners held the attitude that cats wanted to enjoy their food but needed nutrition. This attitude research was directly channelled into managerial action. Because of the owners’ concern, Whiskas marketers began positioning the product as having ‘Catisfaction’. Its advertisements featured a purring kitty, a silver tabby – a pedigree cat – symbolising premium quality but also presenting an image of a sweet cat. The message was: ‘Give cats what they like with the nutrition they need. If you do, they’ll be so happy that they’ll purr for you.’ This effort reversed the sales decline the brand had been experiencing. ranking A measurement task that requires respondents to rank order a small number of stores, brands or objects on the basis of overall preference or some characteristic of the stimulus.

The attitude-measuring process

rating A measurement task that requires respondents to estimate the magnitude of a characteristic or quality that a brand, store or object possesses.

sympathetic nervous system responses may be recorded using physiological measures to quantify

sorting A measurement task that presents a respondent with several objects or product concepts and requires the respondent to arrange the objects into piles or to classify the product concepts.

ranking, rating, sorting or making choices.

A remarkable variety of techniques has been devised to measure attitudes. This variety stems in part from lack of consensus about the exact definition of the concept. Furthermore, the affective, cognitive and behavioural components of an attitude may be measured by different means. For example, affect, but they are not good measures of behavioural intentions. Direct verbal statements concerning affect, belief or behaviour are used to measure behavioural intent. However, attitudes may also be measured indirectly using the qualitative exploratory techniques discussed in Chapter 3. Obtaining verbal statements from respondents generally requires that the respondents perform a task such as A ranking task requires the respondent to rank order a small number of stores, brands or objects on the basis of overall preference or some characteristic of the stimulus. Rating asks the respondent to estimate the magnitude of a characteristic or quality that an object possesses. A quantitative score, along a continuum that has been supplied to the respondent, is used to estimate the strength of the person’s attitude or belief; in other words, the respondent indicates the position on one or more scales at which he or she would rate the object. A sorting task might present the respondent with several

CHAPTER 08 > MEASUREMENT

247

product concepts printed on cards and require the respondent to arrange the cards into a number of piles or otherwise classify the product concepts. Choice between two or more alternatives is another type of attitude measurement. If a respondent chooses one object over another, the researcher can assume that the respondent prefers the chosen object over the other. The following sections describe the most popular techniques for measuring attitudes.

PHYSIOLOGICAL MEASURES OF ATTITUDES Measures of galvanic skin response, blood pressure, pupil dilations and other physiological measures (discussed in Chapter 6) may be used to assess the affective components of attitudes. These measures provide a means of assessing attitudes without verbally questioning the respondent. In general, they can provide a gross measure of likes or dislikes, but they are not extremely sensitive to the different gradients of an attitude.

ATTITUDE RATING SCALES Using rating scales to measure attitudes is perhaps the most common practice in marketing research. This section discusses many rating scales designed to enable respondents to report the intensity of their attitudes.

Simple attitude scales In its most basic form, attitude scaling requires that an individual agree or disagree with a statement or respond to a single question. For example, respondents in a political poll may be asked whether they agree or disagree with the statement: ‘The Prime Minister should call an election now.’ Alternatively, an individual might indicate whether he or she likes or dislikes beetroot dip. This type of self-rating scale merely classifies respondents into one of two categories; thus, it has only the properties of a nominal scale, and the types of mathematical analysis that can be used with this basic scale are limited. Despite the disadvantages, simple attitude scaling can be used when questionnaires are extremely long, when respondents have little education, or for other specific reasons. A number of simplified scales are merely checklists – a respondent indicates past experience, preference and the like merely by checking an item. In many cases the items are adjectives that describe a particular object. Most attitude theorists believe that attitudes vary along continua. Early attitude researchers pioneered the view that the task of attitude scaling is to measure the distance from ‘good’ to ‘bad’, ‘low’ to ‘high’, ‘like’ to ‘dislike’ and so on. Thus, the purpose of an attitude scale is to find an individual’s position on the continuum. Simple scales do not allow for fine distinctions between attitudes. Several other scales have been developed for making more precise measurements.

Category scales The example just given is a rating scale that contains only two response categories: agree or disagree. Expanding the response categories provides the respondent with more flexibility in the rating task. Even more information is provided if the categories are ordered according to a particular descriptive or evaluative dimension. Consider the following question: How often do you disagree with your spouse about how much to spend on various things? ❏ Never ❏ Rarely ❏ Sometimes ❏ Often ❏ Very often This category scale is a more sensitive measure than a scale that has only two response categories; it provides more information. Question wording is an extremely important factor in the usefulness of these scales. Table 8.3 shows some common wordings used in category scales. The issue of question wording is discussed in Chapter 9 (Question wording and measurement scales for commonly researched topics).

choice A measurement task that identifies preferences by requiring respondents to choose between two or more alternatives. category scale A rating scale that consists of several response categories, often providing respondents with alternatives to indicate positions on a continuum.

248

PART THREE > PLANNING THE RESEARCH DESIGN

Likert scale A measure of attributes designed to allow respondents to rate how strongly they agree or disagree with carefully constructed statements, ranging from very positive to very negative attitudes towards some object; several scale items may be used to form a summated index.

Method of summated ratings: The Likert scale Marketing researchers’ adaptation of the method of summated ratings, developed by Likert, is an extremely popular means for measuring attitudes, because it is simple to administer.12 With the Likert scale, respondents indicate their attitudes by checking how strongly they agree or disagree with carefully constructed statements, ranging from very positive to very negative attitudes towards some object. Individuals generally choose from approximately five response alternatives – strongly agree, agree, uncertain, disagree and strongly disagree – although the number of alternatives may range from three to nine. Consider the following example from a study of food shopping behaviour: In buying food for my family, price is no object.

TABLE 8.3 »

Strongly disagree

Disagree

Uncertain

Agree

Strongly agree

(1)

(2)

(3)

(4)

(5)

SELECTED CATEGORY SCALES

Quality Excellent

Good

Fair

Poor

Very good

Fairly good

Neither good nor bad

Not very good

Not good at all

Well above average

Above average

Average

Below average

Well below average

Fairly important

Neutral

Not so important

Not at all important

Importance Very important Interest Very interested

Somewhat interested

Not very interested

Satisfaction Completely satisfied

Somewhat satisfied

Neither satisfied nor dissatisfied

Somewhat dissatisfied

Completely dissatisfied

Very satisfied

Quite satisfied

Somewhat satisfied

Not at all satisfied

All of the time

Very often

Often

Sometimes

Hardly ever

Very often

Often

Sometimes

Rarely

Never

All of the time

Most of the time

Some of the time

Just now and then

Very true

Somewhat true

Not very true

Not at all true

Definitely yes

Probably yes

Probably no

Definitely no

Very different

Somewhat different

Slightly different

Not at all different

Extremely unique

Very unique

Somewhat unique

Slightly unique

Frequency

Truth

Uniqueness

Not at all unique

CHAPTER 08 > MEASUREMENT

To measure the attitude, researchers assign scores, or weights, to the alternative responses. In this example, weights of 5, 4, 3, 2 and 1 are assigned. (The weights, shown in parentheses, would not be printed on the questionnaire.) Strong agreement indicates the most favourable attitude on the statement, and a weight of 5 is assigned to this response. The statement given in this example is positive towards the attitude. If the statement given were negative towards the object (such as: ‘I carefully budget my food expenditures’), the weights would be reversed and ‘strongly disagree’ would be assigned a weight of 5. A single scale item on a summated scale is an ordinal scale. A Likert scale may include several scale items to form an index. Each statement is assumed to represent an aspect of a common attitudinal domain. For example, Table 8.4 shows the items in a Likert scale for measuring attitudes towards patients’ interaction with a physician’s service staff. The total score is the summation of the weights assigned to an individual’s responses. Here the maximum possible score for the index would be 20 if a 5 were assigned to ‘strongly agree’ responses for each of the positively worded statements and a 5 to ‘strongly disagree’ responses for the negative statement. (Item 3 is negatively worded and therefore is reverse coded.) TABLE 8.4 » LIKERT SCALE ITEMS FOR MEASURING ATTITUDES TOWARDS PATIENTS’ INTERACTION WITH A PHYSICIAN’S SERVICE STAFF13

1

My doctor’s office staff takes a warm and personal interest in me.

2

My doctor’s office staff is friendly and courteous.

3

My doctor’s office staff is more interested in serving the doctor’s needs than in serving my needs.

4

My doctor’s office staff always acts in a professional manner. In Likert’s original procedure, a large number of statements are generated and an item analysis is

performed. The purpose of the item analysis is to ensure that final items evoke a wide response and discriminate among those with positive and negative attitudes. Items that are poor because they lack clarity or elicit mixed response patterns are eliminated from the final statement list. However, many marketing researchers do not follow the exact procedure prescribed by Likert. Hence, a disadvantage of the Likert-type summated rating method is that it is difficult to know what a single summated score means. Many patterns of response to the various statements can produce the same total score. Thus, identical total scores may reflect different attitudes because respondents endorsed different combinations of statements.

PERSONAL INVOLVEMENT INVENTORY Product involvement is a term used to ascertain the level of personal importance a person places on a particular product category or brand. We often think that price is the dominant factor in determining involvement, but that is rarely the case. Think about how important it is to you to find the right hairdresser if your regular hairdresser moves away. Or what features you want in your new gaming PC. Involvement can be a multifaceted construct. The Likert scale in Exhibit 8.4 uses the involvement construct. Fourteen semantic differential questions are designed to capture three components of involvement: importance, pleasure and risk. Sometimes involvement is calculated by simply averaging all 14 items for each respondent. More often, the three subscales are calculated, also by averaging the items associated with that scale, to gain a more nuanced measure of the type of involvement. Note that this scale measures involvement in a product category. Brand works less well. We find that hairstyle scores highest on all three scales. Personal computers score high on risk for most people but low on pleasure for many, except for enthusiasts. Something as mundane as laundry detergent also can score very high on importance for many people. How would you score on such categories as ‘shoes’, ‘car’ or ‘coffee shop’?

249

250

PART THREE > PLANNING THE RESEARCH DESIGN

EXHIBIT 8.4 →REVISED PERSONAL INVOLVEMENT INVENTORY (RPII)14

Subscale Irrelevant

1

2

3

4

5

Relevant

I (R)

Important

1

2

3

4

5

Unimportant

I

Of no concern

1

2

3

4

5

Of concern

I

Doesn’t matter

1

2

3

4

5

Matters to me

I (R)

Means a lot to me

1

2

3

4

5

Means nothing to me

I

Fun

1

2

3

4

5

Not fun

P

Unexciting

1

2

3

4

5

Exciting

P (R)

Appealing

1

2

3

4

5

Unappealing

P

Says something about me

1

2

3

4

5

Says nothing about me

P

Tells me about a person

1

2

3

4

5

Says nothing

P

Boring

1

2

3

4

5

Interesting

P (R)

Easy to go wrong

1

2

3

4

5

Hard to go wrong

R

No risk

1

2

3

4

5

Risky

R (R)

Hard to pick

1

2

3

4

5

Easy to choose

R

Notes: Subscales: I 5 Importance, P 5 Pleasure, R 5 Risk. (R) 5 scale is reversed.

CUSTOMER EXPERIENCE QUALITY (CXQ) Direct customer experience is a subtly different construct compared with service quality or product satisfaction. The customer experience quality (CXQ) index is a Likert-scale measure constituting four dimensions, with the acronym POMP: 1 Product experience – customers’ perception of having choices and the ability to compare offers, even if from the same provider. 2 Outcome focus – associated with reducing customers’ transaction costs, such as seeking out and qualifying new providers. 3 Moments of truth – the importance of service recovery and flexibility when faced with unforeseen complications. 4 Peace of mind – emotional aspects of service, based on the perceived expertise of the service provider and the guidance provided throughout the process. Each of the dimensions is captured with between 4 and 6 5-point Likert-scale items, as shown in Exhibit 8.5. The example comes from a study of bank mortgage customers. Questions should be modified and carefully pretested for other product categories.

CHAPTER 08 > MEASUREMENT

251

EXHIBIT 8.5 →CUSTOMER EXPERIENCE QUALITY15

Dimension

Attribute

Example for mortgage bank

Peace of mind

Expertise

I am confident in their expertise; they know what they are doing.

Process ease

The whole process was so easy; they took care of everything.

Relationship versus transaction

It is not just about the now; this company is looking after me.

Convenience retention

I am already a customer; they know me and take good care of me, so why should I go elsewhere?

Familiarity

I have dealt with them before, so getting a mortgage was really easy.

Independent advice

I choose them because they give independent advice.

Inertia

Yes, there are other companies, but I would rather stay with mine; it makes the process much easier.

Result focus

It was more important to get the mortgage than to shop around for a better rate.

Past experience

I stay with my company because I am not confident about using an alternative provider.

Common grounding

It was important that the adviser had a mortgage too; he/she knew what I was going through.

Flexibility

It was important that the company was flexible in dealing with me and looking out for my needs.

Pro-activity

It is important that they keep me up to date and inform me about new options.

Risk perception

I want to deal with a safe company, because a mortgage is a lot of money.

Interpersonal skills

It is important that the people I am dealing with are good people; they listen, are polite and make me feel comfortable.

Service recovery

The way they deal(t) with me when things go (went) wrong will decide if I stay with them.

Freedom of choice

I want to choose between different options to make certain I get the best offer.

Cross-product comparison

It is important to me to receive mortgage offers from different companies.

Comparison necessity

Unless I can compare different options, I will not know which one is the best for me.

Account management

It would be great if I could deal with one designated contact through the entire process of getting my mortgage.

Outcome focus

Moments of truth

Product experience

Semantic differential The semantic differential is actually a series of attitude scales. This popular attitude measurement technique consists of presenting an identification of a product, brand, store or other concept, followed by a series of 7-point bipolar rating scales. Bipolar adjectives – such as ‘good’ and ‘bad’, ‘modern’ and ‘oldfashioned’ or ‘clean’ and ‘dirty’ – anchor the beginning and the end (or poles) of the scale. The subject makes repeated judgements about the concept under investigation on each of the scales. Exhibit 8.6 shows a series

semantic differential A measure of attitudes that consists of a series of 7-point rating scales that use bipolar adjectives to anchor the beginning and end of each scale.

of scales to measure attitudes towards time pressure affecting the choice of a mobile phone service provider. EXHIBIT 8.6 → THE MEASUREMENT OF TIME PRESSURES WHEN SELECTING A NEW MOBILE PHONE SERVICE PROVIDER16

HOW MUCH TIME DID YOU HAVE TO MAKE A DECISION WHEN YOU LAST SWITCHED FROM ONE MOBILE PHONE SERVICE PROVIDER TO ANOTHER No time pressure More than adequate time available Need a lot more time to do this task

Too much time pressure Not adequate time available No more time needed to do this task

252

PART THREE > PLANNING THE RESEARCH DESIGN

The scoring of the semantic differential can be illustrated using the scale bounded by the anchors ‘No time pressure’ and ‘Too much time pressure’. Respondents are instructed to check the place that indicates the nearest appropriate adjective. From left to right, the other scale intervals are interpreted as ‘More than adequate time available’ and ‘Not adequate time available’, and ‘Need a lot more time to do this task’ and ‘No more time needed to do this task’. The scale could be coded from 1 (‘No time pressure’) to 7 (‘Too much time pressure’) for each of the adjectives. The researchers can then use combined scores by summing up the responses to each scale. The semantic differential technique originally was developed by Osgood and others as a method for measuring the meanings of objects or the ‘semantic space’ of interpersonal experience.17 Marketing researchers have found the semantic differential versatile and have modified it for business applications. Replacing the bipolar adjectives with descriptive phrases is a frequent adaptation in image studies. For example, the phrases ‘aged a long time’ and ‘not aged a long time’, and ‘not watery looking’ and ‘watery looking’ were used in a beer brand image study. A banking study might use the phrases ‘low interest on savings’ and ‘favourable interest on savings’. These phrases are not polar opposites. Consumer researchers have found that respondents often are unwilling to use the extreme negative side of a scale. Research with industrial salespeople, for example, found that in rating their own performances, salespeople would not use the negative side of the scale. Hence, it was eliminated, and the anchor opposite the positive anchor showed ‘satisfactory’ rather than ‘extremely poor’ performance. For scoring purposes, a weight is assigned to each position on the rating scale. Traditionally scores are 7, 6, 5, 4, 3, 2, 1 or 13, 12, 11, 0, 21, 22, 23. Many marketing researchers find it desirable to assume that the semantic differential provides interval data. This assumption, although widely accepted, has image profile A graphic representation of semantic differential data for competing brands, products or stores to highlight comparisons.

its critics, who argue that the data have only ordinal properties because the weights are arbitrary. Exhibit 8.7 illustrates a typical image profile based on semantic differential data. Depending on whether the data are assumed to be interval or ordinal, the arithmetic mean or the median will be used to compare the profile of one product, brand or store with that of a competing product, brand or store.

EXHIBIT 8.7 → IMAGE PROFILE OF COMMUTER AIRLINES VERSUS MAJOR AIRLINES18

Positive

Neutral

Negative

Consistently on time

Typically late

Reliable baggage handling

Undependable baggage handling

Desirable schedule

Inconvenient schedule

Security conscious

Not security conscious

Quiet equipment

Loud equipment

Roomy planes

Crowded planes

Clean equipment

Dirty equipment

Polite personnel

Discourteous personnel

Knowledgeable personnel

Uninformative personnel

Prompt service by personnel

Slow service by personnel

High value for money spent

Low value for money spent

Economical

Expensive

Profitable

Unprofitable

Reliable

Unsafe Commuter airlines

Major airlines

CHAPTER 08 > MEASUREMENT

253

Numerical scales Numerical scales have numbers, rather than semantic space or verbal descriptions, as response options to identify categories (response positions). For example, if the scale items have five response positions, the scale is called a 5-point numerical scale; with seven response positions, it is called a 7-point numerical scale, and so on. Consider the following numerical scale: Now that you’ve had your car for about one year, please tell us how satisfied you are with your Ford Focus. Extremely satisfied 7 6 5 4 3 2 1 Extremely dissatisfied This numerical scale uses bipolar adjectives in the same manner as the semantic differential. In practice, researchers have found that for educated populations a scale with numerical labels for intermediate points on the scale is as effective a measure as the true semantic differential. Another example of a numerical scale is the net promoter score, which is discussed later in this chapter.

Stapel scale The Stapel scale was originally developed in the 1950s to measure simultaneously the direction and intensity of an attitude. Modern versions of the scale, with a single adjective, are used as a substitute for the semantic differential when it is difficult to create pairs of bipolar adjectives. The modified Stapel scale places a single adjective in the centre of an even number of numerical values (ranging, perhaps, from 13 to 23). It measures how close to or distant from the adjective a given stimulus is perceived to be. Table 8.5 illustrates a Stapel scale item used in measurement of a retailer’s store image. The advantages and disadvantages of the Stapel scale are very similar to those of the semantic differential. However, the Stapel scale is markedly easier to administer, especially over the telephone. Because the Stapel scale does not require bipolar adjectives, it is easier to construct than the semantic differential. Research comparing the semantic differential with the Stapel scale indicates that results from the two techniques are largely the same.19 TABLE 8.5 »

A STAPEL SCALE FOR MEASURING A STORE’S IMAGE20

David Jones 13 12 Wide selection

11 21 22 23

Select a plus number for words that you think describe the store accurately. The more accurately you think the word describes the store, the larger the plus number you should choose. Select a minus number for words you think do not describe the store accurately. The less accurately you think the word describes the store, the larger the minus number you should choose. Therefore, you can select any number from +3 for words that you think are very accurate all the way to –3 for words that you think are very inaccurate.

numerical scale An attitude rating scale similar to a semantic differential except that it uses numbers (instead of verbal descriptions) as response options to identify response positions. Stapel scale A measure of attitudes that consists of a single adjective in the centre of an even number of numerical values.

254

PART THREE > PLANNING THE RESEARCH DESIGN

Constant-sum scale Suppose a mobile phone service provider wishes to determine the importance of the attributes of choice of a mobile phone plan. Qualitative research has suggested four important factors; price, network coverage, a new handset and customer service. Respondents might be asked to divide a constant sum to indicate the relative importance of the attributes. For example: Divide 100 points among the following characteristics of a mobile phone plan that are important to you: Price

Network coverage

New handset

Customer service

This constant-sum scale works best with respondents who have high educational levels. If respondents follow the instructions correctly, the results will approximate interval measures. Online surveys can use the constant sum scale well as responses are automatically tallied and checked. Constant sum scales become more problematic in telephone and other self-report questionnaires, since we are asking quite a bit of the respondents in terms of questionnaire completion. As the number of stimuli increases, this technique becomes increasingly complex. Although the constant-sum scale A measure of attitudes in which respondents are asked to divide a constant sum to indicate the relative importance of attributes; respondents often sort cards, but the task may also be a rating task. graphic rating scale A measure of attitude that allows respondents to rate an object by choosing any point along a graphic continuum.

constant sum scale is widely used, it can be flawed because the last response is completely determined by the way the respondent has scored the other choices. One means of avoiding this occurrence may be to randomise the order of items on the scale. Brand preference may be measured using this technique. The approach, which is similar to the paired-comparison method, is as follows: Divide 100 points among the following brands according to your preference for each mobile phone service provider brand: Telstra

Optus

Vodafone

In this case, the constant-sum scale is a rating technique. However, with minor modifications it can be classified as a sorting technique.

Graphic rating scales A graphic rating scale presents respondents with a graphic continuum. The respondents are allowed to choose any point on the continuum to indicate their attitude. Exhibit 8.8 shows a traditional graphic scale, ranging from one extreme position to the opposite position. Typically, a respondent’s score is determined by measuring the length (in millimetres) from one end of the graphic continuum to the point marked by the respondent. Many researchers believe that scoring in this manner strengthens the assumption that graphic rating scales of this type are interval scales. Alternatively, the researcher may divide the line into predetermined scoring categories (lengths) and record respondents’ marks accordingly. In other words, the graphic rating scale has the advantage of allowing the researcher to choose any interval desired for scoring purposes. The disadvantage of the graphic rating scale is that there are no standard answers. EXHIBIT 8.8 → GRAPHIC RATING SCALE

Please evaluate each attribute in terms of how important it is to you by placing an X at the position on the horizontal line that most reflects your feelings. Seating comfort

Not important

Very important

In-flight meals

Not important

Very important

Airfare Not important

Very important

CHAPTER 08 > MEASUREMENT

Graphic rating scales are not limited to straight lines as sources of

Best possible life

visual communication. Picture response options or another type of graphic

10

continuum may be used to enhance communication with respondents. A

255

← EXHIBIT 8.9 A LADDER SCALE

9

frequently used variation is the ladder scale, which also includes numerical options:

8

Here is a ladder scale. [Respondent is shown Exhibit 8.9.] It represents

7

the ‘ladder of life’. As you see, it is a ladder with 11 rungs numbered 0 to 10. Let’s suppose the top of the ladder represents the best possible life for you

6

as you describe it, and the bottom rung represents the worst possible life for 5

you as you describe it.

4

On which part of the ladder do you feel your life is today? 0 1 2 3 4 5 6 7 8 9 10

3

Research to investigate children’s attitudes has used happy face scales

2

(see Exhibit 8.10). The children are asked to indicate which face shows how they feel about confectionery, a toy or some other concept. Research with

1

the happy face scale indicates that children tend to choose the faces at

0

the ends of the scale. Although this may be because children’s attitudes fluctuate more widely than adults’ or because they have stronger feelings both positively and negatively, the tendency to select the extremes is a

Worst possible life

disadvantage of the scale. Happy Face Scale

← EXHIBIT 8.10 GRAPHIC RATING SCALE WITH PICTURE RESPONSE CATEGORIES THAT STRESS VISUAL COMMUNICATION

2

3 Very Good

1 Very Poor

Online questionnaires can use a variation of graphic scale called a slider scale. An example of a slider scale is shown in Exhibit 8.11. Here respondents drag the slider to show amounts, or position on a semantic scale between two adjectives. Among attitude rating approaches discussed in this chapter, Likert scales and semantic differentials account for the majority of applications. As shown in Table 8.6, there are disadvantages and advantages of each of the methods of rating scales. The choice of which approach to employ depends very much on the research objectives, the choice of survey method and the type of respondents. What do you think would be a reasonable monthly cost for your preferred plan? 0 Reasonable cost for your preferred mobile phone service plan $

20

40

60

80 100 120 140 160 180 200

← EXHIBIT 8.11 A SLIDER SCALE FOR PREFERRED COST OF A MOBILE PLAN

256

PART THREE > PLANNING THE RESEARCH DESIGN

TABLE 8.6 »

SUMMARY OF ADVANTAGES AND DISADVANTAGES OF RATING SCALES

Rating measure

Subject must

Advantages

Disadvantages

Category scale

Indicate a response category

Flexible, easy to respond to

Items may be ambiguous; with few categories, only gross distinctions can be made

Likert scale

Evaluate statements on a scale of agreement

Easiest scale to construct

Hard to judge what a single score means

Semantic differential and numerical scales

Choose points between bipolar adjectives on relevant dimensions

Easy to construct; norms exist for comparison, such as profile analysis

Bipolar adjectives must be found; data may be ordinal, not interval

Stapel scale

Choose points on a scale with a single adjective in the centre

Easier to construct than semantic differential; easy to administer

Endpoints are numerical, not verbal, labels

Constant-sum scale

Divide a constant sum among response alternatives

Approximates an interval measure

Difficult for respondents with low education levels

Graphic scale

Choose a point on a continuum

Visual impact; unlimited scale points

No standard answers

Graphic scale with picture response categories

Choose a visual picture

Visual impact; easy for poor readers

Hard to attach a verbal explanation to a response

Measuring behavioural intention The behavioural component of an attitude involves the behavioural expectations of an individual towards an attitudinal object. Typically, this represents a buying intention, a tendency to seek additional information or plans to visit a showroom. Category scales for measuring the behavioural component of an attitude ask about a respondent’s likelihood of purchase or intention to perform some future action, using questions like the following: How likely is it that you will purchase an iPhone 7?

❏

I definitely will buy

❏

I probably will buy

❏

I might buy

❏

I probably will not buy

❏

I definitely will not buy

I would write a letter to my representative in parliament or other government official in support of this company if it were in a dispute with government.

❏

Extremely likely

❏

Very likely

❏

Somewhat likely

❏

Likely, about a 50-50 chance

❏

Somewhat unlikely

❏

Very unlikely

❏

Extremely unlikely

The wording of statements used in these scales often includes phrases such as ‘I would recommend’, ‘I would write’ or ‘I would buy’ to indicate action tendencies.

CHAPTER 08 > MEASUREMENT

257

A scale of subjective probabilities, ranging from 100 for ‘absolutely certain’ to 0 for ‘absolutely no chance’, may be used to measure expectations. Researchers have used the following subjective probability scale to estimate the chance that a job candidate will accept a sales position:

100 per cent (Absolutely certain) I will accept

90 per cent

(Almost sure) I will accept

80 per cent

(Very big chance) I will accept

70 per cent

(Big chance) I will accept

60 per cent

(Not so big a chance) I will accept

50 per cent

(About even) I will accept

40 per cent

(Smaller chance) I will accept

30 per cent

(Small chance) I will accept

20 per cent

(Very small chance) I will accept

10 per cent

(Almost certainly not) I will accept

0

(Certainly not) I will accept

MEASURING BEHAVIOURAL INTENTIONS WITH THE JUSTER SCALE21

The Juster scale has been popular among commercial and academic researchers. The Juster Scale is a verbal probability scale designed to estimate future behaviour. It was developed by FT Juster, an American economist, in 1966 as an aid to forecasting national consumer demand. It is an 11-point stated probability scale 0 to 10. Respondents’ answers, 0 to 10, are recoded to probabilities ranging from 0.01 to 0.99 as suggested by the scale. The forecast demand is simply the average of all respondents’ answers. Thus if the average score is 0.24, we conclude that 24 per cent of people in the group will buy. ‘Taking everything into account, what are the chances that you personally will buy a new Tablet Computer some time within the next twelve weeks; that is, between now and the beginning of next semester?’ ‘Now please consider four brands of tablet computer: Apple iPad, Samsung Galaxy Tab, Microsoft Surface, and Google Nexus.’ ‘What are the chances that you personally would buy an Apple iPad in the next twelve weeks?’ (Repeated for all four brands) The 11-point scale consists of the following: 10 Absolutely certain to buy 9 Almost certain to buy 8 Much better than even chance 7 Somewhat better than even chance 6 Slightly better than even chance 5 About even chance 4 Slightly less than even chance 3 Somewhat less than even chance 2 Much less than even chance 1 Almost no chance 0 Absolutely no chance Note that the questions are very specific about the behaviour and the time-scale. Generally, the Juster Scale is a better predictor of purchase probability than other ‘likelihood of purchase’ direct question measures. It is not a good predictor of individual purchase, but it is very good when summarising a group purchase probability. Product categories seem to be better predicted than individual brands, but this can be ameliorated by priming people to think first about the category and then about individual brands, as shown above. The scale has become popular since it is easy to administer and is sensitive enough to measure purchase intent.

TIPS OF THE TRADE

258

PART THREE > PLANNING THE RESEARCH DESIGN

Behavioural differential A general instrument, the behavioural differential, is used to measure the behavioural intentions of subjects towards an object or category of objects. As in the semantic differential, a description of the object to be judged is followed by a series of scales on which subjects indicate their behavioural intentions towards this object. For example, one item might be: A 25-year-old female sales representative would—:—:—:—:—:—:—:—:—would not ask this person for advice.

Ranking Consumers often rank order their preferences. An ordinal scale may be developed by asking respondents to rank order (from most preferred to least preferred) a set of objects or attributes. Respondents easily understand the task of rank ordering the importance of product attributes or arranging a set of brand names according to preference.

PAIRED COMPARISONS Consider a situation in which a chainsaw manufacturer learned that a competitor had introduced a new lightweight (3 kg) chainsaw. The manufacturer’s lightest chainsaw weighed just over 4 kg. Executives wondered if they needed to introduce a 3 kg chainsaw into the product line. The research design chosen was a paired comparison. A 3 kg chainsaw was designed and a prototype built. To control for colour preferences, the competitor’s chainsaw was painted the same colour as the 4 kg and 3 kg chainsaws. Respondents were presented with two chainsaws at a time, then asked to pick the one they preferred. Three pairs of comparisons were required to determine the most preferred chainsaw. The following question illustrates the typical format for asking about paired comparisons. I would like to know your overall opinion of two brands of adhesive bandages. They are Elastoplast and Band-Aid. Overall, which of these two brands – Elastoplast or Band-Aid – do you think is the better one? Or are both the same? Elastoplast is better ______________ Band-Aid is better ___________ They are the same __________ If researchers wish to compare four brands of pens on the basis of attractiveness or writing quality, six comparisons [(n)(n 2 1)/2] will be necessary. behavioural differential A rating scale instrument similar to a semantic differential, developed to measure the behavioural intentions of subjects towards future actions. paired comparison A measurement technique that involves presenting the respondent with two objects and asking the respondent to pick the preferred object. More than two objects may be presented, but comparisons are made in pairs.

Ranking objects with respect to one attribute is not difficult if only a few items, such as products or advertisements, are compared. As the number of items increases, the number of comparisons increases geometrically. If the number of comparisons is too large, respondents may become fatigued and no longer carefully discriminate among them. Ranking items implicitly assumes that each item is more attractive or less attractive than each other item in a list. A ranking technique known as ‘best–worst’ scaling22 can be used to simplify rankings of a large number of items, and it allows for some items to be equally attractive or unattractive. Under best–worst scaling the respondent gives the most-preferred option and also the least-preferred option. Experimental blocks can be used to examine different groups of attributes and objects, for which the respondent can rank. An example of best–worst scale is shown in Exhibit 8.12. Marley and Louviere23 have shown that simply calculating the difference between best and worst for each option can give almost identical results to much more complicated methods. This approach is discussed later in this chapter.

CHAPTER 08 > MEASUREMENT

Please drag one assessment option into each of the boxes, according to your preferences.

Items

Best

← EXHIBIT 8.12 A BEST–WORST RANKING APPROACH24

Worst

60% Short answers Final 20% Short answers Mid-term 20% Individual Project 0% Tutorials 40% MCQ Final 20% MCQ Mid-term 30% Group Project 10% Tutorials 30% MCQ Final 30% MCQ Mid-term 30% Group Project 10% Tutorials

Sorting Sorting tasks require that respondents indicate their attitudes or beliefs by arranging items on the basis of perceived similarity or some other attribute. Exhibit 8.13 shows a scale where respondents are asked to categorise services that are easier and harder to switch than a mobile phone service provider. Note that the sorting process can sometimes yield similar results to rankings when respondents are asked to group items as either best or worst, or in this case, easier and harder than switching mobile phone providers. A variant of the constant-sum technique uses physical counters (for example, poker chips or coins) to be divided among the items being tested. In an airline study of customer preferences, the following sorting technique could be used.

Using your mouse, select which services you think are easier or harder to switch than switching to a different mobile phone service provider. Drag each item to their respective groups. Items Car insurance Home insurance Gas provider Electricity provider Water provider Home phone provider Internet provider Bank Mortgage provider Other

Easier than switching a mobile phone service provider

Harder than switching mobile phone service provider

259

← EXHIBIT 8.13 A SORTING SCALE

260

PART THREE > PLANNING THE RESEARCH DESIGN

Here is a sheet that lists several airlines. Next to the name of each airline is a pocket. Here are 10 cards. I would like you to put these cards in the pockets next to the airlines with whom you would prefer to fly on your next trip. Assume that all of the airlines fly to wherever you would choose to travel. You can put as many cards as you want next to an airline, or you can put no cards next to an airline.

Cards

Singapore

Malaysia

United

Air New

Qantas

Airlines

Airlines

Airlines

Zealand

Airways

Randomised response questions randomised response questions A research procedure used for dealing with sensitive topics, in which a random procedure determines which of two questions a respondent will be asked to answer.

In special cases, such as when respondents are being asked to provide sensitive or embarrassing information in a survey, the researcher may use randomised response questions. To understand this procedure, it is helpful to consider a portion of a questionnaire from an Australian Tax Office survey on income tax cheating. In this section you will be asked some questions about different things you might have done when filling out your tax return. A flip of a coin will determine which questions you are to answer. All that we will know is your answer of either ‘yes’ or ‘no’; we will not know which question you are answering. I’ll show you how it works in a minute, but the important thing to know is that your answers are completely anonymous. Using special kinds of statistics we will never know what you do. So, we hope you will be completely honest with us. Only in this way will this survey be of help to us. For example, let’s flip the coin. [HAVE EXAMPLE CARD READY.] Let’s say it comes up heads. Then you will respond to the ‘heads’ statement: ‘I had scrambled eggs for breakfast this morning.’ If you did have scrambled eggs, you would say ‘Yes’. If you did not have scrambled eggs, you would say ‘No’. Now, the coin could come up tails and you would respond to the ‘tails’ question: ‘I had potatoes for dinner last night.’ You would say ‘Yes’ if you did and ‘No’ if you didn’t. 1 Heads: Some time in the past, I have failed to file a tax return when I think I should have.

Tails: I have lived in this community for over five years.

Yes…1

No…2

2 Heads: Some time in the past, I purposely listed more deductions than I was entitled to.

Tails: I voted in the last federal election.

Yes…1

No…2

3 Heads: Some time in the past, I purposely failed to report some income on my tax return – even just a minor amount.

Tails: I own a car.

Yes…1

No…2

4 Heads: On at least one occasion, I have added a dependant that I wasn’t entitled to.

Tails: I have been to a film within the last year.

Yes…1

No…2

5 Heads: To the best of my knowledge, my tax return for [year] was filled out with absolute honesty.

Tails: I have eaten out in a restaurant within the last six months.

Yes…1

No…2

CHAPTER 08 > MEASUREMENT

6 Heads: I stretched the truth just a little in order to pay fewer taxes for [year].

Tails: Generally, I watch one hour or more of television each day.

Yes…1

No…2

This is the end of the interview.

Thank you very much for your cooperation. The coin flipping randomly determines which of the two questions the respondent answers. Thus,

the interviewer does not know whether the sensitive question about income tax cheating or the meaningless question is being answered, because the responses (‘Yes’ or ‘No’) are identical for both questions. The proportion of ‘Yes’ answers to the income tax question is calculated by a formula that includes previous estimates of the proportion of respondents who answer ‘Yes’ to the meaningless question and the probability (Pr) that the meaningless question is being answered: Pr (‘Yes’ answer) 5 Pr (‘Yes’ on question A) 1 Pr (‘Yes’ on question B) 5 Pr (question A is chosen) 3 Pr (‘Yes’ on question A)

1 Pr (question B is chosen) 3 Pr (‘Yes’ on question B)

Although estimates are subject to error, the respondent remains anonymous, and response bias is thereby reduced. Obviously researchers cannot know the answers for any individual, but the average for a reasonably large group tends to be quite accurate. The randomised response method originally was applied in personal interview surveys. However, randomised response questions in a slightly modified format have been successfully applied in other situations.

Other methods of attitude measurement Attitudes, as hypothetical constructs, cannot be measured directly. Therefore, measurement of attitudes is to an extent subject to the imagination of the researcher. The traditional methods used for attitude measurement have been presented here, but several other techniques that are discussed in the published literature (for example, the Guttman scale) can be used when a situation dictates. Advanced students will seek out these techniques when the traditional measures do not apply to their research problems. With the growth of computer technology, techniques such as multidimensional scaling and conjoint analysis are used more frequently. These complex techniques require knowledge of multivariate statistical analysis (see Chapter 15).

Selecting a measurement scale: Some practical decisions Now that we have looked at a number of attitude measurement scales, a natural question arises: ‘Which is most appropriate?’ As in the selection of a basic research design, there is no single best answer for all research projects. The answer to this question is relative, and the choice of scale will depend on the nature of the attitudinal object to be measured, the manager’s problem definition, and the backward and forward linkages to choices already made (for example, telephone survey versus mail survey). However, several questions will help focus the choice of a measurement scale: 1 Is a ranking, sorting, rating or choice technique best? 2 Should a monadic or a comparative scale be used? 3 What type of category labels, if any, will be used for the rating scale? 4 How many scale categories or response positions are needed to accurately measure an attitude? 5 Should a balanced or unbalanced rating scale be chosen?

261

262

PART THREE > PLANNING THE RESEARCH DESIGN

6 Should a scale that forces a choice among predetermined options be used? 7 Should a single measure or an index measure be used? We will discuss each of these issues now.

IS A RANKING, SORTING, RATING OR CHOICE TECHNIQUE BEST? The answer to this question is determined largely by the problem definition and especially by the type of statistical analysis desired. For example, ranking provides only ordinal data, limiting the statistical techniques that may be used.

SHOULD A MONADIC OR A COMPARATIVE SCALE BE USED? If the scale to be used is not a ratio scale, the researcher must decide whether to include a standard of comparison in the verbal portion of the scale. Consider the following rating scale. Now that you’ve had your car for about one year, please tell us how satisfied you are with its engine power and acceleration. Completely satisfied

Very satisfied

Fairly well satisfied

Somewhat dissatisfied

Very dissatisfied

This is a monadic rating scale because it asks about a single concept (the brand of car the monadic rating scale Any measure of attitudes that asks respondents about a single concept in isolation.

individual actually purchased) in isolation. The respondent is not given a specific frame of reference. A

comparative rating scale Any measure of attitudes that asks respondents to rate a concept in comparison with a benchmark explicitly used as a frame of reference.

as a frame of reference. In many cases the comparative rating scale presents an ideal situation as a

comparative rating scale asks a respondent to rate a concept, such as a specific brand, in comparison with a benchmark – perhaps another similar concept, such as a competing brand – explicitly used reference point for comparison with the actual situation. For example: Please indicate how the amount of authority in your present position compares with the amount of authority that would be ideal for this position. Too much About right Too little

Wide World Photos/Daniel Morel

WHAT TYPE OF CATEGORY LABELS, IF ANY, WILL BE USED FOR THE RATING SCALE? We have discussed verbal labels, numerical labels and unlisted choices. Many rating scales have verbal labels for response categories because researchers believe that these help respondents better understand the response positions. The maturity and educational levels of the respondents will influence this decision. The semantic differential (with unlabelled response categories between two bipolar adjectives) and the numerical scale (with numbers to indicate scale positions) often are selected because the researcher wishes to assume interval-scale data.

HOW MANY SCALE CATEGORIES OR RESPONSE POSITIONS ARE NEEDED TO ACCURATELY MEASURE AN ATTITUDE? Should a category scale have four, five or seven response positions or categories? Or should the researcher use a graphic scale with an infinite number of positions? Many rating scales have verbal labels. The scale shown in this humorous ad gives the reader a choice of appropriate category labels for ice-cream pleasure.

The original developmental research on the semantic differential indicated that five to eight points was optimal. However, the researcher must determine the number of meaningful positions that is best for the specific project. This issue of identifying how many meaningful distinctions respondents can practically make is basically a matter of sensitivity, but at the operational rather than the conceptual level.

CHAPTER 08 > MEASUREMENT

SHOULD A BALANCED OR UNBALANCED RATING SCALE BE CHOSEN? The fixed-alternative format may be balanced or unbalanced. For example, the following question, which asks about parent–child decisions relating to television program watching, is a balanced rating scale. Who decides which television programs your children watch? →→ Child decides all of the time. →→ Child decides most of the time. →→ Child and parent decide together. →→ Parent decides most of the time. →→ Parent decides all of the time. This scale is balanced because a neutral point, or point of indifference, is at the centre of the scale. Unbalanced rating scales may be used when responses are expected to be distributed at one end of the scale. Unbalanced scales, such as the following one, may eliminate this type of ‘end piling’. Satisfied Neither satisfied nor dissatisfied Quite dissatisfied Very dissatisfied The nature of the concept or the researcher’s knowledge about attitudes towards the stimulus to

263

balanced rating scale A fixed-alternative rating scale with an equal number of positive and negative categories; a neutral point or point of indifference is at the centre of the scale. unbalanced rating scale A fixed-alternative rating scale that has more response categories piled up at one end and an unequal number of positive and negative categories. forced-choice rating scale A fixed-alternative rating scale that requires respondents to choose one of the fixed alternatives. non-forced-choice rating scale A fixed-alternative rating scale that provides a ‘no opinion’ category or that allows respondents to indicate that they cannot say which alternative is their choice.

be measured generally will determine the choice of a balanced or unbalanced scale. iStockphoto/Jennifer Daley

SHOULD A SCALE THAT FORCES A CHOICE AMONG PREDETERMINED OPTIONS BE USED? In many situations a respondent has not formed an attitude towards the concept being studied and simply cannot provide an answer. If a forced-choice rating scale compels the respondent to answer, the response is merely a function of the question. If answers are not forced, the midpoint of the scale may be used by the respondent to indicate unawareness as well as indifference. If many respondents in the sample are expected to be unaware of the attitudinal object under investigation, this problem may be eliminated by using a non-forced-choice rating scale that provides a ‘no opinion’ category. For example: How does Westpac compare with NAB? Westpac is better than NAB. Westpac is about the same as NAB. Westpac is worse than NAB. Can’t say. Asking this type of question allows the investigator to separate respondents who cannot make an honest comparison from respondents who have had experience with both banks. The argument for forced choice is that people really do have attitudes, even if they are unfamiliar with the banks, and should be required to answer the question. Higher incidences of ‘no answer’ are associated with forced-choice questions.

Researchers face a number of attitude scaling decisions. One choice they must make is whether a balanced or an unbalanced scale should be used.

264

PART THREE > PLANNING THE RESEARCH DESIGN

SHOULD A SINGLE MEASURE OR AN INDEX MEASURE BE USED? How complex is the issue to be investigated? How many dimensions does the issue contain? Are individual attributes of the stimulus part of a holistic attitude, or are they seen as separate items? The researcher’s conceptual definition will be helpful in making this choice. A single item scale can assess very simple or concrete concepts. Did you watch MasterChef last night?

Yes No

Do you like pistachio ice-cream? Don’t like at all

Like a lot

Other indices such as social class require items to form an index. Latent constructs like the personality trait of innovativeness generally require multiple items. The researcher also has many scaling options. Generally, the choice is influenced by plans for the later stages of the research project. Again, problem definition becomes a determining factor influencing the research design.

EXPLORING RESEARCH ETHICS

THE CLIENT CHANGES THE MEASURES

In August 2009, a market research company reported to the Australian Market & Social Research Society (AMSRS) that some of the measurement scales were changed by them in two periods of a tracking study. A selection of the measures was published in the client’s annual report. Some of these were measured on new scales and others were not. It seemed that the client had taken a rough average to convert from the old to the new for all the time-series charts in the annual report. The charts were footnoted with the fact that the scales had changed. Overall, using the more precise conversion, where available, doesn’t change the picture much, but a couple of the measures did look quite different and it concerned the market research company that this may be the case for others, too, and be misleading to readers. Under the code of ethics, Section 27 of AMSRS, the market research company can elect to have their name removed from the report, since the findings can be potentially misleading.

Attitudes and intentions measurement reliability

Market researchers often model behaviour as a function of intentions, which in turn is considered a function of a person’s beliefs about an activity by the evaluations of those characteristics. This type of research is sometimes called the theory of reasoned action approach (TRA)25 and entails the use of a multi-attribute model. For example, a consumer’s decision to change mobile phone service providers can be modelled by their intention to do so, which in turn is a function of their attitude.26 Likewise, a researcher may first measure the attitude as a way of knowing how likely a consumer would be to

multi-attribute model A means of measuring an attitude to an object by asking respondent to evaluate each part of the object. Attitude scores are based on the product of belief strength and the evaluation of the consequences.

respond to a survey on Facebook.

Multi-attribute attitude score Attitudes are often modelled with a multi-attribute approach by taking belief scores assessed with a type of rating scale like those described and then multiplying each belief score by the evaluation of the consequences also supplied as a type of rating scale, then summing each resulting product. For

CHAPTER 08 > MEASUREMENT

instance, a series of Likert statements might assess a respondent’s beliefs about Internet providers’ connections being established successfully, established speedily, being dropped out and the price or monthly fee charged. The respondents use a simple rating scale to assess how good or bad each characteristic is. For example, the connection dropping out in the middle of the session in this case is rated as bad 23), which is much worse than the price being paid for the service.

TABLE 8.7 »

ATTITUDE TOWARDS TWO INTERNET PROVIDERS

Evaluation of attribute (Unlikely 1 2 3 4 5 Likely) Attribute

iiNet

Telstra

Evaluation of the consequences

1 Connection will be established successfully every time.

5

3

13

2 The connection will be established speedily.

3

4

12

3 The connection will dropout in the middle of the session.

3

3

23

4 The price (monthly fee) will be high.

5

3

21

(Very bad 23 22 0 12 13)

The respondent’s attitude towards Telstra as an Internet Service Provider (ISP), as shown in Table 8.8, can be found by multiplying beliefs scores by evaluations.

TABLE 8.8 »

ATTITUDE TOWARDS TELSTRA AS AN ISP

Evalulation of attribute (Unlikely 1 2 3 4 5 Likely) Attribute

Telstra

Evaluation of the consequences

Belief and evaluation of the consequences

1 Connection will be established successfully every time.

3

X (13) 5

6

2 The connection will be established speedily.

4

X (12) 5

8

3 The connection will dropout in the middle of the session.

3

X (23) 5

29

4 The price (monthly fee) will be high.

3

X (21) 5

23

Attitude towards Telstra as an ISP

2

(Very bad 23 22 0 12 13)

A researcher may also ask the respondent to rate a competitor’s product or service, in this case iiNet whose attitude score using the same approach is 7. This would suggest a possible competitive advantage exists with iiNet compared to Telstra. The multi-attribute model is popular with market researchers because it can provide diagnostic information not only on how brands compare with

265

266

PART THREE > PLANNING THE RESEARCH DESIGN

each other, but also on which attributes are more important; in this case the two most important attributes seem to be having the connection established successfully every time (13 for evaluation of the consequences) and the connection will be dropped out in the middle of the session (23 for evaluation of the consequences). ONGOING PROJECT

DISADVANTAGES OF THE MULTI-ATTITUDE-TOWARDS-THE-OBJECT MODEL As the attitude-towards-the-object model is simple, it also is often simplistic. One important assumption is that respondents are able to atomise their judgements about an object. That is, it is assumed that people are able to think about a product category in terms of its component attributes and are able to consciously calculate and declare the relative importance of each of those attributes. Do you think you could do a good job of working out how much more important is ‘price’ to you over ‘travelling distance’ for a restaurant? How about ‘prestige’ over ‘location’ for a university? Further, the model assumes that respondents are not just able to atomise their evaluations but are also willing to divulge honestly such evaluations. Do you think everyone would honestly declare the relative importance of, say, recyclable packaging over price for a grocery product? Another important assumption is that respondents are able to separate their beliefs about an object and its attributes from evaluations of the importance of those attributes. Studies have shown that when respondents feel that an attribute is very important (ei) then they tend to rate objects with more extreme values on those attributes (bi). If an attribute is unimportant then every object gets an ‘average’ score on that attribute, but on the important attributes respondents will give much higher and lower scores. That is, the ‘importance’ weight tends to be contained within the bi score. Also, mathematically, multiplying two ratings scales together is plain wrong. When two numbers are multiplied it is a requirement that at least one of those numbers is ratio scaled and the other is at least interval scaled. Thus, the brand evaluation measure (bi) is assumed to be interval scaled, and the evaluation of goodness (ei) is assumed to be ratio scaled. So a mark of 3 on the goodness (ei) scale literally means that the attribute is three times as important as a mark of 1, and that 1 is half as important as 2. It is quite likely that the ratio-scaled ei measure assumption would not hold up. Finally, as an overall measure of attitude towards an object, the multi-attribute approach seems to work well only when the product category has high enduring involvement for the respondent. It must be sufficiently important to the respondent that he or she is willing to spend the mental effort to think about the relative importance of each attribute. Even then, the product category must be sufficiently immune to perceived social judgement that respondents will not consciously or unconsciously modify their evaluations in order to gain social approval, or avoid social disapproval.

BEST–WORST SCALING When people make choices they usually have to make a compromise. If I buy one ice-cream cone, then I am also deciding not to buy a different flavour. Best–worst scaling helps to measure how attractive one option is when compared with the alternatives. It is becoming a very popular form of comparative rating scale.

CHAPTER 08 > MEASUREMENT

Let’s say we’re interested in holiday destinations for young people. We could ask respondents to rate these in terms of attractiveness, or likelihood of purchase, or we could ask people to rank them on the same criteria. Ratings, despite their popularity (see the scales we have shown you already) have many shortcomings. Ratings assume that we can make decisions in isolation of competitive options, and they assume that all respondents use the scales in the same way – that one respondent’s 8 out of 10 is the same as another respondent’s 8 out of 10. If all options are attractive then how do we deal with answers that say that everything scores 8, 9 or 10 out of 10? How do we deal with ties when we just want to know which option ‘wins’? Best–worst scaling helps us with all of these problems. As professional marketers we really are not interested in what the objective score of one item might be – we are more interested in whether one item is more attractive than another. Best–worst scaling is a measurement technique that does not assume that all people use the same evaluation rules, and that explicitly takes into account that evaluations are usually relative to alternatives. Best–worst scaling is an extension of paired comparisons data gathering. If we have three objects (A, B, C) that we want to compare, then we have three possible pairs: AB, AC and BC. Four objects give us six pairs. If we have eight objects, then there are 28 pairs. The number grows exponentially so that very quickly there are so many pairs that respondents become fatigued and they make nonsensical or random choices. Fortunately, if we make some simple assumptions about how people make decisions (for example, people are logically consistent), then we can get a great deal of information from a smaller number of choice tasks. Let’s say there are eight options for a semester-break holiday – see Exhibit 8.14. These eight options can be presented in a balanced incomplete block (BIB) experimental design so that choices are made much simpler for respondents. This BIB design has 14 blocks of four options. Note that each option appears seven times, and that each option appears with each other option three times. The first block, row #1 in the table, is made up of levels 8 – Rottnest Island, 2 – Queenstown, 3 – Byron Bay and 5 – Surfers Paradise. The first block might appear in a questionnaire as in Exhibit 8.15.

EXHIBIT 8.14 → OPTIONS FOR SEMESTER BREAK

EXHIBIT 8.15 → BALANCED INCOMPLETE BLOCK DESIGN (EIGHT LEVELS, 14 BLOCKS, SEVEN REPS/ LEVEL, THREE PAIR FREQ.)

Block#

Options

1

8

2

3

5

2

1

4

7

6

3

8

3

4

6

4

2

5

1

7

5

8

4

5

7

6

3

6

2

1

7

8

5

6

1

8

4

7

3

2

9

8

6

7

2

10

5

1

4

3

1

Bali (Indonesia)

2

Queenstown (New Zealand)

3

Byron Bay (North Coast, New South Wales)

4

Noosa (Sunshine Coast, Queensland)

5

Surfers Paradise (Gold Coast, Queensland)

11

8

7

1

3

6

Magnetic Island (Great Barrier Reef)

12

6

2

5

4

7

Fiji

13

8

1

2

4

8

Rottnest Island (Western Australia)

14

7

3

6

5

267

268

PART THREE > PLANNING THE RESEARCH DESIGN

Considering the following options for a holiday during the mid-semester break, please tick the one destination you find most attractive, and the one option you find least attractive.

EXHIBIT 8.16 → BLOCK

Most attractive

Least attractive Rottnest

The same choice task is repeated

Queenstown

for all 14 blocks so that respondents have evaluated each option multiple

Byron Bay

times.

Surfers Paradise

Obviously this is a process that takes some time, but it tends to provide much more reliable results than ordinary ranking or ratings tasks because respondents are forced to think more about the evaluation task. Let’s say one respondent, Pam, is consistent in her evaluations and she has a preference that could be ranked 1 to 8 in our list. Her judgements of the eight options in the 14 choice tasks might look like that in Exhibit 8.17. Pam lives in Sydney, so she felt that Western Australia was too far for only a 10-day break. She also had previously visited most other mainland destinations, so international locations appealed to her. Other people are likely to have quite different utility functions. EXHIBIT 8.17 → PAM’S BEST–WORST SELECTIONS FOR SEMESTER-BREAK HOLIDAY DESTINATION

Block#

Options

Best

Worst

1

Rottnest

Queenstown

Byron Bay

Surfers Paradise

Queenstown

Rottnest

2

Bali

Noosa

Fiji

Magnetic Island

Bali

Fiji

3

Rottnest

Byron Bay

Noosa

Magnetic Island

Byron Bay

Rottnest

4

Queenstown

Surfers Paradise

Bali

Fiji

Bali

Fiji

5

Rottnest

Noosa

Surfers Paradise

Fiji

Noosa

Rottnest

6

Byron Bay

Magnetic Island

Queenstown

Bali

Bali

Magnetic Island

7

Rottnest

Surfers Paradise

Magnetic Island

Bali

Bali

Rottnest

8

Noosa

Fiji

Byron Bay

Queenstown

Queenstown

Fiji

9

Rottnest

Magnetic Island

Fiji

Queenstown

Queenstown

Rottnest

Surfers Paradise

Bali

Noosa

Byron Bay

Bali

Surfers Paradise

11

Rottnest

Fiji

Bali

Byron Bay

Bali

Rottnest

12

Magnetic Island

Queenstown

Surfers Paradise

Noosa

Queenstown

Magnetic Island

13

Rottnest

Bali

Queenstown

Noosa

Bali

Rottnest

14

Fiji

Byron Bay

Magnetic Island

Surfers Paradise

Byron Bay

Fiji

10

Summarising Pam’s preferences, we would see a table like that in Exhibit 8.18. (This can be done very quickly with a spreadsheet.) We can derive a score for each option by simply subtracting the frequency of Worst from the frequency of Best. Note that the scores range from 17 to 27, meaning that each item can be located on a 15-point scale. Of course if we used a different BIB design with, say, fewer repeated evaluations, then the scale would be different. The negative scores do not mean that

CHAPTER 08 > MEASUREMENT

an item is unattractive – it simply means that it is less attractive than the other options. If you prefer to see scores that are all positive, then you can derive another scale – the square root of the ratio of best to worst frequencies. (We overcome the problem of zeros in the numerator or denominator by adding 1/k to each frequency, where k 5 total number of options. Here, k 5 8.) EXHIBIT 8.18 → SUMMARY RESULTS OF PAM’S BEST–WORST EVALUATIONS

Option#

Option

Frequency Best

Frequency Worst

B–W

sqrt(B/W)(a)

1

Bali

7

0

7

7.55

2

Queenstown

4

0

4

5.74

3

Byron Bay

2

0

2

4.12

4

Noosa

1

0

1

3

5

Surfers Paradise

0

1

21

0.33

6

Magnetic Island

0

2

22

0.24

7

Fiji

0

4

24

0.17

8

Rottnest

0

7

27

0.13

(a) To address problem of zeros in the frequencies, we add 1/k (where k 5 number of options) to each frequency. So here, for example, Bali is coded sqrt(7.125/0.125).

It has been suggested that the square root of the B–W ratio can be used as a ratio scale (that is, we could claim, for example, that Queenstown is nearly twice as attractive as Noosa), but this is only true if we confine ourselves to one particular BIB design and only to the exact options used in that design – that the choice alternatives used represent the universe of all possible options that could be considered. Given that both the BIB and the choice options are arbitrarily chosen by the researcher, then we are hardly measuring a ratio scale of respondents’ preferences. In both cases the measures, B–W and sqrt(B/W), may be considered interval scales. So now that we have attractiveness measures, or utility scores, for each of the options under consideration, we can use these data to predict how many people are likely to actually buy each alternative. Using a rather exotic statistical technique called multinomial logit (MNL), we can estimate market share for each alternative – the proportion of people who will buy each one. We have found that the square root of the B–W ratio is approximately directly proportional to the market shares of people in the same market segment. We can discover groups of people with similar utility functions using cluster analysis, discussed briefly in Chapter 15. These clusters are market segments based on utility or preference. Results like these can be gathered very quickly with online survey software, and the scores for each option for each respondent can be very quickly calculated using a standard spreadsheet. Some worked examples are available from the website for this book.

ADVANTAGES OF BEST–WORST SCALING As we noted at the beginning of this section, best–worst scaling has the advantage over most other measurement techniques in that it does not assume that all respondents use the same judgement rules, and that all similar scores mean the same thing. It relies only on respondents making evaluations of whether one object is better than another. It is consistent with how people really do make decisions – usually a trade-off is made where making a choice means that another option is

269

270

PART THREE > PLANNING THE RESEARCH DESIGN

passed over. Choices are an easy task for most respondents. Can you imagine yourself looking at a restaurant menu and saying ‘I give this dish a score of 7, and this one a score of 5’? Why would we expect our respondents to think like that? Instead, most people make quick relative judgements: one is more attractive than another, and so on.

DISADVANTAGES OF BEST–WORST SCALING While best–worst scaling is much more natural and easy for respondents, it clearly is much more complicated for the researcher. The researcher must select an experimental design that includes as many options as possible, and design a question protocol so that respondents are not confused or fatigued with the task. The calculations can take some time and effort on a spreadsheet, and these may be difficult for some managers to understand. If a manager doesn’t understand something, then she may not trust it. A best–worst scaling task can take considerably longer for respondents than alternative scoring methods. Often this is a good thing because respondents think more about the task and give much more reliable answers. On the other hand, it takes up time, and that can tax respondents’ patience.

Sample balanced incomplete block (BIB) designs The following tables in Exhibit 8.19 are popular BIB experimental designs that can be used for best– worst scaling. They can be classified according to the number of levels (objects to be evaluated), the number of times a level is evaluated in the whole design, the number of blocks (choice tasks), the number of levels in each block, and the number of times any pair appear together in one block. EXHIBIT 8.19 → SAMPLE BALANCED INCOMPLETE BLOCK (BIB) DESIGNS

BIB 8.14.7.4.3

BIB 9.12.4.3.1

Block

Block

1

8

2

3

5

1

2

4

8

2

1

4

7

6

2

1

4

5

3

8

3

4

6

3

4

7

9

4

2

5

1

7

4

3

4

6

5

8

4

5

7

5

1

2

3

6

3

6

2

1

6

2

5

7

7

8

5

6

1

7

2

6

9

8

4

7

3

2

8

1

8

9

9

8

6

7

2

9

5

6

8

10

5

1

4

3

10

3

7

8

11

8

7

1

3

11

1

6

7

12

6

2

5

4

12

3

5

9

13

8

1

2

4

14

7

3

6

5

»

CHAPTER 08 > MEASUREMENT

» BIB 10.15.6.4.2

Youden 13.13.4.4.1 Block

Block 1

4

7

8

9

1

11

8

5

2

2

3

6

8

10

2

7

1

6

5

3

2

5

9

10

3

12

5

4

10

4

1

8

9

10

4

5

13

9

3

5

4

5

6

10

5

4

3

2

1

6

3

5

7

9

6

13

2

10

7

7

2

6

7

8

7

2

6

12

9

8

1

5

6

7

8

1

12

11

13

9

2

3

7

10

9

9

11

7

4

10

2

4

6

9

10

6

10

3

11

11

3

4

5

8

11

10

9

1

8

12

1

2

3

4

12

3

7

8

12

13

1

4

7

10

13

8

4

13

6

14

1

3

6

9

15

1

2

5

8

Youden 16.16.6.6.2 Block 1

12

8

15

13

14

4

2

16

11

14

7

3

13

3

6

13

2

16

15

10

4

9

15

1

14

16

5

5

4

16

8

9

10

11

6

10

12

7

3

9

15

7

11

14

9

2

6

12

8

13

5

10

11

12

1

9

7

4

16

12

5

6

10

5

3

6

15

11

8

11

14

10

5

8

7

2

12

8

7

13

6

1

9

13

3

2

12

1

8

16

14

15

1

11

4

2

7

15

1

6

3

10

4

14

16

2

9

4

5

13

3

271

272

PART THREE > PLANNING THE RESEARCH DESIGN

Pricing Pricing decisions are among the most difficult for any marketing manager. Price too high and you lose customers, but price too low and you lose profits. It’s a balancing game that must take into account costs, seasonality, competition, demand, market segments and other issues. Consumers can help us decide on pricing by helping us know what prices to expect, and what they are willing to pay. In the following paragraphs we look briefly at two measures: one for price expectations and another for willingness to pay.

EXPECTED PRICE Usually, expected price is measured with a direct question: ‘At what price would you expect to see BLANK advertised for?’

The main problem with this sort of question is that we assume that the respondent knows enough about the product category to give a reasonable answer. We can overcome this isolation by anchoring respondents’ perceptions with prices of competitive products already on the market. ‘Samsung Galaxy 5 is priced at $XXX, and Apple iPhone 6 is priced at $YYY. What would you expect the new HTC Mind to be priced at?’

Responses can be recorded as marks on a linear scale, or by asking respondents to write in an amount. Online questions can show comparative anchors of competitors and ask respondents to indicate where the target brand would be. Almost always the better response method is the one that is easiest for the respondent.

WILLINGNESS TO PAY Again, new researchers often use a direct question such as: ‘What price would you be willing to pay for BLANK?’

Such a question rarely gives a reliable answer. It is a very blunt instrument. It gives you a point estimate of a price, when managers usually want a range to work within. Respondents tend to ‘low ball’ their answers, saying a lower price in the course of the survey because they want to get a bargain. Finally, like the price expectations question, the product is tested in isolation from competitors. One solution is the Van Westendorp Price Sensitivity Meter.

VAN WESTENDORP PRICE SENSITIVITY METER The van Westendorp Price Sensitivity Meter27 takes into account the fact that price affects two aspects of our judgement: if too high then price is too much for someone’s budget, but at a little lower price the same person may decide that an item is expensive but not totally eliminated from consideration. We know from our own experience that sometimes a high price connotes higher quality. At the other end, a low price may be regarded as a bargain. But if the price is too low then a product may become unattractive again – not because people can’t afford it but rather because they infer that it is not a high quality product, or that it is not a sufficient quality product. A good pricing strategist wants to be above that bargain level, but not at the too cheap level. Similarly, you want to be at the maximum at the expensive level but not the too expensive level.

Van Westendorp Price Sensitivity Meter questions 1 At what price would you consider BLANK to be so expensive that you would not consider buying it? 2 At what price would you consider BLANK starting to get expensive so that it is not out of the question, but you would have to give some thought to buying it?

CHAPTER 08 > MEASUREMENT

3 At what price would you consider BLANK to be a bargain, a great buy for the money? 4 At what price would you consider BLANK to be priced so low that you would feel the quality couldn’t be very good?

Diagram the results as cumulative distributions; the proportion of responses that regarded a price as too low at a particular price or lower, overlaid with the proportion of responses that regard a price as too high or higher. Two questions are plotted as inverse cumulative distributions (1 minus the proportion) for the questions – Too cheap and Getting expensive. 100%

← EXHIBIT 8.20 TOO CHEAP AND GETTING EXPENSIVE

80%

60%

40%

20%

Too cheap

Getting expensive (not)

Bargain (not)

Too expensive

10 $3

00 $3

80 $2

60 $2

40 $2

20 $2

00 $2

$1

80

60 $1

40 $1

$1

20

0%

Such a question system assumes rational decision making by respondents. It wouldn’t make sense if a ‘bargain’ price was actually lower than a ‘too cheap’ price. With an online, computerised questionnaire, we can check that responses are in the proper order. The question set also examines pricing in isolation from competition. It is possible that respondents will low-ball the prices, but there is debate on that among professional marketing researchers. The consensus is that the technique probably works best with new products where there are few competitors on which to anchor one’s judgements.

The net promoter score The net promoter score (NPS) is popular with business as a quick and efficient means of estimating competitive position and determining if customer feedback is likely to lead to growth. To calculate an organisation’s NPS, a simple 10 point semantic differential is used which asks respondents: How likely is it that you would recommend [brand] to a friend or colleague? Respondents are grouped as follows: →→ Promoters (score 9–10) are loyal enthusiasts who will keep buying and refer others, fuelling growth →→ Passives (score 7–8) are satisfied but unenthusiastic customers who are vulnerable to competitive offerings. →→ Detractors (score 0–6) are unhappy customers who can damage your brand and impede growth through negative word-of-mouth. Subtracting the percentage of detractors from the percentage of promoters yields the net promoter score, which can range from a low of –100 (if every customer is a detractor) to a high of 100 (if every customer is a promoter).

273

274

PART THREE > PLANNING THE RESEARCH DESIGN

0 = Not at all likely

EXHIBIT 8.21 → NET PROMOTER SCORE28

0

1

5 = Neutral 2

NPS (Net Promoter Score)

3

4

5

10 = Extremely likely 6

Promoters (%) (9s and 10s)

7

8

9

10

Detractors (%) (0 through 6s)

On a scale from 0–10, how likely are you to recommend [insert company name here] to a friend or colleague?

There are some serious issues raised in academic research about the net promoter score. Studies have shown it does not perform as well as other measures such as satisfaction in predicting firm success, and is less accurate than a composite set of questions. Categorisation of promoters, passives and detractors has no basis in reality. There is no evidence that promoters tick 9 or 10, or that detractors tick anything between 0 and 6. The NPS calculation arbitrarily removes one potentially important group, the passives, and assumes that a mark of 0 has the same value as a mark of 6. The claimed benefits of using NPS for customer retention and profit rarely hold up in real cases. In fact, the evidence shows little relationship between actual advocacy and NPS measures. The researcher is probably better off simply using an average NPS advocacy score rather than the difference between promoters and detractors.29 Despite these concerns, as will be seen in the next section of this chapter, the NPS is still preferred by many companies because of its practicality and simplicity as a single item measure of customer loyalty.

STEP 6: EVALUATE THE MEASURE There are four major criteria for evaluating measurements: reliability, validity, sensitivity and, most importantly, practicality.

Reliability A tailor measuring fabric with a tape measure obtains a ‘true’ value of the fabric’s length. If the tailor repeatedly measures the fabric and each time comes up with the same length, it is assumed that the tape measure is reliable. When the outcome of the measuring process is reproducible, the measuring instrument is reliable. Reliability applies to a measure when similar results are obtained over time and across situations. Broadly defined, reliability is the degree to which measures are free from random error and therefore yield consistent results. For example, ordinal measures are reliable if they consistently rank order items in the same manner; reliable interval measures consistently rank order and maintain the same distance between items. Imperfections in the measuring process that affect the assignment of scores or numbers in different ways each time a measure is taken, reliability The degree to which measures are free from random error and therefore yield consistent results. test–retest method Administering the same scale or measure to the same respondents at two separate points in time to test for stability.

such as when a respondent misunderstands a question, cause low reliability. The actual choice among plausible responses may be governed by such transitory factors as mood, whim or the context set by surrounding questions; measures are not always error free and stable over time. There are two dimensions underlying the concept of reliability: repeatability and internal consistency. Assessing the repeatability of a measure is the first aspect of gauging reliability. The test–retest method of determining reliability involves administering the same scale or measure to the same respondents at two separate times to test for stability. If the measure is stable over time, a test that is administered under the same conditions each time should obtain similar results. For example, suppose a researcher at one time attempts to measure buying intentions and finds that 12 per cent of

CHAPTER 08 > MEASUREMENT

275

the population is willing to purchase a product. If the study is repeated a few weeks later under similar conditions and the researcher again finds that 12 per cent of the population is willing to purchase the product, the measure appears to be reliable. The high stability correlation or consistency between the two measures at time one and time two indicates a high degree of reliability. As an example at the individual (rather than the aggregate) level, assume that a person does not change his or her attitude about pilsner beer. If repeated measurements of that individual’s attitude towards pilsner beer are taken with the same attitude scale, a reliable instrument will produce the same results each time the attitude is measured.30 When a measuring instrument produces unpredictable results from one testing to the next, the results are said to be unreliable because of error in measurement. As another example, consider these remarks a Gillette executive made about the reliability problems in measuring reactions to razor blades. There is a high degree of noise in our data, a considerable variability in results. It’s a big mish mash, what we call the night sky in August. There are points all over the place. A man will give a blade a high score one day, but the next day he’ll cut himself a lot and give the blade a terrible score. But on the third day, he’ll give the same blade a good score. What you have to do is try to see some pattern in all this. There are some gaps in our knowledge.31

Measures of test–retest reliability pose two problems that are common to all longitudinal studies. First, the premeasure, or first measure, may sensitise the respondents to their participation in a research project and subsequently influence the results of the second measure. Furthermore, if the time between measures is long, there may be an attitude change or other maturation of the subjects. Thus, a reliable measure can indicate a low or a moderate correlation between the first and second administration, but this low correlation may be due to an attitude change over time rather than to a lack of reliability. The second underlying dimension of reliability concerns the homogeneity of the measure. An attempt to measure an attitude may require asking several similar (but not identical) questions or presenting a battery of scale items. To measure the internal consistency of a multiple-item measure, scores on subsets of the items within the scale must be correlated. The technique of splitting halves is the most basic method of checking internal consistency when a measure contains a large number of items. In the split-half method the researcher may take the results obtained from one half of the scale items (for example, odd-numbered items) and check them against the results from the other half (even-numbered items). The equivalent-form method is used when two alternative instruments are designed to be as equivalent as possible. The two measurement scales are administered to the same group of subjects. A high correlation between the two forms suggests that the scale is reliable. However, a low correspondence between the two instruments creates a problem. The researcher will be uncertain whether the measure has intrinsically low reliability or whether the particular equivalent form has failed to be similar to the other form. Both the equivalent-form and the split-half approaches to measuring reliability assume that the concept being measured is unidimensional; they measure homogeneity or inter-item consistency, rather than stability over time. Reliability is a necessary condition for validity, but a reliable instrument may not be valid. For example, a purchase intention measurement technique may consistently indicate that 20 per cent of those sampled are willing to purchase a new product. Whether the measure is valid depends on whether 20 per cent of the population indeed purchases the product. A reliable but invalid instrument will yield consistently inaccurate results.

split-half method A method for assessing internal consistency by checking the results of one-half of a set of scaled items against the results from the other half. equivalent-form method A method that measures the correlation between alternative instruments, designed to be as equivalent as possible, administered to the same group of subjects.

276

PART THREE > PLANNING THE RESEARCH DESIGN

Validity The purpose of measurement is to measure what we intend to measure. Achieving this obvious goal is not, however, as simple as it sounds. Consider the student who takes a test (measurement) in a statistics class and receives a poor grade. The student may say: ‘I really understood that material because I studied hard. The test measured my ability to do arithmetic and to memorise formulas rather than measuring my understanding of statistics.’ The student’s complaint is that the test did not measure understanding of statistics, which was what the lecturer had intended to measure; it measured something else. One method of measuring the intention to buy is the gift method. Respondents are told that a drawing will be held at some future period for a year’s supply of a certain product. Respondents report which of several brands they would prefer to receive if they were to win. Do the respondents’ reports of the brands they would prefer to win necessarily constitute a valid measure of the brands they will actually purchase in the marketplace if they do not win the contest? Could there be a systematic bias to identify brands they wish they could afford rather than the brands they would usually purchase? This is a question of validity. Another example of a validity question might involve a media researcher who wonders what it means when respondents indicate they have been exposed to a magazine. The researcher wants to know if the measure is valid. The question of validity expresses the researcher’s concern with accurate measurement. Validity addresses the problem of whether a measure (for example, an attitude measure used in marketing) indeed measures what it is supposed to measure; if it does not, there will measurement validity

be problems. Students should be able to empathise with the following validity problem. Consider the controversy about police officers using radar guns to clock speeders. A driver is clocked at 120 km/h in an 80 km/h zone, but the same radar gun aimed at a house registers 60 km/h. The error occurred because the radar gun had picked up impulses from the electrical system of the police car’s idling engine. The house wasn’t speeding – and the test was not valid.

REAL WORLD SNAPSHOT

SPACE A PRIORITY WHEN YOU’RE UP IN THE AIR32

Airline travellers put ‘space’, especially the distance between seat backs and leg room, high on their list of preferences. Space becomes extremely important if a flight is several hours in length. Plog Travel Research found that when an airline gave people more leg room, they rated the meals more favourably – even though the meals did not change. In fact, a halo effect influenced the ratings of most other characteristics, not just the food. Were the measures of food quality and airline characteristics other than leg room valid?

ESTABLISHING VALIDITY validity The ability of a scale to measure what was intended to be measured.

Researchers have attempted to assess validity in many ways. They attempt to provide some evidence

face (or content) validity Professional agreement that a scale’s content logically appears to accurately reflect what was intended to be measured.

correlate with other measures of the same concept? Does the behaviour expected from my measure

of a measure’s degree of validity by answering a variety of questions. ‘Is there a consensus among my colleagues that my attitude scale measures what it is supposed to measure? Does my measure predict actual observed behaviour?’ The three basic approaches to establishing validity are face or content validity, criterion validity and construct validity. Face (or content) validity refers to the subjective agreement among professionals that a scale logically appears to accurately reflect what it purports to measure; the content of the scale appears to

CHAPTER 08 > MEASUREMENT

277

be adequate. So when it appears evident to experts that the measure provides adequate coverage of the concept, the measure has face validity. Clear, understandable questions such as ‘How many children do you have?’ generally are agreed to have face validity. In scientific studies, however, researchers generally prefer stronger evidence because of the elusive nature of attitudes and other marketing phenomena. For example, the OzTAM television rating system is based on the People Meter system, which mechanically records whether a sample household’s television is turned on and records the channel selection. If one of the viewers leaves the room or falls asleep, the measure is not a valid measure of audience. Researchers who wish to establish criterion validity attempt to answer the question: ‘Does my measure correlate with other measures of the same construct?’ Consider the physical concept of length. Length can be measured with tape measures, callipers, odometers and other variations of the ruler. If a new measure of length were to be developed (for example, through laser technology), finding that the new measure correlated with the other measures of length (the criteria) could provide some assurance that the new measure was valid. Criterion validity may be classified as either concurrent validity or predictive validity, depending on the time sequence in which the new measurement scale and the criterion measure are correlated. If the new measure is taken at the same time as the criterion measure and is shown to be valid, then it has concurrent validity. Predictive validity is established when a new measure predicts a future event. The two measures differ only on the basis of a time dimension – that is, the criterion measure is separated in time from the predictor measure. A practical example of predictive validity is illustrated by a commercial research firm’s test of the relationship between a rough advertisement’s effectiveness (as determined, for example, by recall scores) and a finished advertisement’s effectiveness (also by recall scores). Ad agencies often test ‘animatic rough’, ‘photomatic rough’ or ‘live-action rough’ advertisements before developing actual finished advertisements. One marketing research consulting firm suggests that this testing has high predictive validity. Rough advertisement recall scores provide correct estimates of the final finished advertisement recall scores more than 80 per cent of the time.33 While face (content) validity is a subjective evaluation, criterion validity provides a more rigorous empirical test. Construct validity is established by the degree to which the measure confirms a network of related hypotheses generated from a theory based on the concepts. Construct validity is established during the statistical analysis of the data. Construct validity implies that the empirical evidence generated by a measure is consistent with the theoretical logic behind the concepts. In its simplest form, if the measure behaves the way it is supposed to in a pattern of intercorrelation with a variety of other variables, there is evidence of construct validity. For example, in a measurement of subjective happiness, researchers found this measure to be positively correlated to self-esteem, positive emotions and optimism.34 Subjective happiness was found to be negatively correlated with depression and neuroticism. This is a complex method of establishing validity and of less concern to the applied researcher than to the basic researcher.

Reliability versus validity Let’s compare the concepts of reliability and validity. A tailor using a ruler may obtain a reliable measurement of length over time with a bent ruler. A bent ruler cannot provide perfect accuracy, however, and it is not a valid measure. Thus, reliability, although necessary for validity, is not sufficient by itself. In marketing, a measure of a subject’s physiological reaction to a package (for example, pupil dilation) may be highly reliable, but it will not necessarily constitute a valid measure of purchase intention.

criterion validity The ability of a measure to correlate with other standard measures of the same construct or established criterion. construct validity The ability of a measure to provide empirical evidence consistent with a theory based on the concepts.

278

PART THREE > PLANNING THE RESEARCH DESIGN

The differences between reliability and validity can be illustrated by the rifle targets in Exhibit 8.22. Suppose an expert sharpshooter fires an equal number of rounds, first with a century-old rifle and then a modern rifle.35 The shots from the older gun are considerably scattered, but those from the new gun are closely clustered. The variability of the old rifle compared with that of the new one indicates it is less reliable. The target on the right illustrates the concept of a systematic bias influencing validity. The new rifle is reliable (because it has little variance), but the sharpshooter’s vision is hampered by glare from the sun. Although shots are consistent, the sharpshooter is unable to hit the bull’s eye. EXHIBIT 8.22 → RELIABILITY AND VALIDITY ON TARGET

Old rifle Low reliability (Target A)

sensitivity A measurement instrument’s ability to accurately measure variability in stimuli or responses.

New rifle High reliability (Target B)

New rifle sunglare Reliable but not valid (Target C)

Sensitivity The sensitivity of a scale is an important measurement concept, particularly when changes in attitudes or other hypothetical constructs are under investigation. Sensitivity refers to an instrument’s ability to accurately measure variability in stimuli or responses. A dichotomous response category, such as ‘agree or disagree’, does not allow the recording of subtle attitude changes. A more sensitive measure with numerous categories on the scale may be needed. For example, adding ‘strongly agree’, ‘mildly agree’, ‘neither agree nor disagree’, ‘mildly disagree’ and ‘strongly disagree’ will increase the scale’s

sensitivity

sensitivity. The sensitivity of a scale based on a single question or single item can also be increased by adding questions or items. In other words, because index measures allow for a greater range of possible scores, they are more sensitive than single-item scales.

REAL WORLD SNAPSHOT

CLICK, CLICK, CLICK36

Marketing metrics, as discussed earlier, are clearly increasingly being applied. The quantitative results can be extremely useful in helping to determine how much a marketing tactic is really worth. For example, when marketers use a banner ad on a website, the ad agency can monitor the way click-through rates change with varying ad placements and characteristics. For example, do click-throughs become more likely when the ad is placed in a content-consistent site, such as an ad for an airline in an island resort’s website? Using metrics like this, the ad agency might be able to inform clients how much an ad placement is worth. However, recent research suggests these numbers do not tell the whole story because they ignore consumer attitudes. A banner ad, for instance, may affect consumer awareness and attitudes even if a consumer does not click through to view the contents of the actual Web page. For example, when British Airways introduced its Club World perks for good customers, the click-through rate on banner ads may have been low, but a test showed that respondents who were exposed to the ads were approximately 20 per cent more likely to be aware of Club World than respondents who were not exposed to these ads. Thus, the click metrics did not tell the whole story. Similarly, leading financial institutions supplement their marketing metrics with attitudinal research to better understand which consumers are likely to become loyal. The attitudinal research helps firms to better understand the customer-based brand equity. In the end, behavioural metrics have not replaced attitudinal research. The two approaches complement each other quite well.

CHAPTER 08 > MEASUREMENT

279

Practicality Lastly, measures must be practical. Practical measures are shorter (have fewer items), while still being sensitive, and are easy to administer, timely and simple enough for respondents to understand. Results from practical measures should be easy to interpret.37 This is not always the case with some measures such as those sometimes found in psychology, which, for example, may measure personality, but whose results may not be practical for marketer researchers. Skill in designing practical measures is often due to questionnaire design and this is discussed in the next chapter.

SUMMARY DETERMINE WHAT IS TO BE MEASURED

We must know what we want to measure, and how it fits our research questions or hypotheses. Examination of published research and findings from past studies should help us determine what it is we want to measure. DETERMINE HOW IT IS TO BE MEASURED

In order to measure we must determine some rules of measurement, or operational definitions. This will determine the choice of rules of measurement, the number of measurement items to be used and the type of attitude scales. Operational definitions tell the researcher how to measure and what rules of measurement to apply. For example, our measure of social entrepreneurship near the start of this chapter is made up of composite additive score of several items. APPLY A RULE OF MEASUREMENT

There are four types of measuring scales that can be applied. Nominal scales assign numbers or letters to objects only for identification or classification. Ordinal scales arrange objects or alternatives according to their magnitudes in an ordered relationship. Interval scales measure order (or distance) in units of equal intervals. Ratio scales are absolute scales, starting with absolute zero, at which there is a total absence of the attribute. The type of scale determines the form of statistical analysis to use. DETERMINE IF THE MEASURE CONSISTS OF A NUMBER OF MEASURES

Index (composite) measures often are used to measure complex concepts with several attributes. Asking several questions may yield a more accurate measure than basing measurement on a single question. These measures need to be correlated so that they are internally consistent.

08

DETERMINE THE TYPE OF ATTITUDE AND THE SCALE TO BE USED TO MEASURE IT

Many methods for measuring have been developed, such as ranking, rating, sorting and choice techniques. One class of rating scales, category scales, provides several response categories to allow respondents to indicate the intensity of their attitudes. The simplest attitude scale calls for a ‘yes/no’ or ‘agree/disagree’ response to a single question. The Likert scale uses a series of statements with which subjects indicate agreement or disagreement. The responses are assigned weights that are summed to indicate the respondents’ attitudes. The semantic differential uses a series of attitude scales anchored by bipolar adjectives. The respondent indicates where his or her attitude falls between the polar attitudes. Variations on this method, such as numerical scales and the Stapel scale, are also used. The Stapel scale puts a single adjective in the centre of a range of numerical values from 13 to 23. Graphic rating scales use continua on which respondents indicate their attitudes. Constant-sum scales require the respondent to divide a constant sum into parts, indicating the weights to be given to various attributes of the item being studied. Several scales, such as the behavioural differential, have been developed to measure the behavioural component of attitude. People often rank order their preferences. Thus, ordinal scales that ask respondents to rank order a set of objects or attributes may be developed. In the paired-comparison technique, two alternatives are paired and respondents are asked to pick the preferred one. Sorting requires respondents to indicate their attitudes by arranging items into piles or categories. The accuracy of answers to sensitive questions may be enhanced by using randomised response questions and calculations based on probability theory. The researcher can choose among a number of attitude scales. Choosing among the alternatives requires considering several questions, each of which is generally answered by comparing the advantages of each alternative to the problem definition.

280

PART THREE > PLANNING THE RESEARCH DESIGN

A monadic rating scale asks about a single concept. A comparative rating scale asks a respondent to rate a concept in comparison with a benchmark used as a frame of reference. Scales may be balanced or unbalanced. Unbalanced scales prevent responses from piling up at one end. Forced-choice scales require the respondent to select an alternative; non-forced-choice scales allow the respondent to indicate an inability to select an alternative. EVALUATE THE MEASURE

Measuring instruments are evaluated by reliability, validity and sensitivity. Reliability refers to the measuring instrument’s ability

to provide consistent results in repeated uses. Validity refers to the degree to which the instrument measures the concept the researcher wants to measure. Sensitivity is the instrument’s ability to accurately measure variability in stimuli or responses. Reliability may be tested using the test–retest method, the split-half method or the equivalent-form method. The three basic approaches to evaluating validity are to establish content validity, to establish criterion validity and to establish construct validity. The sensitivity of a scale can be increased by allowing for a greater range of possible scores. Measures must also be practical and easy to use if respondents are to understand and interpret.

KEY TERMS AND CONCEPTS attitude attribute balanced rating scale behavioural differential category scale choice comparative rating scale concept conceptual definition constant-sum scale construct validity criterion validity

equivalent-form method face (or content) validity forced-choice rating scale graphic rating scale hypothetical construct image profile index (or composite) measure interval scale ipsative scale Likert scale monadic rating scale nominal scale

non-forced-choice rating scale normative scale numerical scale operational definition ordinal scale paired comparison randomised response questions ranking rating ratio scale reliability reverse coding

scale semantic differential sensitivity sorting split-half method Stapel scale summated scale test–retest method unbalanced rating scale validity

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 Read the vignette at the start of this chapter. What do you think are the advantages and disadvantages of the new method to measure newspaper readership? 2 Consider the following research questions or hypotheses. What variables need to be measured in each? a Do consumers who spend more time and who are committed to LinkedIn spend more money on education? b What are the characteristics of consumers who belong to the frequent shopper segment of EB Games? c The temperature inside the store will determine how much time consumers spend inside the store, such that when the store is hot, they will spend less time shopping than when the store is cold. d Product price is related positively to product quality. e Consumer perceptions of value are negatively related to the likelihood of switching mobile phone providers. 3 What is the difference between a conceptual definition and an operational definition? 4 A scholarship officer at a university is considering an applicant for a travel scholarship to an overseas university. He or she

uses the following processes. If the student has a grade point average in the 90th percentile among the admission, five points are added. If the student had a university entrance score in the 90th percentile, five points are added. If a student comes from a household whose combined income is less than $50 000 a year for the last three years, five points are added. If the student is from the country or rural area, five points are added. What variables are being used to determine eligibility for the travel scholarship? What would be your approximate score on this scale? Is this best described as an index or a composite scale? Explain your response. 5 Classify each of the following measurement scales as either a normative scale or an ipsative scale: a Likert scale b Best–Worst scale c Constant-sum scale. 6 What descriptive statistics are allowable with nominal, ordinal and interval scales? 7 Discuss the differences between validity and reliability.

CHAPTER 08 > MEASUREMENT

d e f g

8 What measurement problems might be associated with the People Meter method of audience ratings? Would any special problem arise in rating children’s programs? 9 Why might a researcher wish to use more than one question to measure satisfaction with a particular aspect of retail shopping? 10 Comment on the validity and reliability of the following: a A respondent’s report of an intention to subscribe to Choice magazine is highly reliable. A researcher believes this constitutes a valid measurement of dissatisfaction with the economic system and alienation from big business. b A general-interest magazine claims that it is a better advertising medium than television programs with similar content. Research has indicated that for a soft drink and other test products, recall scores are higher for the magazine ads than for 30-second television advertisements. c A respondent’s report of frequency of magazine reading consistently indicates that she regularly reads Property Report and Gourmet, and never reads Cosmopolitan. 11 Indicate whether the following measures use a nominal, ordinal, interval or ratio scale: a prices on the stock market b marital status, classified as ‘married’ or ‘never married’ c whether a respondent has ever been unemployed d academic rank (lecturer, senior lecturer, associate professor, professor) e grades (HD, D, C, P and N) f blood alcohol content g number of Facebook friends h the colour of one’s eyes. 12 Define each of the following concepts, and then operationally define each one: a a good golfer b television audience for The X Factor c purchasing intention for an Apple watch

281

consumer involvement with fashion a shopaholic service standard at a fast-food restaurant the ‘Australian dream’.

13 Education is often used as an indicator of a person’s socioeconomic status. Historically, the number of years of schooling completed has been recorded by the Australian Bureau of Statistics as a measure of education. Critics say that this measure is no longer accurate as a measure of education. Comment. 14 Many Internet surveys want to know the demographic characteristics of their respondents and how technologically sophisticated they are. Create a conceptual definition of ‘technographics’ and use it. 15 Two academic researchers create a psychographic scale to measure travel behaviour. Without measuring reliability or validity of the measuring instrument, they submit an article to a scholarly publication for review. Is the measure still useful? 16 What is an attitude? Why do businesses, governments and notfor-profits place so much emphasis on attitudes? 17 Distinguish between rating and ranking. Which is a better attitude measurement technique? Why? 18 In what type of situation would the choice technique be most appropriate? 19 In what type of situation would the sorting technique be most appropriate? 20 What advantages do numerical scales have over semantic differential scales? 21 In the following table, look at the b columns, which represent belief scores for two competing Internet providers. The e column represents the evaluation of the consequences. Compute the attitude score for each competitor and comment on the competitive positioning of each.

Evaluation of attribute (Unlikely 1 2 3 4 5 Likely) b columns Attribute

e column

iiNet

Telstra

Evaluation of the Consequences

1 Connection will be established successfully every time.

3

5

13

2 The connection will be established speedily.

4

3

12

3 The connection will dropout in the middle of the session.

3

3

23

4 The price (monthly fee) will be high.

2

5

21

(Very bad 23 22 0 12 13)

282

PART THREE > PLANNING THE RESEARCH DESIGN

22 Should a Likert scale ever be treated as though it had ordinal properties? 23 In each of the following identify the type of scale and evaluate it: a A New Zealand government representative’s questionnaire sent to constituents: Do you favour or oppose a constitutional amendment to balance the budget? ❏ Favour ❏ Oppose ❏ Don’t know b An academic study on consumer behaviour: Most people who are important to me think I: 23

13

Definitely should not buy

Definitely should buy

[test brand] some time during the next week. c A psychographic statement: I shop a lot for specials. Strongly agree

Moderately agree

Neither agree or disagree

Moderately disagree

Strongly disagree

5

4

3

2

1

24 What are the advantages of a slider scale? 25 If a Likert summated scale has 10 scale items, do all 10 items have to be phrased as either positive or negative statements, or can the scale contain a mix of positive and negative statements? 26 If a semantic differential has 10 scale items, should all the positive adjectives be on the right and all the negative adjectives on the left? 27 A researcher thinks many respondents will answer ‘Don’t know’ or ‘Can’t say’ if these options are printed on an attitude scale along with categories indicating level of agreement. The researcher does not print either ‘Don’t know’ or ‘Can’t say’ on the questionnaire because the resulting data would be more complicated to analyse and report. Is this proper?

ONGOING PROJECT MEASURING SOMETHING IN YOUR RESEARCH STUDY? CONSULT THE CHAPTER 8 PROJECT WORKSHEET FOR HELP

Download the Chapter 8 project worksheet from the CourseMate website. It outlines a series of steps taken in this chapter to develop

a measure (or measures). Make sure that your measurement is reliable, valid and practical.

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to log in to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ interactive quizzes

☑ crosswords on key concepts ☑ research activities ☑ videos.

☑ case projects

WRITTEN CASE STUDY 8.1 MEASURING AUSTRALIA’S SOCIAL PROGRESS38 Rather than just using an economic measure of progress, Gross Domestic Product (GDP) or GDP per capita, Sydney Morning Herald/Age–Lateral Economics Index of Australia’s Wellbeing takes in social, environmental, educational and health outcomes. Beginning with net national income (NNI), which is essentially GDP adjusted for the depreciation of physical capital, the index is adjusted to reflect the depletion of non-renewable resources by mining and the discovery of

new mineral resources and allows for any degradation of agricultural land. The index is further adjusted by estimating a cost to represent the risk of substantial climate change in the future. Human capital or ‘Australian know how’ is also included in the index and is measured by the risk of early childhood deprivation, school results when compared with other countries and post-secondary education.

CHAPTER 08 > MEASUREMENT

Evidence from surveys is also included in the index such as happiness or life satisfaction. Indices built from these data rarely move. The data suggest that, on average for those Australians on very low incomes, a $6000 increase in income is associated with increased self-reported wellbeing by one percentage point. By contrast, the same increment in happiness requires more than $100 000 for a household earning above $100 000 a year. These relativities enable us to adjust aggregate income growth in the economy for its efficacy in improving people’s reported wellbeing. When national income flows to lower-income households it funds more improvement in wellbeing than when the same amount goes to higher-income families. Overall the figures for Australia were as follows in 2010: • Total wellbeing: $1190b year ended 30 June 2010 • Income: Consists of the nation’s net disposable income. 1$1010.9b • Environment: Includes the depletion in the value of natural resources and the costs of climate change. 2$1.4b • Human capital: The value of human capital depends on

283

early childhood development, education and innovation minus the long-term unemployed. 1$346.5b • Job satisfaction: The value of our job satisfaction depends on the degree of unemployment, under-employment and overwork. 2$35.4b • Health: Considers the value of changes to life expectancy, preventable hospitalisation, mental health and obesity. 2$129.2b • Inequality: The index adjusts for the growing income disparity between rich and poor. 2$0.2b

QUESTIONS

1 How well do you think the index represented social progress in Australia in 2010? 2 What are some validity issues that might be associated with this measure? 3 Can a single index or number represent total wellbeing in Australia?

WRITTEN CASE STUDY 8.2 NEW ZEALAND CONSUMER CONFIDENCE EDGES UP IN 201539 The Nielsen Global Consumer Confidence survey is based on consumers’ confidence in the job market, status of their personal finances and their readiness to spend. The survey for the second quarter of 2015 showed an index score of 102 – an increase of one point from the fourth quarter of 2014 and an increase of two points from a year ago. New Zealand remained ahead of both Australia (95) and global (97) confidence. The Nielsen Global Online Consumer Confidence Survey tracks consumer confidence, major concerns and spending intentions among more than 3000 Internet consumers across 56 countries. Consumer confidence levels above and below a baseline of 100 indicate degrees of optimism and pessimism. Fifty-nine per cent of New Zealand respondents said local job prospects were good or excellent, a four percentage point increase from the last quarter of 2014. Personal finances also improved with 62 per cent having positive perceptions of their finances, up from 59 per cent in the previous quarter. Whether it

is a good time to buy things over the next 12 months remained at 44 per cent. New Zealanders’ discretionary spending intentions remained steady or increased in the first quarter across lifestyle categories measured. About three-in-10 respondents (29 per cent) planned to spend on holidays. One-in-five New Zealanders planned to spend on new clothes (21 per cent), out of home entertainment (20 per cent) and new technology products (19 per cent).

QUESTIONS

1 How is the measure of consumer confidence operationalised? 2 How well did this reflect the actual economic conditions of consumers in New Zealand in late 2015? 3 How predictive do you think consumer confidence surveys are of economic changes?

ONGOING CASE STUDY MOBILE PHONE SWITCHING AND BILL SHOCK After completing their first survey research with AusBargain, David, Leanne and Steve were asked by the client to develop a measure of bill shock – that is an unexpected charge on a mobile account which leads to customer stress. ‘How do we

go about measuring this?’ asked Leanne? David also thought about this issue. ‘Do we need one item or a number of items to do so?’ ‘How do we know in the end our measure is valid?’, Steve added.

284

PART THREE > PLANNING THE RESEARCH DESIGN

QUESTIONS

1 Do a quick online search at your university library on bill shock. Develop a conceptual and operational definition of this measure.

2 Should one item or a set of items be used? How would you combine a set of items, if you think you needed more than one? Justify your answer. 3 How would you evaluate the validity of your measure?

NOTES 1

Enhanced Media Metrics Australia (2015) ‘What is emma?’, Accessed at: http://emma. com.au/what-is-emma December 2015. 2 Heffernan, Madeleine (2015) ‘SMH still holds first place as readership grows’, The Australian Financial Review, 16 June, p. 21. 3 Leroux, Kelly (2005) ‘What is Social Entrepreneurship? A look at budget trends of metro Detroit social service agencies’, American Review of Public Administration, 35(4), pp. 350–62. 4 Leroux, Kelly (2005) ‘What is Social Entrepreneurship? A look at budget trends of metro Detroit social service agencies’, American Review of Public Administration, 35(4), pp. 350–62 5 This definition is adapted from Kerlinger, Fred N. (1973) Foundations of behavioral research, New York: Holt, Rinehart and Winston, p. 31. 6 Kerlinger, Fred (1973) Foundations of behavioral research, New York: Holt, Rinehart and Winston, p. 41. 7 Lee, S., Florida, Richard & Acs, Zoltan (2004) ‘Creativity and Entrepreneurship: A Regional Analysis of New Firm Formation’, Regional Studies 38(8), pp. 879–91. 8 Morgan, N. & Rego, L. (2006) The Value of Different Customer Satisfaction and Loyalty Metrics in Predicting Business Performance’, Marketing Science 25(5): 426-439; Stuchbery, Peter (2010) ‘Tracking key market performance indicators with confidence’, Research News (October), Accessed at http://www.amsrs.com.au, February 2016. 9 Wolman, Benjamin, ed. (1973) Dictionary of behavioral science, New York: Van Nostrand Reinhold, p. 333. 10 Rossiter, John (2002) ‘The C-OAR-SE procedure for scale development in marketing’, International Journal of Research in Marketing, 19(4), pp. 305–35. 11 Anhalt, Karen (1988) ‘Whiskas campaign recruits a tiny tiger’, Advertising Age International, 19 October, p. 41. 12 Likert, Rensis (1931) ‘A technique for the measurement of attitudes’, Archives of Psychology, 19, pp. 44–53. 13 Brown, Stephen & Swarts, Teresa (1989) ‘A gap analysis of professional service quality’, Journal of Marketing, April, p. 95. 14 Adapted from McQuarrie, Edward F. & Munson, J. Michael (1987) ‘The Zaichkowsky Personal Involvement Inventory: Modification and Extension’, in NA – Advances in Consumer Research Volume 14, eds. Melanie Wallendorf and Paul Anderson, Provo, UT: Association for Consumer Research, pp. 36–40. 15 Adapted from: Maklan, S., & Klaus, P. (2011) ‘Customer experience: Are we measuring the right things?’, International Journal of Market Research, 53(6), 771–792. See also: Klaus, P. and S. Maklan (2013). ‘Towards a better measure of customer experience’, International Journal of Market Research 55(2): 227–246. 16 Derived from research by Suri, R. & Monroe, Kent (2003). ‘The Effects of Time Constraints on Consumers’ Judgments of Prices and Products’, Journal of Consumer Research 30(1), pp. 92–104. 17 Osgood, Charles, Suci, George & Tannenbaum, Percy (1957) The measurement of meaning, Urbana: University of Illinois Press. Seven-point scales were used in the original work; however, subsequent researchers have modified the\scale to have five points, nine points and so on. 18 Jones, Richard & Cocke, Sheila (1981) ‘A performance evaluation of commuter airlines: The passengers’ view’, Proceedings: Transportation Research Forum, 22, p. 524. Reprinted with permission. 19 Menezes, Dennis & Elbert, Norbert (1979) ‘Alternate semantic scaling formats for measuring store image: An evaluation’, Journal of Marketing Research, February, pp. 80–7. 20 Menezes, Dennis & Elbert, Norbert (1979) ‘Alternate semantic scaling formats for measuring store image: An evaluation’ Reprinted note same article different pub details by permission of the American Marketing Association 21 See: Juster, F.T. (1966) ‘Consumer buying intentions and purchase probability: An experiment in survey design’, Journal of the American Statistical Association, 61, pp. 658–96.; Adapted from Brennan, M. & Esslemont, D. (1994). ‘The accuracy of the Juster scale for predicting purchase rates of branded, fast-moving consumer goods’, Marketing Bulletin, 5 (Research note 1), pp. 47–52.

22 Finn, A. & Louviere, J. J. (1992) ‘Determining the appropriate response to evidence of public concern: The case of food safety’, Journal of Public Policy and Marketing, 11(1), p. 12. 23 Marley, A. A. J. & Louviere, J. J. (2005) ‘Some probabilistic models of best, worst, and best–worst choices’, Journal of Mathematical Psychology, 49(6), pp. 464–80. 24 D’Alessandro, Steven & Winzar, Hume (2010) Do students know best when it comes to assessment? A best/worst analysis of assessment choices, Proceedings of the Australian New Zealand Marketing Educators’ Conference, Christchurch, New Zealand. 25 Sources: Fishbein, Martin (1967) Readings in attitude theory and measurement, New York: Wiley; Fishbein, Martin & Ajzen, Icek (1975) Belief, attitude, intention and behavior: An introduction to theory and research, Reading MA: Addison Wesley; Ajzen, Icek and Fishbein, Martin (1980) Understanding & attitudes and predicting social behavior, Englewood Cliffs NJ: Prentice-Hall; Ajzen, Icek & Fishbein, Martin (1977) ‘Attitudebehavior relations: A theoretical analysis and review of empirical research’, Psychological Bulletin, 84, pp. 888–918; For a comprehensive collection of resources, links, and examples of the Theory of Planned Behaviour, visit Icek Ajzen’s homepage at http:// people.umass.edu/aizen, accessed October 2012. 26 D’Alessandro, Steven, Gray, David & Carter, Leane (2015) ‘Push-pull factors in brand switching’, Proceedings of the Australian New Zealand Marketing Educators’ Conference, Adelaide, South Australia, 2015. 27 Van Westendorp, Peter (1976) ‘NSS Price Sensitivity Meter – A new approach to consumer perception of price’, Proceedings of the ESOMAR Congress, Vienna, 1976. 28 Adapted from: Reichheld, F. F, & Sasser, W. E. (1990). ‘Zero defections: quality comes to services,’ Harvard Business Review, 68(5), 105. 29 See for example: Sharp, B. (2008) ‘Net Promoter Score Fails the Test. Marketing Research’, 20(4), 28-30; Keiningham, T. L. L. Aksoy, B. Cooil, B, Andreassen, T.W. & Williams, L. (2008) ‘A holistic examination of Net Promoter,’ Journal of Database Marketing & Customer Strategy Management 15(2): 79–90.; de Haan, E., Verhoef, PC & Wiesel, T. (2015) ‘The predictive ability of different customer feedback metrics for retention’, International Journal of Research in Marketing 32(2): 195–206; Stewart, D., & Worthington, S. (2015) ‘The problems with net promoter scores: how to better measure customer advocacy’, Marketing Magazine, Accessed January 2016 from https://www.marketingmag.com.au/hubs-c/problems-net-promoter-scores-bettermeasure-customer-advocacy; Stewart, D., & Worthington, S. (2015) ‘An open letter to CMOS: why you shouldn’t trust net promoter scores’, Marketing Magazine. Accessed January 2016 from https://www.marketingmag.com.au/hubs-c/open-lettercmos-shouldnt-trust-net-promoter-scores. 30 Hill, Nigel, Roche, Greg, Allen, Rachel (2007) Customer Satisfaction: The customer experience through the customer’s eyes, London: Cogent Publishing. p. 7. 31 Wells, Chris (1980) ‘The war of the razors’, Esquire, February, p. 3. 32 Grossman, Cathy (1995) ‘Passenger-jet designers ponder pie-in-the-sky idea’, USA Today, 18 April, p. 3D. 33 J, Karen (2007) ‘Brand alignment a winning strategy for British Airways’, Advertising Age (8 Oct.) Accessed at http://adage.com/article/btob/brand-alignment-a-winningstrategy-british-airways/268688, February 2016. 34 Lyubormirsky, Sonja & Lepper, Heidi (1999) ‘A measure of subjective happiness: Preliminary reliability and construct validation’, Social Indicators Research, 46(2), pp. 137–44. 35 Cox, Keith & Enis, Ben (1986) The marketing research process, Pacific Palisades, CA: Goodyear, pp. 353–55; Kerlinger, Fred (1973) Foundations of behavioral research, 3rd edn, Fort Worth: Holt, Rinehart and Winston. 36 Howel, N (2005) Looking deeper. New Media Age, December, 1, 24-25, Taylor, S. A. and Hunter G .L (2007) Understanding brand equity in financial markets, Journal of Services Marketing 21(4): 241–252. 37 Bergkvist, L., & Rossiter, J. R. (2007) 'The predictive validity of multiple-item versus single-item measures of the same constructs', Journal of marketing research, 44(2), 175–184. 38 Gruen, Nicolas, (2011) ‘Putting a figure on social progress’, Sydney Morning Herald 8 December, Online: http://www.factiva.com, accessed January 2013. 39 Nielsen Media Research (2015) ‘New Zealand consumer confidence edge up one point in Q1 2015’, accessed at http://www.nielsen.com/nz/en/insights/news/2015/nzconsumer-confidence-edges-up-one-point-in-q1-2015.html, on 11 December 2015.

» 8A APPENDIX

CONJOINT ANALYSIS: MEASURING CONSUMER UTILITY

BACKGROUND TO CONJOINT ANALYSIS What are the important product features for my brand? If we improve the quality of our ingredients will customers then be prepared to pay a little more? How much more? Do different product features appeal to different people and in different circumstances? A cosmetics company plans to add a new line of hair shampoo, but wants to know how much the existing range will be cannibalised, and how much share will be taken from competitors. If the city introduces a more frequent train service, will there be a reduction in traffic congestion on the freeway? These are questions that senior managers struggle with regularly. Understanding such issues affects all aspects of the marketing process: product design, target market selection and communications objectives, distribution decisions and, of course, pricing. This appendix deals with a general class of consumer utility modelling known as compensatory models where an unattractive product feature, such as price, may be compensated for by an attractive feature, such as convenience. We can think about compensatory situations using a linear-additive model such as we use with multiple regression. Thinking back to your Business Statistics class, or ahead in Chapter 15, the multiple regression equation is: Y 5 b0 1 b1X1 1 b2X2 1 b3X3 1 … 1 bnXn or, more simply: Y 5 SbiXi While it may not have been obvious, we used the same model in Chapter 8 where we examined the multi-attribute attitude-to-the-object model. In that model the Xs are company service features, and the bs represent the importance or value of each service feature for one person. We can just as easily extend this model to the domain of human judgement of products. Let’s say that the Xs in the model are product features, and the bs represent the importance or value of each product feature. Then it should be possible to estimate the overall evaluation of a product for that person. These evaluations then can be compared with evaluations of other competitive products and among other people. There are three components to the model: overall evaluation (Y) of the product, the product features (X) and the importance of each of the features (b). If we know two of the three components, we can estimate the value of the third component. That is, if we know the brand attributes (X) and how important they are (b) then we can estimate, or forecast, the overall brand evaluation (Y). This can be regarded as a compositional approach to utility measurement because overall evaluation is built up from the product features. Alternatively, if we know the overall brand evaluations (Y) for a APPENDIX 8A > CONJOINT ANALYSIS: MEASURING CONSUMER UTILITY

285

286

PART THREE > PLANNING THE RESEARCH DESIGN

reasonable cross-section of possible combinations of features, such as brand, price etc. (X), then we can use multiple regression techniques (discussed in Chapter 15) to estimate the relative importance of the product features (b). This can be regarded as a decompositional approach because attribute importance is inferred from overall judgements. For the Fishbein models we examined techniques for the compositional approach. In this section we examine the decompositional approach to measuring consumer utility. Where a compositional approach to deriving brand utility requires respondents to describe the relative importance of product attributes, from which overall attitude evaluations may be computed, the decompositional approach works in the opposite direction: respondents give their overall evaluations of an object, from which the researcher decomposes the scores to find the relative importance of individual attributes.1 The techniques involved tend to be more natural for respondents, but may require some technical and statistical expertise for the researcher. In general terms, conjoint analysis proceeds like this. A set of attribute bundles (products) is conjoint analysis

presented to the respondent who then makes some sort of judgement about those products. The products are presented in a particular pattern designed by the researcher. The pattern of responses is compared with the pattern of stimuli, so that the researcher may estimate what attributes were most important in deriving the responses. The nature of the evaluation task may vary – objects may be presented in pairs and the respondent must choose which one is better; objects may be presented all together and the respondent must rank them most attractive to least attractive; objects may be presented one at a time and the respondent must give an overall evaluation on a rating scale; and objects may be presented only a few at a time and the respondent must choose the one to buy. For each

conjoint analysis A range of techniques for inferring the relative importance of product attributes by decomposing overall evaluations of different patterns of stimuli.

of these approaches there are different assumptions about human judgement and choice, different workloads and different levels of realism for respondents, and different levels and complexity of statistical analysis for researchers. In this section we will look at the two most popular approaches to conjoint analysis: ratings-based conjoint and choice-based conjoint. We conclude with an explication of a relatively new approach to conjoint analysis using best–worst scaling.

RATINGS-BASED CONJOINT Consider the regression model we looked at earlier: Y 5 SbiXi We can see that it is made up of three pieces of information: Y (overall evaluations), Xi (product attribute levels) and bi (importance weights of product attributes). Recall that in a multi-attribute attitude model we ask respondents to give us both Xi and bi, from which we can calculate Y. In ratingsbased conjoint, we give people examples Xi and ask for their evaluations (Y). With those evaluations we can estimate the values for bi using multiple regression. In the example shown in Exhibit 8A.1, three attributes – distance to beach, distance to night-life and quality rating – are binary variables, so they can be represented as binary dummy variables; for example, distance to beach is 0 if 5 minutes and 1 if 15 minutes. There are four levels of price, which is ratio scaled, so here we might represent price as the number of thousands of dollars. The results of a multiple regression of holiday apartment attributes against Anne’s ratings are presented in Table 8A.2A. Results for Ben’s ratings are presented in Table 8A.2B.

APPENDIX 8A > CONJOINT ANALYSIS: MEASURING CONSUMER UTILITY

Consider this situation: you and some friends plan to share a holiday apartment near a beach resort for a week of the summer holidays. A catalogue of holiday apartments lists the following features: • walking distance from a surfing beach (within 5 minutes, about 15 minutes) • walking distance from night-life (within 10 minutes, about 20 minutes) • quality rating (3-star, 4-star) • price per week ($595, $735, $945, $1085). If holiday apartments are described only in terms of the first three attributes, which have two levels each, and of price with four levels, then we have 2 3 2 3 2 3 4 5 32 possible combinations. This may be too many for any one person to judge and give valid, well-considered responses without becoming fatigued and giving random answers. Fortunately, researchers in experimental design have come up with ways of reducing the number of experimental treatments needed to get data. A fractional factorial array is an TABLE 8A.1 »

arrangement of all attributes of an experiment, in this case product features, designed to be able to measure the influence of each attribute independently of the other attributes. One possible fractional factorial design is presented in Table 8A.1. Note that each attribute level appears an equal number of times, and each attribute level appears with each level of the other attributes an equal number of times. Now the task for each respondent is to give a rating score for each ‘profile’ or example of a holiday apartment, say a score from 1 to 10, with 10 for the most attractive offering and 1 for the least attractive and every other profile somewhere in between. Alternatively, respondents may be asked to indicate their ‘likelihood of purchase’ on a similar 1 to 10 rating scale marked from 0% to 100%. This gives us a pattern of responses to the pattern of stimuli. We then discover how important each product attribute is for each person using a straightforward multiple regression. Table 8A.1 also shows ratings from two students, Anne and Ben, of the profiles.

FRACTIONAL FACTORIAL DESIGN FOR HOLIDAY APARTMENTS CONJOINT STUDY

Apartment #

Walking distance to beach

Walking distance to night-life

Quality rating

Price per week

Anne’s ratings

Ben’s ratings

#01

15 min

10 min

4-star

$ 595

10

3

#02

5 min

10 min

3-star

$ 735

9

10

#03

5 min

10 min

4-star

$ 945

7

7

#04

15 min

10 min

3-star

$ 1085

3

1

#05

15 min

20 min

3-star

$ 595

5

4

#06

15 min

20 min

4-star

$ 1085

1

2

#07

5 min

20 min

3-star

$ 1085

2

6

#08

5 min

20 min

3-star

$ 735

7

8

#09

5 min

10 min

3-star

$ 945

6

5

#10

15 min

10 min

4-star

$ 735

8

4

#11

15 min

20 min

3-star

$ 945

4

3

#12

5 min

20 min

4-star

$ 595

7

9

First we refer to the R-squared statistic and the significance of the F-statistic to check that the regression model gives us a reasonable fit. We may decide to discard any respondents who made random or nonsense ratings. Then we can check the regression coefficients.

287

← EXHIBIT 8A.1 EXAMPLE OF RATINGSBASED CONJOINT

288

PART THREE > PLANNING THE RESEARCH DESIGN

TABLE 8A.2A » MULTIPLE REGRESSION OF HOLIDAY APARTMENT ATTRIBUTES AGAINST ANNE’S RATINGS

Regression statistics Multiple R

0.96

R-square

0.92

Adjusted R-square

0.87

Std. error

1.01

Observations

12 ANOVA d.f.

SS

MS

F

Sig.-F

Regression

4

79.080

19.770

19.303

0.001

Residual

7

7.170

Total

11

86.250

1.024

Coefficients

Std. error

t-stat

p-value

Intercept

16.84

1.43

11.80

0.00

Beach

21.17

0.58

22.00

0.09

Night-life

22.83

0.58

24.85

0.00

0.17

0.58

0.29

0.78

210.92

1.55

27.04

0.00

Rating Price

TABLE 8A.2B » MULTIPLE REGRESSION OF HOLIDAY APARTMENT ATTRIBUTES AGAINST BEN’S RATINGS

Regression statistics Multiple R

0.94

R-square

0.89

Adjusted R-square

0.83

Std. error Observations

1.19 12 ANOVA

Regression

d.f.

SS

MS

F

Sig-F

4

79.747

19.937

14.069

0.002

1.417

Residual

7

9.920

Total

11

89.667

Coefficients

Std. error

t-stat

p-value

12.16

1.68

7.24

0.00

274.67

0.69

276.79

0.00

0.33

0.69

0.49

0.64

0.00

0.69

275.75

1.82

Intercept Beach Night-life Rating Price

0.00 273.15

1.00 0.02

APPENDIX 8A > CONJOINT ANALYSIS: MEASURING CONSUMER UTILITY

We interpret Anne’s ratings as follows: Anne’s score out of 10 starts with 16.84 and then take away 1.17 points if the apartment is 15 minutes from the beach, and take away 2.83 points if the apartment is 20 minutes from the nightlife, add 0.17 points if the apartment is 4-star instead of 3-star, and then take away 10.92 points for each thousand dollars of rent. We interpret the regression for Ben’s ratings as follows: Ben’s score out of 10 starts at about 12 and then take away 4.67 points if the apartment is not less than 5 minutes from the beach, then add just 0.33 points if the apartment is 20 minutes from the nightspots, add nothing due to the quality rating of the apartment, and then take away 5.75 points for each thousand dollars of rent. It would seem that Ben is most concerned that he is located close to the beach, and, of course, price is important to him. Anne seems to be even more budget conscious, and while she’d like to be close to the beach she’d much rather be closer to the resort nightspots. Neither seems concerned much with the quality of the apartment itself.

Advantages of ratings-based conjoint Ratings-based conjoint analysis is the most established of the conjoint techniques. It is reasonably reliable and relatively easy to do. Many standard statistical packages have ratings-based conjoint analysis routines, which allow design of an appropriate orthogonal array, and the design of a questionnaire and a regression procedure so that many respondents may be processed simultaneously. These are available in such statistical packages as SAS, SPSS (not the student version), MiniTab and Systat. With a little effort researchers may analyse ratings-based conjoint data using a spreadsheet program, such as OpenOffice Calc, Apple Numbers or Microsoft Excel. Output is familiar and easy to interpret for anyone who has done a little multiple regression analysis. Typically, the regression results are just the beginning for the marketing researcher. With a utility formula for each person, we have the tools to compare the total utility scores for product modifications and play ‘what if’ games: what if we offered a less feature-rich model but with a lower price? We can even make forecasts of likely scores for products that don’t exist. These, in turn, can be used to make predictions of market share and, with a bit of intermediate-level algebra, estimate price-elasticity of demand. With four price points as presented in this example, we could estimate the price coefficients as a quadratic model allowing for only very small changes in utility score at lower prices, but as prices rise then utility falls at an increasing rate.

Disadvantages of ratings-based conjoint Most of us rarely give a rating out of 10 to the products and services that we think about buying. We’re more likely to simply decide that one option is the most attractive of all that are available. As such, ratings are an unnatural task, and difficult for many people to do without some practice. As with most ratings scales, people often don’t use the full range of the available scale. If all options presented are acceptable then a respondent may not give any option a score lower than the notional mid-point. Tied scores must be interpreted as of equal value when, in many cases, one option should be considered to be superior. Analysis of ratings-based conjoint data assumes that people make their evaluations using a linear scale and that the scale is in the same dimensions as the one imposed by the researcher (in the example here, we assume that people really do make their judgements in a range from 1 to 10). In truth, we have no reason to believe that judgements are linear or independent or rescaled to a score out

289

290

PART THREE > PLANNING THE RESEARCH DESIGN

of 10 and, because such data are, at best, interval scaled, we certainly cannot interpret a score of 8 as twice as much as a score of 4. Similarly, because people use response scales differently, the parameter estimates for different people have different meanings. Importantly, ratings are interpreted as independent of the other profiles offered. This also is not really true – most product evaluations are made on a comparative basis. Ideally, we want a measurement and analysis system that does not impose response constraints on people, and that does not make unrealistic assumptions about the nature of responses.

CHOICE-BASED CONJOINT Choice-based conjoint analysis techniques recognise an important issue in evaluations of objects (products) – a high score is no guarantee that a product will be chosen. Consider a simple example: two different courses are available to you as a university elective course. Both look interesting and useful. Student evaluations of the courses from last year are published in the handbook. Option A scores an average of 89 per cent, while option B scores an average of 91 per cent. Which option will you choose? In our classroom experiments most students choose option B: it typically earns more than 75 per cent share even though the student evaluation scores are nearly identical. The theory relating to choice-based conjoint analysis rests upon the work of economist Daniel McFadden, who won the Nobel Prize for his contributions to the study of evaluation and choice. McFadden showed that we don’t need arbitrary ratings to model choice; we just need to know how often one option is chosen over other options. The probability of choice of a certain option is the utility score for that option divided by the sum of the utility scores for all options. The utility scores are the natural exponents of scores, which are a function of the attributes of the available options. That is: Pi 5 Ui / SUi

where: Ui 5 exp(bi Xi) Pi 5 probability of choice Ui 5 utility of object ‘i’ Xi 5 level of attribute ‘j’ bi 5 importance of attribute ‘j’ Overall, if we know the distribution of choice probabilities, Pi, and we know the product attributes, Xi, then we can estimate the importance of those attributes, bi. Choice-based conjoint analysis removes many of the assumptions of ratings-based conjoint, but at the cost of more complexity for the researcher. In choice-based conjoint, a limited number of options are presented to a respondent, who is asked to simply select the one option that he or she would buy. The task is repeated several times with varying options presented for consideration. With many repetitions, usually among several people, there are sufficient data to make a reliable count of the frequency with which each option has been chosen among all of the times it was offered. This frequency distribution is then applied to a statistical technique called multinomial-logit (MNL) analysis to derive estimates of the influence each attribute has on the choices. Because in choice-based conjoint the evaluations are simple choices instead of ratings or rankings of each product profile, some researchers have argued that the technique is more properly called stated preference discrete choice modelling (SPDCM).2

APPENDIX 8A > CONJOINT ANALYSIS: MEASURING CONSUMER UTILITY

← EXHIBIT 8A.2 EXAMPLE OF CHOICEBASED CONJOINT

Let’s consider again the holiday apartments problem we looked at in the previous section. A slightly smaller orthogonal array may be constructed such as the one in Table 8A.3A. This would be then presented as the profiles in Table 8A.3B. TABLE 8A.3A » PRODUCT PROFILES FROM THE ORTHOGONAL ARRAY

Apartment # (profile #)

Walking distance to beach

Walking distance to night-life

Quality rating

Price per week

#01

0

1

1

2

#02

0

0

0

1

#03

0

1

0

3

#04

1

1

1

4

#05

1

1

0

2

#06

0

0

0

4

#07

1

0

1

1

#08

1

0

1

3

TABLE 8A.3B » PRODUCT PROFILES FROM THE ORTHOGONAL ARRAY

Apartment # (profile #)

Walking distance to beach

Walking distance to night-life

Quality rating

Price per week

#01

5 min

20 min

3-star

$735

#02

5 min

10 min

2-star

$595

#03

5 min

20 min

2-star

$945

#04

10 min

20 min

3-star

$1085

#05

10 min

20 min

2-star

$735

#06

5 min

10 min

2-star

$1085

#07

10 min

10 min

3-star

$595

#08

10 min

10 min

3-star

$945

So far, the derivation of profiles is the same as for ratings-based conjoint analysis. Now we break the full list into subsets of small numbers of options. We must find a list of subsets so that each option is compared with each other option several times in combination. This is achieved with a very special pattern of permutations known as a balanced incomplete block (BIB) design, such as the one in Table 8A.4. (A block is the group of profiles presented together; the blocks are ‘incomplete’ because not all of the profiles are presented at once, and the whole list is ‘balanced’ in that each profile appears an equal number of times and with each other profile an equal number of times.) Note in this example that each profile is presented seven times.

291

»

292

PART THREE > PLANNING THE RESEARCH DESIGN

» TABLE 8A.4 » BALANCED INCOMPLETE BLOCK DESIGN – 8 PROFILES IN 14 BLOCKS OF 43

Block

Profile #

#1

1

2

3

4

#2

5

6

7

8

#3

1

2

7

8

#4

3

4

5

6

#5

1

3

6

8

#6

2

4

5

7

#7

1

4

6

7

#8

2

3

5

8

#9

1

2

5

6

#10

3

4

7

8

#11

1

3

5

7

#12

2

4

6

8

#13

1

4

5

8

#14

2

3

6

7

Thus a questionnaire, or other instrument, may be designed as a series of choice tasks. Alternatives are presented in as realistic a way as possible, such as in a catalogue, newspaper advertisements or an in-store display. For example, block #1 from the BIB design in Table 8A.4 might be presented as in Table 8A.5. See that apartments #1, #2, #3 and #4 are presented. The second choice task would include apartments #5, #6, #7 and #8, and the third choice task includes apartments #1, #2, #7 and #8, and so on. In this experimental design, each person must complete 14 choice tasks similar to the one in Table 8A.5. TABLE 8A.5 » BLOCK #1 FROM BIB DESIGN

Just 5 minutes from the beach and 20 minutes from great nightspots, 4-star accommodation at $735 for the week

(Profile #1)

Just 5 minutes from the beach and 10 minutes from great nightspots, 3-star accommodation at $595 for the week

(Profile #2)

Just 5 minutes from the beach and 20 minutes from great nightspots, 3-star accommodation at $945 for the week

(Profile #3)

Just 10 minutes from the beach and 20 minutes from great nightspots, 4-star accommodation at $1 085 for the week

(Profile #4)

When all 14 choice tasks are completed then we can see the frequency with which each profile has been selected as the more attractive alternative. We then have a record of the frequency of times that each profile has been selected compared with the number of times it has been offered.

»

APPENDIX 8A > CONJOINT ANALYSIS: MEASURING CONSUMER UTILITY

»

One respondent, Steve, answered all 14 choice tasks and a summary of his answers appears in Table 8A.6. TABLE 8A.6 » STEVE’S HOLIDAY APARTMENT CHOICE DATA

Profile #

Beach

Night-life

Rating

Price

Frequency offered

Frequency chosen by Steve

#1

0

1

1

0.735

7

5

#2

0

0

0

0.595

7

4

#3

0

1

0

0.945

7

1

#4

1

1

1

1.085

7

0

#5

1

1

0

0.735

7

1

#6

0

0

0

1.085

7

0

#7

1

0

1

0.595

7

3

#8

1

0

1

0.945

7

0

These choice data are then applied to a multinomial logit analysis. The details of multinomial logit are beyond the scope of this text, but suffice it to say that the proportions of choices are transformed to their natural logarithms before a technique similar to multiple regression is applied. TABLE 8A.7 » MULTINOMIAL LOGIT ANALYSIS OF STEVE’S CHOICES

se_b

7.104

3.052

2.33

21.747

0.841

22.08

1.307

1.147

1.14

0.907

0.829

1.09

211.295

4.738

22.39

Intercept Beach Night-life Rating Price

t

Beta

LOGIT model: LN(p/(1 2 p)) 5 S(bi Xi). Pearson goodness-of-fit Chi-square 5 0.484, d.f. 5 3, p 5 0.922.

Results can be interpreted similarly to multiple regression. We interpret Steve’s ratings as follows: Steve’s value score starts with 7.10 and then take away 1.75 points if the apartment is 10 minutes from the beach, and add 1.31 points if the apartment is 20 minutes from the night-life, add 0.90 points if the apartment is 4-star instead of 3-star, and then take away 11.30 points for each thousand dollars of rent. Clearly, Steve would like to be close to the beach, he’d prefer to be away from the noise of late-night revellers and he’s cautious about price. Note that this is not a score out of 10 as we had for the ratings-based conjoint – it is a ratio-scaled score computed so that it gives us a good prediction of the likelihood of choice when comparing other objects. Thus a utility score may be extracted for any combination of given product attributes. And that utility score may then be compared with other attribute combinations to estimate the likelihood of choice. Let’s confirm that the model works by reconstructing the choice situation for block #1, which includes profiles #1, #2, #3 and #4. Calculations are summarised in Table 8A.8. Steve’s value score for profile #1 is estimated by: V1 5 Sbi X1 5 Intercept-coefficient 3 1 1 Beach-coefficient 3 0 1 Night-life-Coefficient 3 1 1 Price-coefficient 3 0.735 5 7.104 3 1 2 1.747 3 0 1 1.307 3 1 1 0.907 3 1 2 11.295 3 0.735 5 1.015

»

293

294

PART THREE > PLANNING THE RESEARCH DESIGN

»

Utility is then the natural exponent of this value score: U1 5 exp(V1) 5 exp(1.105) 5 2.760. Probability of choosing profile #1, when considered against profiles #2, #3 and #4, is then: P1 5 U1/SUi 5 2.760/(2.760 1 1.467 1 0.104 1 0.009) 5 0.64 Probabilities may be calculated for each other option in turn. We can see that there is a 64 per cent chance that Steve would select profile #1, a 34 per cent chance of choosing profile #2 and virtually no chance of choosing either of the other options in this particular choice set. ln fact, Steve chose #1 in the study. It had his preferred location and was a better quality establishment. While profile #2 was at a more favourable price level, it was too close to the night-life and of poorer quality. TABLE 8A.8 » CHOICE FORECAST BASED ON STEVE’S MNL ESTIMATES

Intercept b

7.104

#1

1

#2

Beach

Night-life

Rating

Price

Vi 5 SbiXi

Ui 5 Exp(Vi )

Pi 5 Ui /SUi

1.307

0.907

0

1

1

0.735

1.015

2.760

0.64

1

0

0

0

0.595

0.383

1.467

0.34

#3

1

0

1

0

0.945

22.264

0.104

0.02

#4

1

1

1

1

1.085

24.685

0.009

0.00

21.747

211.295

Advantages of choice-based conjoint The data-gathering task for respondents in choice-based conjoint is clearly a more natural activity. Respondents find it simpler and easier to compare objects and to simply select the one option they would buy. Analysis of the data does not depend on unrealistic assumptions of linear response; it only needs to know the extent that each option was preferred over others. The multinomial logit procedure can give remarkably accurate predictive models, and the procedure is well suited to testing for interaction effects. The choice task also may be made in a realistic fashion. Instead of a rating scale questionnaire, for example, the researcher may create a dummy shop display with prototype package designs for respondents to evaluate.

Disadvantages of choice-based conjoint Choice-based conjoint can be complicated. The derivation and interpretation of MNL output requires considerable expertise in experimental design and statistics. Most marketing professionals employ a consultant specialising in choice-based conjoint. Even these specialists tend to rely heavily on customised packaged software programs for the task. The dominant specialist software for choicebased conjoint (CBC) is distributed by Sawtooth Software.4 It is not cheap. The example of choice-based conjoint presented in this section was a relatively simple one. It was analysed using the MNP (logit) procedure in SPSS and forecasts were made using Microsoft Excel.

APPENDIX 8A > CONJOINT ANALYSIS: MEASURING CONSUMER UTILITY

295

Usually a product is defined by more than just four attributes (variables), and they usually have more than just two levels. More product attributes and more levels lead to larger experimental designs, giving block designs that have far too many choice tasks for one person to deal with. The usual solution is to spread the block design among many people, giving each respondent only a few choices to complete. Where usually ratings-based conjoint gives us a model for each respondent, choice-based conjoint usually derives models of utility and choice probability for a whole group rather than many individual-level models. The researcher then must be very careful to define beforehand the market segments to be surveyed. Forecasts based on MNL estimates of group-level choice-based conjoint models can suffer from a form of bias known as independence of irrelevant alternatives (IIA), sometimes called the ‘red bus/ blue bus’ problem. Say a group of commuters in one suburb can either drive to work or take the red bus. For these consumers the car has a utility score of 2 while the red bus has a utility score of 1. That means the probability of driving is exp(2)/(exp(2) 1 exp(1)) 5 0.73 and the probability of taking the red bus is exp(1)/(exp(2) 1 exp(1)) 5 0.27. So about 73 per cent of commuters from this suburb would travel to work by car. Now what if we add a new bus service? The blue bus has exactly the same timetable, the same route and the same price as the red bus. They’re the same except for the colour. Not surprisingly, it earns a utility score of 1. Now with three options to choose from the predicted choice probabilities become: P(Car) 5 exp(2)/(exp(2) 1 exp(1) 1 exp(1)) 5 0.58; P(Red bus) 5 exp(1)/ (exp(2) 1 exp(1) 1 exp(1)) 5 0.21; and P(Blue bus) 5 exp(1)/(exp(2) 1 exp(1) 1 exp(1)) 5 0.21. Now we predict that with the introduction of another bus service that is the same as the existing one, car traffic will fall from 73 per cent to 58 per cent. This result, of course, doesn’t make sense – the colour of the buses is trivial and so the utility and shares between the red bus and the blue bus should be regarded as just one option. The IIA problem can make forecasts of probability of purchase or market share inaccurate. Ideally we’d want to gather more data from each person, to reduce the need for group-level models, and have a relatively simple method of analysis.

CONJOINT ANALYSIS USING BEST–WORST SCALING Recent applications of stated-preference discrete choice modelling and conjoint analysis have best– worst as the scaling technique.5 Consider the usual choice-based conjoint task where we would ask a respondent to select the one preferred option of, say, four brands labelled A, B, C and D. If A is chosen then we know A>B, A>C and A>D. That’s three of the possible six paired comparisons. Under best–worst scaling the respondent gives the most-preferred option and also the least-preferred option. If A is best and D is worst then we know A>B, A>C, A>D, B>D and C>D. That’s five of the six possible paired comparisons. By nearly doubling the amount of information from each respondent, we greatly reduce the need for large group-based models. Such data may be rescaled into paired comparisons data to give probability of choice, which can be applied to multinomial logit modelling as for choice-based conjoint. For most purposes, that’s unnecessarily difficult. Marley and Louviere6 have shown that simply calculating the difference between best and worst for each option can give almost identical results to the complicated multinomial logit approach. There are several approaches for analysing best–worst data in a conjoint situation. Those offered by Sawtooth Software’s proprietary solutions are quite complex, and can simultaneously find respondents similar to each other so that group-level models can be created. If such sophisticated options are not needed then analysis can be much simpler, requiring only multiple regression techniques for useful results.

choice modelling Closely related to conjoint analysis, but focuses on the patterns of choices made by respondents instead of ratings data.

296

PART THREE > PLANNING THE RESEARCH DESIGN

Advantages of best–worst conjoint Conjoint analysis using best–worst data is much simpler than choice-based conjoint – all the analysis presented in this example was done using Microsoft Excel. The best–worst task for respondents is only slightly more difficult than choosing the best only. The evaluation and selection task is more natural and straightforward than ratings-based conjoint. And the method imposes no constraints or assumptions on the scale of the responses. Nearly twice as many data are gathered from respondents than choice-based conjoint, which somewhat reduces the need for group-based models, although large numbers of choice sets will still make this necessary in many cases. Making forecasts with best–worst data can suffer from the IIA problem in the same way as with choice-based-conjoint data. Forecasts made at the level of the individual or for small groups, however, suffer much less than those for large groups. Exhibit 8A.3 → EXAMPLE OF BEST– WORST CONJOINT

Consider the same holiday apartments problem we looked at in the choice-based conjoint example. We have the same fractional factorial orthogonal array for the product attribute combinations shown in Table 8A.3 and the same BIB design as shown in Table 8A.4. Now, instead of asking each respondent to give only the most preferred option for each choice set (block), we ask each respondent to give the single most preferred option and the single least preferred option. We asked Steve once again to look over the choice sets and indicate the best and worst. His responses are summarised in Table 8A.9A. TABLE 8A.9A » STEVE’S BEST–WORST EVALUATIONS OF HOLIDAY APARTMENTS

Block

Profile #

#1

1

2

3

4

#2

5

6

7

8

#3

1

2

7

8

#4

3

4

5

6

#5

1

3

6

8

#6

2

4

5

7

#7

1

4

6

7

#8

2

3

5

8

#9

1

2

5

6

#10

3

4

7

8

#11

1

3

5

7

#12

2

4

6

8

#13

1

4

5

8

#14

2

3

6

7

Best 5

, Worst 5

»

APPENDIX 8A > CONJOINT ANALYSIS: MEASURING CONSUMER UTILITY

» TABLE 8A.9B » SUMMARY OF BEST–WORST EVALUATIONS BY PROFILE

Profile#

Beach

Night-life

Rating

Price

Freq. best

Freq. worst

B–W

#1

5 min

20 min

4-star

$735

5

0

5

#2

5 min

10 min

3-star

$595

4

0

4

#3

5 min

20 min

3-star

$945

1

1

0

#4

10 min

20 min

4-star

$1085

0

4

24

#5

10 min

20 min

3-star

$735

1

1

0

#6

5 min

10 min

3-star

$1085

0

4

24

#7

10 min

10 min

4-star

$595

3

0

3

#8

10 min

10 min

4-star

$945

0

4

24

As discussed previously, the measure for each profile is calculated simply as the number of times each profile is best, minus the number of times it is worst. The data are then applied to a multiple regression analysis to derive the importance weights for each product attribute. The results are presented in Table 8A.10. From the regression result, we interpret Steve’s ratings as follows: Steve’s value score starts with 14.34 and then subtract 3.33 points if the apartment is 15 minutes from the beach, add 1.70 points if the apartment is 20 minutes from the night-life, add 1.67 points if the apartment is 4-star instead of 3-star, and then subtract 17.09 points for each thousand dollars of rent. TABLE 8A.10 » MULTIPLE REGRESSION OF STEVE’S BEST–WORST DATA ON HOLIDAY APARTMENTS

Regression statistics Multiple R

0.997

R-square

0.993

Adjusted R-square

0.984

Std. error

0.474

Observations

8 ANOVA d.f.

SS

MS

F

Sig.-F

Regression

4

97.327

24.332

108.524

0.001

Residual

3

0.673

0.224

Total

7

98

Coefficients

Std. error

t-Stat

p-value

Intercept

14.34

0.79

18.17

0.00

Beach

23.33

0.39

28.62

0.00

Night-life

1.70

0.34

4.98

0.02

Rating

1.67

0.39

4.31

0.02

217.09

0.90

218.91

0.00

Price

»

297

298

PART THREE > PLANNING THE RESEARCH DESIGN

»

The interpretation turns out to be the same as for the choice-based conjoint; however, clearly the scaling (mean and variance of the value scale) is different, in the same way that miles and kilometres measure the same thing but with a different scale. Likelihood of purchase forecasts can be made in the same way as for the MNL forecasts. So, now we have utility scores for each of the components of a holiday apartment for one person, Steve. We can do the same with a large sample of respondents. Of course we would run the calculations using automatic computer software. With a little matrix algebra we can create a complete conjoint analysis solution in a spreadsheet program such as Microsoft Excel. Some example spreadsheets are available from the Cengage CourseMate website which accompanies this book. The utility importance coefficients are only the beginning of the task. We have now a measurement that permits us to estimate or forecast what would be the utility value of any other combination of, in this case, proximity to the beach and nightlife, plus star rating and price. For example, let’s say we have two competing apartments and we want to know which one is more attractive to Steve. The two apartments have attributes as presented in Table 8A.11B, which can be represented mathematically as in Table 8A.11B. TABLE 8A.11A » COMPETING APARTMENTS

Profile#

Beach

Night-life

Rating

Price

#A

5 min

20 min

4-star

$720

#B

5 min

10 min

4-star

$740

TABLE 8A.11B » FORECASTS OF STEVE’S UTILITY FOR COMPETING APARTMENTS

Profile#

Intercept

Beach

Night-life

Rating

Price

Steve’s estimated total utility

#A

1

0

1

1

0.720

5.4

#B

1

0

0

1

0.740

3.4

3.33

1.70

1.67

17.09

Steve’s coefficients

14.34

These two test profiles do not exist in the original measurements. We can use any combination we like, within reason. (We shouldn’t use a price much higher or lower than those used in the analysis, for example.) With a consumer like Steve, we can see that the cheaper option, which is equally close to the beach but further from the nightlife, is much more attractive to Steve than Option B, which is more costly and closer to the noise. If we do this with a large number of people then we can estimate market shares and price sensitivities, and discover market segments based on utility, which is much more useful and reliable than simple demographics. Again, you will find some worked examples in the website for this textbook.

Researchers have found that respondents spend more time considering the options in each evaluation and choice task, which tends to give much more consistent and therefore accurate data for analysis. Respondents tend to become more slapdash with ratings scales, considerably increasing the level of error.

Disadvantages of best–worst conjoint While conjoint analysis using best–worst scaling resolves a number of difficulties with the evaluation and decision task for respondents, and resolves some problems with data analysis, other difficulties remain. The main problems are with finding a good balanced incomplete block (BIB) design and dealing with large numbers of choice sets.

APPENDIX 8A > CONJOINT ANALYSIS: MEASURING CONSUMER UTILITY

As with other non-ratings forms of conjoint data, the researcher may have difficulty defining a suitable BIB design. Combinatorial mathematicians spend considerable effort deriving block designs that are efficient; that is, where all of the required data may be gathered using the smallest number of blocks. The number of levels of each variable (product attribute) can cause problems. Most published BIB designs are for just two levels of each variable. It’s easy then to combine two variables to give a new variable with four levels, or join three variables to give eight levels. Some catalogues of BIB designs include special designs for variables of three levels each, but these are small in number. Thus, combinations of designs that include variables with three or five levels, and so on, can cause difficulty. Typically, the researcher must choose between reducing the number of levels in one or more variables or accepting a design that is much larger than necessary, because for some block design problems there is no efficient solution yet discovered. The relatively simple example offered in this section required 14 choice sets (blocks) of four profiles each in order to properly evaluate just eight profiles. A larger number of profiles can increase the number of choice sets exponentially. In such circumstances it is unlikely that we could persuade respondents to evaluate all of them. When this happens then the researcher must split the design among several people and accept parameter estimates that apply to a group of respondents rather than to one individual respondent. This can simplify the analysis in many ways, because there are fewer models to derive, but it means that the researcher must be careful to group respondents who are similar to each other in terms of their evaluations of the product category. As we know from our studies in consumer behaviour, demographics alone are fairly crude methods of finding similar market segments. Often the criteria for a priori segmentation of respondents are just speculation. We rarely want to derive utility models for just one person. Typically, we want representatives of several segments, known or unknown, and we should have about 30 or so of each so that random fluctuations in preferences and judgements can be evened out. It can be a tedious job for the researcher to run hundreds of separate regressions one at a time. Specialised software procedures included in SPSS, SAS and other products allow for the simultaneous calculation of parameter estimates for very large samples. With some patience, and a basic knowledge of matrix algebra, this also can be accomplished using a spreadsheet program such as Microsoft Excel or OpenOffice Calc. Professional marketing research consultants will buy Sawtooth Software or invest in designing their own software with the help of development tools such as Shazam7 and MatLab8.

FINAL COMMENTS ON CONJOINT ANALYSIS Conjoint analysis can be a very powerful technique for deriving consumer utility and price elasticity. These, in turn, can become the input for forecasting tools used for testing the attractiveness of new products (attribute bundles), changes to existing brands and the effects of competitive response to product-price changes. But the researcher should remember that various measurement and analysis techniques are an aid to, not a substitute for, managerial judgement. In all these consumer utility techniques the researcher must decide beforehand which attributes are sufficiently salient to include in the modelling process. If an important product feature is left out of the model, then we can never learn the extent to which it affects consumers’ judgements. Sometimes relatively trivial product features make the difference between being chosen and being left on the shelf. If all competitors are much the same, or are acceptable in all of the ‘must have’ attributes, then the choice comes down to less-important brand features.

299

300

PART THREE > PLANNING THE RESEARCH DESIGN

The conjoint analysis process generally involves all different levels of each product attribute (variable) to be compared and traded off against other levels of each other attribute. Sometimes this is unrealistic. We are unlikely, for example, to find a five-star hotel room offered at a three-star price. If we do, consumers may wonder if there’s something wrong. In such cases people naturally assume in their own minds additional benefits and risks that are not in the conjoint model. These, in turn, create biases and interaction effects that cannot be accounted for. The solution is to try to design a fractional factorial design without some attribute combinations, or to remove or substitute problematic combinations, allowing for some collinearity among the regressors. Several assumptions underlie data gathering in conjoint analysis experiments. We assume that a choice will be made and that the selection will be made from only those options presented in the choice set. This requirement often is relaxed in choice-based conjoint with the addition of another option in all choice tasks – a respondent may select ‘None of these, I wouldn’t buy at all’. This option then becomes a component in the statistical model. Forecasts based on conjoint analysis results must take account of all four Ps in the marketing mix: product, price, promotion and distribution (place). If a model predicts the market share of a new brand at a particular price against existing competitors, we assume that consumers are equally aware of the options and that all options in the model are accessible to customers. To the extent that this is not the case then adjustments should be made accordingly. Be careful how you use the many individual-level models from conjoint analysis results for forecasting. The most accurate predictions are achieved by making forecasts for each person and then aggregating these to get a whole-market prediction. The next best approach is to take the individual responses and combine these to make a group-level model. The least accurate approach is to average the parameter estimates from all models and then to make a single prediction – this washes out all the important similarities and differences among people and often gives market share estimates with all options about the same. The examples presented in this discussion of conjoint analysis measured only main effects of product attributes. If the researcher is interested in going deeper than simply making a good forecasting model, and understanding why choices are made, then interaction effects should be taken into account. You probably know someone who doesn’t like dark chocolate or coconut with milk chocolate, but a chocolate bar that combines dark chocolate and coconut together … well, that’s different! Main-effects models are usually quite sufficient for making good predictions, but the parameter estimates often are biased. Explanatory models often are better with interaction effects, but predictions often are no better or even may be worse. So the researcher must decide between a simple prediction model and a more complex explanatory model. As in conjoint analysis, choosing which approach to use is a trade-off.

NOTES 1 2 3 4 5

Green, P. & Srinivasan, V. (1978) ‘Conjoint analysis in consumer research: Issues and outlook’, Journal of Consumer Research, 5(2), September. Louviere, J. J. (2000) ‘Why stated preference discrete choice modelling is not conjoint analysis (and what SPDCM is)’, Memetrics White Paper, 1, pp. 1–11. This and many other BIB designs can be drawn from Colbourn, Charles J. and Dinitz, Jeffrey (eds) (2006) CRC handbook of combinatorial designs, 2nd edn, Boca Raton FL: CRC-Press. A very large amount of data, research results and information about different types of choice-based-conjoint are available from http: // www.sawtoothsoftware.com, accessed in January 2016. Finn, A. & Louviere, J. J. (1992) ‘Determining the appropriate response to evidence of public concern: The case of food safety’, Journal of Public Policy and Marketing, 11(1), pp. 12–25; Louviere,

J. J., Swait, Joffre & Anderson, Donald (1995) ‘Best/worst conjoint: A new preference elicitation method to simultaneously identify overall attribute importance and attribute level partworths’, Unpublished working paper, University of Sydney. 6 Marley, A. A. J. & Louviere, J. J. (2005) ‘Some probabilistic models of best, worst, and best–worst choices’, Journal of Mathematical Psychology, 49(6), pp. 464–80. 7 Shazam Econometrics Software, accessed at http://econometrics. com in January 2016. 8 The Mathworks, accessed at http://au.mathworks.com/products/ matlab in January 2016.

09 » WHAT YOU WILL LEARN IN THIS CHAPTER

To specify what information will be sought when designing a questionnaire. To determine the type of questionnaire and type of survey research methods. To determine the content of individual questions. To determine the form of response to each question. To determine the wording of each question. To determine question sequence. To determine the physical characteristics of the questionnaire.

QUESTIONNAIRE DESIGN Which Australian state is the most generous?

Australians seem be a generous lot. Between October 2010 and September 2011, 70 per cent of Australians reported making at least one donation to charity in the previous year. As of September 2015, this had fallen to 66 per cent. Yet the average annual amount given rose from $264 to $302. Incidence of charitable giving is fairly consistent throughout the country, generally hovering around the national average – except for Western Australia, where the figure rises to 71 per cent; and Tasmania, where it drops to 63 per cent. As well as being home to the country’s greatest proportion of charitable givers, WA also distinguishes itself in terms of average annual value given per donor. Western Australians who donate to charity hand over around $355 each per year, ahead of donors from NSW/ACT ($331) and Victoria ($285) – and some $115 more than the average South Australian.1 Well-designed

questionnaires are an important means of obtaining accurate information on social trends, such as donations and market behaviour. When asked how research could be improved in Australia, market research interviewers nominated the following:2 • Reduce the length of interviews – 10 to 15 minutes is optimum for telephone interviews (80 per cent). • Provide respondent incentives – charity donations, lottery tickets etc. (53 per cent). • Cut out obvious or apparent repetition (48 per cent). • Use clear, simple, less wordy language – be concise (39 per cent). • Use questions that are easier to understand, less confusing or ambiguous (34 per cent). Note that four of the five suggestions relate to questionnaire design. This chapter outlines a procedure for questionnaire design and illustrates that this is as much an art as it is a science. Nonetheless, it is best to follow a recipe or series of nine steps. These are listed at the outset of this chapter and are discussed in detail below.

To re-examine and revise steps 1–7 if necessary. To pretest the questionnaire.

CHAPTER 09 > QUESTIONNAIRE DESIGN

301

302

PART THREE > PLANNING THE RESEARCH DESIGN

ONGOING PROJECT

STEP 1: SPECIFY WHAT INFORMATION WILL BE SOUGHT A questionnaire is relevant if no unnecessary information is collected and only the information needed

questionnaire design

to solve the marketing problem is obtained. Asking the wrong question or an irrelevant question is a common pitfall. If the marketing task is to pinpoint store image problems, questions asking for general information about clothing style preferences will be irrelevant. To ensure information relevance, the researcher must be specific about data needs and have a rationale for each item of information. Before constructing a questionnaire, the researcher should first list in order of importance the specific research objectives and information required to meet those objectives. This is illustrated in Table 9.1, a study of the mobile phone market. Note these are only some of the questions that could be asked. But all research objectives relevant to the management decision or problem (for example, what features to include in a new mobile phone) need to be addressed in the study. TABLE 9.1 » POSSIBLE RESEARCH OBJECTIVES AND INFORMATION REQUIRED FOR A MOBILE PHONE STUDY

Research objectives

Information required (questions)

Brand of mobile

What is the brand of your mobile phone? What is the brand of your father’s/mother’s/best friend’s mobile phone?

Usage

How often on average do you use your mobile phone per week? How many calls per day (today) do you make on the mobile? How much do you pay per month?

Features of mobile

What features do you use? What features would you like? What are the main benefits you get from using your mobile phone?

Income

What is your yearly gross income? What is your disposable income (after rent and taxes)? How much on average do you have to spend in a week?

Many researchers, after conducting surveys, find that they omitted some important questions. Therefore, when planning the questionnaire design, researchers must think about possible omissions. Is information on the relevant demographic and psychographic variables being collected? Are there any questions that might clarify the answers to other questions? Will the results of the study provide the answer to the marketing manager’s problem? Sometimes research objectives can be expressed as hypotheses. A hypothesis is an implied relationship between two variables or measures – usually an independent and dependent variable – and is clearly stated in unambiguous terms. Hypotheses are often used to determine what information needs to be collected via a questionnaire in causal research. Here are some examples: →→ There is a positive relationship between the dollars spent on advertising and the dollar sales of a firm. →→ There is an association between the brand of a friend’s or family’s mobile phone and that of the brand of mobile purchased. →→ The amount of fuel economy of a car will be positively associated with purchase intent.

CHAPTER 09 > QUESTIONNAIRE DESIGN

303

In the example of the second hypothesis, information that would need to be collected is the brand of the respondent’s mobile phone and the brand of their friends and family’s mobile phones. Certain decisions made during the early stages of the research process will influence the questionnaire design. The preceding chapters stressed the need to have a good problem definition and clear objectives for the study. The problem definition will indicate the type of information that must be collected to answer the manager’s questions; different types of questions may be better at obtaining certain types of information than others. The questions that should be asked will, of course, take the form of data analysis into account. When designing the questionnaire, the researcher should consider the types of statistical analysis that will be conducted.

By now you are probably quite familiar with the online course questionnaire. As you went through the questionnaire, did you spot any problem questions? You should be able to describe the problems better after finishing this chapter. Here are some questions to consider: »» What are the problem items, if any? »» Are there any topics covered in the survey that would result in more valid responses through: (a) a telephone interview; or (b) a personal interview? »» Look at the section of the questionnaire shown. Describe any potential problems with these particular items using terminology from this chapter. In other words, do the items display characteristics that should be avoided?

SURVEY THIS!

Courtesy of Qualtrics.com

STEP 2: DETERMINE THE TYPE OF QUESTIONNAIRE AND SURVEY RESEARCH METHOD The type of survey to be used depends on the types of respondents and nature of information needed. A small sample of experts on Italian cooking is best obtained by personal interview. Generally, as noted in Chapter 5, personal interviews are best for more open-ended and exploratory studies and when visuals or product interactions (for example, taste or test driving). Telephone interviews are good for short questionnaires, while mail and online surveys are good for collecting information about sensitive issues such as health, criminal behaviour, drug taking and personal relationships.

ONGOING PROJECT

304

PART THREE > PLANNING THE RESEARCH DESIGN

STEP 3: DETERMINE THE CONTENT OF INDIVIDUAL QUESTIONS It is a good idea at this stage of the process to consider the following: 1 Is the question necessary? 2 Are several questions needed instead of one? 3 Do the respondents have the necessary information to respond to the question? 4 Will respondents give the information freely and accurately? In asking whether the question is necessary, you need to think about whether the information you want is interesting to know or is something you need to know. In our mobile phone example, it might be interesting to know how many single people own a mobile phone, but is that something that the marketer needs to know? Stick to asking questions that relate only to your research objectives and/or hypotheses. In addition, you may need to ask more than one question in order to obtain the necessary information. For example, instead of asking simply: How did you first happen to use that detergent? You may need to ask: →→ Where did you buy that detergent? →→ Why did you select it over other brands? It is also important to remember that questions need to be written so that respondents can easily respond to them. Respondents, like all of us, tend to forget events quickly and perceive that events happened more recently than they did. For example: →→ When was the last time the government did something positive for you? →→ What did you do last night? →→ What did you do on the day of the terrorist attacks in Paris on 13 November 2015? People can forget very quickly what government actions made them feel better, and remembering what you did last night is a lot easier than trying to remember what you did the day that Paris was attacked. Generally respondents will struggle to recall events greater than six months ago (see also the section on the wording of individual questions). Respondents’ willingness to answer your questions will be determined by the work involved in producing an answer, their ability to articulate an answer and the sensitivity of the issue. Respondents will find it difficult to answer questions that involve calculating a quantity (for example, a body mass index (kg/m2) or the number of calories or standard drinks consumed in a week). Sometimes, respondents may not be able to articulate an answer due to their age, literacy level or lack of knowledge about a particular topic. In this case, more simple and direct language relevant to the respondent should be used in framing questions. The problem of respondent literacy common in many developing countries means that personal interviews with simple, direct and structured questions may be the most appropriate means of obtaining information.

Asking sensitive questions Asking sensitive questions is very difficult and is to be avoided if possible, as respondents may refuse to answer any more questions in a survey. Table 9.2 lists some sensitive issues for respondents.

CHAPTER 09 > QUESTIONNAIRE DESIGN

TABLE 9.2 »

305

SENSITIVE ISSUES IN QUESTIONNAIRES33

Topic

Very uneasy

Masturbation

56.4%

Marijuana

42.0%

Intercourse

41.5%

Stimulants and depressants

31.3%

Intoxication

29.0%

Petting and kissing

19.7%

Income

12.5%

Gambling with friends

10.5%

Drinking

10.3%

General leisure

2.4%

Sports activity

1.3%

Some of these issues would appear obvious to you, but it may not be readily apparent that other issues (such as income, gambling and alcohol consumption) would be considered to be sensitive. Some issues may also be more sensitive, depending on culture or gender. For example, asking about sexual health in a Muslim country would be a highly sensitive issue. Asking people whether they are single, and what their telephone number is, is also sensitive and, some would say, unnecessary. However, in some research studies, particularly health and social research, it is still important to ask sensitive questions (for example, the amount of alcohol imbibed in a week). In persuading respondents to provide information about sensitive issues, a number of approaches that can be used. The first is to ‘hide’ the question in a group of less sensitive questions. A second is to state that the behaviour or attitude is not that unusual before asking the question. For example: →→ One in four households has problems meeting its monthly financial obligations. →→ Do you find that you have problems meeting your financial obligations at least once a month? Another approach is to phrase the question in terms of others and how they might feel or act (in the third person). This usually elicits a response that is not threatening, but still applies to the respondent: →→ Do you think most people cheat on their income tax? Why? Stating the responses in the form of categories that the respondent may simply check or tick, or as closed-response questions, is also likely to yield information on sensitive issues. This is used especially when asking questions about income. Another way to deal with sensitive issues is to ask about them only on a random basis. This is called the randomised-response model, where paired sensitive and innocuous questions are drawn randomly (for example, red and black balls in an urn).

BE CAREFUL OF WORDS WITH MORE THAN ONE MEANING

Care must be taken in formatting questions, even a simple question like ‘Where did you read the article on rice production’? This could mean: »» in The Age »» in front of the television »» in Hong Kong.

TIPS OF THE TRADE

306

PART THREE > PLANNING THE RESEARCH DESIGN

ONGOING PROJECT

STEP 4: DETERMINE THE FORM OF RESPONSE TO EACH QUESTION Two basic types of questions can be identified based on the amount of freedom respondents have in

open-ended question

answering. An open-ended response question poses a problem or topic and asks respondents to answer in their own words. If the question is asked in a personal interview, the interviewer may probe for more information. For example: →→ What names of local banks can you think of offhand? →→ What comes to mind when you look at this advertisement? →→ In what way, if any, could this product be changed or improved? I’d like you to tell me anything you

REAL WORLD SNAPSHOT

can think of, no matter how minor it seems. →→ What things do you like most about Australia Post’s Express service? →→ Why do you buy more of your clothing in Sportsgirl than in other shops? →→ How can our stores better serve your needs? →→ Please tell me anything at all that you remember about the BMW advertisement you saw last night.

WHO REALLY DOES THE HOUSEWORK?4

Who does housework? What seems like a simple question becomes not so simple when one needs a precise answer. One survey suggested that, on average, women spent approximately 42 hours a week doing housework compared to approximately 23 hours a week for men. According to these results, women do almost twice as much housework as men. On closer inspection however, these results suggest that the average married couple spends 65 hours a week doing housework. Really? Do couples really put in nearly 10 hours a day on housework? That result doesn’t seem plausible on first glance, but a number of factors related to survey design may influence the result. First, what is housework? Does housework include driving the kids to school, driving to the supermarket or driving to work? Does it include time going out to get the newspaper or time spent perusing food magazines for recipe ideas? A broader definition of housework will yield higher numbers. Second, respondents who do very little housework are not likely to answer such a survey. Thus, response bias may occur based on the type of person who does respond. Third, the question is prone to socially desirable responding. The socially desirable response for both men and women is to admit to doing a significant amount of housework. Perhaps an interesting side note is that the more couples report doing housework, the higher the frequency of intimacy they report. On top of this, men report an average of 34 hours a week of work outside the home and women about 20. This doesn’t seem to leave a lot of time for other activities. Perhaps one factor behind the apparent relationship is that respondents who report a lot of housework exhibit a response pattern using the upper ends of scales more than the lower parts. The end result, Royalty-free/Corbis

open-ended response question A question that poses some problem and asks the respondent to answer in her own words.

»

CHAPTER 09 > QUESTIONNAIRE DESIGN

»

307

if researchers want accurate answers to such questions, is that they should ensure anonymity, have a very good definition of the phenomena being studied, and be able to convey that definition in a survey instrument. Sometimes, behavioural evidence can validate (or invalidate) survey results. People who do more housework do not have more children than other couples. Does this behavioural result say anything about potential bias in the survey results?

Open-ended response questions are free-answer questions. They may be contrasted with fixedalternative questions – sometimes called closed questions – which give respondents specific limitedalternative responses and ask them to choose the one closest to their own viewpoints. For example:

fixed-alternative questions

Did you use any commercial feed or supplement for livestock or poultry in 2015?

❏ Yes ❏ No

As compared with 10 years ago, would you say that the quality of most products made in China is higher, about the same, or not as good?

❏ Higher ❏ About the same ❏ Not as good

Do you think the government’s tax policy has affected your business?

❏ Yes, for the better ❏ Yes, for the worse ❏ Not especially

In which type of bookshop is it easier for you to shop – a regular bookshop or a bookshop on the Internet?

❏ Regular bookshop ❏ Internet bookshop

How much of your shopping for clothes and household items do you do in wholesale club stores?

❏ ❏ ❏ ❏ ❏

All of it

fixed-alternative question A question in which the respondent is given specific, limited alternative responses and asked to choose the one closest to his own viewpoint.

Most of it About one-half of it About one-quarter of it Less than one-quarter of it

EXPLORING RESEARCH ETHICS

WHAT A DIFFERENCE WORDS MAKE5

Some surveys may be conducted to further the interests of their sponsors. By carefully wording questions, enough bias may be predetermined to achieve the result desired by the sponsor of the research. Consider the following examples: »» ‘The legislation would generate more revenue’ versus ‘The legislation will implement tax reform’. »» ‘Are you in favour of the MX missile?’ versus ‘Are you in favour of the Peacekeeper?’ »» ‘Are you in favour of abortion?’ versus ‘Are you in favour of pro-choice?’ »» ‘Are you in favour of welfare?’ versus ‘Are you in favour of public assistance?’ »» ‘Are you in favour of a department of war?’ versus ‘Are you in favour of a Department of Defence?’ »» ‘Are you in favour of equality in marriage rights for all?’ versus ‘Do you support the traditional role of marriage?’

»

308

PART THREE > PLANNING THE RESEARCH DESIGN

»

Consider this example of a hypothetical question in a mail survey sponsored by Greenpeace, an environmental lobby group: Depletion of the earth’s protective ozone layer leads to skin cancers and numerous other health and environmental problems. Do you support Greenpeace’s demand that DuPont, the world’s largest producer of ozone-destroying chemicals, stops making unneeded ozone-destroying chemicals immediately?

How we word questions will thus influence how people will respond. Good questions need to be clear and unbiased. Open-ended response questions are most beneficial when the researcher is conducting exploratory research, especially when the range of responses is not known. Such questions can be used to learn which words and phrases people spontaneously give to the free-response question. Respondents are free to answer with whatever is uppermost in their minds. By obtaining free and uninhibited responses, the researcher may find some unanticipated reaction towards the product. Such responses will reflect the flavour of the language that people use in talking about products or services and thus may provide a source of new ideas for advertising copywriting. Also, open-ended response questions are valuable at the beginning of an interview. They are good first questions because they allow respondents to warm up to the questioning process. The cost of administering open-ended response questions is substantially higher than that of administering fixed-alternative questions, because the job of editing, coding and analysing the data is quite extensive. As each respondent’s answer is unique, there is some difficulty in categorising and summarising them all. The process requires that an editor go over a sample of questions to classify the responses into some sort of scheme; then all the answers must be reviewed and coded according to the classification scheme. Another potential disadvantage of the open-ended response question is the possibility that interviewer bias will influence the answer. While most interviewer instructions state that answers are to be recorded verbatim, rarely does even the best interviewer get every word spoken by the respondent. Interviewers have a tendency to take shortcuts in recording the answers. But changing even a few of the respondent’s words may substantially influence the results; thus, the final answer may reflect a combination of the respondent’s and interviewer’s ideas rather than the respondent’s ideas alone. In addition, articulate individuals tend to give longer answers to open-ended response questions. Such respondents often are better educated and from higher income groups, and therefore may not be representative of the entire population, though they may give a large share of the responses. In contrast, fixed-alternative questions require less interviewer skill, take less time and are easier for the respondent to answer. This is because answers to closed questions are classified into standardised groupings before data collection. Standardising alternative responses to a question provides comparability of answers, which facilitates coding, tabulating and, ultimately, interpreting the data. Where possible use closed-response questions, as they require less interviewer skill and are easier to administer and code.

Types of closed-response questions There are a number of types of closed-response questions, some of which have been discussed in simple-dichotomy (dichotomous alternative) question A fixed-alternative question that requires the respondent to choose one of two alternatives.

Chapter 8 on measurement. The simple-dichotomy (dichotomous alternative) question requires the respondent to choose one of two alternatives. The answer can be a simple ‘yes’ or ‘no’ or a choice between ‘this’ and ‘that’. For example: Did you make any international calls last week?

❏ Yes

❏ No

CHAPTER 09 > QUESTIONNAIRE DESIGN

309

Several types of questions provide the respondent with multiple-choice alternatives. The determinantchoice question requires the respondent to choose one – and only one – response from among several possible options. For example: Please give us some information about your flight. In which section of the aircraft did you sit?

❏ First class ❏ Business class ❏ Economy class The frequency-determination question is a determinant-choice question that asks for an answer about the general frequency of occurrence. For example: How frequently do you watch the Channel 9 News?

❏ ❏ ❏ ❏ ❏ ❏

Every day 5–6 times a week 2–4 times a week Once a week Less than once a week Never

Attitude rating scales – such as the Likert scale, semantic differential, Stapel scale – are also fixedalternative questions. These are discussed in Chapter 8. The checklist question allows the respondent to provide multiple answers to a single question. The respondent indicates past experience, preference and the like merely by ticking off items. In many cases the choices are adjectives that describe a particular object. A typical checklist question might ask the following: Please check which of the following sources of information about investments you regularly use, if any.

❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏

Personal advice of your broker(s)

frequency determination question A fixed-alternative question that asks for an answer about general frequency of occurrence.

Brokerage newsletters Brokerage research reports Investment advisory service(s)

checklist question A fixed-alternative question that allows the respondent to provide multiple answers to a single question by ticking off items.

Conversations with other investors Web page(s) None of these Other (please specify)

iStockphoto/Catherine Yeulet

GIVING TO CHARITY: DONOR BEHAVIOUR IN THREE COUNTRIES6

According to the Donor Perspectives report, approximately 25 per cent of donors from each country reported that the amount they donated to charity increased from the previous year. US donors were primarily motivated by a change in their financial situation, while donors in the UK and Australia cited an innate desire to help. On average, US respondents reported donating nearly three times as much to charity as UK or Australian donors.

determinant-choice question A fixed-alternative question that requires the respondent to choose one response from multiple alternatives.

REAL WORLD SNAPSHOT

310

PART THREE > PLANNING THE RESEARCH DESIGN

A major problem in developing dichotomous or multiple-choice alternatives is the framing of the response alternatives. There should be no overlap among categories. Alternatives should be mutually exclusive; that is, only one dimension of an issue should be related to each alternative. The following listing of income groups illustrates a common error:

❏ ❏ ❏ ❏ ❏

Under $15 000 $15 000–30 000 $30 000–55 000 $55 000–70 000 Over $70 000

How many people with incomes of $30 000 will be in the second group, and how many will be in the third group? There is no way to determine the answer. Grouping alternatives without forethought about analysis is likely to diminish accuracy. Few people relish being in the lowest category. Including a category lower than the lowest expected answers often helps to negate the potential bias caused by respondents’ tendency to avoid an extreme category. When a researcher is unaware of the potential responses to a question, fixed-alternative questions obviously cannot be used. If researchers assume what the responses will be but are, in fact, wrong, they will have no way of knowing the extent to which the assumption was incorrect. Unanticipated alternatives emerge when respondents feel that closed answers do not adequately reflect their feelings. They may make comments to the interviewer or write additional answers on the questionnaire indicating that the exploratory research did not yield a complete array of responses. After the fact, little can be done to correct a closed question that does not provide enough alternatives; therefore, before writing a descriptive questionnaire, a researcher may find it valuable to spend time conducting exploratory research with open-ended response questions to identify the most likely alternatives. The researcher should strive to ensure that there are sufficient response choices to include almost all possible answers. Respondents may select obvious alternatives, such as price or durability, if they do not see the choice they would prefer. Also, a fixed-alternative question may tempt respondents to check an answer that is more prestigious or socially acceptable than the true answer. Rather than stating that they do not know why they chose a given product, they may select an alternative among those presented, or, as a matter of convenience, they may opt for a given alternative rather than think of the most correct response. Most questionnaires mix open-ended and closed questions. As we have discussed, each form has ONGOING PROJECT

unique benefits. In addition, a change of pace can eliminate respondent boredom and fatigue.

STEP 5: DETERMINE THE WORDING OF EACH QUESTION There are no hard-and-fast rules in determining how to develop a questionnaire. Fortunately, research experience has yielded some guidelines that help prevent the most common mistakes.

Avoid complexity: use simple, conversational language Words used in questionnaires should be readily understandable by all respondents. The researcher usually has the difficult task of adopting the conversational language of people at the lower education levels without talking down to better-educated respondents. Remember, not everyone has the vocabulary of a university student; a substantial number of people have never gone beyond high school.

CHAPTER 09 > QUESTIONNAIRE DESIGN

311

Respondents can probably tell an interviewer whether they are married, single, divorced, separated or widowed, but providing their marital status may present a problem. The technical jargon of top corporate executives should be avoided when surveying retailers or industrial users. ‘Brand image’, ‘positioning’, ‘marginal analysis’ and other corporate language may not have the same meaning for or even be understood by a shop owner-operator in a retail survey. The vocabulary used in the following example question from a hypothetical attitude survey on social problems probably would confuse many respondents: When effluents from a paper mill can be drunk and exhaust from factory smokestacks can be breathed, then humankind will have done a good job in saving the environment ... Don’t you agree that what we want is zero toxicity: no effluents?

Besides being too long and confusing, this question is leading. Table 9.3 gives a list of complex words with their more simple equivalents. Note that some words have multiple meanings. TABLE 9.3 »

COMPLEX AND SIMPLE WORDS7

Acquaint

Inform, tell

Assist

Help

Consider

Think

Desire

Wish

Factor

Fact, consideration, circumstance, feature, element, constituent, cause

Function (verb)

Work, operate, act

Inform

Tell

Locate

Find

Practically

Virtually, almost, nearly, all but

Purchase

Buy

Require

Want, need

Residence

Home

State

Say

Sufficient

Enough

Terminate

End

Visualise

Imagine, picture leading question

Avoid leading and loaded questions Leading and loaded questions are a major source of bias in question wording. A leading question suggests or implies certain answers. A study of the dry-cleaning industry asked this question: →→ Many people are using dry-cleaning less because of improved wash-and-wear clothes. How do you feel wash-and-wear clothes have affected your use of dry-cleaning facilities in the past four years?

❏ Use less ❏ No change ❏ Use more The potential ‘bandwagon effect’ implied in this question threatens the study’s validity. Partial mention of alternatives is a variation of this phenomenon: →→ Do small imported cars, such as Toyotas, get better economy than small Australian-built cars? →→ How do you generally spend your free time, playing computer games or what?

leading question A question that suggests or implies certain answers.

312

PART THREE > PLANNING THE RESEARCH DESIGN

Sometimes questions may be designed to get a particular response. Consider an online poll conducted by the Australian newspaper. →→ Do you think the Australian government is too soft in its war on crime? Most respondents would agree that the government is soft on a number of issues, including crime. The question also does not specify what aspect of crime it refers to, or whether the respondent agrees (or not) with any particular policy relating to a war against crime. One could also phrase another question as: →→ Are you in favour of the Australian government’s humane rehabilitation policy that makes past criminals productive members of society? This would get a more sympathetic response than the previous question. Merely mentioning an alternative may have a dramatic effect. The following question was asked in a research study for a court case (Universal City, Inc. v. Nintendo, Co., Ltd. (1984)).8 →→ To the best of your knowledge, was Donkey Kong made with the approval or under the authority of the people who produced the King Kong films? loaded question

Of the respondents, 18 per cent answered, ‘Yes’. In contrast, 0 per cent correctly answered the question: ‘As far as you know, who makes Donkey Kong?’

loaded question A question that suggests a socially desirable answer or is emotionally charged.

A loaded question suggests a socially desirable answer or is emotionally charged. Consider the following:

In light of today’s farm crisis, would it be in the public’s best interest to offer interest-free loans to farmers?

❏ Strongly agree

❏ Agree

❏ Disagree

❏ Strongly disagree

Answers might be different if the loaded portion of the statement, ‘farm crisis’, was worded to suggest a problem of less magnitude than a crisis. A television station produced the following 10-second spot asking for viewer feedback: →→ We are happy when you like programs on Channel X. We are sad when you dislike programs on Channel X. Write to us and let us know what you think of our programming. Few people wish to make others sad. This request is likely to elicit only positive comments.

WHAT WENT WRONG? ALLEGATIONS OF PUSH POLLING IN THE ACT ELECTION9

In 2012, the Canberra Liberals were accused of ‘blatant push polling’ in a telephone survey of Canberra residents about the ACT election. Participants called last week by an interstate market research company have complained that the 40-question survey was loaded with up to a dozen questions about ACT Labor ‘doubling or even tripling rates’. Push polling, which is not illegal, is a campaign technique conducted in the guise

of a survey of voter intentions but intended to use leading questions to influence a respondent’s vote. A voter in a marginal seat, who did not want to be named but said she had no political connections, said the survey was ‘blatant push polling’. ‘They were setting up a proposition which isn’t true and asking your opinion on it repeatedly,’ she said. ‘They attacked the question in different ways each time but each question used the wording doubled or even tripled.’ The Liberal party denied any allegations of push polling.

Certain answers to questions are more socially desirable than others. For example, a truthful answer to the following classification question might be painful:

Where did you rank academically in end of high school exam results?

❏ Top quarter

❏ 3rd quarter

❏ 2nd quarter

❏ Bottom quarter

CHAPTER 09 > QUESTIONNAIRE DESIGN

313

When taking personality or psychographic tests, respondents frequently can interpret which answers are most socially acceptable, even if those answers do not portray their true feelings. For example, which are the socially desirable answers to the following questions/statements on a selfconfidence scale?

I feel capable of handling myself in most social situations.

❏ Agree

I seldom fear my actions will cause others to have low opinions of me.

❏ Agree

❏ Disagree ❏ Disagree

Invoking the status quo is a form of loading resulting in bias because most people tend to resist change.10 A Newspoll survey of 400 Sydneysiders11 found that support for a desalination plant was only 15 per cent, with 72 per cent preferring that the $1.3 billion to be spent on this project be allocated to recycling water. Respondents were not asked if they would accept drinking recycled water or water produced by desalination, although 65 per cent had reservations about the price of desalinated water and 77 per cent expressed a concern about the impact of a desalination plant on greenhouse gases. The survey was sponsored by an alliance of local councils, scientists, engineering experts and environmental groups. Asking respondents ‘how often’ they use a product or visit a shop leads them to generalise about their habits, because there usually is some variance in their behaviour. In generalising one is likely to portray one’s ideal behaviour rather than one’s average behaviour. For instance, brushing one’s teeth after each meal may be ideal, but busy people may skip a brushing or two. An introductory counterbiasing statement or preamble to a question that reassures respondents that their ‘embarrassing’ behaviour is not abnormal may yield truthful responses:

Some people have the time to brush three times daily; others do not. How often did you brush your teeth yesterday? If a question embarrasses the respondent, it may elicit no answer or a biased response. This is

particularly true with respect to personal or classification data such as income or education. The problem may be mitigated by introducing the section of the questionnaire with a statement such as:

To help classify your answers, we’d like to ask you a few questions. Again, your answers will be kept in strict confidence. A question statement may be leading because it is phrased to reflect either the negative or the

positive aspects of an issue. To control for this bias, the wording of attitudinal questions may be reversed for 50 per cent of the sample. This split-ballot technique is used with the expectation that two alternative phrasings of the same question will yield a more accurate total response than will a single phrasing. For example, in a study on small-car buying behaviour, one-half of a sample of imported-car purchasers received a questionnaire in which they were asked to agree or disagree with the statement: ‘Australian cars are cheaper to maintain than small imported cars.’ The other half of the imported-car owners received a questionnaire in which the statement read: ‘Small imported cars are cheaper to maintain than Australian cars.’

Avoid ambiguity: Be as specific as possible Items on questionnaires often are ambiguous because they are too general. Consider indefinite words such as often, occasionally, regularly, frequently, many, good, fair and poor. Each of these words has many different meanings. For one person frequently reading Fortune magazine may be reading six

counterbiasing statement An introductory statement or preamble to a potentially embarrassing question that reduces a respondent’s reluctance to answer by suggesting that certain behaviour is not unusual. split-ballot technique Using two alternative phrasings of the same questions for respective halves of a sample to elicit a more accurate total response than would a single phrasing.

314

PART THREE > PLANNING THE RESEARCH DESIGN

or seven issues a year; for another it may be two issues a year. The word fair has a great variety of meanings; the same is true for many other indefinite words. Questions such as the following one, used in a study measuring the reactions of consumers to a television boycott, should be interpreted with care:

Please indicate the statement that best describes your family’s television viewing during the boycott of Channel 7.

❏ ❏ ❏ ❏

We did not watch any television programs on Channel 7. We watched hardly any television programs on Channel 7. We occasionally watched television programs on Channel 7. We frequently watched television programs on Channel 7.

Some marketing scholars have suggested that the rate of diffusion of an innovation is related to the perception of product attributes such as divisibility, which refers to the extent to which the innovation may be tried or tested on a limited scale.12 An empirical attempt to test this theory using semantic differentials was a disaster. Pretesting found that the bipolar adjectives divisible–not divisible were impossible for consumers to understand because they did not have the theory in mind as a frame of reference. A revision of the scale used these bipolar adjectives:

Testable — : — : — : — : — : — : —: — Not testable (sample use possible)

(sample use not possible)

However, the question remained ambiguous because the meaning was still unclear. A brewing industry study on point-of-purchase advertising (store displays) asked:

What degree of durability do you prefer in your point-of-purchase advertising?

❏ Permanent (lasting more than 6 months) ❏ Semipermanent (lasting from 1 to 6 months) ❏ Temporary (lasting less than 1 month) Here the researchers clarified the terms permanent, semipermanent and temporary by defining them for the respondent. However, the question remained somewhat ambiguous. Beer marketers often use a variety of point-of-purchase devices to serve different purposes – in this case, what is the purpose? Furthermore, analysis was difficult because respondents were merely asked to indicate a preference rather than a degree of preference. Thus, the meaning of a question may not be clear because the frame of reference is inadequate for interpreting the context of the question. A student research group asked this question:

What media do you rely on most?

❏ Television ❏ Radio ❏ Internet ❏ Newspapers This question is ambiguous because it does not ask about the content of the media. ‘Rely on most’

for what – news, sports, entertainment?

Avoid double-barrelled items double-barrelled question A question that may induce bias because it covers two issues at once.

A question covering two issues at once is referred to as a double-barrelled question and should always be avoided. Making the mistake of asking two questions rather than one is easy. For example, ‘Please indicate your degree of agreement with the following statement: “Wholesalers and retailers are responsible for the high cost of meat.”’ Which intermediaries are responsible: the wholesalers or the

CHAPTER 09 > QUESTIONNAIRE DESIGN

retailers? When multiple questions are asked in one question, the results may be exceedingly difficult to interpret. For example, consider the following question from a magazine’s survey entitled ‘How Do You Feel about Being a Woman?’ Between you and your husband, who does the housework (cleaning, cooking, dishwashing, laundry) over and above that done by any hired help?

❏ ❏ ❏ ❏ ❏

I do all of it. I do almost all of it. I do over half of it. We split the work fifty-fifty. My husband does over half of it.

The answers to this question do not tell us if the wife cooks and the husband washes the dishes. A survey by a consumer-oriented library asked:

Are you satisfied with the present system of handling ‘closed-reserve’ and ‘open-reserve’ readings? (Are enough copies available? Are the required materials ordered promptly? Are the borrowing regulations adequate for students’ use of materials?)

❏ Yes

❏ No

A respondent may feel torn between a ‘yes’ to one part of the question and a ‘no’ to another part. The answer to this question does not tell the researcher which problem or combination of problems concerns the library user. Consider this comment about double-barrelled questions: Generally speaking, it is hard enough to get answers to one idea at a time without complicating the problem by asking what amounts to two questions at once. If two ideas are to be explored, they deserve at least two questions. Since question marks are not rationed, there is little excuse for the needless confusion that results [from] the double-barrelled question.13

Avoid making assumptions Consider the following question:

Should Myer continue its excellent gift-wrapping program?

❏ Yes

❏ No

This question has a built-in assumption: that people believe the gift-wrapping program is excellent. By answering ‘Yes’, the respondent implies that the program is, in fact, excellent and that things are fine just as they are; by answering ‘No’, the implication is that the store should discontinue wrapping gifts. Researchers should not place respondents in that sort of bind by including an implicit assumption in the question. Another frequent mistake is assuming that the respondent had previously thought about an issue. For example, the following question appeared in a survey concerning Hungry Jack’s: ‘Do you think Hungry Jack’s restaurants should consider changing their name?’ It is very unlikely that respondents had thought about this question before being asked it. Most respondents answered the question even though they had no prior opinion concerning the name change. Research that induces people to express attitudes on subjects they do not ordinarily think about is meaningless.

315

316

PART THREE > PLANNING THE RESEARCH DESIGN

Avoid burdensome questions that may tax the respondent’s memory A simple fact of human life is that people forget. Researchers writing questions about past behaviour or events should recognise that certain questions may make serious demands on the respondent’s memory. Writing questions about prior events requires a conscientious attempt to minimise the problems associated with forgetting. In many situations respondents cannot recall the answer to a question. For example, a telephone survey conducted during the 24-hour period following the airing of the FIFA World Cup might establish whether the respondent watched the World Cup and then ask: ‘Do you recall any advertisements on that program?’ If the answer is positive, the interviewer might ask: ‘What brands were advertised?’ These two questions measure unaided recall, because they give the respondent no clue as to the brand of interest. If the researcher suspects that the respondent may have forgotten the answer to a question, he or she may rewrite the question in an aided-recall format – that is, in a format that provides a clue to help jog the respondent’s memory. For instance, the question about an advertised beer in an aidedrecall format might be: ‘Do you recall whether there was a brand of beer advertised on that program?’ Or: ‘I am going to read you a list of beer brand names. Can you pick out the name of the beer that was advertised on the program?’ While aided recall is not as strong a test of attention or memory as unaided recall, it is less taxing to the respondent’s memory. Telescoping and squishing are two additional consequences of respondents forgetting the exact details of their behaviour. Telescoping occurs when respondents believe that past events happened more recently than they actually did. The opposite effect, squishing, occurs when respondents think that recent events took place longer ago than they really did. A solution to this problem may be to refer to a specific event that is memorable. For example: ‘How often have you gone to a sporting event since the last Ashes series in Australia?’ Because forgetting tends to increase over time, the question may concern a recent period: ‘How often did you watch the History Channel on pay TV last week?’ (During the editing stage, the results can be transposed to the appropriate time period.) In situations in which ‘I don’t know’ or ‘I can’t recall’ is a meaningful answer, simply including a ‘Don’t know’ response category may solve the question writer’s problem. By this stage of questionnaire development it is recommended that respondents follow the suggested guide in Exhibit 9.1.

EXHIBIT 9.1 → AVOIDING COMMON WORDING MISTAKES IN QUESTIONNAIRE DESIGN

Mistakes in Wording Items

Solutions for Better Wording

Item too complex

Use fewer and simpler words

Leading respondent

Use concrete terminology

Ambiguous wording

1 issue = 1 item

Double-barrelled Items

Know respondents’ knowledge level

Items assumes too much

Provide memory prompt or use aided recall

Taxing respondent’s memory

Pretest

CHAPTER 09 > QUESTIONNAIRE DESIGN

317

Questionnaire design features not only influence the validity of data – they can also affect whether the questionnaire gets any data at all. No matter what the interview mode, many respondents give up before finishing and abandon the survey. With snail-mail questionnaires, the number is impossible to determine. However, phone interviews and online surveys allow an assessment of not only how many break-offs occur, but also when they occur. When designing online surveys, keep the following guidelines in mind when attempting to minimise nonresponse problems due to break-offs: »» Make sure the questionnaire is visually appealing and easy to read. Clutter causes respondents to give up. »» Don’t put too many questions on a single page or the task looks burdensome and leads respondents to give up. »» Sensitive questions lead respondents to give up. »» Respondents give up in the face of long questions.

»» Open-ended questions in a majority closed-ended survey lead respondents to give up. »» The more sophisticated the sample, the more items capturing greater variance (like sliders and high response rates) and those not containing labels on all response categories can be used effectively. »» One important element in a pretest is estimating how many people give up without finishing. Follow these rules and you won’t have to give up on Web-based surveys.14

STEP 6: DETERMINE QUESTION SEQUENCE

Shutterstock.com/Diego Cervo

WHAT WENT WRONG?

ONGOING PROJECT

The order of questions, or the question sequence, may serve several functions for the researcher. If the opening questions are interesting, simple to comprehend and easy to answer, respondents’ cooperation and involvement can be maintained throughout the questionnaire. Asking easy-to-answer questions teaches respondents their role and builds their confidence – they know that this is a professional researcher and not another salesperson posing as one. If respondents’ curiosity is not aroused at the outset, they can become uninterested and terminate the interview. A mail survey among department store buyers drew an extremely poor return rate. A substantial improvement in response rate occurred, however, when researchers added some introductory questions seeking opinions on pending legislation of great importance to these buyers. Respondents completed all the questions, not only those in the opening section. In their attempt to ‘warm up’ respondents towards the questionnaire, student researchers frequently ask demographic or classificatory questions at the beginning. This generally is not advisable, because asking for personal information such as income level or education may embarrass or threaten respondents. It usually is better to ask potentially embarrassing questions at the middle or end of the questionnaire, after rapport has been established between respondent and interviewer. Order bias can result from a particular answer’s position in a set of answers or from the sequencing of questions. In political elections in which candidates lack high visibility, such as voting for the senate in Australia, where the candidate position on the ballot paper can influence their vote, due to ‘donkey voting’ (ticking the boxes in order by disinterested voters). For this reason each candidate position on the ballot paper is randomly assigned each election. Order bias can also distort survey results. For example, suppose a questionnaire’s purpose is to measure levels of awareness of several charitable organisations. If Care Australia is always mentioned

order bias Bias caused by the influence of earlier questions in a questionnaire or by an answer’s position in a set of answers.

318

PART THREE > PLANNING THE RESEARCH DESIGN

first, the Red Cross second and Guide Dogs Australia third, Care Australia may receive an artificially high awareness rating because respondents are prone to yea-saying (by indicating awareness of the first item in the list). Asking specific questions before asking about broader issues is a common cause of order bias. For example, bias may arise if questions about a specific clothing store are asked before those concerning the general criteria for selecting a clothing shop. Suppose a respondent indicates in the first portion of a questionnaire that he shops at a store where parking needs to be improved. Later in the questionnaire, to avoid appearing inconsistent, he may state that parking is less important than he really believes it is. Specific questions may thus influence the more general ones. Therefore, it is advisable to ask general questions before specific questions to obtain the freest of open-ended responses. This procedure, funnel technique Asking general questions before specific questions in order to obtain unbiased responses.

known as the funnel technique, allows the researcher to understand the respondent’s frame of reference before asking more specific questions about the level of the respondent’s information and the intensity of his or her opinions. Consider how later answers might be biased by previous questions in this questionnaire on environmental pollution:

Circle the number on the following table that best expresses your feelings about the severity of each environmental problem: Problem

Not a problem

Very severe problem

Air pollution from motor vehicle exhausts

1

2

3

4

5

Air pollution from open air burning

1

2

3

4

5

Air pollution from industrial smoke

1

2

3

4

5

Air pollution from foul odours

1

2

3

4

5

Noise pollution from planes

1

2

3

4

5

Noise pollution from cars, trucks and motorcycles

1

2

3

4

5

Noise pollution from industry

1

2

3

4

5

Not surprisingly, researchers found that the responses to the air pollution questions were highly correlated – in fact, almost identical. With attitude scales, there also may be an anchoring effect. The first concept measured tends to become a comparison point from which subsequent evaluations are made. Randomisation of items on a questionnaire susceptible to the anchoring effect helps minimise order bias. A related problem is bias caused by the order of alternatives on closed questions. To avoid this problem, the order of these choices should be rotated if producing alternative forms of the questionnaire is possible. However, marketing researchers rarely print alternative questionnaires to eliminate problems resulting from order bias. A more common practice is to pencil in Xs or check marks on printed questionnaires to indicate where the interviewer should start a series of repetitive questions. For example, the capitalised phrases in the following question provide instructions to the interviewer to ‘rotate’ brands, starting with the one checked:

I would like to determine how likely you would be to buy certain brands of confectionery in the future. Let’s start with (X’ed brand). (Record below under appropriate brand. Repeat questions for all remaining brands.)

CHAPTER 09 > QUESTIONNAIRE DESIGN

Start here:

( ) Mars Bar

(X) Cherry Ripe

( ) Snickers

Definitely would buy

21

21

21

Probably would buy

22

22

22

Might or might not buy

23

23

23

Probably would not buy

24

24

24

Definitely would not buy

25

25

25

319

One advantage of Internet surveys is the ability to reduce order bias by having the computer randomly order questions and/or response alternatives. With complete randomisation, question order is random and respondents see response alternatives in different random positions.

Provide good survey flow Survey flow refers to the ordering of questions through a survey. Good survey flow makes a questionnaire easy to follow for the respondent, and so increases response quality. As discussed, the funnel technique should be followed, with more sensitive questions being asked towards the end of the survey. It is important that respondents are also provided with clear directions as to why they are being asked certain groups of questions, and where possible not asked questions that are irrelevant to them. Asking a question that does not apply to the respondent or that the respondent is not qualified to answer may be irritating or cause a biased response because the respondent wishes to please the interviewer or to avoid embarrassment. Including a filter question minimises the chance of asking questions that are inapplicable. Asking ‘What kind of problems have you had with your mobile phone provider?’ may elicit a response, even though the respondent has never had a problem with their mobile phone; he or she may wish to please the interviewer with an answer. A filter question such as ‘Do you ever have a problem with your mobile phone provider? – Yes – No’ would screen out the people who are not qualified to answer. Another form of filter question, the branch question, can be used to obtain income information and other data that respondents may be reluctant to provide. For example, →→ ‘Is your total family income over or under $50 000?’ IF UNDER, ASK, ‘Is it over or under $25 000?’ IF OVER, ASK, ‘Is it over or under $75 000?’ →→ Under $25 000 →→ $25 001–$50 000 →→ $50 001–$75 000 →→ Over $75 000 Exhibit 9.2 gives an example of a flowchart plan for a questionnaire. Structuring the order of the questions so that they are logical will help to ensure the respondent’s cooperation and eliminate confusion or indecision. The researcher maintains legitimacy by making sure that the respondent can comprehend the relationship between a given question (or section of the questionnaire) and the overall purpose of the study. Furthermore, a logical order may aid the individual’s memory. Transitional comments explaining the logic of the questionnaire may ensure that the respondent continues. Here are two examples: →→ We have been talking so far about general shopping habits in this city. Now I’d like you to compare two types of grocery stores – regular supermarkets and grocery departments in wholesale outlets.

filter question A question that screens out respondents who are not qualified to answer a second question. branch question A filter question used to determine which version of a second question will be asked.

320

PART THREE > PLANNING THE RESEARCH DESIGN

→→ So that I can combine your answers with those of other farmers who are similar to you, I need some personal information about you. Your answers to these questions – as to all of the others you’ve answered – are confidential, and you will never be identified to anyone without your permission. Thanks for your help so far. If you’ll answer the remaining questions, it will help me analyse all your answers.

EXHIBIT 9.2 → FLOW OF QUESTIONS TO DETERMINE THE LEVEL OF PROMPTING REQUIRED TO STIMULATE RECALL15

Were you watching television last night between 7.00 and 10.00 pm?

Tabulate and Terminate

No

Yes When you were watching last night, did you see a commercial for a brand of orange breakfast drink?

Yes What brand?

Tang Classify: Category prompt, Unaided recaller

Not Tang No

What other brand?

Tang

Not Tang Do you remember seeing a commercial last night for Tang?

Yes

Classify: Brand prompt, Aided recaller

Yes

Classify: Commercial prompt, Prompted recaller

No Series of questions on details of the show’s dramatic content, ratings of the show’s quality, and overall television viewing frequency; then: Do you recall a scene when Ben's landlady went back to bed and watched television after he told her he would go to the housing authority because she wanted to increase his rent to repair the ceiling in his apartment? Right after this, there was a commercial for Tang showing (then describe the general content of the commercial being tested). Do you recall seeing this commercial? No Classification questions on product usage, television viewing habits, household characteristics, etc.; terminate interview.

Seven questions regarding the content and persuasiveness of the Tang commercial being tested.

CHAPTER 09 > QUESTIONNAIRE DESIGN

STEP 7: DETERMINE PHYSICAL CHARACTERISTICS OF THE QUESTIONNAIRE

321

ONGOING PROJECT

Good layout and physical attractiveness are crucial in mail, Internet and other self-administered questionnaires. For different reasons it is also important to have a good layout in questionnaires designed for personal and telephone interviews.

Traditional questionnaires Exhibit 9.3 shows a page from a phone questionnaire. The layout is neat and attractive, and the instructions for the interviewer (all bold-face capital letters) are easy to follow. The responses ‘It depends’, ‘Refused’ and ‘Don’t know’ are enclosed in a box to indicate that these answers are acceptable, but responses from the 5-point scale are preferred. Rather than budgeting for an incentive, the money is often better spent on the questionnaire’s attractiveness and quality to increase its rate of return. Mail questionnaires should never be overcrowded. Margins should be of decent size, white space should be used to separate blocks of print, and the unavoidable columns of multiple boxes should be kept to a minimum. A question should not begin on one page and end on another page. Splitting questions may cause a respondent to read only part of a question, to pay less attention to answers on one of the pages, or to become confused. Questionnaires should be designed to appear as short as possible. Sometimes it is advisable to use a booklet form of questionnaire rather than stapling a large number of pages together. In situations in which it is necessary to conserve space on the questionnaire or to facilitate data entry or tabulation of the data, a multiple-grid layout may be used. The multiple-grid question presents several similar questions and corresponding response alternatives arranged in a grid format. For example: Airlines often offer special fare promotions. On a holiday trip would you take a connecting flight instead of a nonstop flight if the connecting flight were longer?

Yes No Not sure

One hour longer?

Two hours longer?

Three hours longer?

❏ ❏ ❏

❏ ❏ ❏

❏ ❏ ❏

Experienced researchers have found that it pays to phrase the title of a questionnaire carefully. In self-administered and mail questionnaires, a carefully constructed title may capture the respondent’s interest, underline the importance of the research (‘Nationwide Study of Blood Donors’), emphasise the interesting nature of the study (‘Study of Internet Usage’), appeal to the respondent’s ego (‘Survey among Top Executives’) or emphasise the confidential nature of the study (‘A Confidential Survey among ...’). The researcher should take steps to ensure that the wording of the title will not bias the respondent in the same way that a leading question might. By using several forms, special instructions and other tricks of the trade, the researcher can design the questionnaire to facilitate the interviewer’s job of following interconnected questions. Exhibits 9.4 and 9.5 illustrate portions of telephone and personal interview questionnaires. Note how the layout and easy-to-follow instructions for interviewers in Questions 1, 2 and 3 of Exhibit 9.4 help the interviewer follow the question sequence.

multiple-grid question Several similar questions arranged in a grid format.

322

PART THREE > PLANNING THE RESEARCH DESIGN

EXHIBIT 9.3 → LAYOUT OF A PAGE FROM A TELEPHONE QUESTIONNAIRE

5. Now I’m going to read you some types of professions. For each one, please tell me whether you think the work that profession does, on balance, has a very positive impact on society, a somewhat positive impact, a somewhat negative impact, a very negative impact, or not much impact either way on society. First … (Start at X’d item. Continue down and up the list until all items have been read and rated.) (do not read) Very positive impact

Somewhat positive impact

Somewhat negative impact

Very negative impact

Not much impact

It depends

Refused

Don’t know

[ ] Members of Federal Parliament

1

2

3

4

5

0

X

Y (24)

[ ] Business executives

1

2

3

4

5

0

X

Y (25)

[ ] Doctors

1

2

3

4

5

0

X

Y (26)

[ ] Political pollsters – that is, people who conduct surveys for public officials or political candidates

1

2

3

4

5

0

X

Y (27)

[ ] Researchers in the media – that is, people in media such as television, newspapers, magazines, and radio who conduct surveys about issues later reported in the media

1

2

3

4

5

0

X

Y (28)

[ ] Telemarketers – that is, people who sell products or services over the phone

1

2

3

4

5

0

X

Y (29)

[ ] Used car salesmen

1

2

3

4

5

0

X

Y (30)

[ ] Market researchers – that is, people who work for commercial research firms who conduct surveys to see what the public thinks about certain kinds of consumer products or services

1

2

3

4

5

0

X

Y (31)

[ ] Biomedical researchers

1

2

3

4

5

0

X

Y (32)

[ ] Public-opinion researchers – that is, people who work for commercial research firms who conduct surveys to see what the public thinks about important social issues

1

2

3

4

5

0

X

Y (33)

[ ] University academics

1

2

3

4

5

0

X

Y (34)

[ ] Lawyers

1

2

3

4

5

0

X

Y (35)

[ ] Members of the clergy

1

2

3

4

5

0

X

Y (36)

[ ] Journalists

1

2

3

4

5

0

X

Y (37)

Start here:

CHAPTER 09 > QUESTIONNAIRE DESIGN

EXHIBIT 9.4 → TELEPHONE QUESTIONNAIRE WITH SKIP QUESTIONS

1 Did you take the car you had checked to the Standard Auto Repair Centre for repairs? ❏ 1 Yes (Skip TO Q. 3) 2 (If no, ask:) Did you have the repair work done?

❏ 2 No

❏ 1 Yes

❏ 2 No

1 Where was the repair work done? 1 Why didn’t you have the car repaired? 2 Why didn’t you have the repair work done at the Standard Auto Repair Centre? 3 (If yes to Q. 1, ask:) How satisfied were you with the repair work? Were you… ❏ 1 Very satisfied ❏ 2 Somewhat satisfied ❏ 3 Somewhat dissatisfied ❏ 4 Very dissatisfied (If somewhat or very dissatisfied:) In what way were you dissatisfied?

4 (Ask everyone:) Do you ever buy petrol at the Main Road Standard Centre? ❏ 2 No (Skip to Q. 6) ❏ 1 Yes 5 (If yes, ask:) How often do you buy petrol there? ❏ 1 Always ❏ 2 Almost always ❏ 3 Most of the time ❏ 4 Part of the time ❏ 5 Hardly ever 6 Have you ever had your car washed there? ❏ 1 Yes ❏ 2 No 7 Have you ever had an oil change or lubrication done there? ❏ 1 Yes ❏ 2 No Instructions are often capitalised or in bold to alert the interviewer that it may be necessary to proceed in a certain way. For example, if a particular answer is given, the interviewer or respondent may be instructed to skip certain questions or go to a special sequence of questions. Note that Questions 3 and 6 in Exhibit 9.5 instruct the interviewer to hand the respondent a card bearing a list of alternatives. Cards may help respondents grasp the intended meaning of the question and remember all the brand names or other items they are being asked about. Also, Questions 2, 3 and 5 in Exhibit 9.5 instruct the interviewer that rating of the banks will start with the bank that has been checked in red pencil on the printed questionnaire. The name of the red-checked bank is not the same on every questionnaire.

323

324

PART THREE > PLANNING THE RESEARCH DESIGN

By rotating the order of the check marks, the researchers attempted to reduce order bias caused by respondents’ tendency to react more favourably to the first set of questions. To facilitate coding, question responses should be precoded when possible, as in Exhibit 9.5.

EXHIBIT 9.5 →PERSONAL INTERVIEW QUESTIONNAIRE16

‘Hello, my name is . I’m a Public Opinion Interviewer with Research Services, Inc. We’re making an opinion survey about banks and banking, and I’d like to ask you …’ 1

What are the names of local banks you can think of offhand? (INTERVIEWER: List names in order mentioned.) a b c d e f g

2 Thinking now about the experiences you have had with the different banks here in Blackheath, have you ever talked to or done business with … (INTERVIEWER: Insert name of bank checked in red below.) ? a Are you personally acquainted with any of the employees or officers at ? b (If YES) Who is that ? c How long has it been since you have been inside (INTERVIEWER: Now go back and repeat 2–2c for all other banks listed.) (2a and 2b) (2) Know employee (2c) Talked or officer Been in bank in: Yes No No Name Last year 1–5 5-plus No DK 1 2 3 4 5 National Australia Bank 1 2 1 1 2 3 4 5 Westpac 1 2 1 1 2 3 4 5 Commonwealth Bank 1 2 1 1 2 3 4 5 ANZ Bank 1 2 1 1 2 3 4 5 BankWest 1 2 1 1 2 3 4 5 HSBC Australia 1 2 1 3 (HAND BANK RATING CARD) On this card there are a number of contrasting phrases or statements – for example, ‘Large’ and ‘Small’. We’d like to know how you rate (NAME OF BANK CHECKED IN RED BELOW) in terms of these statements or phrases. Just for example, let’s use the terms ‘fast service’ and ‘slow service.’ If you were to rate a bank #1 on this scale, it would mean you find their service ‘very fast.’ On the other hand, a 7 rating would indicate you feel their service is ‘very slow,’ whereas a 4 rating means you don’t think of them as being either ‘very fast’ or ‘very slow.’ Are you ready to go ahead? Good! Tell me then how you would rate (NAME OF BANK CHECKED IN RED) in terms of each of the phrases or statements on that card. How about (READ NEXT BANK NAME)? … (INTERVIEWER: Continue on until respondent has evaluated all six banks.) National Australia Bank Westpac ANZ Bank BankWest HSBC Australia a Service b Size c Business vs Family d Friendliness e Big/Small Business f Rate of Growth g Modernness

»

CHAPTER 09 > QUESTIONNAIRE DESIGN

325

» National Australia Bank h Leadership i Loan Ease j Location k Hours l Ownership m Community Involvement

Westpac

ANZ Bank

BankWest

HSBC Australia

4 Suppose a friend of yours who has just moved to Perth asked you to recommend a bank. Which local bank would you recommend? Why would you recommend that particular bank? National Australia Bank 1 Westpac Bank 2 Commonwealth Bank 3 ANZ Bank 4 BankWest 5 Hongkong Shanghai Bank 6 Other (Specify) Don’t Know (DK)/Wouldn’t 9 5 Which of the local banks do you think of as: (INTERVIEWER: Read red-checked item first, then read each of the other five.) the newcomer’s bank? the student’s bank? the Personal Banker bank? the bank where most faculty and staff bank? the bank most interested in this community? the most progressive bank? 6 Which of these financial institutions, if any, (HAND CARD 2) are you or any member of your immediate family who lives here in this home doing business with now? Bank 1 Credit Union 2 Finance Company 3 Building Society 4 Merchant Bank 5 None of these 6 Don’t Know (DK)/Not Sure 7 (IF NONE, Skip to 19.) 7 If a friend asked you to recommend a place where he or she could get a loan with which to buy a home, which financial institution would you probably recommend? (INTERVIEWER: Probe for specific name.) Why would you recommend (INSTITUTION NAMED)? Would Recommend: Wouldn’t 0 Don’t Know (DK)/Not Sure 9

326

PART THREE > PLANNING THE RESEARCH DESIGN

EXHIBIT 9.6 → EXAMPLE OF A SKIP QUESTION

1 If you had to buy a computer tomorrow, which of the following three types of computers do you think you would buy?

1 Laptop/Netbook 2 Palm-sized (PDA) 3 Wearable technology

2 (If ‘wearable technology’ on Q. 1, ask): What brand do you think you would buy? 3 What is your age?

Exhibit 9.6 illustrates a series of questions that includes a skip question. Either skip instructions or an arrow drawn pointing to the next question informs the respondent which question comes next. Layout is extremely important when questionnaires are long or require the respondent to fill in a large amount of information. In many circumstances using headings or subtitles to indicate groups of questions will help the respondent grasp the scope or nature of the questions to be asked. Thus, at a glance, the respondent can follow the logic of the questionnaire.

Internet questionnaires Layout is also an important issue for questionnaires appearing on the Internet. A questionnaire on a website should be easy to use, flow logically and have a graphic look and overall feel that motivate the respondent to cooperate from start to finish. Many of the guidelines for layout of paper questionnaires apply to Internet questionnaires. There are, however, some important differences. With graphical user interface (GUI) software, the researcher can exercise control over the background, colours, fonts and other visual features displayed on the computer screen to create an attractive and easy-to-use interface between the computer user and the Internet survey. GUI software allows the researcher to design questionnaires in which respondents click on the appropriate answer rather than having to type answers or codes. Researchers often use Web publishing software, such as Qualtrics and Survey Monkey, to format a questionnaire so that they will know how it should appear online. However, several features of a respondent’s computer may influence the appearance of an Internet questionnaire. For example, discrepancies between the designer’s browser and computer type and the respondent’s device (such as a PC, mobile or tablet) and browser for viewing the questionnaire. may result in questions not being fully visible on the respondent’s screen, misaligned text, or other visual problems.17 The possibility that the questionnaire the researcher/designer constructs on his or her computer may look different from the questionnaire that appears on the respondent’s computer should always be considered when designing Internet surveys. One sophisticated remedy is to use the first few questions on an Internet survey to ask about operating system, browser software and other computer configuration issues so that the questionnaire that is delivered is as compatible as possible with the respondent’s computer. A simpler solution is to limit the horizontal width of the questions to 70 characters or fewer, to decrease the likelihood of wrap-around text.

LAYOUT ISSUES Even if the questionnaire designer’s computer and the respondents’ devices are compatible, there are several layout issues a Web questionnaire designer should consider. The first decision is whether the questionnaire will appear page by page, with individual questions on separate screens (Web pages), or on a scrolling basis, with the entire questionnaire appearing on a single Web page that the respondent scrolls

CHAPTER 09 > QUESTIONNAIRE DESIGN

327

from top to bottom. The paging layout (going from screen to screen) greatly facilitates skip patterns. Based on a respondent’s answers to filter questions, the computer can automatically insert relevant questions on subsequent pages. If the entire questionnaire appears on one page (the scrolling layout), the display should advance smoothly, as if it were a piece of paper being moved up or down. The scrolling layout gives the respondent the ability to read any portion of the questionnaire at any time, but the absence of page boundaries can cause problems. For example, suppose a Likert scale consists of 15 statements in a grid-format layout, with the response categories Strongly Agree, Agree, Disagree and Strongly Disagree at the beginning of the questionnaire. (See Exhibit 9.7.) Once the respondent has scrolled down beyond the first few statements, he or she may not be able to see both the statements at the end of the list and the response categories at the top of the grid simultaneously. Thus, avoiding the problems associated with splitting questions and response categories may be difficult with scrolling questionnaires.

THINKING ABOUT YOUR ATTITUDES TO YOUR CURRENT MOBILE PHONE SERVICE PROVIDER WOULD YOU SAY THAT:

Strongly disagree

Neither agree Strongly nor Somewhat Disagree Agree disagree agree Agree agree

← EXHIBIT 9.7 AN EXAMPLE OF A GRID LAYOUT OF QUESTIONS

I use the services of this provider because it is the best choice for me To me, service quality this provider offers is higher than the service quality of other service providers I have grown to like this provider in this category. This service provider is my preferred provider in this category I have used the services of this service provider less frequently than before I have switched to a competitor of the service organisation I will not use services of this organisation any more in the future.

When a scrolling questionnaire is long, category or section headings are helpful to respondents. It is also a good idea to provide links to the top and bottom parts of each section, so that users can navigate through the questionnaire without having to scroll through the entire document.18 Whether a Web survey is page-by-page or scrolling format, push buttons with labels should clearly describe the actions to be taken. For example, if the respondent is to go to the next page, a large arrow labelled ‘NEXT’ might appear in colour at the bottom of the screen. Decisions must be made about the use of colour, graphics, animation, sound and other special features that the Internet makes possible. One thing to remember is that, although sophisticated graphics are not a problem for people with very powerful computers, some respondents’ devices are not powerful enough to deliver complex graphics at a satisfactory speed – if at all. A textured background, coloured headings and small graphics can make a questionnaire more interesting and appealing, but they may present problems for respondents with older computers and/or low-bandwidth Internet connections. With a paper questionnaire, the respondent knows how many questions he or she must answer. Because many Internet surveys offer no visual clues about the number of questions to be asked, it is important to provide a status bar or some other visual indicator of questionnaire length. For example, including a partially filled rectangular box as a visual symbol and a statement (such as ‘The status bar

push button A small outlined area on a dialogue box, such as a rectangle or an arrow, that the respondent clicks on to select an option or perform a function (such as Submit) on an Internet questionnaire. status bar In an Internet questionnaire, a visual indicator that tells the respondent what portion of the survey he or she has completed.

328

PART THREE > PLANNING THE RESEARCH DESIGN

at top right indicates approximately what portion of the survey you have completed’) increases the likelihood that the respondent will finish the entire sequence of questions. An Internet questionnaire uses windows known as dialogue boxes to display questions and record answers. Exhibit 9.8 portrays four common ways of displaying questions on a computer screen. Many Internet questionnaires require the respondent to activate his or her answer by clicking on the radio button for a response. Radio buttons work like push buttons on car radios: clicking on an alternative response deactivates the first choice and replaces it with the new response. A drop-down box, such as the one shown in Exhibit 9.8, is a space-saving device that allows the researcher to provide a list of responses that are hidden from view until they are needed. A general statement, such as ‘Please select’ or ‘Click here’, is shown initially. Clicking on the downward-facing arrow makes the full range of choices appear. If the first choice in a list, such as ‘Strongly Agree’, is shown while the other responses are kept hidden, the chance that response bias will occur is increased. Drop-down boxes may present a problem for individuals with minimal computer skills, as they may not know how to reveal hidden responses behind a drop-down menu or how to move from one option to another in a moving-bar menu. EXHIBIT 9.8 → ALTERNATIVE WAYS OF DISPLAYING INTERNET QUESTIONS: RADIO BUTTONS CHECKLISTS AND OPEN ENDED BOXES

radio button In an Internet questionnaire, a circular icon, resembling a button, that activates one response choice and deactivates others when a respondent clicks on it. drop-down box In an Internet questionnaire, a space-saving device that reveals responses when they are needed but otherwise hides them from view. check box In an Internet questionnaire, a small graphic box, next to an answer, that a respondent clicks on to choose that answer; typically, a tick or an X appears in the box when the respondent clicks on it. open-ended box In an Internet questionnaire, a box where respondents can type in their own answers to open-ended questions. pop-up boxes In an Internet questionnaire, boxes that appear at selected points and contain information or instructions for respondents.

Checklist questions may be followed by check boxes, several, none or all of which may be checked or ticked by the respondent. Open-ended boxes are boxes in which respondents type their answers to open-ended questions. Open-ended boxes may be designed as one-line text boxes or scrolling text boxes, depending on the breadth of the expected answer. Of course, open-ended questions require that respondents have both the skill and the willingness to keyboard lengthy answers on the computer. Some open-ended boxes are designed so that respondents can enter numbers for frequency response, ranking or rating questions. For example: Below you will see a series of statements that might or might not describe how you feel about your career. Please rate each statement using a scale from 1 to 4, where 4 means ‘Totally agree’, 3 means ‘Somewhat agree’, 2 means ‘Somewhat disagree’ and 1 means ‘Totally disagree’. Please enter your numeric answer in the box provided next to each statement. Would you say that ...?

A lack of business knowledge relevant to my field/career could hurt my career advancement.

My career life is an important part of how I define myself. Pop-up boxes are message boxes that can be used to highlight important information. For example,

pop-up boxes may be used to provide a privacy statement, such as the following:

CHAPTER 09 > QUESTIONNAIRE DESIGN

329

IBM would like your help in making our website easier to use and more effective. Choose to complete the survey now or not at all.

Complete

No thank you

Privacy statement

Clicking on Privacy Statement opens the following pop-up box:

Survey privacy statement This overall Privacy Statement verifies that IBM is a member of the TRUSTe program and is in compliance with TRUSTe principles. This survey is strictly for market research purposes. The information you provide will be used only to improve the overall content, navigation, and usability of http://ibm.com.

In some cases, respondents can learn more about how to use a particular scale or get a definition of a term by clicking on a link, which generates a pop-up box. One of the most common reasons for using pop-up boxes is error trapping or prompting. Most allow validation of responses; that is, they remind respondents to complete questions, or they can be used to elucidate only valid responses, such as four digits for postcodes in Australia. This means completion rates for online questionnaires can be greatly increased. See Exhibit 9.9. ← EXHIBIT 9.9 ILLUSTRATION OF PROMPT AND STATUS BAR

Chapter 8 described graphic rating scales, which present respondents with a graphic continuum. On the Internet, researchers can take advantage of scroll bars or other GUI software features to make these scales easy to use. For example, the graphic continuum may be drawn as a measuring rod with a plus sign on one end and a minus sign on the other. The respondent then moves a small rectangle back and forth between the two ends of the scale to scroll to any point on the continuum. Scoring, as discussed in Chapter 8, is in terms of some measure of the length (millimetres) from one end of the graphic continuum to the point marked by the respondent.

330

PART THREE > PLANNING THE RESEARCH DESIGN

Online surveys can also time respondents as to how long it takes them to answer the survey and each question. Timing can also be used to provide behavioural data and help identify types of response error. For instance if respondents spend less time on a page of the survey than it would take to read the question, then they are not paying close attention and as a result raise doubt about their response. Similarly, if the total time spent responding is too fast, then the respondent may be marking responses randomly or without much thought. Respondents who take one and half standard deviations less time than average appear to be particularly likely to be offering ‘satisfying’ responses leading to a high response error.19 Finally, it is a good idea to include a customised thank-you page at the end of an Internet questionnaire, so that a brief acknowledgement pops onto their screens when respondents click on the Submit button.20

SOFTWARE THAT MAKES QUESTIONNAIRES INTERACTIVE Computer code can be written to make Internet questionnaires interactive and less prone to errors. The writing of software programs is beyond the scope of this discussion. However, several of the interactive functions that software makes possible should be mentioned here. As discussed in Chapter 5, Internet software allows the branching off of questioning into two or more different lines, depending on a respondent’s particular answer, and the skipping or filtering of questions. Questionnaire-writing software with Boolean skip and branching logic is readily available. Most of these programs have hidden skip logic so that respondents never see any evidence of skips. It is best if the questions the respondent sees flow in numerical sequence.21 However, some programs number all potential questions in numerical order, and respondents see only the numbers on the questions they answer. Thus, a respondent may answer questions 1 through 11 and then next see a question numbered 15 because of the skip logic. Software can systematically or randomly manipulate the questions a respondent sees. Variable piping software allows variables, such as answers from previous questions, to be inserted into unfolding questions. Other software can randomly rotate the order of questions, blocks of questions variable piping software Software that allows variables to be inserted into an Internet questionnaire as a respondent is completing it.

and response alternatives from respondent to respondent. This means it is possible to do online experiments by randomly displaying blocks corresponding to treatments, and to account for random order bias of questions. Researchers can use software to control the flow of a questionnaire. Respondents can be blocked

error trapping Using software to control the flow of an Internet questionnaire – for example, to prevent respondents from backing up or failing to answer a question.

from backing up, or they can be allowed to stop mid-questionnaire and come back later to finish. A

forced answering software Software that prevents respondents from continuing with an Internet questionnaire if they fail to answer a question.

faced error message on the question screen or insert a pop-up box instructing the respondent how

interactive help desk In an Internet questionnaire, a live, real-time support feature that solves problems or answers questions respondents may encounter in completing the questionnaire.

questionnaire can be designed so that if the respondent fails to answer a question or answers it with an incorrect type of response, an immediate error message appears. This is called error trapping. With forced answering software, respondents cannot skip over questions as they do in mail surveys. The program will not let them continue if they fail to answer a question.22 The software may insert a boldto continue. For example, if a respondent does not answer a question and tries to proceed to another screen, a pop-up box might present the following message: ‘You cannot leave a question blank. On questions without a “Not sure” or “Decline to answer” option, please choose the response that best represents your opinions or experiences.’ The respondent must close the pop-up box and answer the question in order to proceed to the next screen. Some designers include interactive help desks in their Web questionnaires, so that respondents can solve problems they encounter in completing a questionnaire. A respondent might email questions to the survey help desk or get live, interactive, real-time support via an online help desk.

CHAPTER 09 > QUESTIONNAIRE DESIGN

331

Some respondents will leave the questionnaire website, prematurely terminating the survey. In many cases, sending an email message to these respondents at a later date, encouraging them to revisit the website, may persuade them to complete the questionnaire. Through the use of software and cookies, researchers can make sure that the respondent who revisits the website will be able to pick up at the point where he or she left off. Once an Internet questionnaire has been designed, it is important to pretest it to ensure that it works with Internet Explorer, Firefox, Chrome and other browsers. Some general-purpose programming languages, such as Java, do not always work with all browsers. Because different browsers have different peculiarities, a survey that works perfectly well with one may not function at all with another.23

STEP 8: RE-EXAMINE AND REVISE STEPS 1–7 IF NECESSARY

ONGOING PROJECT

Many novelists write, rewrite, revise and rewrite again certain chapters, paragraphs or even sentences. The researcher works in a similar world. Rarely does he or she write only a first draft of a questionnaire. Usually the questionnaire is tried out on a group – selected on a convenience basis – that is similar in make-up to the one that ultimately will be sampled. Although the researcher should not select a group too divergent from the target market (for example, selecting business students as surrogates for businesspeople), pretesting does not require a statistical sample. The pretesting process allows the researcher to determine whether respondents have any difficulty understanding the questionnaire and whether there are any ambiguous or biased questions. This process is exceedingly beneficial. Making a mistake with 25 or 50 subjects can avoid the potential disaster of administering an invalid questionnaire to several hundred individuals. Tabulating the results of a pretest helps determine whether the questionnaire will meet the objectives of the research. A preliminary tabulation often illustrates that, although respondents can easily comprehend and answer a given question, it is inappropriate because it does not provide relevant information to help solve the marketing problem. Consider the following example from a survey among distributors of powder-actuated tools, such as stud drivers concerning the percentage of sales to given industries: Please estimate what percentage of your fastener and load sales go to the following industries: __% heating, plumbing and air-conditioning __ % carpentry __ % electrical __ % maintenance __ % other (please specify) The researchers were fortunate to learn that asking the question in this manner made it virtually impossible to obtain the information actually desired. Most respondents’ answers did not total 100 per cent, and the question had to be revised. Getting respondents to add everything correctly is a problem. Notice how the questions in Exhibit 9.10 from a survey on secretarial support are designed to mitigate this problem. Pretesting difficult questions such as these is essential.

preliminary tabulation A tabulation of the results of a pretest to help determine whether the questionnaire will meet the objectives of the research.

332

PART THREE > PLANNING THE RESEARCH DESIGN

EXHIBIT 9.10 → MITIGATING A RESPONSE PROBLEM WITH QUESTIONNAIRE DESIGN24

5. Of your work that is typed on a word processor, what percentage consists of (Your answers should equal 100%) A. Memos and short letters A % B. Reports (3+ pages) B % C. Other C % Check your responses to A, B and C. They must equal 100%. 6. Estimate how many hours in an average week you spend making copies. Include time spent walking to and from the copier, waiting to use it, and actually making copies. The next three responses must equal 100%. Estimate what percentage of your total copying time is spent

5 Per cent 6 Hours 7 Per cent

7. Walking to and from the copier 8. Waiting to use the copier 9. Making copies 9 Check your responses to 7,8 and 9. They must equal 100%.

ONGOING PROJECT

8 %

Per cent

STEP 9: PRETEST THE QUESTIONNAIRE What administrative procedures should be implemented to maximise the value of a pretest? Administering a questionnaire exactly as planned in the actual study often is not possible; for example, mailing out a questionnaire might require several weeks. And pretesting a questionnaire in this manner would provide important information on response rate, but it might not point out why questions were skipped or why respondents found certain questions ambiguous or confusing. Personal interviewers can record requests for additional explanation or comments that indicate respondents’ difficulty with question sequence or other factors. This is the primary reason why interviewers are often used for pretest work. Self-administered questionnaires are not reworded to be personal interviews, but interviewers are instructed to observe respondents and ask for their comments after they complete the questionnaire. When pretesting personal or telephone interviews, interviewers may test alternative wordings and question sequences to determine which format best suits the intended respondents. No matter how the pretest is conducted, the researcher should remember that its purpose is to uncover any problems that the questionnaire may cause. Thus, pretests typically are conducted to answer questions about the questionnaire such as the following: →→ Can the questionnaire format be followed by the interviewer? →→ Does the questionnaire flow naturally and conversationally? →→ Are the questions clear and easy to understand? →→ Can respondents answer the questions easily? →→ Which alternative forms of questions work best? Pretests also provide means for testing the sampling procedure – to determine, for example, whether interviewers are following the sampling instructions properly and whether the procedure is efficient. Pretests also provide estimates of the response rates for mail surveys and the completion rates for phone surveys. Usually a questionnaire goes through several revisions. The exact number of revisions depends on the researcher’s and client’s judgement. The revision process usually ends when both agree that the desired information is being collected in an unbiased manner.

CHAPTER 09 > QUESTIONNAIRE DESIGN

333

DESIGNING QUESTIONNAIRES FOR GLOBAL MARKETS Now that marketing research is being conducted around the globe, researchers must take cultural factors into account when designing questionnaires. The most common problem involves translating a questionnaire into other languages. A questionnaire developed in one country may be difficult to translate because equivalent language concepts do not exist or because of differences in idiom and vernacular. For example, the concepts of uncles and aunts are not the same in Australia and New Zealand as in India. In India, the words for ‘uncle’ and ‘aunt’ are different for the maternal and paternal sides of the family.25 Although Spanish is spoken in both Mexico and Venezuela, one researcher found out that the Spanish translation of the English term ‘retail outlet’ works in Mexico, but not in Venezuela. Venezuelans interpreted the translation to refer to an electrical outlet, an outlet of a river into an ocean or the passageway into a patio. International marketing researchers often have questionnaires back translated. Back translation is the process of taking a questionnaire that has previously been translated from one language to another and having it translated back again by a second, independent translator. The back translator is often a person whose native tongue is the language that will be used for the questionnaire. This process can reveal inconsistencies between the English version and the translation. For example, when a soft-drink

back translation Taking a questionnaire that has previously been translated into another language and having a second, independent translator translate it back to the original language.

company translated its slogan ‘Baby, it’s cold inside’ into Cantonese for research in Hong Kong, the result read: ‘Small mosquito, on the inside, it is very cold.’ In Hong Kong, small mosquito is a colloquial expression for a small child. Obviously the intended meaning of the advertising message had been lost in the translated questionnaire.26 In another international marketing research project, ‘Out of sight, out of mind’ was back translated as ‘Invisible things are insane’.27 As indicated in Chapter 8, literacy influences the designs of self-administered questionnaires and interviews. Knowledge of the literacy rates in foreign countries, especially those that are just developing modern economies, is vital.

SUMMARY SPECIFY WHAT INFORMATION WILL BE SOUGHT WHEN DESIGNING A QUESTIONNAIRE

Good questionnaire design is a key to obtaining accurate survey results. The specific questions to be asked will be a function of the type of information needed to answer the manager’s questions and the communication medium of data collection. Relevance and accuracy are the basic criteria for judging questionnaire results. A questionnaire is relevant if no unnecessary information is collected and the information needed for solving the marketing problem is obtained. Accuracy means that the information is reliable and valid. DETERMINE THE TYPE OF QUESTIONNAIRE AND TYPE OF SURVEY RESEARCH METHODS

Questionnaires that do not use an interviewer (mail and Internet surveys) are best for sensitive issues and when respondents are

09

educated and are to be asked a long series of questions. Fixed format questions are best suited for this purpose. Personal interviews are best used when interviewing requires respondents to answer open-ended questions, when it is necessary for the respondent to trial a product (such as a taste test) and/or when the respondent is less educated or is not fully literate. DETERMINE THE CONTENT OF INDIVIDUAL QUESTIONS

It is a good idea to ensure that all questions are necessary. All questions should match the information needs of the survey. Sometimes several questions may be needed instead of one question. Also be sure that questions can be answered – respondents should have the necessary information to respond to the question. Lastly, how likely are respondents to answer your questions freely and accurately? Questions that ask respondents to

334

PART THREE > PLANNING THE RESEARCH DESIGN

calculate quantities or tax their understanding (and literacy in some cases) will not be answered accurately. Asking sensitive questions also needs to be considered and carefully dealt with, as respondents may not be likely to answer these types of questions at all. DETERMINE THE FORM OF RESPONSE FOR EACH QUESTION

Knowing how each question should be phrased requires some knowledge of the different types of questions possible. Open-ended response questions pose some problem or question and ask the respondent to answer in his or her own words. Fixed-alternative questions require less interviewer skill, take less time and are easier to answer. In fixed-alternative questions the respondent is given specific limited alternative responses and asked to choose the one closest to his or her own viewpoint. Standardised responses are easier to code, tabulate and interpret. Care must be taken to formulate the responses so that they do not overlap. Respondents whose answers do not fit any of the fixed alternatives may be forced to select alternatives that do not communicate what they really mean. Open-ended response questions are especially useful in exploratory research or at the beginning of a questionnaire. They make a questionnaire more expensive to analyse because of the uniqueness of the answers. Also, interviewer bias can influence the responses to such questions. DETERMINE THE WORDING OF EACH QUESTION

Some guidelines for questionnaire construction have emerged from research experience. The language should be simple to allow for variations in educational level. Researchers should avoid leading or loaded questions, which suggest answers to the respondents, as well as questions that induce them to give socially desirable answers. Respondents have a bias against questions that suggest changes in the status quo. Their reluctance to answer personal questions can be reduced by explaining the need for the questions and by assuring respondents of the confidentiality of their replies. The researcher should carefully avoid ambiguity in questions. Another common problem is the double-barrelled question, which asks two questions at once.

DETERMINE QUESTION SEQUENCE

Question sequence can be very important to the success of a survey. The opening questions should be designed to capture respondents’ interest and keep them involved. Personal questions should be postponed to the middle or end of the questionnaire. General questions should precede specific ones. In a series of attitude scales the first response may be used as an anchor for comparison with the other responses. The order of alternatives on closed questions can affect the results. Filter questions are useful for avoiding unnecessary questions that do not apply to a particular respondent. Such questions may be put into a flowchart for personal or telephone interviewing. DETERMINE THE PHYSICAL CHARACTERISTICS OF THE QUESTIONNAIRE

The layout of a mail or other self-administered questionnaire can affect its response rate. An attractive questionnaire encourages a response, as does a carefully phrased title. Internet questionnaires present unique design issues. Decisions must be made about the use of colour, graphics, animation, sound and other special layout effects that the Internet makes possible. RE-EXAMINE AND REVISE STEPS 1–7 IF NECESSARY

The previous steps should be repeated so that as good as possible a questionnaire is developed for pretesting. Poor questionnaires may invalidate a research study, so care should be taken with their preparation. PRETEST THE QUESTIONNAIRE

Pretesting helps to reveal errors while they can still be corrected easily. It can also ensure that all the information necessary to be collected is included in the questionnaire. Pretesting is useful in identifying problems in questionnaire layout and wording before more expensive field work commences.

KEY TERMS AND CONCEPTS back translation branch question check box checklist question counterbiasing statement determinant-choice question double-barrelled question drop-down box

error trapping filter question fixed-alternative question forced answering software frequency-determination question funnel technique interactive help desk

leading question loaded question multiple-grid question open-ended box open-ended response question order bias pop-up boxes preliminary tabulation

push button radio button simple-dichotomy (dichotomous-alternative) question split-ballot technique status bar variable piping software

CHAPTER 09 > QUESTIONNAIRE DESIGN

335

5

7.15–7.3 0

7.30–7 .45

7.45–8 .00

9.15–9.3 0

.45 9.30–9

9.45–10 .00

7.00–7 .1 .15

.45 6.45–7 .00

6.30–6

9.00–9

.00 8.45–9

8.30–8

.45

Radio TV

8.15–8.3 0

3 Evaluate and comment on the following questions, taken from several questionnaires: a A university computer centre survey on SPSS usage: How often do you use Excel software? Please check one. Infrequently (once a semester) Occasionally (once a month) Frequently (once a week) All the time (daily) b A survey of advertising agencies: Do you understand and like the Australian Competition and Consumer Commission’s new corrective advertising policy? ❏ Yes ❏ No c A survey on a new, small electric car: Assuming 90 per cent of your driving is in town, would you buy this type of car? ❏ Yes ❏ No If this type of electric car had the same initial cost as a current ‘Big 3’ full-sized, fully equipped car, but operated at one-half the cost over a five-year period, would you buy one? ❏ Yes ❏ No d A student survey: Since the beginning of this semester, approximately what per cent of the time do you get to campus using each of the forms of transportation available to you per week? Walk ❏ Bicycle ❏ Public transport ❏ Motor vehicle ❏ e A survey of Apple resellers: Should the company continue its generous cooperative advertising program? f A survey of media use by farmers: Thinking about yesterday, put an X in the box below for each quarter-hour time period during which, so far as you can recall, you personally listened to radio. Do the same for television.

6.00–6

2 What are six critical questions for a researcher in designing a questionnaire?

6.00 to 10.00 a.m. by quarter-hours

8.00–8 .15

1 What potential flaw might have existed in the Roy Morgan survey discussed in the opening vignette of this chapter?

.15 6.15–6.3 0

QUESTIONS FOR REVIEW AND CRITICAL THINKING

Radio TV

If you did not watch television any time yesterday, place an X here ❏ If you did not listen to radio any time yesterday, place an X here ❏ g A pro-modern art society’s face-to-face survey of the general public: Australian art adds greatly to the quality of life in our community. Do you believe that more government support should be provided for Australian art? ❏ Yes ❏ No h A government survey of petrol retailers: Suppose the full-service pump selling price for regular petrol is 142.8 cents per litre on the first day of the month. Suppose on the 10th of the month the price is raised to 150.9 cents per litre, and on the 25th of the month it is reduced to 130.9 cents per litre. In order to provide the required data you should list the accumulator reading on the full-service regular petrol pump when the station opens on the 1st day, the 10th day and the 25th day of the month and when the station closes on the last day of the month. i An anti-gun control group’s survey: Do you believe that private citizens have the right to own firearms to defend themselves, their families and their property from violent criminal attack? ❏ Yes ❏ No j A survey of the general public: In the next year, after accounting for inflation, do you think your real personal income will go up or down? i Up ii Stay the same iii Down iv Don’t know

336

PART THREE > PLANNING THE RESEARCH DESIGN

k A survey of the general public: Some people say that companies should be required by law to label all chemicals and substances that the government states are potentially harmful. The label would tell what the chemical or substance is, what dangers it might pose and what safety procedures should be used in handling the substance. Other people say that such laws would be too strict. They say the law should require labels on only those chemicals and substances that the companies themselves decide are potentially harmful. Such a law, they say, would be less costly for the companies and would permit them to exclude those chemicals and substances they consider to be trade secrets. Which of these views is closest to your own? i Require labels on all chemicals and substances that the government states are potentially harmful. ii Don’t know iii Require labels on only those chemicals and substances that companies decide are potentially harmful. l A survey of voters: Since agriculture is vital to our state’s economy, how do you feel about the government’s farm policies? Strongly favour Somewhat favour Somewhat oppose Strongly oppose Unsure m A paper and pencil study of students registering to use a career centre: Many of the firms that interview screen applicants based on GPA (or grade point average). What is your GPA? GPA 5 Comment . 4 The following question was asked of a sample of television viewers:

We are going to ask you to classify the type of fan you consider yourself to be for different sports and sports programs. • Diehard fan: Watch games, follow up on scores and sports news multiple times a day. • Avid fan: Watch games, follow up on scores and sports news once a day. • Casual fan: Watch games, follow up on scores and sports news occasionally. • Championship fan: Watch games, follow up on scores and sports news only during championships or finals. • Non-fan: Never watch games or follow up on scores. • Anti-fan: Dislike, oppose or object to a certain sport. Does this question do a good job of avoiding ambiguity?

5 How might the wording of a question about income influence respondents’ answers? 6 What is the difference between a leading question and a loaded question?

7 Design one or more open-ended response questions to measure reactions to a magazine ad for a new MBA degree. 8 What are some good guidelines for avoiding common mistakes in writing questionnaire items? 9 Evaluate the layout of the filter question that follows:

Are you employed either full time or part time? Mark (x) one. ❏ Yes ❏ No If yes: How many hours per week are you usually employed? Mark (x) one.

❏ Less than 35

❏ 35 or more

hat is the postcode at your usual W place of work? 10 It has been said that surveys show that consumers hate advertising, but like specific ads. Comment. 11 The Apple assistance centre exists to solve problems for its users of MacBook, iPhone and iPad products. Develop a text message questionnaire that assesses users’ satisfaction with the Apple assistance centre. 12 What advantages do Internet surveys offer in terms of order bias and survey flow? 13 Design a complete personal interview questionnaire for a zoo that wishes to determine who visits the zoo and how they evaluate it. 14 Design a complete self-administered questionnaire for a bank to give to customers immediately after they open new accounts. 15 What is back translation? 16 Define pretesting. Pretests cost time and money. How might a researcher decide if further pretesting is necessary? 17 Provide an example when a filter question is needed to implement the survey. 18 A client tells a researcher that she wants a questionnaire that evaluates the importance of 30 product characteristics and rates her brand and 10 competing brands on these characteristics. The researcher believes that this questionnaire will induce respondent fatigue because it will be far too long. Should the researcher do exactly what the client says or risk losing the business by suggesting a different approach? 19 A lobbying organisation designs a short questionnaire about its political position. It also includes a membership solicitation with the questionnaire. Is this approach ethical? 20 A public figure who supports cost cutting in government asks the following question in a survey: ‘Do you support the Australian Senate having the power to eliminate waste in government?’ Is this question ethical?

CHAPTER 09 > QUESTIONNAIRE DESIGN

337

ONGOING PROJECT DOING A QUESTIONNAIRE? CONSULT THE CHAPTER 9 PROJECT WORKSHEET FOR HELP

Questionnaire design is as much an art as it is a science. The nine steps shown in this chapter can be followed by consulting the

Chapter 9 project worksheet on questionnaire design from the CourseMate website. Make sure you pilot (test) the questionnaire on a small sample before using it to collect all your data.

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to log in to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ interactive quizzes

☑ Search me! activities ☑ online research activities ☑ videos.

☑ flashcards

ONGOING CASE STUDY MOBILE PHONE SWITCHING AND BILL SHOCK David and Steve have just designed a questionnaire for AusBargain. They have done extensive research on the factors that encourage consumers to switch mobile service providers. The client also included a number of additional questions and wanted all the information in the survey collected. Unfortunately the survey has some 90 items and will take 30 minutes for a respondent to complete via an online survey. On reading the survey, Leanne notes its length and says that she believes it will induce respondent fatigue. David and Steve return from her office feeling depressed. Can they really go through with this project?

QUESTIONS

1 Should David, Leanne and Steve do exactly what the client has asked for or risk losing the business from AusBargain? 2 What might be an alternative approach? 3 How many questionnaire items would be needed if David, Steve and Leanne moved forward as suggested? 4 What type of branching might be useful in a survey of this type?

WRITTEN CASE STUDY 9.1 MAY THE FORCE BE WITH YOU! THE WORLDWIDE GROWTH OF JEDI KNIGHTS AS A RELIGION One would think that a census would provide the most accurate information about population. This approach assumes that people will answer truthfully and provide information that is correct and relevant to the nature of the study. It also depends on how the question is framed. The Australian Census in 2001 revealed the growth of a new religion, ‘Jedi Knights’, with some 70 000 respondents claiming to be disciples of the cult of George Lucas and Star Wars. In the United Kingdom, this nominated religion outnumbered the Jewish faith, with some 400 000 people claiming to be light-sabre-carrying members. The Jedi religion did not appear to be as popular in Canada; however, with some 20 000 members,28 it is showing potential as a new path to enlightenment. Clearly, some respondents are not being serious

in answering a question about religious faith, or are making a political point that it is inappropriate for governments to be collecting such information. Reacting to potentially bogus responses, in 2006 the Australian Bureau of Statistics (ABS) announced it would only recognise (code) responses of established religions that have a formal structure, and would code other philosophies (such as the Jedi Knights) as ‘other’. The question would also be optional to answer so that political concerns about the use of this information were addressed. The ABS also stressed that information about religious background was important for planning for hospitals, schools and retirement homes for faiths in Australia.

338

PART THREE > PLANNING THE RESEARCH DESIGN

QUESTIONS

1 Why did you think that respondents have answered the question about religion in the Australian Census in this particular way?

2 How do you think more accurate responses are to be collected by the ABS in the future?

WRITTEN CASE STUDY 9.2 MARKETING STRATEGIES IN CHINA A graduate student at an Australian university wishes to examine the kinds of marketing strategies used by Chinese firms. She develops a questionnaire in English and has it translated into Chinese. A section of the Chinese version is shown here.

EXHIBIT 9.11 → CHINESE MARKET ORIENTATION QUESTIONNAIRE

QUESTIONS

1 What is the typical process for developing questionnaires for markets where consumers speak a different language? 2 Find someone who speaks Chinese and have him or her back translate the questions that appear in Exhibit 9.11. Are these Chinese questions adequate?

NOTES 1

Roy Morgan Single Source (Australia), October 2014 – September 2015 (n=15,668), accessed at http://www.roymorgan.com/findings/6597-donation-nation-which-stateis-most-generous-201512092152 on 11 December 2015. 2 Beveridge, Richard (2002) ‘Listen to your interviewers’, Research News, November, accessed at http://www.amsrs.com.au/print.cfm?i=1019&e=84 on 22 May 2009. 3 Blair, Ed, Sudman, Seymour, Bradburn, Norman & Stocking, Carol (1977) ‘How to ask questions about drinking and sex: Response effects in measuring consumer behavior’, Journal of Marketing Research, 14(3), pp. 316–17, accessed from ABI/INFORM global database (document ID: 779732991) on 24 May 2009.

4 Sources: Craig, L. & Simminski, P, (2011), ‘If men do more housework, do their wives have more babies?’ Social Indicators Research, 101(2), pp. 255–58; Shellenbarger, S. (2009), ‘Housework pays off between the sheets’, Wall Street Journal, (October 21), D1–D3. 5 USA TODAY (1998) ‘USA snapshots’, 26 February, p. C1. 6 Business Wire (2012) ‘Donors give more regularly to nonprofits that communicate impact and make it easy to donate’, accessed from http://www.fativa.com on 30 January 2013. 7 Baker, Michael (2003) ‘Data collection–questionnaire design’, Marketing Review, vol. 3, pp. 343–70.

CHAPTER 09 > QUESTIONNAIRE DESIGN

8 Morgan, Fred (1990) ‘Judicial standards for survey research: An update and guidelines’, Journal of Marketing, January, pp. 59–70. 9 Cox, Lisa & Towell, Noel (2012) ‘Libs reject allegations of election push polling’, Canberra Times, 28 September, p. 8. 10 Payne, Stanley (1951) The art of asking questions, Princeton, NJ: Princeton University Press, p. 185. 11 Newspoll News (2006) ‘Support for desalination down the drain’, 15(1), p. 1. 12 Other product attributes are relative advantage, compatibility, complexity and communicability. 13 Payne, Stanley (1951) The art of asking questions, Princeton, NJ: Princeton University Press, pp. 102–3. The reader who wants a more detailed account of question wording is referred to this classic book on that topic. 14 Sources: Peychev, A. (2009), ‘Survey Breakoff,’ Public Opinion Quarterly, 71 (Spring), pp. 74–97; Weijters, B., Cabooter, E. & Schillewaert, N. (2010), ‘The Effect of Rating Scale Format on Response Styles: The Number of Response Categories and Response Category Labels,’ International Journal of Research in Marketing, 27, pp. 236–47. 15 Dillman, Don (2000) Mail and Internet surveys: The tailored design method, New York: John Wiley and Sons, pp. 357–61. 16 Reprinted with permission from the Council of American Survey Research, http://www. casro.org. 17 Research Services, Inc. of Denver, CO and the United Bank of Boulder, CO.

339

18 Dillman, Don (2000) Mail and Internet surveys: The tailored design method, New York: John Wiley and Sons, pp. 357–61. 19 Malhotra, Neil (2008) ‘Completion Time And Response Order Effects In Web Surveys’, Public Opinion Quarterly 22(8) pp. 914–934. 20 Young, Sarah & Ross, Craig (2000) ‘Web questionnaires: A glimpse of survey research in the future’, Parks & Recreation, 35(6), June, p. 30. 21 Accessed at www.decisionanalyst.com/online/surtech.htm on 6 February 2001. 22 Accessed at www.decisionanalyst.com/online/surtech.htm on 6 February 2001. 23 Michel, Matt (2001) ‘Controversy Redux’, CASRO Journal, accessed at http://www. decisionanalyst.com/publ_art/contredux.htm on 8 February 2001. 24 Adapted, with permission, from IBM Management Staff Booklet for Systems Design, 1975 (Z140-3008-2 U/M 001). 25 Cateora, Philip (1990) International marketing, Homewood, IL: Richard D. Irwin, pp. 387–9. 26 Cateora, Philip (1990) International marketing, Homewood, IL: Richard D. Irwin, pp. 387–9. 27 Jain, Subhash (1990) International marketing, Boston: PWS Kent, p. 338. 28 Canada.com (2006) ‘Some 20,000 Canadians worship at the altar of Yoda’, accessed at: www.canada.com/national/features/census/story .html?id=A4623A62-51954B57-B40B-087D8F38CF6F on 3 July.

FOUR PLANNING THE SAMPLE

10 » SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

PART 4: Planning the sample

DEFINE TARGET POPULATION

Probability sample

Yes

No

Do we have a list?

Non-probability sample

Are most people in the target population?

Type of list? Population members

Location only

No

Additional relevant information in the list?

iStock.com/tetmc

Yes

Cluster sample

Stratified random sample

Can field staff easily identify target respondents? No

Simple random sample

No

Snowball sample

Yes

Judgement sample

Yes

Convenience sample

341

10 » WHAT YOU WILL LEARN IN THIS CHAPTER

To define the terms ‘sample’, ‘population’, ‘population element’ and ‘census’. Why a sample rather than a complete census is useful. Issues concerning the identification of the target population and the selection of a sampling frame. Common forms of sampling frames and sampling frame error. To distinguish between sampling errors and systematic (nonsampling) errors. Types of systematic (non-sampling) errors that result from sample selection. Advantages and disadvantages of different types of probability and nonprobability samples. How to choose an appropriate sample design. Three factors that affect sample size.

SAMPLING:

SAMPLE DESIGN AND SAMPLE SIZE

The pivotal role sampling plays in conducting good research

We use sampling almost every day. When you are cooking noodles, you taste just one piece to decide if they’re all cooked. When you are channel surfing, you usually watch less than a minute of a TV show before deciding to switch or continue watching. We often sample products and make accurate decisions based on a subset of information. Researchers often use samples to infer characteristics of the population without contacting or surveying every member of the population – we can often ask a few respondents and get results similar to those we’d get if the whole population was asked. For instance, in the run-up to an election, research firms such as Roy Morgan (http://www. roymorgan.com.au), Newspoll (http://www. newspoll.com.au), Nielsen (http://www.nielsen. com/au/en.html), Colmar Brunton (http://www. colmarbrunton.co.nz) and Galaxy Research (http:// www.galaxyresearch.com.au) ask a relatively small sample of the population questions regarding their political preferences. Often sample sizes of slightly more than 1000 citizens provide accurate predictions about the outcome of elections. It is not economically feasible or timely to ask all citizens about their political preferences. Instead, a small sample of that

Non-statistical considerations influence the determination of sample size.

342

PART FOUR > PLANNING THE SAMPLE

population can provide accurate and meaningful results that represent the views of the millions of citizens likely to vote. Many people, including researchers and lay-people, believe that a large sample size is important, but this is rarely the case. In fact, the quality of the data, the method by which people are contacted and how we communicate with them all have far more impact on the validity of a research study than does sample size. Sampling is just as important for qualitative research studies as it is for quantitative research studies. Sample size is less important for qualitative studies, but if the researcher intends to make inferences or develop theory about the larger population, then the quality and representativeness of the sample is crucial. How should we select our sample? How can we design our sampling procedure to provide the most accurate and reliable results? How many cases should be in the sample to provide us with accurate results, and still complete the research within budget and on time? The concept of sampling may seem simple and intuitive, but the actual process of sampling can be quite complicated. Sampling is a central aspect of marketing research, and it requires in-depth examination to understand how to conduct good quality market research. This chapter explains some ‘ideal types’ of sampling, and some compromises that are more often used when ideal conditions are not possible.

SAMPLING TERMINOLOGY The process of sampling involves using a small number of items or parts of the population to make

sampling

conclusions about the whole population. For instance, when we generalise about a certain group of

sample

people or things (for example, ‘Lectures are boring!’) we are doing so on the basis of a small sample of observations, not all observations. A sample is a subset, or some part, of a larger population. The purpose of sampling is to enable one to estimate some unknown characteristic of the population.

SURVEY THIS!

This survey asks a variety of questions of tertiary students. Suppose you were a university interested in studying the habits of university students in general. Consider the following questions: 1 How well do the results collected from this survey represent the market for potential undergraduate students? 2 How well do the results represent current Australian undergraduate students? [Hint: Compare the profile on the questions shown below with data showing typical characteristics of Australian undergraduate students.] 3 How well do the results represent Australian business students? 4 Can the data be stratified in a way that would allow them to represent more specific populations? Explain your answer. Courtesy of Qualtrics.com

We have defined sampling in terms of the population to be studied. A population is any complete group – for example, of people, sales territories, shops or university students – that shares some common set of characteristics (don’t confuse this with the number of residents of a particular country!). The term population element refers to an individual member of the population. For example, an element within the population of coffee drinkers is someone who drinks coffee. A census is an investigation of all the individual elements that make up the population. For instance, if a small firm of 25 employees wants to know how satisfied employees are with current working conditions, then all of them may be contacted as the population is so small. Similarly, for planning purposes the Australian government conducts a census every five years, attempting to survey every member of the Australian population (http://www.abs.gov.au). sample A subset, or some part, of a larger population used to make inferences about the larger population.

WHY SAMPLE? On a wine-tasting tour, guests all recognise the impossibility of doing anything but sampling. However, sample rather than a complete census be taken?

population Any complete group of entities that share some common set of characteristics.

Cost and time

population element An individual member of a population.

in a scientific study in which the objective is to estimate an unknown population value, why should a

Market research projects have budget and time constraints. If GM Holden wished to take a census on past purchasers’ satisfaction ratings with their cars, then hundreds of thousands of buyers would have to be contacted. This would be costly and time-consuming, and there are obvious limitations with

census An investigation of all the individual elements that make up a population.

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

343

344

PART FOUR > PLANNING THE SAMPLE

not having a list of these buyers and their current contact details. Some people would be inaccessible (for example, out of the country) and it would be impossible to contact everyone within a short time period, if at all. Remember, a key concern with any research project is determining whether or not the costs outweigh the benefits.

Accurate and reliable results Most properly selected samples give sufficiently accurate results. If the elements of a population are quite similar, only a small sample is necessary to accurately portray the characteristic of interest. The example at the beginning of this chapter shows that, if done correctly, samples of 1000 can provide results indicative of surveying millions of people across many political electorates. Of course, accuracy depends on how homogeneous (similar) the sample is. Samples of a highly homogeneous population can be smaller than samples of a more heterogeneous (dissimilar) population.

REAL WORLD SNAPSHOT

WILLIAM’S CENSUS WAS AHEAD OF ITS TIME1

After William the Conqueror took over England, after the Battle of Hastings in 1066, he wanted to see what his conquest had netted him. Monarchs back then lacked basic information such as the number of estates or livestock in the kingdom. So, in 1086, William ordered a unique survey. Every village and fief was counted to the last building, netting detailed data on how much land and wealth each person owned. He used the data to levy taxes. The result was Domesday Book, a corruption of ‘Doomsday’, the Day of Judgement. Today, governments conduct censuses, keeping our names and birthdates on file. But it’s not just to work out what can be taxed – more often census data are used to decide where schools and hospitals should be built, location of electoral boundaries, and the allocation of essential social services.

Destruction of test units Manufacturing quality-control tests require the destruction of the items being tested. If Samsung wishes to find out from just how high each smartphone can be dropped, then there would be no smartphones left to sell! The same occurs in many marketing experiments. If we want to find out which advertising message is more persuasive then a census removes the opportunity to make changes to the promotional campaign before we commit.

PRACTICAL SAMPLING CONCEPTS Researchers must make several decisions before taking a sample. Exhibit 10.1 presents these decisions as a series of sequential stages, even though the order of the decisions does not always follow this sequence.

Defining the target population The first question we should address when sampling is to define the target population. What is the relevant population? In some cases, this is not a difficult question. If a company’s sales force is the population of interest, there are few definitional problems. In other cases, the decision may be more difficult, particularly if there is no list to draw from. For example, how is a researcher to identify a

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

345

target population for Coca-Cola drinkers or wearers of Nike shoes? Similarly, a researcher interested in understanding how potential students choose a university might define the target population as ‘potential university students’. With such a definition, current university students would not be included. And we would also exclude anyone who may have an influence on the decision process, including friends and family. If the research question includes input from others on the behaviour of the individual, then the population definition would need to be changed. In other cases, the appropriate population element may be a collection of people rather than any individual person; for example, in determining consumer attitudes towards broadband usage and take-up, a researcher may seek information about households rather than the views of just one person in a household.

Define the target population

← EXHIBIT 10.1 STAGES IN THE SELECTION OF A SAMPLE

Select a sampling frame

Determine if a probability or nonprobability sampling method will be chosen

Plan procedure for selecting sampling units

Determine sample size

Select actual sampling units

Conduct fieldwork

It is vital to carefully define the target population. What are the crucial characteristics of the population? Who is definitely in the population of interest? Who is definitely not in the population of interest? For example, if you want to survey fellow students on the quality of transport to and from campus then you need to ask, ‘Who do we want to talk to?’ It may be users, non-users, recent adopters, influencers and others. Ideally, you would want to identify these people before you made contact. So the next consideration is to identify tangible characteristics that define the population. For example, a baby-food manufacturer might define the population of interest as all women able to have children. A more specific operational definition might be women between the ages of 18 and 50 who have given birth in the last six months.

The sampling frame You may have watched a nature documentary on television where a biologist wanders into an empty field with a big wire frame 1 metre by 1 metre square. He dumps it on the ground and then drops to his hands and knees to count the number of creatures of a particular species that are inside the frame.

sampling frame

346

PART FOUR > PLANNING THE SAMPLE

Everything inside the frame is available for counting and anything outside the frame is excluded. This is the origin of the term ‘sampling frame’. It is a device that defines the boundaries of our sample and helps us to identify what should, and should not, be counted. In marketing research, we don’t throw a giant wire square over a crowd of people, of course. It is a metaphor. Our sampling frames need to be a little more sophisticated.

REAL WORLD SNAPSHOT

YOU CAN LEARN A LOT FROM A FEW: GEORGE GALLUP’S NATION OF NUMBERS2

Have you ever heard of the Gallup Poll? When at university in Iowa, USA, young George Gallup answered an advertisement for summer employment in St. Louis. The Post-Dispatch hired 50 students to survey the city, questioning readers about what they liked and didn’t like in the newspaper: each and every reader. The students were hired to go to every door in St. Louis – there were 55 000 homes in the city then – and ask the same questions. Gallup, one hot day, knocked on one door too many, got the same answers one time too many, and decided that there had to be a better way. George developed a better way through his PhD thesis entitled ‘A new technique for objective methods for measuring reader interest in newspapers’. Working with the Des Moines Register and Tribune and the 200-year-old statistical theory probabilities of the Swiss mathematician Jakob Bernoulli, Gallup developed ‘sampling’ techniques. He said that you didn’t have to talk to everybody, as long as you randomly selected interviews according to a sampling plan that took into account whatever diversity was relevant in the universe of potential respondents – geographic, ethnic and economic. These concepts formed the foundation of modern sampling theory and the basis of this chapter. Although not everybody understood or believed then – or now – this intellectual invention was a big deal. George tried to explain what he was talking about and doing. ‘Suppose there are 7000 white beans and 3000 black beans well churned in a barrel,’ he said, ‘If you scoop out 100 of them, you’ll get approximately 70 white beans and 30 black in your hand, and the range of your possible error can be computed mathematically. As long as the barrel contains many more beans than your handful, the proportion will remain within that margin of error 997 times out of 1000.’ Well, it seemed to work for newspapers, and George Gallup was in great demand around the country. He took senior academic posts and conducted readership surveys across the USA. In 1932 a new advertising agency, Young & Rubicam (Y&R), invited him to New York to create a research department and procedures for evaluating the effectiveness of advertising. He did that, too. One of his first Y&R surveys, based on newspaper experience, indicated that the number of readers of advertisements was proportional to the length of the paragraphs in a piece of copy. He even used the same ideas to help his mother-in-law, a Democrat, win a state election in a seat not held by the Democrats since the Civil War! With continued success, Gallup went out and formed the grandly titled American Institute of Public Opinion. The sampling frame, ideally, is a list of all elements in the target population (although it is rarely possible). For instance, if a researcher was interested in surveying practising doctors in Australia, a

sampling frame A list of elements from which a sample may be drawn; also called working population. When a list is infeasible, then the sampling frame is a highly detailed explanation of how a representative subset of the target population will be contacted.

sampling frame might be a list of all members of the Australian Medical Association. In practice, often it is not feasible to compile a list that does not exclude some members of the population. For example, if the student email list is taken as a sampling frame of your university’s student population, the sampling frame may exclude those students who registered late, or include other students who have just completed their degrees. The sampling frame is also called the working population because it provides the list for operational work. Therefore, it is the list that researchers use to contact the target population. The target population is a useful theoretical notion, but the sampling frame is a practical operationalisation of the target population. When a list of working

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

347

population members is impossible or impractical then the sampling frame is a detailed explanation of how the researcher plans to contact each population member. We discuss this in the chapter sections called ‘Probability sampling’ and ‘Nonprobability sampling’. The discrepancy between the definition of the population and the sampling frame is the first potential source of error associated with sample selection. Some firms specialise in providing lists or databases that may include the names, addresses, phone numbers and email addresses of specific populations. Lists offered by companies such as this are compiled from subscriptions to professional journals, credit card applications, online product registrations and other sources. Most often, such lists are used by salespeople to target highly defined customer groups with a business offering, but the lists may also be used by marketing researchers to target specific populations. If you have ever entered a competition in a shopping mall then you are on a mailing list. If you have registered a new purchase online then you are probably on a mailing list, unless the company explicitly said otherwise, or you clicked the opt-out option. In some cases, lists will not be available at all or are very different from the target population. For example, if a researcher wants to contact all Coca-Cola drinkers, there may be a list of people who registered their email addresses in a recent competition, but certainly not all Coca-Cola drinkers will have registered. In such cases, the researcher must make a judgement – for the purposes of the research question, is the person who signs up for a competition likely to be different from the person who decides not to sign up? If the answer is ‘yes’, then marketers have to rely on more practical methods of sampling known as nonprobability sampling. The use of lists by market researchers and telemarketers is an important ethical issue in market research. Have you ever been contacted by telemarketers or market researchers by phone and wondered how they got your number? Think of all the forms and warranty cards that you fill out and the competitions you enter. While many companies offer strict privacy policies associated with your personal information, some don’t. (You may want to discuss the ethics of such information-gathering methods in class. Also, you may want to discuss whether an uninvited email from a marketing researcher is any different from sales spam.)

MOBILE TECHNOLOGY TRANSFORMS SURVEY TECHNIQUES AND LIVES3

In countries such as India with developing economies, fewer than 5 per cent of households had telephones or Internet access until about the year 2000. As such, telephone directories and other lists could not serve as sampling frames for the majority of market research studies. Now more than 75 per cent of people in developing countries have access to a mobile phone. According to a World Bank report in 2015, citizens of developing countries are increasingly using mobile phones to create new livelihoods and enhance their lifestyles, while governments are using them to improve service delivery and citizen feedback, and social marketing research mechanisms.

A sampling frame error occurs when certain sample elements are excluded or when the entire population is not accurately represented in the sampling frame. That is, some people are on the list who shouldn’t be, or some people are not on the list but should be. For example, some years ago it was popular to conduct a phone survey based on randomly selecting contacts in the White Pages telephone directory. Nowadays, the White Pages includes only ‘listed’ landlines while people with ‘unlisted’ numbers and mobile phones generally are not included. If the people who are listed in the White Pages are different from those who only use mobile phones or who have unlisted numbers, then the researchers have made a sampling frame error.

REAL WORLD SNAPSHOT

sampling frame error An error that occurs when certain sample elements are not listed or are not accurately represented in a sampling frame.

348

PART FOUR > PLANNING THE SAMPLE

Population elements can also be over-represented in a sampling frame. For instance, if a bank defined its sampling frame as those individuals who hold each of its savings accounts, then people who have multiple accounts will be over-represented. A list of account holders’ names may be more useful in this case.

SAMPLING FRAMES FOR INTERNATIONAL MARKET RESEARCH The availability of list sampling frames around the globe varies dramatically. Not every country’s government conducts a reliable census of population, so the researcher cannot compare a sample with known population characteristics. However, in Taiwan, Japan and other Asian countries, a researcher can build a sampling frame more easily because those governments release some census information about individuals. If a family changes households, updated census information must be reported to a centralised government agency before communal services (water, gas, electricity, education etc.) are made available. This keeps residents’ information up to date and increases accuracy.

Sampling units The sampling unit is a single element or group of elements subject to selection in the sample. A sampling unit does not have to be a person – it can be some other entity such as a city, a suburb or a team. For example, if an airline wishes to sample passengers, it may take every twenty-fifth name on a complete list of passengers. In this case the sampling unit would be the same as the element. Alternatively, the airline could first select certain flights as the sampling unit, and then select certain passengers on each flight. In this case the sampling unit would contain many elements. In such cases we can further define these sampling units as primary sampling units (PSUs), secondary sampling units (if two stages of sampling are necessary) and tertiary sampling units (if three stages of sampling are necessary).

RANDOM SAMPLING AND NON-SAMPLING ERRORS The world of research is not perfect. In all research projects some form of error is inevitable. Errors occurring in a research project can be categorised either as random sampling error or non-sampling error. Random sampling error is a type of error resulting from incorrect selection of respondents due to purely random influence. These are beyond the researcher’s control due to their random nature but can be estimated through statistical testing (see Chapter 12). Non-sampling errors, on the other hand, are errors resulting from aspects of the research that can be controlled, otherwise known as human random sampling error

error. These types of error are discussed in more depth below. Random sampling error is the difference between the result obtained from a sample result and the result that we would obtain from a census using identical procedures (that is, the difference

sampling unit A single element or group of elements subject to selection in the sample.

between the true value and the value obtained from the sample). For instance, suppose a firm

random sampling error The difference between sample result and the result of a census conducted using identical procedures; a statistical fluctuation that occurs because of chance variations in the elements selected for a sample.

these same respondents, despite already knowing the composition of gender, and found within the

has 1000 customers and knows, with certainty from customer records, that 10 per cent are males and 90 per cent are females. Now suppose that the researcher took a random sample of 100 of sample that 30 per cent were males and 70 per cent were females. Using this hypothetical example, random sampling error is the difference between the true value (that is, 10:90) and the values obtained from the sample (that is, 30:70). Another simple example may also help to illustrate the point. Suppose our population of interest is currently enrolled Marketing Research students at your university. This might mean a population of 200 students. Suppose we know that median annual income of the 200 students is $20 000 (about

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

349

half students earn less than $20K and half earn more than $20K). Now suppose that a researcher randomly drew a sample of 10 respondents and found, based on these 10 individuals, that median income was $10 000 (that is, $10 000 less than the true value of $20 000). Would you be surprised at this outcome? Probably not, because there are only 10 students in the sample. The random sample gave us 10 students who just happened to have lower incomes – this could have occurred by accident (that is, random error). A different sample of 10 students might give you an average income of $25 000, and another sample of 10 might give you an average income of $17 000. No single sample will be exactly the same as the true population value but the average of all of those samples will be pretty close. Assuming no other type of error, random sampling error is the difference between the true value from the population (which we are never likely to know, otherwise why would we sample in the first place?) and the value drawn from the sample. Random sampling error is the basis of statistical inference (discussed in Chapter 12). Inferential statistics are tools for estimating, within bounds of certainty, the level of random error in a sample. To some extent, the researcher can exert some sort of influence over the amount of random error by increasing the sample size. As sample size increases, random sampling error decreases – the more people we ask, the closer the sample result will be to the true result. Of course, the resources available will influence how large a sample may be taken, and, if done properly, quite small samples can provide very accurate results. Larger samples may be marginally more accurate than smaller samples, but many times more costly. Sample size is just one of the sources of error in sample-based measurement. Usually sample size is the least important cause of errors in research. Many researchers focus on sample size because it is one of the few things that can be properly measured in their designs. A large sample from the wrong people, or a large sample that asks the wrong question, or asks questions in the wrong way, all will give much greater error than anything from a smaller sample. These are known as systematic errors. For a poorly run study, a large sample size just gives the researcher more confidence in a wrong answer. Systematic (non-sampling) errors result from non-sampling factors, primarily the nature of a study’s design and the correctness of execution. These errors are not due to chance fluctuations. Similarly, phone or in-home surveys conducted during the day are likely to be answered by people who are at home during the day – stay-at-home parents, shift-workers, students –who may provide answers different from those provided by a more representative sample. Sample biases such as these account for a large portion of errors in marketing research. The term ‘sample bias’ is somewhat unfortunate, because many forms of bias are not related to the selection of the sample. Errors due to sample selection problems, such as sampling frame errors, are systematic (nonsampling) errors and should not be classified as random sampling errors. That is, even though they are errors related to the incorrect sample, they are errors resulting from the researcher, not random influence.

Less than perfectly representative samples Exhibit 10.2 illustrates random sampling error and two non-sampling errors (sampling frame error and nonresponse error) related to sample design. The total population is represented by the area of the largest square. Sampling frame errors eliminate some potential respondents. Random sampling error

systematic (non-sampling) error Error resulting from some imperfect aspect of the research design, such as mistakes in sample selection, sampling frame error or nonresponses from persons who were not contacted or refused to participate. nonresponse error The statistical differences between a survey that includes only those who responded and a perfect survey that would also include those who failed to respond.

(due exclusively to random, chance fluctuation) may cause an imbalance in the representativeness of the group. There is nothing that can be done about that, but inferential statistics let us guess how much error there may be. Additional errors will occur if individuals refuse to be interviewed or cannot be contacted. Such nonresponse error may also cause the sample to be not representative of the

nonresponse error

350

PART FOUR > PLANNING THE SAMPLE

Sampling frame

Total population

Sampling frame error

Respondents (actual sample)

Planned sample

Random sampling error

Nonresponse error

population. Political pollsters have found that conservative voters are less likely to respond to political polls than are liberal-minded voters. Consumers who are busier are less likely to respond to a survey than those with more spare time. Often it may be just as important to understand perceptions of those who do not respond as those that do respond, because they may provide different ideas.

probability sampling

PROBABILITY VERSUS NONPROBABILITY SAMPLING There are several ways to take a sample. The main alternative sampling plans may be grouped into two categories: probability techniques and nonprobability techniques. In probability sampling, every element in the population has a known, non-zero probability of selection. In practice that means we need a list that captures all of the members of our population of interest. The simple random sample, in which each member of the population has an equal probability of being selected, is the best-known probability sample. In nonprobability sampling the probability of any particular member of the population being chosen is unknown. The selection of sampling units in nonprobability sampling is quite arbitrary, as researchers rely heavily on personal judgement. There are no appropriate statistical techniques for measuring random sampling error from a nonprobability sample. Technically, if random sampling error

probability sampling A sampling technique in which every member of the population has a known, nonzero probability of selection. nonprobability sampling A sampling technique in which units of the sample are selected on the basis of personal judgement or convenience; the probability of any particular member of the population being chosen is unknown.

is the only issue of concern then projecting the data beyond the sample is statistically inappropriate. In practice, however, random-sampling error is never the only issue of concern, and on most occasions nonprobability samples are best suited for the researcher’s purpose, and the results may be fairly projected beyond the sample to the population of interest. Generally, probability sampling methods are preferred to nonprobability sampling methods. Because of random selection, probability samples are likely to be less biased. However, they rely on the researcher having an accurate list for a sampling frame – something that is difficult to obtain for many market research studies. If there is no sampling frame or list then researchers have to use nonprobability methods. Nonprobability methods are often more practical. They are also cheaper, less time-consuming and easier to implement.

iStock.com/Nataliashein

EXHIBIT 10.2 → ERRORS ASSOCIATED WITH SAMPLING4

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

351

WHAT WENT WRONG? POLITICAL POLLS IN THE 2001 AUSTRALIAN FEDERAL ELECTION

In 2001, Australia went to a federal election. John Howard, the incumbent prime minister and leader of the Coalition, was vying for a third term in office. His main opposition, Kim Beazley, leader of the Australian Labor Party, had lost against Howard once before and was trying for a second time to oust the current government. As usual, in the run-up to the election the main pollsters produced regular predictions about the outcome. On a twoparty-preferred basis, the actual results and the predicted results by the three major public opinion research companies appear in the table. Predicted results

Liberal-National Coalition (%) 52

Australian Labor Party (%) 48

Newspoll

53

47

Roy Morgan

45.5

55.5

Actual results

51

49

Nielsen

Nielsen and Newspoll predictions were close to the actual results, using sample sizes of about 2000 to represent the intentions of the Australian voters. However, using a similar sample size, the results from Roy Morgan Research were not only incorrect in their prediction but out by 5.5 percentage points! At the time, this result opened up a good deal of controversy – Roy Morgan Research had great faith in its predictions right up until the last minute. What went wrong? Here are some options: 1 Sampling error – inaccurate selection of respondents as a result of chance. Unlikely: if we use the standard error calculations in Chapter 12 we can see the probability that

this result occurred from sampling error is less than 1:1000. Possible but unlikely. 2 Non-sampling error – differences in methods. Morgan used face-to-face interviews while both Newspoll and Nielsen used random-digit dialling telephone interviews. People who live in similar suburbs are likely to have similar views and telephone interviews are more geographically dispersed. So it is possible that Morgan didn’t sample the range of views that would give a true measure. Questioning procedures also differed among the researchers. 3 Timing. The most persuasive explanation for the poll differences is a late change in the mood of electors. Three days before election day, the government was faced with a major crisis (or opportunity or scandal, depending on your views). A shipload of stranded asylum seekers was rescued at sea by the Norwegian cargo ship, MV Tampa, which precipitated a bitter debate on refugee policy. The debate and clever media manipulation swung the election towards the incumbent Liberal–National Party Coalition. Morgan had relied on survey results taken the weekend before the election (six days) where the competitor pollsters relied on telephone interviews taken in the evenings in the two days before the election. Morgan now makes its final prediction based on face-to-face interviews made the day before election. In experimental design terminology, the Tampa incident is a history effect which caused a disparity between measured sentiment and manifest votes.

PROBABILITY SAMPLING All probability sampling techniques are based on random sampling procedures: there is no chance of introducing bias due to human judgement. Consider the hypothetical sampling frame in Table 10.1, which shows potential respondents who vary in terms of gender and area in which they live. Half are male and half female; two-thirds live in the northern suburbs, one-sixth live in the southern suburbs and one-sixth live in the eastern suburbs. Of course, in reality the sampling frame would be much larger than this.

352

PART FOUR > PLANNING THE SAMPLE

TABLE 10.1 »

HYPOTHETICAL SAMPLING FRAME

ID

Name

Gender

Area

Simple random sampling

Systematic sampling

1

Carlos

Male

Northern suburbs

2

Sarah

Female

Southern suburbs

●

3

Lisa

Female

Northern suburbs

●

●

4

Ryder

Male

Northern suburbs

5

Steve

Male

Southern suburbs

●

●

6

Hasan

Male

Eastern suburbs

●

7

Marc

Male

Southern suburbs

●

8

Ally

Female

Northern suburbs

9

Kerstin

Female

Northern suburbs

Christine

Female

Northern suburbs

11

George

Male

12

Mita

13

●

●

●

●

●

●

●

●

Eastern suburbs

●

●

Female

Northern suburbs

●

Shu-lin

Female

Southern suburbs

●

14

Tony

Male

Northern suburbs

●

15

Lee

Female

Northern suburbs

16

Chrissie

Female

Eastern suburbs

17

April

Female

Northern suburbs

18

Tatiana

Female

Southern suburbs

●

19

Ram

Male

Northern suburbs

●

20

Matt

Male

Northern suburbs

21

Jim

Male

Eastern suburbs

22

Hume

Male

Northern suburbs

23

Margee

Female

Northern suburbs

24

Yasmin

Female

Northern suburbs

25

Kevin

Male

Northern suburbs

26

Umberto

Male

Northern suburbs

27

Michael

Male

Eastern suburbs

28

Julie

Female

Northern suburbs

29

Owen

Male

Northern suburbs

30

Jarmila

Female

Northern suburbs

10

Stratified sampling

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

Advertising Age, June 12, 1979, reproduced with permission.

Simple random sampling

353

Simple random sampling is a sampling procedure that ensures that each element in the population will have an equal chance of being included in the sample. Selecting the winning raffle ticket from a large drum is a typical example of simple random sampling. If the raffle tickets are thoroughly stirred, each ticket should have an equal chance of being selected. In contrast to other, more complex types of probability sampling, this process is simple because it requires only one stage of sample

‘Ah – here comes a crosssection of the public now!’

selection (although, typically, researchers would use a computer program to generate a list of random numbers). Your favourite spreadsheet program can generate random numbers for you.

SELECTING A SIMPLE RANDOM SAMPLE Using Table 10.1, suppose you wish to draw a sample of 15 respondents from the target population of 30. First, in a computer program you would generate 15 random numbers between 1 and 30, if using the ID numbers given. The 15 numbers selected would be the random sample. Suppose those 15 numbers were the numbers given in Table 10.1 (that is, 2, 3, 5, 6, 7 … 28). Our sample would consist of nine females and six males (remember our sampling characteristics were 50 per cent female and 50 per cent male). Seven respondents were from the northern suburbs, five from the southern suburbs and three from the eastern suburbs. The sample characteristics are not quite in the right proportions to the sampling frame, but they are not too far off.

Systematic sampling If the list is randomly ordered (that is, not ordered alphabetically or some other systematic arrangement) then the researcher can choose to perform systematic sampling. Again, suppose we want to take a sample of 15 respondents from the sampling frame. With a systematic sample the researcher chooses every nth name in the list. The value for n is determined by dividing the sampling frame by the required sample size (30/15). This means we select every second name on the list (this is known as the sampling interval). But where do we start on the list? If we start with the first person then they will always get selected and this would not be random! Really we should determine a random starting point by randomly generating a number between 1 and 30. Say 23 was the random number generated, then respondent 23 would be the first person to be sampled, followed by respondent 25, 27, 29 etc., and then starting from the beginning if we roll over the last case, until we have reached our sample size requirement of 15. For instance, see which boxes have been dotted in Table 10.1. If we examine the characteristics of those chosen, we can see that we selected nine males and seven females. Furthermore, nine were from the northern suburbs, three from the southern suburbs and three from the eastern suburbs. Again, these characteristics are only a little different from the sampling frame characteristics. This is not a truly random sample, but the results appear to be random if there is no other systematic pattern to the list. The problem of periodicity occurs if a list has a systematic pattern – that is, if it is not random in character. Collecting retail sales information every seventh day would result in a distorted sample because there would be a systematic pattern of selecting sampling units – sales for only one day of the week (perhaps Monday) would be sampled. If the first 50 names on a list of contributors to a charity were extremely large donors, periodicity bias might occur in sampling every

simple random sampling A sampling procedure that assures each element in the population of an equal chance of being included in the sample. systematic sampling A sampling procedure in which a starting point is selected by a random process and then every nth number on the list is selected.

354

PART FOUR > PLANNING THE SAMPLE

200th name. Periodicity is rarely a problem for most sampling in marketing research, but researchers should be aware of the possibility. Sometimes a systematic sample is much better than a simple random sample. Consider a list of many thousands of customer names as a sampling frame. Consider the number of people with names Abbas, Brown, D’Alessandro, Kim, Smith, Srinivasan, Tan, Williams and Wong. Often, but not always, a family name is associated with a particular cultural heritage. Very popular names (e.g., Smith or Wong) may take many pages in the list, and less popular names less than a page. Now consider a systematic sampling procedure where we use a sampling interval of 100. We almost guarantee a representative sample of names, and thus a representative sample of different cultural groups.

Stratified sampling Simple random sampling and systematic sampling can provide us with samples fairly similar to the characteristics of the sampling frame. But sometimes they don’t. Selection is random and it’s possible that we would get a sample of all women or all from the northern suburbs. Unlikely, but possible. If we think males and females may respond differently, then we might want to make sure that we select an equal number of males and females. To do this, we divide our sampling frame into strata (layers) and then randomly sample within each stratum. This is called stratified sampling. So, if we have knowledge of a particular characteristic on which respondents may differ, we can divide our frame into subsections and randomly sample within the subsections, and thus reduce sampling error. As another example, suppose we are researching attitudes towards our delivery fees among our customers and we suspect that that urban and rural residents have widely different attitudes. Random sampling error will be reduced with the use of stratified sampling. More technically, a smaller standard error may result from stratified sampling because the groups will be adequately represented when strata are combined. Using our sampling frame in Table 10.1, we can select a stratified sample by first dividing the list into males and females. We then perform a simple random sampling procedure within each stratum. That is, randomly select seven or eight females from the list using a random number generator, then do the same for males. This time we have a slight problem: we can’t select 7.5 males. Instead we may select seven or eight of each sex. The areas in which respondents live also seem to be reasonably well represented with nine from the northern suburbs, three from the southern suburbs and three from stratified sampling A probability sampling procedure in which simple random subsamples that are more or less equal on some characteristics are drawn from within each stratum of the population.

the eastern suburbs.

proportional stratified sample A stratified sample in which the number of sampling units drawn from each stratum is in proportion to the population size of that stratum.

stratified sample will be selected to ensure an adequate number of sampling units in every stratum.

disproportional stratified sample A stratified sample in which the sample size for each stratum is allocated according to analytical considerations.

percentage as in the population. Although there is a small percentage of warehouse club stores, the

Proportional versus disproportional sampling If the number of sampling units drawn from each stratum is in proportion to the relative population size of the stratum, the sample is a proportional stratified sample. Sometimes, however, a disproportional Sampling more heavily in a given stratum than its relative population size is not a problem if the primary purpose of the research is to estimate some characteristic separately for each stratum, and if researchers are concerned about assessing the differences among strata. Consider, however, the percentages of retail outlets presented in Exhibit 10.3. A proportional sample would have the same average store size, in dollar volume, for the warehouse club-store stratum is quite large and varies substantially from the average store size for the smaller independent stores. To avoid over-representing the chain stores and independent stores (with smaller sales volume) in the sample, a disproportional sample is taken. In a disproportional stratified sample the sample size for each stratum is not allocated

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

355

in proportion to the population size, but is dictated by analytical considerations, such as variability in store sales volume. The logic behind this procedure relates to the general argument for sample size: as variability increases, sample size must increase to provide accurate estimates. Thus, the strata that exhibit the greatest variability are sampled more heavily to increase sample efficiency – that is, produce smaller random sampling error. Complex formulae (beyond the scope of an introductory course in marketing research) have been developed to determine sample size for each stratum. For most marketing research tasks we don’t need them. Percentage in population

Proportional sample

20%

20%

Warehouse clubs

Disproportional sample

← EXHIBIT 10.3 DEMONSTRATION OF DISPROPORTIONAL SAMPLING5

50% Chain stores

57%

57% 38%

Small independents

23%

23%

12%

Cluster sampling Often we don’t have a list of names of potential respondents, but we do have a list of where those people are. The purpose of cluster sampling is to sample economically while retaining the characteristics of a probability sample. In a cluster sample, the primary sampling unit is no longer the individual element in the population (for example, people) but a larger cluster of elements located close to one another (for example, cities or areas within cities). The area sample is the most popular type of cluster sample. If we want to interview managers of grocery stores, for example, we may randomly choose several geographic areas as primary sampling units and then interview all of the grocery store managers within the geographic clusters. Interviews are confined to these clusters only. When Nielsen distributes its radio listening diaries around the country, the interviewers are given a randomly selected single block of households and then attempt to recruit every household in that block. The research company does not need a list of residents – it needs a list of residential blocks, and these are in every street directory. The great advantage of cluster sampling is that we do not need to have a list of people – we only need a list of the different places where those people are. We make a random selection of places (clusters) and then sample everyone within that cluster. Cluster samples frequently are used when lists of the sample population are not available. For example, researchers investigating satisfaction of travellers with an airline service did not want to risk violating privacy of passengers by taking a list of all passengers. Instead, they took a random sample of flights on a stratified sample of routes on each day of a week, and then surveyed all passengers on each plane. This had the added advantage of capturing a proportionally representative sample of first-, business- and economy-class passengers. Some examples of clusters appear in Table 10.2. We cannot use our simple example in this case because we already have a list.

cluster sampling An economically efficient sampling technique in which the primary sampling unit is not the individual element in the population but a large cluster of elements; clusters are selected randomly.

356

PART FOUR > PLANNING THE SAMPLE

TABLE 10.2 »

ONGOING PROJECT

EXAMPLES OF CLUSTERS

Population element

Possible clusters in Australia

Australian adult population

States Localities Local government areas Postcodes Census collection districts Households

University students

Residential colleges Tutorial classes

Airline travellers

Airports Planes

Sports fans

Sports stadia Recreation parks Indoor sporting venues

NONPROBABILITY SAMPLING So far we have concerned ourselves with probability samples. As we discussed, probability samples are theoretically more accurate because we remove biases in the selection process. That is, by drawing off a list of potential respondents, each respondent has a known, non-zero chance of being selected because they are selected randomly. Unfortunately, marketers rarely have an accurate list available from which to select respondents. Consider shopping mall intercept studies. These may appear random but they are not. There are certain people who won’t be in the shopping centre in the first place – their chance of selection is zero. If the interviewer is close to a major grocery chain (e.g., Coles or Woolworths), then consumers who prefer to buy from the chains are more likely to be selected than those who prefer to buy from independent grocery stores, and so on. Such samples rely on judgement from the researcher and are not random. However, they are cheaper and more practical alternatives. The researcher must ask, ‘For the purposes of the research, does it matter?’ The potential for bias should be acknowledged in the research report. We discuss possible options next.

Convenience sampling convenience sampling The sampling procedure of obtaining those people or units that are most conveniently available.

Convenience sampling (also called haphazard or accidental sampling) refers to sampling by obtaining the people or units that are most conveniently available. It may be convenient and economical to set up an interviewing booth from which to intercept consumers at a shopping centre. Standing outside the university library and grabbing fellow students as they enter is clearly a convenience sample – you may not have any control of who will arrive next, but you are standing there at a time of your convenience. Is it possible that people who go to the library are different from students who don’t go to the library? Is it possible that students who visit the library at night are different from people who visit during the day?

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

357

Researchers generally use convenience samples to obtain a large number of completed questionnaires quickly and economically. Respondents may not be representative of the population of interest because some members of the population of interest are not available when or where the researcher chooses to recruit them.

Judgement sampling Judgement or purposive sampling is a nonprobability sampling technique in which an experienced researcher selects the sample based on his or her judgement about some appropriate characteristics required of the sample member. Test market cities are often selected because they are viewed as typical cities whose demographic profiles closely match the national profile. Researchers select samples that satisfy their specific purposes, even if they are not fully representative of the total population. For example, if you wanted to survey international students at your university, you may decide to stand near the cafeteria and approach those students who ‘look’ like they are international students. You would get many correct, but you would also miss out on many others, and you would probably talk to some who were not in your target group. Obviously there is a great potential for bias in judgement sampling. A research report should always explain the reasons for making the sampling judgements so that the reader can decide whether the study is valid. A special form of judgement sampling is the expert interview, sometimes called key informant sampling. The researcher may decide that some people have special knowledge or expertise that may substitute for a much larger sample. Researchers seek out knowledgeable respondents and ask them about others’ behaviour, rather than their own. For example, company sales representatives are in touch with customers every day – they should be experts on customer sentiment. For many marketing research tasks, a manager may benefit more from talking with a small number of sales reps and saving the expense of a customer survey.

Quota sampling Suppose a firm wishes to investigate consumers’ attitudes towards a proposed government policy using a mall intercept. Interviewers in a shopping centre approach shoppers and ask some screening questions such as age and what suburb they live in. Some people are thanked and the interview ends, while others are interviewed more extensively. Why? The interviewer has a quota to fill. She needs to interview a certain number of people from each of some specific categories. For example, on that day she and her interviewing team will need to interview 50 people: seven women aged 17 to 25, eight women aged 26 to 35 and 10 women aged over 35, as well as eight men aged 17 to 25, seven men aged 26 to 35 and 10 men aged over 35. Such numbers crudely conform to the proportions in the general Australian population. The purpose of quota sampling is to ensure that the various subgroups in a population are represented on pertinent sample characteristics to the exact extent that the investigators desire. Quota sampling is not the same as stratified sampling. In quota sampling the interviewer has a quota to achieve and is responsible for finding enough people to meet the quota. Aggregating the various interview quotas yields a sample that represents the desired proportion of each subgroup. The sample is still selected largely through convenience or judgemental procedures. Quota sampling differs from stratified sampling because there is no random selection process. Nonetheless, such a procedure is useful to ensure important population characteristics are adequately represented.

judgement (purposive) sampling A nonprobability sampling technique in which an experienced individual selects the sample based on personal judgement about some appropriate characteristic of the sample member. expert interview A special form of judgement sampling where the researcher decides that a certain group of people have special knowledge or expertise that may substitute for a much larger sample of nonexpert respondents. quota sampling A nonprobability sampling procedure that ensures that various subgroups of a population will be represented on pertinent characteristics to the exact extent that the investigator desires.

358

PART FOUR > PLANNING THE SAMPLE

POSSIBLE SOURCES OF BIAS The logic of classifying the population by pertinent subgroups is essentially sound. However, convenience sampling and human judgement may introduce bias. For example, a university lecturer hired some of his students to conduct a quota sample based on age. When analysing the data, the lecturer discovered that almost all the people in the ‘under 25 years’ category were university students. Interviewers, being human, tend to prefer to interview people who are similar to themselves. Quota samples tend to include people who are easily found, willing to be interviewed and middle class. Interviewers often concentrate their interviewing in areas with heavy pedestrian traffic such as CBDs, shopping malls and university campuses. Think about how you might tackle the job of interviewing several people ‘aged over 35’ in a shopping mall. Would you approach men and women who looked as if they were in their 30s, or would you approach those who were clearly grey haired? What about interviewing people ‘aged under 25’? Would you approach the young office worker or the high-school student? Most interviewers don’t want to waste time looking for people who could be in the quota – they want to find someone who is definitely in the quota. Thus, quota samples are often clumped in terms of age or other characteristics, instead of being truly representative.

ADVANTAGES OF QUOTA SAMPLING Speed of data collection, lower costs and convenience are the major advantages of quota sampling over probability sampling. Although quota sampling has many problems, carefully supervised data collection may provide a representative sample of the various subgroups within a population. Quota sampling may be appropriate when the researcher knows that a certain demographic group is more likely to refuse to cooperate with a survey. For instance, if older men are more likely to refuse, a higher quota can be set for this group so that the proportion of each demographic category will be similar to the proportions in the population.

Sampling rare or hidden populations Sometimes the population of interest is so hard to find in sufficient numbers within the broader community that normal probabilistic and non-probabilistic methods are infeasible. Anthropologists and sociologists who are interested in topics such as illicit drug use, unusual sexual practices, or political or religious extremism have developed methods that rely on initial information from knowledgeable experts and referrals. Researchers make contact with an initial respondent who is prepared to introduce the researcher to other potential respondents.

snowball sampling A sampling procedure in which initial respondents are found and additional respondents are obtained from information provided by the initial respondents. time-location sampling A two-stage sampling procedure in which knowledgeable experts brief the researcher on locations and times that members of the target population may meet, followed by cluster sampling of venues and times and face-toface contacts.

SNOWBALL SAMPLING Snowball sampling refers to a variety of procedures in which initial respondents are selected by the researcher and additional respondents are obtained from information provided by the initial respondents. Generally, people who have particular interests know other people with the same interests. For instance, if a researcher is interested in habits and attitudes of philatelists (that is, stamp collectors) then speaking to one philatelist would likely provide contact details of others. Reduced sample sizes and costs are obvious advantages of snowball sampling. However, bias can enter into the study because a person suggested by someone also in the sample has a higher probability of being similar to the first person.

TIME-LOCATION SAMPLING6 It may seem obvious, but it’s helpful to find out where and when people gather. The problem often is finding out where and when that is. Time-location sampling then typically involves a two-stage

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

359

process of first a series of expert interviews to learn as much as possible about the population of interest, including ideas about the different venues and occasions when members of the target population may meet. Only then does the researcher go into the venues to make contact with people face-to-face.

RESPONDENT-DRIVEN SAMPLING7 Like snowball sampling, respondent-driven sampling is a chain-referral system. It differs from snowballing in that (1) respondents receive a reward (usually financial) for speaking with the interviewer, and they also receive a reward for recruiting others to be interviewed, and (2) the respondent takes responsibility for contacting other potential respondents so that the researcher does not know names or contact details of potential respondents until the newly recruited respondent makes contact with the researcher, and receives the same rewards as previous respondents. Much of the bias found in snowball sampling can be removed with these straightforward changes to methodology. Respondent-driven sampling is popular among anthropologists and sociologists working in social health programs where the population of interest wants to hide (e.g., domestic violence victims, druginjectors in Vietnam, HIV AIDS among refugees in South Africa).8 Innovations by marketing researchers have combined this technique with mobile phone technology so that target consumers introduce other consumers to the researcher in exchange for rewards of high-perceived value (such as credits for downloadable ringtones). The contact and referral process is all controlled via a smartphone app, which exchanges SMS containing links to short, specific questionnaires.9

WHAT IS THE APPROPRIATE SAMPLE DESIGN? Table 10.3 summarises the advantages and disadvantages of each sampling technique. A researcher who must decide on the most appropriate sample design for a specific project will identify a number of sampling criteria and evaluate the relative importance of each criterion before selecting a sampling design. This section outlines and briefly discusses the most common criteria.10 This discussion of sampling methods has presented each technique one at a time. In practice, most researchers use a combination of techniques. For example, a media ratings company, such as Nielsen, may take a stratified sample of suburbs with different demographic characteristics in the major cities, and then draw a cluster sample of streets within each of those suburbs, and then finally question the households in order to ensure a quota sample with the desired cross-section of ages.

Degree of accuracy Selecting a representative sample is the ideal, and the reason why samples are selected. However, the degree of accuracy required or the researcher’s tolerance for sampling and non-sampling error may vary, especially when cost savings or another benefit may be a trade-off for a reduction in accuracy. For example, when the sample is being selected for an exploratory research project, a high priority may not be placed on accuracy because a highly representative sample may not be necessary. For other, more conclusive projects, the sample result must precisely represent a population’s characteristics, and the researcher must be willing to spend the time and money needed to achieve accuracy.

respondent-driven sampling A respondent recruitment method where respondents are rewarded for their interviews and also for recruiting others to be interviewed. Newly-recruited respondents make contact with the researcher to earn the same rewards as the earlier respondent. The researcher does not know the respondent until the respondent makes contact.

360

PART FOUR > PLANNING THE SAMPLE

TABLE 10.3 »

COMPARISON OF SAMPLING TECHNIQUES

Nonprobability samples Description

Cost and degree of use

Advantages

1 C onvenience: The researcher uses the most convenient sample or economical sample units.

Very low cost, extensively used

No need for list of population Ideal for testing questions and questionnaires

2 J udgement: An expert or experienced researcher selects the sample to fulfil a purpose, such as ensuring that all members have a certain characteristic.

Moderate cost, average use

No need for list of population Useful for certain types of forecasting; sample guaranteed to meet a specific objective

3 Q uota: The researcher classifies the population by pertinent properties, determines the desired proportion to sample from each class, and fixes quotas for each interviewer.

Moderate cost, very extensively used

No need for list of population Introduces some stratification of population; requires no list of population

4 S nowball: Initial respondents are found; additional respondents are obtained by referral from initial respondents.

Low cost, used in when target population is rare in larger population

No need for list of population Useful in locating members of rare populations

5 T ime-location: Expert interviews help determine locations and times when prospects meet, faceto-face interviews are attempted.

Moderate cost, used when populations are ‘hidden’ or rare

No need for list of population Useful in locating members of rare populations

6 Respondent-driven.

Higher cost, used when populations are ‘hidden’ or rare

Maintains privacy among potential respondents who may not want to be interviewed

Probability samples Description

Cost and degree of use

Advantages

1 S imple random: The researcher assigns each member of the sampling frame a number, then selects sample units by random method.

High cost, moderately used in practice (most common in random digit dialling and with computerised sampling frames)

Only minimal advance knowledge of population needed; easy to analyse data and compute error

2 S ystematic: The researcher uses natural ordering or the order of the sampling frame, selects an arbitrary starting point, then selects items at a preselected interval.

Moderate cost, moderately used

Simple to draw sample; easy to check

3 S tratified: The researcher divides the population into groups and randomly selects subsamples from each group. Variations include proportional, disproportional and optimal allocation of subsample sizes.

High cost, moderately used

Ensures representation of all groups in sample; characteristics of each stratum can be estimated and comparisons made; reduces variability for same sample size

4 C luster: The researcher selects sampling units at random, then does a complete observation of all units or draws a probability sample in the group.

Low cost, frequently used

If clusters geographically defined, yields lowest field cost; requires listing of all clusters, but of individuals only within clusters; can estimate characteristics of clusters as well as of population

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

Resources The costs associated with the different sampling techniques vary tremendously. If the researcher’s financial and human resources are restricted, certain options will have to be eliminated. For a graduate student working on a Master’s thesis, conducting a national survey is almost always out of the question because of limited resources. Managers concerned with the cost of the research versus the value of the information often will opt to save money by using a nonprobability sampling design, rather than make the decision to conduct no research at all.

Time A researcher who needs to meet a deadline or complete a project quickly will be more likely to select a simple, less time-consuming sample design. An online survey that uses a sample drawn from a list of people who have previously agreed to be surveyed takes considerably less time than a survey that uses an elaborate disproportional stratified sample.

Advance knowledge of the population Prior knowledge of population characteristics, such as the availability of lists of population members, or proportions of different demographic groups, affect sampling design. Often no list of population elements will be available, especially when the population is defined by ownership of a particular product or brand, by experience in performing a specific job task or on a qualitative dimension. A lack of adequate lists automatically rules out most probabilistic sampling designs. Previous research, or national census data, can give clues about the likely numbers of respondents with particular characteristics, which helps with determining sample size quotas or location.

National versus local project Geographic proximity of population elements will influence sample design. When population elements are unequally distributed geographically, a cluster sample may become much more attractive – particularly in a sparsely populated country such as Australia.

MOBILE DEVICES AND THE INTERNET CHANGE EVERYTHING With advancements of Internet technology, Web surveys are the default research design for many organisations. However, there are some problems associated with Internet sampling. In particular, Internet surveys are more prone to nonresponse error. Recall the discussion on nonresponse error earlier in the chapter. Internet surveys allow researchers to reach a large sample rapidly. This is both an advantage and a disadvantage. Sample size requirements can be met overnight or, in some cases, almost instantaneously using SMS contacts.11 People in some populations are more likely to go online during the weekend than on a weekday. If the researcher can anticipate a day-of-the-week effect, the survey should be kept open long enough so that all sample units have the opportunity to participate in the research project. A potential disadvantage of Internet surveys is the lack of computer ownership and Internet usage among certain segments of the population. A sample of Internet users is representative only of Internet users, and this tends to exclude those who are older and/or less educated. This is not to say

361

362

PART FOUR > PLANNING THE SAMPLE

that all Internet samples are unrepresentative of all target populations. For most consumer or social issues Internet sampling works very well. Increasingly, researchers are making use of mobile communications technologies. Potential respondents, who have previously given permission to be contacted, are sent an SMS. Often a single response is needed and the respondent replies by return SMS. Otherwise the SMS is an invitation and link to a very brief questionnaire.

Website visitors As noted earlier, many Internet surveys are conducted with volunteer respondents who visit an organisation’s website intentionally or unintentionally. These unrestricted samples are clearly convenience samples. They may not be representative of the larger population of interest because people who arrive at a particular website may be different from those who do not visit the site – usually they are there to find information, and are already interested in the product or product category, which leads to self-selection bias. A better technique for sampling website visitors is to randomly select sampling units. PeoplePulse, an Australian company that specialises in conducting Internet surveys, collects data by using its ‘popup survey’ software. The software selects Web visitors at random and ‘pops up’ a small JavaScript window asking the person if he or she wants to participate in an evaluation survey. Note that this is not random selection from the population – it is random selection from visitors. If the person clicks ‘Yes’, a new window containing the online survey opens up. The person can then browse the site at his or her own pace and switch to the survey at any time to express an opinion.12

EXPLORING RESEARCH ETHICS

MANIPULATING YOUR OUTCOME WITH SELECTIVE SAMPLING13

The sampling frame and the sample size are among the most easily manipulated areas of the research process. It can be embarrassingly easy to find a biased sample to survey in order to find results that suit one’s cause. Imagine you want to survey the opinions of students about smoking on campus. If you took all of your surveys within designated smoking areas, then you are likely to find a much larger number who disapproved of a ban on smoking on campus. Results would clearly be very different from the truth. The Australian Market & Research Society’s Code of Professional Behaviour has very specific rules about how respondents and clients should be informed about the nature of sampling. (See http://www. amsrs.com.au /professional-standards/amsrs-code-of-professional-behaviour.) (Rule 3) For respondents, researchers and those working on their behalf (e.g. interviewers) must not, in order to secure Respondents’ cooperation, make statements or promises that are knowingly misleading or incorrect – for example, about the likely length of the interview or about the possibilities of being re-interviewed on a later occasion. Any such statements and assurances given to Respondents must be fully honoured. The source of the research sample (e.g., customer records, information or lists collected by Researchers, publicly available lists such as a telephone directory or electoral roll, random digit dialling, door knocking) must be revealed to respondents, or be able to be reasonably inferred by Respondents, except when the Researcher and Client decide there is a valid reason (e.g. methodological, legal) not to do so. (Rule 25) For clients, the research practitioner should provide the client with a report which includes, among other things: » a description of the intended and actual universe covered in the sample » the size, nature and geographical distribution of the sample (both planned and achieved); and where relevant, the extent to which any of the data collected were obtained from only part of the sample » details of the sampling method and any weighting methods used » where technically relevant, a statement of response rates and a discussion of any possible bias due to non-response.

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

This type of random sampling is valid if the target population is defined as visitors to a particular website. Evaluation and analysis of visitors’ perceptions and experiences of the website would be a typical survey objective with this type of sample. Researchers who have broader interests may obtain Internet samples in a variety of other ways.

Panel samples Drawing a probability sample from an established consumer panel or other pre-recruited membership panel is a popular, scientific and effective method for creating a sample of Internet users (though panels are not unique to Internet samples). Qualtrics, the company that hosts the survey involved in the ‘Survey this!’ features throughout this text, provides such services. These panels generally contain millions of potential respondents, which enables a panel to be obtained that matches practically any demographic profile imaginable. Typically, sampling from a panel yields a high response rate because panel members have already agreed to cooperate with the research organisation’s email or Internet surveys. Often panel members are compensated for their time with a sweepstake ticket, a small cash incentive or a donation to a charity chosen by the respondent. Further, because the panel has already supplied demographic characteristics and other information from previous questionnaires, researchers are able to select panellists based on product ownership, lifestyle or other characteristics. A variety of sampling methods and data transformation techniques can be applied to assure that sample results are representative of the general public or a targeted population.14 In Australia, Nielsen operates a consumer panel known as Homescan. This is the largest consumer panel per capita in the world, covering 10 000 households. The panel is used to better understand consumer-purchasing behaviour in buying packaged groceries and fresh produce. The sample is monitored and controlled to be representative of Australian households. Indeed, this is the major advantage of such panels. By controlling the sample, researchers can be surer of representativeness and higher response rates. Each household is given a handheld scanner device that records their purchases and links this back to a database that cross-references to other important characteristics such as demographic information.15 Such methods of market research provide marketers with important and powerful information about consumer behaviour for use in marketing strategy and planning. Moreover, because panel members have been recruited beforehand and they trust the recruiting company, they tend to generally conduct surveys honestly and diligently.16

Recruited ad hoc samples Another means of obtaining an Internet sample is to obtain or create a sampling frame of email addresses on an ad hoc basis. Researchers may create the sampling frame offline or online. Databases containing email addresses can be compiled from many sources, including customer/client lists, advertising banners on pop-up windows that recruit survey participants, online sweepstakes, and registration forms that must be filled out in order to gain access to a particular website. For companies anticipating future Internet research, including email addresses in their customer relationship databases (by inviting customers to provide that information on product registration cards, in telephone interactions or through on-site registration) can provide a valuable database for sample recruitment.

Opt-in lists Survey Sampling Incorporated (SSI) is a company that specialises in providing sampling frames and scientifically drawn samples. The company offers more than 3500 lists of high-quality, targeted

363

364

PART FOUR > PLANNING THE SAMPLE

email addresses of individuals who have given permission to receive email messages related to a particular topic of interest. SSI’s database contains over seven million names of Internet users who opt in To give permission to receive selected email, such as questionnaires, from a company with an Internet presence.

have opted in for limited participation. An important feature of this database is that the company has each individual confirm and reconfirm interest in communicating about a topic before the person’s email address is added to the company’s database.17 By whatever technique the sampling frame is compiled, it is important not to send unauthorised email to respondents. If individuals do not opt in to receive email from a particular organisation, they may consider unsolicited survey requests to be ‘spam’. A researcher cannot expect high response rates from individuals who have not agreed to be surveyed. Spamming is not tolerated by experienced Internet users, and it can easily backfire, creating a host of problems – the most extreme being complaints to the Internet service provider (ISP), which may shut down the survey site.

ONGOING PROJECT

Sample size We know that a larger sample leads to more accurate results, but sample size is not the only thing that affects accuracy – in fact it’s more often the least important issue. Many naïve researchers focus on sample size because it’s one of the few things we can get a good measure of. There are various formulae for helping to decide on a sample, but these are only very crude guides.

Random error and sample size When asked to evaluate a marketing research project, most people, even those with little marketing research training, begin by asking: ‘How big was the sample?’ Intuitively we know that the larger the sample, the more accurate the research. This is, in fact, a statistical truth – random sampling error varies with samples of different sizes. That means researchers can reduce sampling error just by asking more people. However, every project has a budget and the final sample size must have a limit. In most cases we can ask a certain number of people and still be confident of getting the same results had we asked many more people. In statistical terms, increasing the sample size means we can be more confident of the result that we get – it is more accurate. In fact, theoretically we can use a simple formula to calculate sample size. This formula is the formula for a confidence interval and is discussed in more depth in Chapter 12. So, if we can be more accurate just by asking more people, does that mean we should ask as many people as possible? Not necessarily, because asking more people means spending more money and resources – as with many business decisions there is a trade-off to make. The formula also tells us that if we survey more people, we may reduce sampling error. However, we don’t reduce sampling error by the same proportion that we increase the number of respondents. In other words, by asking more people, sampling error decreases, but at a decreasing rate. So, at some point, we can reduce error but only at a very substantial monetary cost. If we do the maths from Chapter 12 (that is, manipulating the confidence interval formula to make n the subject) it will become clear quickly that doubling a sample of 1000 will reduce random sampling error by 1 percentage point, but doubling the sample from 2000 to 4000 will reduce random sampling error by only another half (0.5) of a percentage point. The return we get in terms of a reduction of sampling error is quickly outweighed by the cost of asking more people. More technically, random sampling error is inversely proportional to the square root of n. Exhibit 10.4 gives an approximation of the relationship between sample size and error.

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

Percentage error at different sample sizes at 90, 95 and 99 per cent confidence levels

35%

Expected sampling error

30% 25% 20% 15% 10% 5%

11 5 12 5 13 5 14 5 15 5 16 5 17 5 18 5 19 5 20 5 21 5 22 5 23 5 24 5 25 5 26 5 27 5 28 5 29 5

10 5

75 85 95

45 55 65

35

5

15 25

0%

Sample size e.90

e.95

e.99

Systematic error and sample size Recall that systematic error is the error that is introduced into a research study because of bias in the sampling frame, or poorly worded questions, or by priming respondents to respond in a particular way, and so on. Systematic error has the effect of pushing your responses towards an incorrect answer. Sample size has no effect on systematic error. If there is bias in your study, then it cannot be repaired by simply gathering a larger sample. In social research, systematic error is usually much larger than random error. The problem is that we can very rarely measure systematic error, and the researcher often does not recognise that questions are biased (or that the people are recruited in a way that favours some respondents and not others, etc.) and so it goes undetected. In a poorly designed study, a larger sample just gives the researcher more confidence in a wrong answer.

Factors in determining sample size for questions involving means Using the formulae from Chapter 12, we can see that three factors affect how large our sample should be: (1) the variance, or heterogeneity, of the population; (2) the magnitude of acceptable error; and (3) the confidence level. Suppose a researcher wishes to find out whether nine-year-old boys are taller than four-year-old boys. Intuitively we know that even with a very small sample size, the correct answer probably will be obtained. (Yes, nine-year-olds are taller than four-year-olds.) This is because the determination of sample size depends on the research question and the variability within the sample. But a different question could be asked: How much taller are nine-year-old boys than four-year-old boys? That would require a much larger sample. The variance, or heterogeneity, of the population is the first necessary bit of information. In statistical terms, this refers to the standard deviation of the population (see Chapter 12). Only a small sample is required if the population is homogeneous. For example, predicting the average age of firstyear university students requires a smaller sample than predicting the average age of people who visit the zoo on a given Sunday afternoon. As heterogeneity increases, so must sample size. The magnitude of error indicates how precise the estimate must be. It indicates a certain precision level. From a managerial perspective, the importance of the decision in terms of profitability will

365

← EXHIBIT 10.4 RELATIONSHIP BETWEEN SAMPLE SIZE AND ERROR18

366

PART FOUR > PLANNING THE SAMPLE

influence the researcher’s specifications of the range of error. If, for example, favourable results from a test market sample will result in the construction of a new plant and unfavourable results will dictate not marketing the product, the acceptable range of error probably will be small; the cost of an error would be too great to allow much room for random sampling errors. In other cases, the estimate need not be extremely precise. Allowing an error of 6$10 000 in total family income instead of E 5 6$1000 may be acceptable in most market segmentation studies. The third factor of concern is the confidence level. We typically use the 95 per cent confidence level. However, this is an arbitrary decision based on convention; there is nothing sacred about the 0.05 chance level. Many researchers are quite comfortable with 90 per cent confidence. These three main factors give us a simple appreciation for the different elements that affect the sample size required. In summary, the more heterogeneous the population, the greater the sample size required; if it is more important to be correct then we need a bigger sample size; and the more confident we wish to be in our results, the larger the sample should be. As discussed, there is, in fact, a formula that in theory can be used to calculate optimal sample size, and that incorporates the above three factors. It is discussed in Chapter 12.

The influence of population size on sample size If we think back to the political polling firms towards the start of this chapter, one may wonder how it is possible to provide accurate forecasts of such a large population based on comparatively small samples. Really, the size of the population rarely affects the sample size. Actually, the variance of the population has the largest effect on sample size. All inferential statistics assume that the true population is infinite in size. So any sized sample is tiny in comparison. If the population is not large (say, when we want to take a sample of employees in a company and make inferences only about the employees in that one company), then there are formulae for reducing the required sample size. This is unusual, however, and such formulae are rarely used. The main point is that population size is relatively unimportant.

Determining sample size on the basis of judgement Just as sample units may be selected to suit the convenience or judgement of the researcher, sample size may also be determined on the basis of managerial judgements. In fact, although required sample size can be estimated in theory, in many cases there are restrictions against the use of the formulae. Firstly, many studies have a series of different measurements. If we are to estimate the variance or standard deviation, which variance or standard deviation are we estimating? Some researchers would suggest the one with the largest variance. In theory this makes sense, but the formulae in Chapter 12 further assume we know, before we gather our sample, what the variance is. How can we know the variance of a measurement that has not yet been measured? For similar studies that have been done many times before, we may use these as proxies. However, often research studies have not been done before – that’s why we are doing them! So, while it’s theoretically useful and intuitively appealing, estimating sample size with a formula is impractical for many studies. Instead, researchers may rely on their experience and use a sample size similar to those used in previous studies. Researchers also face a budget restriction, of course. In many cases a researcher may decide to ask as many people as possible subject to budget constraints, because often budgets do not extend as far as the researcher would like.

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

367

In summary, sample size formulae can be used to determine sample size. Yet, in most cases this is impractical to calculate precisely. Still, sample size formulae remain a useful way of understanding what factors affect the sample size required in different studies. Finally, the researcher is often governed by more practical restrictions in the determination of sample size, such as budget and time limitations. Thus, the sample size decision should not be taken lightly and researchers need to exercise significant judgement in making sample size decisions. Finally, as we said earlier in the chapter, sample size is only one area of concern when determining sampling technique. Sample size affects sampling error but it cannot affect non-sampling error. Non-sampling error is always much bigger than sampling error. This is the error that comes from asking the wrong people (sampling frame), asking the wrong questions (measurement validity) or asking them in the wrong way (research design validity). If you as a professional researcher have to make a trade-off between (a) spending time and resources on testing and refining your questionnaire some more and (b) spending time and resources on gathering a larger sample, then choose (a) to test and refine. Once more… in a poorly designed research study, a larger sample size just gives you more confidence in a wrong answer.

»» Marketing research rarely requires a census. »» Accurately defining the target population is critical in research involving forecasts of how that population will react to some event. Consider the following in defining the population: • Who are we not interested in? • What are the relevant market segment characteristics involved? • Is region important in defining the target population? • Is the issue being studied relevant to multiple populations? • Is a list available that contains all members of the population? »» Online panels are a practical reality in marketing research. A sample can be quickly measured that matches the demographic profiles of the target population. • As with all panels, the researcher faces a risk that systematic error is introduced in some way. For example, this sample may be higher in willingness to give opinions or may be responding only for an incentive. • The researcher should take extra steps such as including more screening questions to make sure the responses are representative of the target population. »» Convenience samples do have appropriate uses in marketing research. Convenience samples are particularly appropriate when: • Exploratory research is conducted. • The researcher is primarily interested in internal validity (testing a hypothesis under any condition) rather than external validity (understanding how much the sample results project to a target population). »» When cost and time constraints only allow a convenience sample: • Researchers can think backwards and project the population for whom the results apply to, based on the nature of the convenience sample. »» The research report should address the adequacy of the sample. Researchers seldom have a perfectly representative sample. Thus, the report should qualify the generalisability of the results based on sample limitations.

TIPS OF THE TRADE

368

PART FOUR > PLANNING THE SAMPLE

SUMMARY DEFINE THE TERMS ‘SAMPLE’, ‘SAMPLING’ ‘POPULATION’, ‘POPULATION ELEMENT’ AND ‘CENSUS’

A sample is a subset, or some part of a larger population. Sampling is a procedure that uses a small number of units of a given population as a basis for drawing conclusions about the whole population. Sampling enables researchers to understand their population by only contacting some of them. A population is any complete group that shares some common set of characteristics. The term population element refers to an individual member of the population – in social research usually a person. A census is an investigation of all the individual elements that make up the population. WHY A SAMPLE RATHER THAN A COMPLETE CENSUS MAY BE TAKEN

Sampling is necessary because it would be practically impossible to conduct a census to measure characteristics of all units of a population, and if it is possible then often a census is unnecessarily expensive. Samples also are needed in cases where measurement involves ‘destruction’ of the measured unit. ISSUES CONCERNING THE IDENTIFICATION OF THE TARGET POPULATION AND THE SELECTION OF A SAMPLING FRAME

The first problem in sampling is to define the target population. Incorrect or vague definition of this population is likely to produce misleading results. A sampling frame is a method for defining the individual members of the overall population from which the sample is drawn. A sampling unit is a single element or group of elements subject to selection in the sample. COMMON FORMS OF SAMPLING FRAMES AND SAMPLING FRAME ERROR

A sampling frame is a mechanism for defining as objectively as possible who may be included in the sample and who is not included. Ideally, it is a list of all members of the population of interest. In social research this is often not possible, so other mechanisms for sampling must be developed. A list is valuable because it permits probability sampling, whereas if there is no list then only non-probability sampling is possible. DISTINGUISH BETWEEN RANDOM SAMPLING AND SYSTEMATIC (NON-SAMPLING) ERRORS

There are two sources of discrepancy between sample results and population parameters. One, random sampling error, arises from chance variations of the sample from the population. Random sampling error is a function of sample size and may be estimated using the central-limit theorem, discussed in Chapter 12. The level

10

of random sampling error really can only be estimated accurately using probability sampling methods. Systematic, or non-sampling, error comes from sources such as sampling frame error, mistakes in recording responses, or nonresponses from persons who are not contacted or who refuse to participate. Non-sampling error is always bigger than sampling error, but it cannot be easily measured. TYPES OF SYSTEMATIC (NON-SAMPLING) ERRORS THAT RESULT FROM SAMPLE SELECTION

Systematic errors come from errors in the execution of the sampling process, not from random variation in the sample or population. These can include anything that may cause some people to be contacted more than others, or cause some people to respond, or to respond differently from others. These may be method of communication, time of day, incentives, question wording and so on. ADVANTAGES AND DISADVANTAGES OF THE VARIOUS TYPES OF PROBABILITY AND NONPROBABILITY SAMPLES

The two major classes of sampling methods are probability and nonprobability techniques. Nonprobability techniques include convenience sampling, judgement sampling, quota sampling and variations of referral chains, such as snowball sampling. They are convenient to use, but there are no statistical techniques with which to measure their random sampling error. Probability samples are based on chance selection procedures. These include simple random sampling, systematic sampling, stratified sampling and cluster sampling. With these techniques, random sampling error can be accurately predicted. Probability samples can only be used if there is a complete list of all members of the population of interest. HOW TO CHOOSE AN APPROPRIATE SAMPLE DESIGN

Appropriate sample design depends primarily on how accurate the results need to be, resources of time and money, and the geographic scope of the study. An exploratory study with the goal of learning what is the likely range of issues and values does not need to be as rigorous as a conclusive study, which needs very precise answers to very specific questions. Costs and time budgets force many marketing researchers to use simple convenience sampling methods, despite the risk of bias, because the information gained is better than gathering no information. Prior knowledge of the population, such as previous experience with that market and the availability of lists, will determine whether probability samples can be used and the contact methods used for recruitment. Finally, a widely dispersed population makes face-to-face contact difficult, and suggests cluster sampling or electronic contact methods.

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

THREE FACTORS THAT AFFECT SAMPLE SIZE

The final consideration a researcher has to make is the sample size. If we ask too few people we may end up with inaccurate results, but if we ask too many it will take a long time and cost a great deal of money. We have to make a trade-off. Three considerations affect sample size: (1) the level of variation in the population, usually measured by the variance, or standard deviation, of the key variable of interest; (2) the confidence interval, or the margin for error that the researcher is willing to accept (recognising that accuracy of plus or minus 1 per cent will cost 25 times more than accuracy of plus or minus 5 per cent); and (3) confidence level, or the level of risk we’re prepared to accept that the true scores may be outside the desired confidence interval. Large samples are more accurate for populations with greater variability in terms of the characteristics of interest. Likewise, if the population is relatively homogeneous then we generally need to sample less.

369

NON-STATISTICAL CONSIDERATIONS INFLUENCE THE DETERMINATION OF SAMPLE SIZE

A researcher who must determine the most appropriate sampling design for a specific project will identify a number of sampling criteria and evaluate the relative importance of each criterion before selecting a design. The most common criteria concern accuracy requirements, available resources, time constraints, knowledge availability and analytical requirements. Researchers can determine sample size through statistical formulae. However, such methods are often impractical because researchers are not fully informed about the necessary parameters. Instead, researchers often have to determine sample size based on budget requirements or based on experience. Often we focus on sample size simply because we have formulae, but more often sampling error is the least important problem. Non-sampling error – due to question wording, non-response bias and other problems – usually is greater than sampling error. For a poorly designed study, a larger sample size just gives more confidence in a wrong answer.

KEY TERMS AND CONCEPTS census cluster sampling convenience sampling disproportional stratified sample expert interview judgement (purposive) sampling nonprobability sampling

nonresponse error opt in population population element probability sampling proportional stratified sample quota sampling

random sampling error sample sampling frame sampling frame error sampling unit simple random sampling snowball sampling

time and venue sampling respondent-driven sampling stratified sampling systematic (non-sampling) error systematic sampling

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 If you decide whether you want to see a new film or television program on the basis of the ‘coming attractions’ or television commercial previews, are you using a sampling technique? A scientific sampling technique? 2 Describe possible sampling frames for these target populations: a licensed plumbers b practitioners of martial arts (karate, aikido, tae kwon do, etc.) c dog breeders d owners of large motorcycles e home handymen and women f businesses owned by Indian migrants g men over 1.8 metres tall 3 Describe the difference between a probability sample and a nonprobability sample. 4 In what situations is a census more appropriate than sampling?

5 Comment on the following sampling designs: a A university gym and sports centre interested in new members prints a questionnaire in the local newspaper. Readers return the questionnaires by mail. b A department store that wishes to examine whether it is losing or gaining customers draws a sample from its list of loyalty-card holders by selecting every tenth name, starting at a random name. c A motorcycle manufacturer decides to research consumer characteristics by sending 100 questionnaires to each of its dealers. The dealers will then mail the questionnaire to buyers of this brand using their sales records as a frame. d An advertising executive suggests that advertising effectiveness be tested in the real world. A one-page ad is placed in a magazine. One-half of the space is used for the ad itself. On the other half, a short questionnaire requests that readers comment on the ad. An incentive will be given for the first 1000 responses.

370

PART FOUR > PLANNING THE SAMPLE

e A research company obtains a sample for a focus group through a local sports club. The club is paid for securing respondents but the respondents themselves are not directly compensated. f A banner ad on a business-oriented website reads: ‘Are you a large company senior executive? Qualified execs receive $50 for less than 10 minutes of your time. Take the survey now!’ 6 When would a researcher use a judgement, or purposive, sample? 7 A telephone interviewer asks: ‘I would like to ask you about your preference for televisions. Which of the following televisions do you own: Panasonic, Sony, Samsung or other?’ After the respondent replies, the interviewer says, ‘We have conducted a large number of surveys with people who prefer that brand, and we do not need to question you further. Thank you for your cooperation.’ What type of sampling was used? 8 In a research report, suppose a researcher states that they ‘randomly’ stopped passers-by in a shopping centre to get responses for their survey. Explain why this process is not truly random. 9 What are the benefits of stratified sampling? Contrast stratified sampling with quota sampling. 10 What geographic units within a metropolitan area are useful for sampling? 11 Marketers often are particularly interested in the subset of a market that contributes most to sales (for example, frequent

fast-food buyers or large-volume retailers). What type of sampling might be best to use with such a subset? Why? 12 Outline the step-by-step procedure you would use to select the following: a a simple random sample of 150 students at your university b a quota sample of 50 light consumers and 50 heavy consumers of chocolate in a shopping mall intercept study c a stratified sample of 50 mechanical engineers, 40 electrical engineers and 40 civil engineers from the subscriber list of an engineering journal. 13 To understand how sample size is conceptually related to random sampling error, costs and non-sampling errors, graph these relationships. 14 Suppose you are a political analyst and wish to be very confident that you will predict the outcome of an extremely close federal election. What should the sample size be for your poll? Are there any problems involved in determining the sample size for this election? 15 One sampling service obtained its listing of households with children from an ice-cream retailer who gave away free icecream cones on children’s birthdays. (The children filled out cards with their names, addresses and birthdays, which the retailer then sold to the mailing list company. You may want to discuss the ethics of such an information-gathering method in class.) 16 A researcher wants to learn the extent of cheating at his university. Design a step-by-step protocol for using SMS to recruit respondents using a referral chain method.

ONGOING PROJECT SAMPLING OR CONDUCTING A SURVEY IN YOUR RESEARCH STUDY? CONSULT THE CHAPTER 10 PROJECT WORKSHEET FOR HELP

a good sample. Minimise error caused by sampling frame bias, and have some ideas of how confident you can be in your results.

Download the Chapter 10 project worksheet from the CourseMate website. It outlines a series of steps taken in this chapter to develop

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to log in to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ interactive quizzes ☑ flashcards

☑ case projects ☑ research activities ☑ online video activities.

CHAPTER 10 > SAMPLING: SAMPLE DESIGN AND SAMPLE SIZE

371

WRITTEN CASE STUDY 10.1 THE VICTORIAN POPULATION HEALTH SURVEY 201419 The Victorian Population Health Survey has been conducted each year since 2001 and is based on a sample of 7500 adults, aged 18 years and over, randomly selected from households using random digit dialling, also known as CATI (computer assisted telephone interviewing). Information in the report includes health and lifestyle, including physical activity levels, smoking prevalence, alcohol consumption, intake of fruit and vegetables, selected health and screening checks, body weight, asthma and diabetes prevalence, psychological distress and social networks.

QUESTIONS

1 Random digit dialling has been used from the inception of this annual study. Is it still appropriate today? What alternatives are

there? If a new contact/interview method were used, how could this affect comparisons with previous years’ results? 2 How would you respond if someone representing the state government phoned to ask you about your health habits? How would members of your family respond? 3 Are there any members of the population who are not represented in the study, who should be? Are there any members of the population who are represented in the study who should not be? 4 Suggest ways in which the sampling process could be improved.

WRITTEN CASE STUDY 10.2 THE AUSTRALIAN TAXATION OFFICE Suppose you are a consultant talking to the Australian Taxation Office (ATO), which wishes to conduct a study on income tax cheating. The study has several objectives: 1 to identify the extent to which taxpayers cheat on their returns, their reasons for doing so and the approaches the ATO can take to deter this kind of behaviour 2 to determine taxpayers’ experience of and satisfaction with various ATO services 3 to determine what services taxpayers need 4 to develop an accurate profile of taxpayers’ behaviour as they prepare their income tax returns 5 to assess taxpayers’ knowledge and opinions about various tax laws and procedures. The federal government always wishes to be extremely accurate in its survey research. A survey of approximately 5000 individuals located throughout the country will provide

the database for this study. The sample will be selected on a probability basis from all households in Australia. Eligible respondents will be adults over age 18. Within each household, an effort will be made to interview the individual who is most familiar with completing tax returns. When there is more than one taxpayer in the household, a random process will be used to select the member to be interviewed.

QUESTIONS

1 Is a single survey appropriate for this study? What research techniques are appropriate for each question? 2 How would you determine the appropriate sample size? 3 Design the sample for a personal, in-home interview. Design a suitable sample procedure for the Australian Taxation Office, explaining your reasons.

NOTES 1 2 3 4

5

Adapted with permission from ‘Millennium milestones: William’s census was ahead of its time’, Tulsa World, 27 March 1999, p. A-3. ©Associated Press. Adapted with permission from ‘George Gallup’s nation of numbers’, Esquire, December 1983, pp. 91–2. Information and Communications for Development: Maximizing Mobile. World Bank: IC4D 2012: Maximizing Mobile, Online at: http://web.worldbank.org, accessed December 2015 Adapted from Cox, Keith K. & Enis, Ben M. (1972) The marketing research process, Pacific Palisades, CA: Goodyear; Bellenger, Danny N. & Greenberg, Barnet A. (1978) Marketing research: A management information approach, Homewood, IL: Richard D. Irwin, pp. 154–5. This is a hypothetical example.

6 Karon, J. M., & Wejnert, C. (2012) ‘Statistical methods for the analysis of timelocation sampling data,’ Journal of Urban Health, 89, 565–586. doi:10.1007/s11524012-9676-8. 7 See for example: Heckathorn, D. D. (1997) ‘Respondent-driven sampling: A new approach to the study of hidden populations’, Social Problems, 44(2), 174–199. doi:10.2307/3096941; Heckathorn, D. D. (2002) ‘Respondent-driven sampling II: Deriving valid population estimates from chain-referral samples of hidden populations’, Social Problems, 49(1), 11–34. doi:10.1525/sp.2002.49.1.11; Górny, A., & Napierała, J. (2015) Comparing the effectiveness of respondent-driven sampling and quota sampling in migration research,’ International Journal of Social Research Methodology, 1–17. doi:10. 1080/13645579.2015.1077614

372

PART FOUR > PLANNING THE SAMPLE

8 See for example: Wagner, J. & S. Lee (2014) ‘Sampling rare populations’, Handbook of Health Survey Methods. T. P. Johnson, Wiley: 77–103; Tran, H., Le, L.-V., Johnston, L., Nadol, P., Van Do, A., Tran, H., & Nguyen, T. (2015) ‘Sampling males who inject drugs in Haiphong, Vietnam: Comparison of time-location and respondent-driven sampling methods,’ Journal of Urban Health, 92(4), 744–757. doi:10.1007/s11524-015-9966-z; Vearey, J. (2013) ‘Sampling in an urban environment: Overcoming complexities and capturing differences’, Journal of Refugee Studies, 26(1), 155–162. doi:10.1093/jrs/ fes032 9 See for example: Grannis, R., Freedy, E., & Freedy, A. (2011, 8–10 February 2011) ‘Ultra-rapid Social Network Sampling in Cross-cultural Environments’, Paper presented at the HSCB Focus 2011; Grannis, R. (2012) ‘UCLA “Flash Study” to Verify the Ultra-Rapid Survey Methodology and Toolkit for Socio-Cultural Decisions’, Working Paper, UCLA Department of Sociology. 10 ESOMAR (2015) ESOMAR/GRBN Online Research Guideline, Online at: https:// www.esomar.org, accessed December 2015. 11 Brosnan, Kylie (2010) ‘Online Panellists. Why they do what they do’, paper presented at the AMSRS 2010 National Conference.

12 PeoplePulse, Online at: http://www.peoplepulse.com, accessed December 2015. 13 Nielsen Australia, ‘Consumer Panels’, Online at: http://www.nielsen.com/au/en/ solutions/measurement/consumer-panels.html, accessed December 2015. 14 Fiedler, K. (2000) ‘Beware of samples! A cognitive-ecological sampling approach to judgment biases’, Psychological Review, 107(4), 659-76. doi:10.1037/0033295X.107.4.659 15 Brosnan, Kylie (2010) ‘Online Panellists. Why they do what they do’, paper presented at the AMSRS 2010 National Conference. 16 Nielsen Australia, ‘Consumer Panels’, Online at: http://www.nielsen.com/au/en/ solutions/measurement/consumer-panels.html, accessed December 2015. 17 Survey Sampling Online, Online at: http://www.surveysampling.com, accessed December 2015. 18 Based on the formula: e 5 z( p)(1 2 p)/√n where: e 5 error, z 5 critical z-value on right-tail of standard normal distribution for desired confidence level, n 5 sample size. 19 Department of Health, State of Victoria 2014, Victorian population health survey 2014, Online at: https://www2.health.vic.gov.au, accessed December 2015.

FIVE COLLECTING THE DATA 11 »

EDITING AND CODING: TRANSFORMING RAW DATA INTO INFORMATION

PART 5: Collecting the data

Fixed-alternative or open-ended question? Fixedalternative question

Consistent answers?

Open-ended question

Define new numeric code for theme

Amend

Existing response theme?

Discard questionnaire as unusable

Complete answers?

Amend or leave as incomplete

Apply code(s) for each distinct theme

iStock.com/kickimages

Data entry

375

WHAT YOU WILL LEARN IN THIS CHAPTER

To define and explain the terms ‘editing’ and ‘coding’. To code fixedalternative questions. To code open-ended questions. To understand the importance of proper data cleaning and editing. To understand how computerised data processing influences the coding process.

376

TRANSFORMING RAW DATA INTO INFORMATION

Average in one sense, large in another

In a managerial survey, respondents were asked: Relative to other companies in the industry, is your company

One of the largest? ❏ About average in size? ❏ Small? ❏ One respondent checked both the category ‘one of the largest’ and the category ‘about average in size’. Next to the question the respondent wrote, ‘average in retailing, one of the largest in pharmacy

chains’. So what do you record in your database: large, average or small? With hindsight, we can say that the question should have been worded more clearly, and this should have been discovered during a pretest, but maybe you didn’t have that choice. The editor must decide whether the industry should be categorised as ‘retailing’ or ‘pharmacy chain’ and edit in the appropriate category. A numerical code is then assigned to the answer so that researchers may analyse the data. This chapter discusses how editing and coding transform raw data into a format suitable for analysis.

iStock.com/greenwatermelon

11 »

EDITING AND CODING:

PART FIVE > COLLECTING THE DATA

STAGES OF DATA ANALYSIS The process of analysis begins after the data have been collected. During the analysis stage several interrelated procedures are performed to summarise and rearrange the data. Exhibit 11.1 presents the research steps following data collection that are related to processing and analysis. The goal of most research is to provide information. There is a difference between raw data and information. Information refers to a body of facts that are in a format suitable for decision-making, whereas data are simply recorded measures of certain phenomena. The raw data collected in the field must be transformed into information that will answer a decision-maker’s questions. The conversion of data into information is analysis. Before that, however, most often raw data need to be edited and coded to be ready for analysis. The best way to overcome problems with data analysis is to avoid them in the first place. That means anticipating where potential errors with data may occur and dealing with them before a problem arises. As we suggested in the opening vignette, the problem with coding ‘large’ or ‘average’ for that retailer could have been avoided with some good preliminary, qualitative research, and by pretesting the questionnaire. The following sections discuss the processes of ensuring the integrity of data before, during and after the data-gathering stages.

← EXHIBIT 11.1 OVERVIEW OF THE STAGES OF DATA ANALYSIS1

Coding

Data entry (keyboarding) Data analysis

Descriptive analysis

Univariate analysis

Error checking and verification

Editing

Bivariate analysis

Multivariate analysis

Interpretation

Before the survey Pretest. Pretest. Pretest. Ideally we should design survey questionnaires so that no editing is necessary and coding is automatic. This means that all possible answers and possible mistakes have been considered and dealt with before any data are gathered. As we shall see, editing often is done by hand and it can be extremely time-consuming. Try to avoid it. CHAPTER 11 > EDITING AND CODING: TRANSFORMING RAW DATA INTO INFORMATION

377

378

PART FIVE > COLLECTING THE DATA

As we suggested in Chapter 10 on sample size, non-sampling error is far greater than sampling error in questionnaire surveys. That means the people we ask, the questions we ask, how we ask them and how we process responses all cause much more error than does simply gathering a slightly smaller sample. If you have to make a choice between (a) spending time and effort checking the validity of a questionnaire or (b) spending time and effort gathering a larger sample, then always choose (a). Many researchers worry about sample size because that’s the only thing that can be measured with confidence. Too often, a large sample size just gives more confidence in a wrong answer. Before the questionnaire is distributed, make sure that all questions can be clearly understood and that all methods of response are easy to deal with by the target population. If questions are to be read out by interviewers, say face-to-face or over the phone, then practice interviews are always a good idea. Practise first among members of the research team, and then with several cooperative colleagues who are not familiar with the research. If anyone in your research team or in the pretest group understands a question, or even a word, differently from what you had in mind, then it is guaranteed that members of your sample will interpret those words differently too. That means they are answering different questions! Fix any problems with interpretation of questions and flow of questions before you go to the field.

During the survey EDITING Occasionally, an interviewer makes a mistake and records an improbable answer (for example, ‘birth year: 1843’) or interviews an ineligible respondent (such as someone too young to qualify). Sometimes answers are contradictory, such as ‘No’ to car ownership, but ‘Yes’ to purchase of car insurance. These and editing

other problems must be dealt with before the data can be coded. Editing is the process of checking and

coding

adjusting the data for omissions, legibility and consistency, and readying them for coding and storage. Editing may be differentiated from coding, which is the assignment of numerical scores or classifying symbols to previously edited data. Careful editing makes the coding job easier. The editor’s task is to check for errors and omissions on the questionnaires or other data collection forms. When the editor discovers a problem, he or she adjusts the data to make them more complete, consistent or readable.

After the survey responses editing The process of checking and adjusting data for omissions, legibility and consistency, and readying them for coding and storage. coding The process of assigning a numerical score or other character symbol to previously edited data in preparation for analysis. field editing Preliminary editing on the same day as the interview to catch technical omissions, check legibility of handwriting and clarify responses that are logically or conceptually inconsistent.

Interviewers and field supervisors should conduct preliminary field editing immediately after the interview. The purpose of field editing is to check that the questionnaire has been answered as well as possible. Have any pages been missed accidentally? Can others read the handwriting? If necessary, it’s often easier to speak with a respondent to clarify any issues. Online surveys should be monitored early also. The researcher should download and examine the survey responses on the first day and check that all questions are being answered, that no questions cause respondents to abandon the questionnaire prematurely and that the survey takes no longer to complete than promised. Well-designed questionnaires require less editing and checking because they help the respondent to make accurate records. For example, Nielsen Media Research measures radio audiences by having respondents record their listening behaviour – time, station and place (home, work or car) – in diaries. These diaries have all radio stations in a region preprinted with time-slots marked in tidy 15-minute intervals. Respondents merely need to check the box for each interval that they are listening to a radio station. When the field officer collects the diary he or she then checks each page for the week to be sure that a record has been taken for each day of the week.

CHAPTER 11 > EDITING AND CODING: TRANSFORMING RAW DATA INTO INFORMATION

379

Editing requires checking for logically consistent responses. The editor must determine if the answers a respondent gave to one question are consistent with those for other, related questions. Many surveys use filter questions or skip questions that direct the sequence of questions according to the respondent’s answers. In pencil and paper questionnaires some respondents will answer questions that are not applicable. The editor should adjust these answers (usually to ‘no answer’ or ‘inapplicable’) so that the responses will be consistent. Online questionnaires make the task even easier because skip and branch questions can be built in to the questionnaire so that respondents never see any questions that are not applicable. As we learned in Chapter 5, item nonresponse is the technical term for an unanswered question on an otherwise complete questionnaire. In many situations, the decision rule is to do nothing with the unanswered question. The editor merely indicates an item nonresponse by providing a message instructing the coder to record ‘missing value’ or ‘blank’ as the response.

item nonresponse The failure of a respondent to provide an answer to a survey question; the technical term for an unanswered question on an otherwise complete questionnaire.

SURVEY THIS!

How are data entry, editing and coding made easier by using a Qualtrics-type survey approach compared with a paper and pencil survey approach? Do any of the questions in the survey present any particular coding problems? Are there questions that can be coded using dummy coding? If so, what are they? Have you noticed any inconsistent responses in the data? If so, how should they be addressed?

Courtesy of Qualtrics.com

EDITING AND TABULATING ‘DON’T KNOW’ ANSWERS In many situations, the respondent answers ‘don’t know’. On the surface, this response seems to indicate that the respondent is not familiar with the brand, product or situation, or is uncertain and has not formulated a clear-cut opinion. This legitimate ‘don’t know’ means the same as ‘no opinion’. However, there may be reasons for this response other than the legitimate ‘don’t know’. The reluctant ‘don’t know’ is given when the respondent simply does not want to answer the question and wishes to stop the interviewer from asking more. For example, asking an individual about family income may elicit a ‘don’t know’ answer meaning: ‘This is personal, and I really do not want to answer the question.’ Also, if the individual does not understand the question, he or she may give a confused ‘I don’t know’ answer.

After the survey When data have been gathered, typically they have to be cleaned up to prepare them for analysis. As with editing, data cleaning is a process used to determine inaccurate, incomplete or unreasonable data and then improve the quality through correction of detected errors and omissions. Historically, editing referred to checking individual questionnaires by hand, while cleaning refers to the same process done with the whole data set on a spreadsheet. The process may include completeness checks, reasonableness checks, limit checks and review of the data to identify outliers or other errors. Online respondents that abandoned the survey part way through may be eliminated. Responses that don’t

380

PART FIVE > COLLECTING THE DATA

make sense should be examined and a decision made on whether to include the case or remove just the nonsensical answer.

EXPLORING RESEARCH ETHICS

codes Rules for interpreting, classifying and recording data in the coding process; also, the actual numerical or other character symbols assigned to raw data. field A collection of characters that represents a single type of data, such as the answer to a question. record A collection of related fields – the answers to all questions by one respondent. data matrix A rectangular arrangement of data in rows and columns. Typically, records are in the rows and the variables are in the columns. file A collection of related records, with accompanying information about the nature of the data.

EXHIBIT 11.2 → A FILE, RECORDS AND FIELDS

DO YOU HAVE INTEGRITY?2

Data integrity is essential to successful research and decision-making. Sometimes, this is a question of ethics. Whereas data integrity can suffer when an interviewer or coder simply makes up data, other things can occur that limit data integrity. For instance, data with a large portion of nonresponse have lower integrity than data without so much missing data. However, if respondents have truly left questions blank, the editor should not feel compelled to just ‘make up’ responses. Data integrity can also suffer simply because the data are edited or coded poorly. For example, the data coder should be aware that data may be used by other researchers in the future. Therefore, data must be coded consistently across different data sets. For example, if a coder sometimes uses a 1 for women and a 2 for men, while in another data set uses a 1 for men and 0 for women, the possibility exists that analyses using these categories will be confused. What happens if the data sets are combined? Who exactly are the men and who are the women? Confusion like this is particularly likely if the coder does not enter value labels for nominal variables. Consider how important consistent coding is for companies that share or sell secondary data. Occupations need a common coding, as do product classes, industries and numerous other potential data values. Fortunately, industries have standard codes such as the Australian and New Zealand Standard Industrial Classification (ANZSIC). Without a standardised approach, analysts may never be quite sure what they are looking at from one data set to another. Thus, research firms need to carefully maintain information coding systems that help maximise data integrity.

CODING Codes generally are considered to be numbered symbols; however, they are more broadly defined as rules for interpreting, classifying and recording data. Researchers organise coded data into fields, records and files. A field is a collection of characters that represents a single type of data. In a spreadsheet, a cell is a field. A record is a collection of related fields. In a spreadsheet a record is a row on the sheet. Together these rows and columns create a data matrix. A file is a collection of related records, typically a spreadsheet (see Exhibit 11.2) with accompanying coding explanations and other metadata.

Record 4

File

Record

Field

Record 3

Record 2

Record 1

9999

Mary Draper

Connors

Highett

VIC

2

34

Questionnaire number field

Name field

Street field

City field

State field

Question 1 field

Question 2 field

CHAPTER 11 > EDITING AND CODING: TRANSFORMING RAW DATA INTO INFORMATION

Code construction

381

ONGOING PROJECT

Exhibit 11.3 shows a typical survey question and its associated codes. When the question has a fixedalternative (closed-ended) format, the number of categories that require codes is determined during the questionnaire design stage. The codes 8 and 9 conventionally are given to ‘don’t know’ (DK) and ‘no answer’ (NA) responses, respectively. However, many computer programs recognise that a blank field indicates a missing value (no answer). There are two basic rules for code construction. First, the coding categories should be exhaustive; that is, coding categories should be provided for all subjects, objects or responses. With a categorical variable such as gender, making categories exhaustive is not a problem. However, trouble may arise when the response represents a small number of subjects or when responses might be categorised into a class not typically found. For example, household size might be coded 1, 2, 3, 4 and 5 or more. The ‘5 or more’ category assures all subjects of a place in a category. Second, the coding categories should be mutually exclusive and independent. This means that there should be no overlap among the categories, ensuring that a subject or response can be placed in only one category.

Fixed-alternative question In general, self-regulation by business itself is preferable to stricter control of business by the government. 1

Strongly agree

2 Mildly agree 3 Mildly disagree 4 Strongly disagree 8 Don’t know 9 No answer

Multiple-response questions Consider the following two questions: 1 When hanging out with friends I usually eat at a fast-food restaurant. 1

2

3

❏ Strongly

❏ Somewhat

❏ Neither Agree

Agree

Agree

nor Disagree

4

❏ Somewhat Disagree

5

❏ Strongly Disagree

2 Which of the following fast-food restaurants have you bought from in the last week? (Tick all that apply.) McDonald’s

❏

Hungry Jack’s

❏

Oporto

❏

KFC

❏

Red Rooster

❏

Subway

❏

Pizza Hut

❏

Domino’s

❏

Eagle Boy’s

❏

Wok in a Box

❏

Noodle Box

❏

← EXHIBIT 11.3 CODING FOR AN ATTITUDE STATEMENT

382

PART FIVE > COLLECTING THE DATA

The first question is very straightforward. The code is a number 1 to 5. The second question is more complicated. It is almost impossible to create a single code for all possible choices. Really, it is the equivalent of 11 yes–no questions, which should be coded as dummy variables – ‘0’ for no, ‘1’ for yes. As such, multiple-response questions must be treated as multiple questions, with a field given to each option, as shown in Table 11.1. TABLE 11.1 »

SAMPLE TABULATION OF MULTIPLE-RESPONSE QUESTIONS

ID

Q1

Q2a

Q2b

Q2c

Q2d

Q2e

Q2f

Q2g

Q2h

Q2i

Q2j

Q2k

01

2

0

1

0

0

0

1

0

0

0

0

0

02

5

1

0

0

0

1

0

0

1

0

0

0

03

3

1

0

1

0

0

1

0

1

0

0

1

Coding open-ended questions The usual reason for using open-ended questions is that the researcher has no clear hypotheses regarding the answers, which will be numerous and varied. The purpose of coding such questions is to reduce the large number of individual responses to a few general categories of answers that can be assigned numerical codes. Open-ended questions should be avoided in online surveys. Think about how likely you are to think about and then write a detailed comment about a topic that you don’t care about very much. Never expect your respondents to regard your research as important as you do. Generally then open-ended questions are good for face-to-face and telephone interviews, and occasionally they work well with pencil and paper questionnaires when respondents feel invested in the outcome of the research. Similar answers should be placed in a general category and assigned the same code. For example, individuals asked a projective question about why some people did not purchase a new microwaveable frozen food product might give the following answers: →→ They don’t buy frozen food very often. →→ They like to prepare fresh food. →→ Frozen foods are not as tasty as fresh foods. →→ They don’t like that freezer taste. →→ They’ve tried that brand in the past and they didn’t like it. →→ They’re happy with the brand they use now. →→ They prefer the flavour of another brand. The first four of these answers could be categorised under ‘dislike frozen foods’ and assigned the code 1. The last three are not related to frozen food as such but are about brands. One refers to disliking one brand and two answers say they prefer a competitor. The researcher could create just one code simply related to brand and assigned the code 2. Alternatively, code 2 may be used for ‘dislike brand’ and code 3 for ‘prefer competitor’. Code construction in these situations necessarily must reflect the judgement of the researcher. A major objective in the code-building process is to accurately transfer the meanings from written responses to numeric codes. Experienced researchers recognise that the key idea in this process is that code building is based on thoughts, not just words. For example, it is useful to show coders the test product or the mock-up of the test advertisement that was shown to respondents, so that they will better understand the answers to open-ended questions and thus better convey respondents’ thoughts. The end result of code building should be a list, in an abbreviated and orderly form, of all comments and thoughts given in answers to the questions.

CHAPTER 11 > EDITING AND CODING: TRANSFORMING RAW DATA INTO INFORMATION

383

Differentiating categories of answers for coding is more difficult with open-ended questions than with fixed-alternative questions. Developing an appropriate code from the respondent’s exact comments is something of an art. Researchers generally perform a test tabulation to identify verbatim responses from approximately 20 per cent of the completed questionnaires and then establish coding categories that reflect the judgement of the person constructing the codes. Test tabulation is the tallying of a small sample of the total number of replies to a particular question. Its purpose is preliminary identification of the stability and distribution of the answers that will determine how to set up a coding scheme. Exhibit 11.4 illustrates open-ended responses and preliminary open-ended codes generated for the question: ‘Why does the soup you just tasted

test tabulation Tallying of a small sample of the total number of replies to a particular question in order to construct coding categories.

taste closer to homemade?’ During the coding procedure, the respondent’s opinions are divided into mutually exclusive thought patterns. These separate divisions may consist of a single word, a phrase or a number of phrases, but in each case they represent only one thought. Each separate thought is coded once. When a thought is composed of more than one word or phrase, only the most specific word or phrase is coded. Some specialist software packages permit the researcher to extract verbatim comments from an open-ended question, then create new fields in which codes may be selected from a list and then further summarised into a third field. In this way, the coding scheme may be checked and rechecked by analysts as needed. Coding open-ended questions is a very complex issue.

You don’t get that much meat in canned soup. The vegetables are cooked just right – not too soft. It just (doesn’t look) like any canned soup I’ve had. I can see herbs; I’ve never seen it in any canned soup. It is not too spicy.

1 Don’t get that much meat in a can 2 Vegetables are cooked just right 3 I can see herbs

It’s tasty – savoury flavour. It’s not (loaded with vegetables) – just enough vegetables. I can see the ingredients – I can pick out individual vegetables.

5 6 7 8 9 10 11

It is tasty Has just enough vegetables See ingredients See vegetables Fresh taste Canned is usually overcooked Not a lot of filler

12 13 14 15

Not too many beans Not too hot, it’s mild Has enough spice Broth not watery

Tastes (fresh). The canned stuff is too (soft). Too overcooked usually. It doesn’t have a lot of filler such as pasta, and not too many beans. It’s not too spicy. It’s not too hot, it’s mild. Has enough spice to make it tastier. It seems to have a pretty good broth. Some are watery.

← EXHIBIT 11.4 CODING OPEN-ENDED QUESTIONS ABOUT CANNED SOUP3

4 Not too spicy

REAL WORLD SNAPSHOT

NATURALLY NEGATIVE4

Data coders often extract meaning from short open-ended comments written by research respondents. Coders may code these at different levels beginning by grouping responses based on whether they are positive or negative. One thing that might surprise someone new to coding is how many of the responses are negative. Respondents seem very anxious to complain. Many workplaces have a suggestion box for employees. Research suggests that dissatisfied employees are much more likely to express comments anonymously than are satisfied employees. The open-ended

»

384

PART FIVE > COLLECTING THE DATA

»

comments tend to look far more negative than the average employee satisfaction ratings. The length of comments also increases as employees are more dissatisfied. People from Western cultures are not the only respondent group that focuses on the negative. A study comparing the business ethics of several countries with a particular focus on Japan asked Japanese consumers: ‘What particular ethical or unethical behaviour did you personally experience or hear from others or the media?’ The sample of Japanese consumers produced 201 ethical comments and 462 comments about unethical treatment from businesses. Are businesses in Japan really that unethical or do respondents focus on the negative? What would happen when the same question were asked of Australian, American or Singaporean consumers? In any case, the coder’s job is not to draw a conclusion on the overall results, but simply to enter the appropriate response into a data file for each and every respondent – without complaining!

CODE BOOK Up to this point, we have assumed that each code’s position in the data matrix already has been determined. However, this plan generally is formed after the coding scheme has been designed for every question. The code book gives each variable in the study and its location in the data matrix. With the code book the researcher can identify any variable’s description, code name and field. The code book may be as simple as a separate worksheet in your data file, where you note what the meaning of each question is and what the expected answers should be. For example, how is GENDER coded – 0–1, 1–2, and which is which? On a 1–7 rating scale, does 1 mean ‘Strongly Agree’ or ‘Strongly Disagree’? ONGOING PROJECT

RECODING In a number of situations it is easier to enter the raw data into the computer using the precoding on the questionnaire and then program the computer to recode certain raw data. This often occurs when a researcher measures attitudes with a series of both positive and negative statements. Reversing the order of the codes for the negative statements so that the statements’ codes reflect the same order of magnitude as the positive statements’ codes requires only a simple data transformation. For instance, if a 7-point scale for variable 1 (VAR1)

code book A book that identifies each variable in a study and gives the variable’s description, code name and position in the data matrix. recode To use a computer to convert original codes used for raw data into codes that are more suitable for analysis. data cleaning The process used to determine inaccurate, incomplete or unreasonable data and then improving the quality through correction of detected errors and omissions.

is to be recoded, the following programming statement might be used to subtract the original code score from 8: NEWVAR1 5 8 2 VAR1

Thus the data are recoded so that 1 becomes 7, 2 becomes 6, 3 becomes 5 and so on. Collapsing the number of categories or values of a variable or creating new variables also requires recoding. These topics, which are interrelated with data analysis, are discussed in Chapter 12.

ERROR CHECKING The final stage in the coding and data cleaning process is to ensure that all codes are legitimate. For example, ‘gender’ should have only two values. If a summary tabulation shows that there are more than two then you have a problem with the data – a field has been missed when typing in the raw data, questions have been missed, or worse. Create summary tabulations for all variables to ensure that only the acceptable values are present.

CHAPTER 11 > EDITING AND CODING: TRANSFORMING RAW DATA INTO INFORMATION

NEGATIVE VALUES ON A RATING SCALE?

Sometimes error checking is less obvious than it looks. It’s much easier to make a mistake once data are hidden inside a data set than we’d like to admit. One of the authors of this text had the uncomfortable experience of being chastised for having poor ratings on a survey of teaching quality by recent graduates. Not personally. The whole faculty had earned an average mark of 25 on satisfaction. And we were not alone: the report cited serious problems with most departments in the university. ‘Minus 5?’ We asked. ‘Yes’, declared the representative from Central Administration, ‘We often see negative scores.’ The report was supposed to create discussion on how we may have caused our graduates to be so unhappy. But we asked how data were coded and analysed. Responses were recorded like this: 1 Strongly disagree 2 Disagree 3 Neither agree nor disagree 4 Agree 5 Strongly agree 299 No answer The option ‘No answer’ had been coded 299. In this survey, any question that was not answered had been coded as –99. In a spreadsheet we can see those missing values very easily. But we have to look first. What is the average of 90 responses that say ‘Strongly agree’ and 10 nonresponses? The answer should be 5. But if we wrongly include the nonresponse records then the answer is a nonsensical 25.4! In this case, the analyst had neglected to replace the 299 values with empty cells (not zeros). So a laudable result was incorrectly interpreted as a poor result. This author wrote an email as politely as he could to the official, pointing out the error and suggesting various methods of improving it. We never heard from her again.

»» »» »» »»

Check responses for inconsistent answers. Remove the data (becomes missing) when contradiction cannot be resolved. Replace the data with correct value when the proper answer is clear. Include check questions on questionnaires that are lengthy or for which respondents may have low involvement or may provide misleading responses. »» Missing values can be a problem in statistical analysis, particularly when there is more than a trivial amount (5 per cent) and especially when the missing values are not randomly dispersed throughout the data. The best options for missing data: • Replace missing values with an actual entry with caution and using the most appropriate imputation method. • Pair-wise deletion is a good way to handle missing data in most applications as it preserves information and is very amenable to even advanced statistical procedures. • Dummy variables should be used to represent categorical variables, particularly dichotomous (only two values possible) data, whenever possible. Dummy variables provide greater flexibility for dichotomous responses in later statistical analysis. • Take advantage of technology. If a computerised interface can be used to collect data, strongly consider using it. Data entry errors (nonrespondent errors) are greatly reduced or eliminated, and respondent errors can also be mitigated. • Use value labels for categorical responses. This greatly reduces the chance for later confusion in interpreting statistical results for these variables. • Spreadsheets provide a good way to store data because not only can basic analyses be conducted with a spreadsheet program, but statistical packages like SAS and SPSS also easily can use data in spreadsheet or CSV format.

385

REAL WORLD SNAPSHOT

TIPS OF THE TRADE

386

PART FIVE > COLLECTING THE DATA

CODING AND ANALYSIS OF QUALITATIVE RESEARCH DATA As with quantitative research, qualitative studies should be reproducible. That is, it should be possible for another researcher to conduct the research in much the same way as you did, and then draw the same conclusions. Also the documentation of research procedures should be clear enough for anyone else to code and analyse your data and also draw the same conclusions. Qualitative data may be subjective in part, but the analysis of such data should not be. Think about the reasons we might choose a qualitative research method. (Tip: ‘Because quantitative methods and statistics scare me’ is a really bad answer.) One reason for using a qualitative method is to investigate the reasons for something that was uncovered in a quantitative study. The usual reason for conducting interviews or observation studies or other such methods is that researchers want to develop theory. They want to learn what is going on, and possible reasons for a phenomenon. They want to learn the context in which events occur and what small interactions trigger an event or what stops an event. Few of these goals can be achieved with quantitative methods. Only after researchers have some good ideas about what is going on, and who is involved and why, can they go about the much more simple tasks of measuring how much it is going on, and the strength of any relationships that may be hypothesised by using quantitative methods. If qualitative research is done well then often, for the purposes of a business decision, there is no need to carry out further quantitative research. In a qualitative study coding has the role of organising and sorting your data. In the process of organising and sorting, the researcher must think carefully about events, motivations, and contingencies. For much qualitative research, coding the data is analysing the data.

Qualitative data sources In marketing research, we often confine our discussions of qualitative research to verbal data – the words that respondents give us over the course of an interview. Of course qualitative data can be much richer and more diverse than mere words. Some examples of such data may include: →→ Video recordings of people’s behaviour in a focus group, displaying emotion, gestures, eye focus, and interpersonal distance. →→ Audio recordings of an interview, capturing tone of voice or time spent thinking about what to say. →→ Text from local news stories about a particular issue over a period of time. In all these cases the data offer more valuable information than the words alone. It may make a big difference when looking at what people say when you also consider that a statement is directed to a few rather than all people present, or that the tone of voice indicates that some words are intended to be sarcastic or ironic, which usually means the opposite of what the words mean on their own. In all cases, however, the overall method is the same. The researcher must make a record of each relevant aspect of the data and place it in context so that other people can also look at your results and interpretations and accept them as valid. Table 11.2 gives some examples of the sort of phenomena that can be coded from interview or other qualitative data. Note that we code not just for specific words or concepts, but also for more abstract ideas such as motivations, strategies, interrelationships and more.

CHAPTER 11 > EDITING AND CODING: TRANSFORMING RAW DATA INTO INFORMATION

TABLE 11.2 »

387

WHAT IS ABLE TO BE CODED FOR QUALITATIVE DATA ANALYSIS

No.

What can be coded

Examples

1

Behaviours, specific acts

Seeking reassurance, bragging

2

Events – stories about events or experiences.

Travelling overseas for the first time, interviewing for a job

3

Activities – involving other people within a particular setting

Going clubbing, attending night classes

4

Strategies, practice or tactics

Flirting, gossiping to cause jealousy, staying late at work to get promotion

5

States – Mood or temperament in people or organisations

Hopelessness: ‘I’ve never been good at maths’, Hubris: ‘I understood everything in the lectures so the exam will be easy.’

6

Meanings – the core of much qualitative analysis. Meanings and interpretations direct people’s actions. a What concepts do people use to understand their world? What norms, values and rules guide their actions?

In many cultures it is very important not to ‘lose face’, or to cause others to ‘lose face’

b What meaning or significance do concepts within a context have for participants, how do they construe events, and what are their feelings?

Frustration: ‘I just felt I’d worked so hard and I couldn’t get the job I wanted.’

c What symbols do people use to understand their situation? What names do they use for objects, events, persons, roles, setting and equipment?

‘This degree says that I’m smart and capable, doesn’t it?’

7

Participation and adaptation to an environment

‘In my old share-house we were up all hours. But here we tend to be pretty quiet, at least during the work week.’

8

Relationships or interaction

‘The shared facilities mean we have to work together. Sometimes there’s conflict about dirty dishes, but mostly we get along.’

9

Conditions or constraints

Financial difficulties, friends are back home

10

Consequences

Positive attitude attracts opportunities

11

Settings – the context under study

Student accommodation, work place

12

Reflexive – researcher’s role in the process, how intervention generated the data

Probing question ‘How did you feel when he said that?’

Coding qualitative data Researchers usually design coding schemes with themes drawn from existing theory and also create new codes that appear in unexpected themes in the data.

A PRIORI CODES A priori codes, or pre-existing codes, are established before examining the data. They come from: →→ previous research or theory →→ the research question →→ topics covered in the interview.

a priori codes Codes created before data have been gathered and examined, usually drawn from existing theory and current knowledge of the research domain.

388

PART FIVE > COLLECTING THE DATA

EMERGENT CODES emergent codes sometimes called ‘grounded codes’, appear in the process of reading and examining the data codes created before data have been gathered and examined, usually drawn from existing theory and current knowledge of the research domain.

Emergent codes, sometimes called ‘grounded codes’, emerge from the data when you come across an idea or a theme that was not expected from theory. In studies guided by ‘grounded theory’ the researcher consciously tries to put aside presuppositions and previous knowledge of the subject area and tries to find new themes in the data. Some naïve new researchers like the idea of grounded theory because they think they can skip any background reading, or thinking. In fact, grounded theory is extremely difficult and time-consuming to do well, but it also is capable of uncovering some exciting and surprising ideas. In practice, most qualitative research coding begins with a priori codes and new codes are created as unexpected ideas or relationships are discovered in the data. Any one piece of text may have multiple codes. Most researchers create code hierarchies: For example, you might characterise a sentence as expressing an emotion (Code-type: Emotion) that is unpleasant (second-order-code: Negative), specifically frustration (third-order-code: Frustration). Other emotion sub-codes would be positive and the next order would include a list of appropriate positive emotions such as joy, satisfaction, etc.

Coding strategies The actual approach and depth to which you code and analyse your data obviously depends on the nature and volume of the data and the purposes of the research. The following is a list of activities and strategies that some researchers have found useful when working through their data. →→ Word repetitions – look for commonly used words and words whose close repetition may indicate emotions. →→ Indigenous categories (what the grounded theorists refer to as in-vivo codes) – terms used by respondents with a particular meaning and significance in their setting. →→ Key-words-in-context – look for the range of uses of key terms in the phrases and sentences in which they occur. →→ Compare and contrast – often called ‘constant comparison’. Ask, ‘What is this about?’ and ‘How does it differ from the preceding or following statements?’ →→ Social science queries – refer back to your understanding of consumer behaviour, psychology, sociology, organisational behaviour, or other social science theories, to explain the conditions, actions, interaction and consequences of phenomena. →→ Searching for missing information – think about what is not being done or talked about that you would have expected to find. →→ Metaphors and analogies – frequently used to indicate feelings and central beliefs about things. →→ Transitions – taking turns in conversation as well as story points, such as moving from background to problem to resolution. →→ Connectors – connections between terms such as causal (‘since’, ‘because’, ‘as’ etc.) or logical (‘implies’, ‘means’, ‘is one of’ etc.). →→ Unmarked text – examine the text that has not been coded as a theme or even not at all. →→ Pawing – physically handling the text: eyeballing and marking the text. Circle words, underline, use coloured highlighters, run coloured lines down the margins to indicate different meanings and coding. Then look for patterns and significances. →→ Cutting and sorting – the traditional technique of making photocopies of all the transcripts, cutting them up and collecting all those coded the same way into piles, envelopes or folders or pasting

CHAPTER 11 > EDITING AND CODING: TRANSFORMING RAW DATA INTO INFORMATION

389

them onto cards. Laying out all these scraps and re-reading them, together, is an essential part of the process of analysis.

CONSTANT COMPARISON Researchers who use a grounded theory approach advocate constant comparison – it’s a valuable technique for all qualitative research. Constant comparison suggests that when every new piece of text, or idea or event, is coded, you should compare it with all other pieces of text that have been coded the same way, especially in other interviews. Additionally, the sentence (or video section) should be compared with preceding or following sections to provide context. This makes sure that your coding is consistent. If a text doesn’t quite fit with what you have already coded, then this is a signal to use a different code, or to create a new code or sub-code. It might also alert you to the possibility of a new idea contained in those pieces of text you have already coded.

MEMOS AND CODES Keep written notes, or memos, during the coding process. Memos are notes to yourself and to team members to record definitions of, and reasons for, codes and to note any insights or theories you have about the importance or relationships among themes. If you are working in teams or over an extended period of time, then memos are essential. Have you ever opened an old spreadsheet file or set of notes and wondered what on earth it was about? When you’re analysing abstract ideas and relationships in qualitative data then you guarantee confusion if you don’t have notes to remind you where you were up to and what you were thinking.

SOFTWARE FOR QUALITATIVE RESEARCH There are many software tools designed as aids for text-based and related qualitative research. Some are automatic text-mining software that do much of the heavy lifting and limit human biases. Others are data-base programs that help researchers keep track of their thinking.

Text-mining software LEXIMANCER Leximancer automatically searches for key words and concepts and measures their proximity to other key words in the text, and then displays the results as a conceptual map. It is a remarkably simple tool that can produce remarkably useful insights. This ‘unsupervised coding’ overcomes much of the potential bias in human textual analysis.8 The following is a snapshot of a very quick automatic analysis of a draft of this chapter using Leximancer. You can see some clear themes extracted. If we edit the automatically derived lexicon to create compound phrases as concepts, then more detailed themes and sub-themes may emerge.

constant comparison The process of routinely comparing each new piece of text with other similarly coded pieces, and their contexts, to ensure consistent coding and to help discover new themes.

PART FIVE > COLLECTING THE DATA

MAXQDA – Distribution by VERBI GmbH

390

Thematic summary of this chapter, showing key words9

Data-base management software MAXQDA MAXQDA is a very efficient database system. The user interface comprises four windows. The Document window lists project documents. The Code window houses the coding schema. The Document Browser

MAXQDA - Distribution by VERBI GmbH

displays individual documents and the Retrieved Segments window displays coded data. The coding

MAXQDA interface finding common themes in different texts10

CHAPTER 11 > EDITING AND CODING: TRANSFORMING RAW DATA INTO INFORMATION

MAXQDA sample outputs: word clouds and thematic relationships11

schema is hierarchical or non-hierarchical as needed. Changing a code in one window automatically updates corresponding codes in other windows and adds higher-order hierarchical codes. It is a very flexible and easy-to-use program.

NVIVO NVivo is a little more complicated to learn and to use than MAXQDA. It is more flexible in handling data and in producing more specialised output. Using NVivo, the researcher carefully reads and codes text into themes, as in MAXQDA. Themes are linked as relationships. If the researcher already has some ideas of what themes and codes might be found, then NVivo is very easy to use to find similar words and phrases, and to create subthemes. NVivo also holds video and media files. A priori coding can be imported from a spreadsheet – for example, when multiple interviewees refer to the same topic, or the same people are interviewed several times.

391

PART FIVE > COLLECTING THE DATA

QSR International

392

Sample NVivo interface, analysing a Delphi study on the usefulness of MOOCs12

SUMMARY Sometimes editing and coding is a boring and time-consuming clerical task, but it is the first step in transforming data into information. For some surveys, it is the last time that a researcher actually handles the exact responses made by people. If it is not done properly then all subsequent analysis can be wrong. DEFINE AND EXPLAIN THE TERMS ‘EDITING’ AND ‘CODING’

Editing is the process of checking and adjusting the data for omissions, legibility and consistency, and readying them for coding and storage. An editor may make logical changes or additions to a survey form to ensure complete data (but not to alter the meaning of responses). Coding is the assignment of numerical scores or classifying symbols to previously edited data. It is the application of rules for symbolically representing data; for example, 0 5 male, 1 5 female and so on.

CODE FIXED-ALTERNATIVE QUESTIONS

11

The easiest questions to code are fixed-alternative questions because the coding system is worked out ahead of time. The researcher needs to know exactly all of the possible responses to a question. CODE OPEN-ENDED QUESTIONS

Open-ended questions are easy to ask, but very difficult and time-consuming to code. The researcher must read comments made by respondents and then make a judgement on what the key theme or message is, so that it may be grouped with other similar comments and differentiated from other contrasting comments. Each time a new theme is found, a new code must be generated with appropriate rules for classifying and coding.

CHAPTER 11 > EDITING AND CODING: TRANSFORMING RAW DATA INTO INFORMATION

393

UNDERSTAND THE IMPORTANCE OF PROPER DATA CLEANING AND EDITING

UNDERSTAND HOW COMPUTERISED DATA PROCESSING INFLUENCES THE CODING PROCESS

The data-cleaning and editing process is about the last chance that the researcher has before analysing the data. Any missing or incorrect data will be used in subsequent analysis, and the people doing the analysis may never discover that their conclusions are invalid.

The whole coding process can be streamlined with computer-based and online questionnaires, where a researcher uses a computer to automatically check the completeness of answers.

KEY TERMS AND CONCEPTS code book codes coding data entry

data matrix editing field field editing

file item nonresponse recode

record test tabulation

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 What is the purpose of editing? Provide some examples of questions that might need editing. 2 Suppose respondents in a political survey were asked if they favoured or opposed the Agricultural Trade Act. Edit the following open-ended answers: a I don’t know what it is, so I’ll oppose it. b Favourable, though I don’t really know what it is. c You caught me on that. I don’t know, but from the sound of it I favour it. 3 What are the potential meanings of ‘don’t know’ responses in a political poll to evaluate two candidates for a city’s lord mayor? 4 Comment on the coding scheme for the following question: ‘In which of these groups did your total family income, from all sources, fall last financial year before taxes? Just tell me the code number.’ (Refer to the table.) Response

Code

Under $9999

01

$10 000 to $14 999

02

$15 000 to $24 999

03

$25 000 to $39 999

04

$40 000 to $54 999

05

»

Response

Code

$55 000 to $69 999

06

$70 000 to $89 999

07

$90 000 to $99 999

08

$100 000 to $124 999

09

$125 000 or over

10

Refused to answer

11

Don’t know

98

No answer

99

5 A frequently asked question on campus is: ‘What is your major?’ Suppose you wished to develop a coding scheme for this question. What would your scheme look like? How do you deal with people who are taking a double major? 6 A researcher asks you to help build a coding scheme for types (not brands) of coffee found in supermarkets. These might be regular, instant or decaffeinated. About how many codes might be needed?

»

7 A researcher investigated attitudes towards her company and noticed that one individual answered all image questions 5 on a 1–5 rating scale. Should she decline to use this questionnaire in the data analysis?

394

PART FIVE > COLLECTING THE DATA

ONGOING PROJECT SAMPLING OR CONDUCTING A SURVEY IN YOUR RESEARCH STUDY? CONSULT THE CHAPTER 11 PROJECT WORKSHEET FOR HELP

Download the Chapter 11 project worksheet from the CourseMate website. It outlines a series of steps taken in this chapter to ensure

that your data are ready for analysis. Remove nonsense answers, check for missed questions and other missing data, and check that all responses have been properly coded.

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to log in to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ interactive quizzes

☑ crosswords on key concepts ☑ research activities ☑ videos.

☑ flashcards

WRITTEN CASE STUDY 11.1 QUESTIONNAIRE EDITING A mail questionnaire that included a concept statement describing Wardlowe’s Wafer Sandwich Cookies, a proposed new product, was sent to members of a consumer panel. Exhibit 11.5 presents an excerpt from a questionnaire that a respondent filled out.

QUESTION

1 Does this questionnaire need editing? Explain why or why not.

EXHIBIT 11.5 → RESPONDENT QUESTIONNAIRE (COMPLETED)

4a Now that you have read the description for Wardlowe’s Wafer Sandwich Cookies, which statement best describes how likely you would be to buy Wardlowe’s Wafer Sandwich Cookies if they were being sold in the stores where you normally shop? (‘X’ ONE BOX) 36 Definitely would buy ∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 1 Probably would buy ∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 2 Might or might not buy ∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 3 Probably would not buy ∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 4 Definitely would not buy∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 5 4b Why do you feel that way about buying this product? (PLEASE BE AS SPECIFIC AS POSSIBLE) I don’t normally buy biscuits – I bake them myself. I feel Wardlowe’s Wafers are pretty bland and have only used them in recipes that require them.

(37) (42) (38) (43) (39) (44) (40) (45) (41) (46) 47–51R

»

CHAPTER 11 > EDITING AND CODING: TRANSFORMING RAW DATA INTO INFORMATION

»

395

5 Now, thinking of each different variety of Wardlowe’s Wafer Sandwich Cookies, please indicate how likely you would be to buy each variety if it were available in the stores where you normally shop? (‘X’ ONE BOX FOR EACH VARIETY) Definitely would buy

Probably would buy

Might or might not buy

Probably would not buy

Definitely would not buy

Vanilla Cream

1

2

3

4

5 (52)

Chocolate Fudge

1

2

3

4

5 (53)

Peanut Butter

1

2

3

4

5 (54)

Strawberry Jam

1

2

3

4

5 (55) 56–62R

6 Compared to other products now on the market, would you expect this new product to be . . . ? (‘X’ ONE BOX) Somewhat different . . . . x 2

Very different . . . . 1

Not at all different . . . . 3

(63)

Not at all believable . . . . 3

(64)

7 Overall, how believable is the description of this product? (‘X’ ONE BOX) Very believable . . . .

1

Somewhat believable . . . . 2

8 Considering the price of this product, do you think this product would be . . . ? (‘X’ ONE BOX)

65)

A very good value for the money∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 1 A somewhat good value for the money∙∙∙∙∙∙∙∙∙∙∙∙∙ 2 An average value for the money∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 3 A somewhat poor value for the money∙∙∙∙∙∙∙∙∙∙∙∙∙ 4

A very poor value for the money∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 5

9 Considering the price of this product, do you think the price is . . . ? (‘X’ ONE BOX)

(66)

Very expensive∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 1 Somewhat expensive∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 2 About average∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 3 Somewhat inexpensive∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 4 Very inexpensive∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 5 10 How often would you purchase this product? (‘X’ ONE BOX) More than once a week∙∙∙∙∙∙∙ 1

(67)

3 times a month ∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 3

Once a month ∙∙∙∙∙∙∙∙∙∙∙∙∙ 5

Less than once every 3 months ∙∙∙∙ 7

Once a week ∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 2

2 times a month ∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ 4

Once every 2–3 months ∙∙∙∙∙∙ 6

Would not purchase ∙∙∙∙∙∙∙∙∙∙∙∙∙ 8

NOTES 1 2 3 4

5 6

Sonquist, J. A., & Dunkelberg, W. C. (1977) Survey and opinion research: Procedures for processing and analysis, Englewood Cliffs, N.J: Prentice-Hall. Sources: Dubberly, Hugh (2004) ‘The information loop’, CIO Insight, 43 (September), pp. 55–61; Shonerd, René (2003) ‘Data integrity rules’, Association Management, 55(9), p. 14. Adapted from: Walker Research (1979) ‘Coding open ends based on thoughts’, The Marketing Researcher, December, pp. 1–3. © George Doyle & Ciaran Griffin. Sources: Tsalikas, J. & Seaton, B. (2008) ‘The International Business Ethics Index: Japan’, Journal of Business Ethics, 80, pp. 379–85; Poncheri, R. M., Lindberg, J. T., Thompson L. F. & Surface, E. A. (2008) ‘A comment on employee surveys: Negativity bias in open-ended responses’, Organizational Research Methods, 11 (July), p. 614. Adapted from: Taylor, & Gibbs, G.R. (2010) ‘How and what to code’, Accessed at http:// onlineqda.hud.ac.uk/Intro_QDA/how_what_to_code.php accessed on 16 March 2016. Adapted from: Ryan, G.W. & Bernard, H.R. (2003) ‘Techniques to Identify Themes’, Field Methods, 15(1): 85–109.

7 Strauss, Anselm & Corbin, Juliet (1990) Basics of Qualitative Research. Grounded Theory Procedures and Techniques 2 ed. Newbury Park, CA: Sage. 8 Smith, A. E. & Humphreys, M. S (2006) ‘Evaluation of unsupervised semantic mapping of natural language with Leximancer concept mapping’, Behavior Research Methods 38(2): 262–279. 9 http://info.leximancer.com, accessed 17 June 2016. 10 http://www.maxqda.de/wp/wp-content/uploads/maxqda-12-user-interface21024x578.png, accessed 17 June 2016. 11 http://www.maxqda.com/wp/wp-content/uploads/sites/2/visualizations.png, accessed 17 June 2016. 12 http://www.qsrinternational.com; http://www.qsrinternational.com; http://scalar. usc.edu/works/using-nvivo-an-unofficial-and-unauthorized-primer/media/ IntelligenceTextSearchWordTree.jpg, accessed 17 June 2016.

SIX

ANALYSING THE DATA 12 »

UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

13 » BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

14 »

BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

15 » MULTIVARIATE STATISTICAL ANALYSIS

PART 6: Analysing the data

HOW MANY VARIABLES AM I INTERESTED IN ANALYSING?

One variable (e.g., is mean number of store visits to Wendy’s higher than two per month?)

Two variables (e.g., does mean number of store visits per month to Wendy’s differ by gender?)

Univariate analysis (Chapter 12)

Bivariate analysis (Chapters 13 and 14)

Multivariate analysis (Chapter 15)

What am I interested in examining?

What am I interested in examining?

What am I interested in examining?

Comparing a sample mean with a population mean

Comparing an observed frequency with an expected frequency

One-sample t-test

Chi-square test

iStock.com/LaraBelova

Association between variables

Chi-square test, bivariate correlation and bivariate regression

Three or more variables (e.g., is mean number of store visits per month to Wendy’s associated with gender, age and distance lived from a Wendy’s store?)

Dependence relationships between variables

Partial correlation, multiple regression analysis, discriminant analysis, binary logistic regression and N-way ANOVA or N-way cross tabulation

Interdependence relationships between variables

Factor analysis, cluster analysis and multidimensional scaling

Testing for differences between groups

Independent samples t-test and univariate ANOVA

397

12 »

WHAT YOU WILL LEARN IN THIS CHAPTER

To explain the difference between descriptive and inferential statistics and discuss the purpose of inferential statistics in terms of population parameters and sample statistics. To make data usable by organising and summarising them into frequency distributions, measures of central tendency and measures of dispersion. To identify the characteristics and importance of the normal distribution, compute and use a standardised Z-value, and distinguish between population, sample and sampling distributions. To explain the central-limit theorem for computing confidence interval estimates and testing simple univariate hypotheses. To discuss the nature of the t-distribution, calculate a hypothesis test about a mean using a one-sample t-test and to run a one-sample t-test and interpret the output in IBM SPSS Statistics (SPSS) and Microsoft Excel. To explain the situations in which a univariate Chisquare test is appropriate, understand and be able to perform a Chi-square test and to run the test and interpret the output in SPSS and Microsoft Excel. To distinguish between parametric and nonparametric statistics.

398

UNIVARIATE STATISTICAL ANALYSIS:

A RECAP OF INFERENTIAL STATISTICS

Slaying the statistical dragon

The study of statistics is often the bane of university students’ studies. However, an understanding of statistics is a valuable skill to have within the world of marketing. A good understanding of statistics allows us to clearly and concisely present complex numerical information within reports and, perhaps more importantly as a buyer of market research information, helps us to understand and evaluate the efficacy of the information provided to us by others. Perhaps this is why Darrell Huff’s book How to Lie with Statistics1 has become one of the best-selling statistics books of all time. Likewise, a quote often paraphrased from writer H. G. Wells may be more relevant now than ever with the proliferation of information available to us: ‘Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write!’ As a new, keen marketing graduate, suppose you are interested in finding out what you are worth in the job market. You might choose to visit a job search website such as Adzuna (http://www.adzuna.com. au). At the time of writing the ‘average’ salary for a marketing graduate reported on this website was $52 976.2 But what does this mean exactly? What measure was used to arrive at this ‘average’? Often an average is the mean of a set of observations (add them all up and divide by the number of observations – see page 400). But sometimes very high or very low values skew a distribution and bias the mean, rendering this information less useful. In such cases when the distribution is not spread evenly one may use a different measure – a median – to reduce the bias from any outliers or extreme values. Digging deeper, let’s ask a bit more about this reported average salary for marketing graduates. In what city is the job located? Could it be that the PART SIX > ANALYSING THE DATA

figures for Sydney and Melbourne, where most marketing jobs are located, have skewed the results? The figures on this website are for the country as a whole, so could include jobs nationwide. What kind of role is it? The average for a brand management role ($76 587) seems to be substantially lower than the average for a market research role ($85 510), for example. Where do the data come from? Presumably the data come from jobs advertised on the website. But is this website representative of other such jobs websites? What about jobs advertised on different websites or using different keywords? Hopefully you can now see that when averages are reported they ought to be taken with a very large pinch of salt! To quote Benjamin Disraeli (attributed to him by Mark Twain), there are ‘Lies, damned lies and statistics’, so we need to be careful about how we interpret statistical information that is presented to us. A study on household water conservation behaviour in Australia3 surveyed 909 residents of a regional city to better understand their attitudes and perceptions towards conserving and consuming household water. Among a number of other findings, the authors reported that mean attitudes to conserving water were 3.79 on a 5-point Likert scale with 5 indicating more positive attitudes. As we recall from Chapter 10, where we discussed sampling, it would be neither feasible nor necessary to survey every member of the target population. Instead, researchers use a sample to take a snapshot of the population at a point in time – this sample can be used to estimate characteristics of the population. In the water conservation study, the researchers were interested in using the sample to infer characteristics about the larger population and found mean attitudes to water conservation were about 3.79. That is, the mean of 3.79 is an estimate of the population’s attitudes about water conservation. (We don’t know

what the ‘true’ mean is, which is why we’re estimating it from the sample). Recall also from Chapter 10 that researchers ought to take a random sample, if possible, in order to reduce sample selection biases. ComputerAssisted Telephone Interviewing (CATI – see Chapter 5) was used in this study so the researchers took a random sample. However, could it be possible that the sample estimate of 3.79 was incorrect? Is the true mean higher or lower than this value? Yes. Therefore, accidentally, through the random sampling process, our estimate might be wrong – we called this a sampling error.

This chapter is about numerically summarising data so that we can understand them simply and present them in an appropriate way. It also deals with the introduction of inferential statistics – the field of study that tries to tell us how likely it is that our observations have occurred as a result of sampling error (that is, random error). Thus, inferential statistics allow us to calculate the probability that our estimate has occurred through sampling error (that is, the probability it is incorrect). This can be very useful to marketers when interpreting statistical information.

DESCRIPTIVE AND INFERENTIAL STATISTICS inference

The Australian Bureau of Statistics (http://www.abs.gov.au) provides a range of statistics of importance for future government planning based on the 2011 census.4 These are called descriptive statistics and describe characteristics of the Australian population based on locality. Inferential statistics, on the other hand, are used to make inferences about a population from a sample of that population. For example, when an organisation test markets a new product, it wishes to make an inference from these sample markets to predict what will happen when the product is released to the larger population – the results from the sample are inferred onto the population. Australia might be regarded as a good test market for products from other similar countries such as the US or the UK because of its cultural similarity, and because it is far enough away to reduce the chances of ideas being leaked and test markets being disrupted by competitors. Any statistics ascertained from a test market are then used to infer about the ultimate population. Take, for example, the launch of Guinness Black Lager where Australia was one of four initial test markets.5 Thus, there are two applications of statistics: (1) to describe characteristics of the population or sample; and (2) to generalise from the sample to the population.

SAMPLE STATISTICS AND POPULATION PARAMETERS The primary purpose of inferential statistics is to make a judgement about the population, or the collection of all elements about which one seeks information. The sample is a subset or a relatively small fraction of the total number of elements in the population. It is useful to distinguish between the data computed in the sample and the data or variables in the population. The term sample statistics designates variables in the sample or measures computed from the sample data. The term population parameters designates variables or measured characteristics of the population. However, generally we do not know what these population parameters are and that is why we use samples – to make inferences about population parameters. In our notation we will generally use Greek lower-case 6

letters (for example, m or s) to denote population parameters, and English letters (such as X or S) to denote sample statistics.

sample statistics Variables in a sample or measures computed from sample data. population parameters Variables in a population or measured characteristics of the population.

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

399

400

PART SIX > ANALYSING THE DATA

MAKING DATA USABLE Frequency distributions Suppose a telephone survey has been conducted for a credit union. The data have been recorded frequency distribution A set of data organised by summarising the number of times a particular value of a variable occurs.

on a large number of questionnaires. To make the data usable, this information must be organised

percentage distribution A frequency distribution organised into a table (or graph) that summarises percentage values associated with particular values of a variable.

Table 12.1 represents a frequency distribution of respondents’ answers to a question that asked how

and summarised. Constructing a frequency table, or frequency distribution table, is one of the most common means of summarising a set of data. The process begins with recording the number of times a particular value of a variable occurs. This is the frequency of that value. Using the survey example, much customers had deposited in the credit union. Constructing a distribution of relative frequency, or a percentage distribution, is also quite simple. In the third column of Table 12.1 the frequency of each value (second column) has been divided by the total number of observations and the result multiplied by 100, to give a frequency distribution of

probability The long-run relative frequency with which an event will occur.

percentages (third column).

mean A measure of central tendency; the arithmetic average.

uses the concept of a probability distribution, which is conceptually the same as a percentage

Probability is the long-run relative frequency with which an event will occur. Inferential statistics distribution except that the data are converted into probabilities (see the Probability column). We now examine some measures of central tendency, which help us to describe the centrality of the data. TABLE 12.1 »

FREQUENCY, PERCENTAGE AND PROBABILITY DISTRIBUTION OF DEPOSITS

Amount

Percentage of people who hold deposits in each range

Frequency (number of people who hold deposits in each range)

Probability

Under $3000

499

16

0.16

$3000–$4999

530

17

0.17

$5000–$9999

562

18

0.18

$10 000–$14 999

718

23

0.23

$15 000 or more

811

26

0.26

3120

100

1.00

Measures of central tendency central tendency

On a typical day, a sales manager counts the number of sales calls each sales representative makes.

mean

He then may wish to inspect the data to find the centre, or middle area, of the frequency distribution. Central tendency can be measured in three ways – the mean, median or mode – each of which has a different meaning.

THE MEAN We all have been exposed to the average known as the mean. The mean is simply the arithmetic average, and it is a common measure of central tendency. At this point it is appropriate to introduce the summation symbol, the capital Greek letter sigma (S). A typical use might look like this: n

∑X

i

i51

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

which is a shorthand way to write the sum X1 1 X2 1 X3 1 X4 1 X5 1 . . . 1 Xn

Below the S is the initial value of an index, usually i, j or k, and above it is the final value, in this case n, the number of observations in Table 12.2. The shorthand expression says to replace i in the formula with the values from 1 to 8 and total the observations obtained. Without changing the basic formula, the initial and final index values may be replaced by other values to indicate different starting and stopping points. Suppose a sales manager supervises the eight salespeople listed in Table 12.2. To express the sum of the salespeople’s calls in S notation, we just number the salespeople (this number becomes the index number) and associate subscripted variables with their numbers of calls: We then write an appropriate S formula and evaluate it: 8

∑ X 5X 1X 1X 1X 1

i

2

3

4

1X5 1X6 1X7 1X8

i=1

5 4 1 3 1 2 1 4 1 3 1 3 1 3 1 14 5 36 TABLE 12.2 »

NUMBER OF SALES CALLS PER DAY BY SALESPEOPLE

Index

Salesperson

Variable

Number of calls

1

5

Mike

X1

5

4

2

5

Mannos

X2

5

3

3

5

Billie

X3

5

2

4

5

Meb

X4

5

4

5

5

John

X5

5

3

6

5

Frank

X6

5

3

7

5

Irena

X7

5

3

8

5

Sam

X8

5

14

Generally we try to estimate the population mean (m) with the sample mean, X (read ‘Xbar’), with the following formula: n

X5

∑X i=1

n

where n 5 number of observations made in the sample. Substituting into the formula reveals X 5 4.5: n

Mean 5

∑X i=1

n

5

36 5 4.5 8

More likely than not, you already know how to calculate a mean. However, knowing how to distinguish among the symbols S, m, X and x is necessary to understand statistics. In this introductory discussion of the summation sign (S), we have used very detailed notation that included the subscript for the initial index value (i) and the final index value (n). However, from this point on, references to S will not include the subscript for the initial index value (i) and the final index value (n), unless there is a unique reason to highlight these index values.

401

402

PART SIX > ANALYSING THE DATA

THE MEDIAN The next measure of central tendency, the median, is the midpoint of the distribution, or the 50th percentile. In other words, the median is the value below which half the values in the sample fall. In the sales manager example in Table 12.2, the median is 3 because half the observations are greater than 3 and half are less than 3. We would typically use a median as a measure of central tendency either when our data are ordinal scaled or when the data are significantly skewed. For example, consider the data in 12.2 again. If we calculate the mean, the number of calls Sam makes (14) significantly biases the average. The median is not affected by extreme values and therefore, with distributions skewed by potentially erroneous observations, the median is often used as an average instead (e.g., for house prices, salaries).

THE MODE The mode is the measure of central tendency that identifies the value that occurs most often. In our previous example, Mannos, John, Frank and Irena each made three sales calls. The value 3 occurs most often, and thus 3 is the mode. The mode is determined by listing each possible value and noting the number of times each value occurs. In this example given 3 is a commonly occurring value this may lend credence to the mode as an appropriate measure. The mode is also often used when the data is ordinal or nominal scaled.

median A measure of central tendency that is the midpoint; the value below which half the values in a distribution fall. mode A measure of central tendency; the value that occurs most often.

Measures of dispersion The mean, median and mode summarise the central tendency of frequency distributions. Knowing the tendency of observations to depart from the central tendency is also important. Calculating the dispersion of the data, or how the observations vary from the mean, is another way to summarise the data. Consider, for instance, the 12-month sales patterns of the two products shown in Table 12.3. Both have a mean monthly sales volume of 200 units, but the dispersion of observations for product B is much greater than that for product A. There are several measures of dispersion to account for such differences. TABLE 12.3 » SALES LEVELS FOR PRODUCTS A AND B (BOTH AVERAGE 200 UNITS)

Units product A

Units product B

January

196

150

February

198

160

March

199

176

April

200

181

May

200

192

June

200

200

July

200

201

August

201

202

September

201

213

October

201

224

November

202

240

December

202

261

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

403

THE RANGE The range is the simplest measure of dispersion. It is the distance between the smallest and the largest values of a frequency distribution. Thus, for product A the range is between 196 units and 202 units (six units), whereas for product B the range is between 150 units and 261 units (111 units). The range

range The distance between the smallest and the largest values of a frequency distribution.

does not take into account all the observations: it merely tells us about the extreme values of the distribution. Distributions may be fat or skinny. For example, for product A the observations are close together and reasonably close to the mean. While we do not expect all observations to be exactly like the mean, in a skinny distribution they will lie a short distance from the mean, while in a fat distribution they will be spread out. Exhibit 12.1 illustrates this concept graphically with two frequency distributions that have identical modes, medians and means, but different degrees of dispersion.

THE WELL-CHOSEN AVERAGE

In Huff’s7 book, the author points out a variety of ways in which simple statistics can be used to ‘misinform’. This may or may not be deliberate but one has to be discerning when it comes to interpreting statistics. Recruitment website Payscale.com claims the average salary for a marketing manager in Australia is $76 601 per year (http://www.payscale.com/research/AU/Job=Marketing_Manager/Salary8). The average used here is the median, which means that the average is not going to be skewed by a few high paying jobs (the mean possibly would be). Interestingly, the website presents a range too ($50 441 to $118 876) and various different percentiles to show how these salaries are distributed. A further look at the website shows that one can drill down further and get more accurate averages based on skills they possess, the city that they are interested in living in, and their level of experience. This more comprehensive range of statistics helps the informed user to understand the data better and get a closer representation of their likely salary. Contrast this with information from Adzuna (http://www.adzuna.com. au) where, at the time of writing, the average salary was $95 132. Which estimate is correct? Which site provides the most comprehensive and correct information? How should averages be presented?

EXPLORING RESEARCH ETHICS

The interquartile range is another measure of dispersion and is a range that encompasses the middle 50 per cent of the observations – in other words, the range between the bottom quartile (lowest 25 per cent) and the top quartile (highest 25 per cent). This is also useful to show how data are distributed. 5 Low dispersion

4

Frequency

Frequency

5

3 2

High dispersion

4 3 2 1

1 150 160 170 180 190 200 210 Value on variable

150 160 170 180 190 200 210 Value on variable

DEVIATION SCORES A method of calculating how far any observation is from the mean is to calculate individual deviation scores. To calculate a deviation from the mean, use the following formula: di 5 Xi 2 X

← EXHIBIT 12.1 LOW DISPERSION VERSUS HIGH DISPERSION

404

PART SIX > ANALYSING THE DATA

For the value of 150 units for product B for the month of January, the deviation score is 250; that is, 150 2 200 5 250. If the deviation scores are large, we will have a fat distribution because the distribution exhibits a broad spread.

WHY USE THE STANDARD DEVIATION? Statisticians have derived several quantitative indexes to reflect a distribution’s spread, or variability. The standard deviation is perhaps the most valuable index of spread, or dispersion. Students often have difficulty understanding it. Learning about the standard deviation will be easier if we first look at several other measures of dispersion that may be used. Each of these has certain limitations that the standard deviation does not. Understanding each of these different measures will illustrate the rationale behind the standard deviation and its advantages. First is the average deviation. We compute the average deviation by calculating the deviation score of each observation value (that is, its difference from the mean), summing these scores and then dividing by the sample size (n): Average deviation 5

∑ ( X 2 X) i

n

While this measure of spread seems interesting, it is never used. The positive deviation scores are always cancelled out by the negative scores, leaving a numerator of zero. Hence, the average deviation is useless as a measure of spread (try calculating it for any set of observations to see). One might correct for the disadvantage of the average deviation by computing the absolute values of the deviations. In other words, we ignore all the positive and negative signs and use only the absolute value of each deviation. The formula for the mean absolute deviation is:

∑| X 2 X | i

Mean absolute deviation 5

n

While this procedure eliminates the problem of always having a zero score for the deviation measure, there are some technical mathematical problems that make it less valuable than some other measures; it is mathematically intractable.

VARIANCE Another means of eliminating the sign problem caused by the negative deviations cancelling out the positive deviations is to square the deviation scores. The following formula gives the mean squared deviation:

∑ ( X 2 X)

2

Mean squared deviation 5

i

n

This measure is useful for describing the sample variability. However, we typically wish to make an inference about a population from a sample, and so the divisor n 2 1 is used rather than n in most variance A measure of variability or dispersion. Its square root is the standard deviation.

pragmatic marketing research problems9. This new measure of spread, called the variance, has the formula:

∑ ( X 2 X)

2

Variance 5 S2 5

i

n 21

The variance is a very good index of the degree of dispersion. The variance, S2, will equal zero if (and only if) each and every observation in the distribution is the same as the mean. The variance will grow larger as the observations tend to differ increasingly from one another and from the mean.

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

405

Standard deviation While the variance is frequently used in statistics, it has one major drawback. The variance reflects a unit of measurement that has been squared. For instance, if measures of sales in a territory are made in

standard deviation

dollars, the mean number will be reflected in dollars, but the variance will be in squared dollars. Because of this, statisticians often take the square root of the variance. Using the square root of the variance for a distribution, called the standard deviation, eliminates the drawback of having the measure of dispersion in squared units rather than in the original measurement units. The formula for the standard deviation is:

∑ ( X 2 X)

2

S 5 S2 5

i

n 21

standard deviation A quantitative index of a distribution’s spread, or variability; the square root of the variance for a distribution.

Table 12.4 illustrates that the calculation of a standard deviation requires the researcher to first calculate the sample mean. In the example with eight salespeople’s sales calls (Table 12.4), we calculated the sample mean as 4.5. Table 12.4 illustrates how to calculate the standard deviation for these data. TABLE 12.4 » CALCULATING A STANDARD DEVIATION: NUMBER OF SALES CALLS PER DAY BY SALESPERSONS SALESPERSON

xi

xi 2 x

(xi 2 x )2

Mike

4

(4 2 4.5) 5 20.5

0.25

Mannos

3

(3 2 4.5) 5 21.5

2.25

Billie

2

(2 2 4.5) 5 22.5

6.25

Meb

4

(4 2 4.5) 5 20.5

0.25

John

3

(3 2 4.5) 5 21.5

2.25

Frank

3

(3 2 4.5) 5 21.5

2.25

Irena

3

Sam

14

(3 2 4.5) 5 21.5 (14 2 4.5) 5 9.5

∑ ( X 2 X)

2

i

S5

∑(x 2 x ) i

n21

2

5

2.25 90.25

5 106

106 821

5 15.29 5 3.90

At this point we can return to thinking about the original purpose for measures of dispersion. Indexes of central tendency, such as the mean, help us summarise and interpret the data. In addition we wish to calculate a measure of variability that will give us a quantitative index of the dispersion of the distribution. We have looked at several measures of dispersion to arrive at two very adequate means of measuring dispersion: the variance and the standard deviation.10 Remember, the student must learn the language of statistics to use it in a research project. If you do not understand the language at this point, please review this material now.

THE NORMAL DISTRIBUTION One of the most useful probability distributions in statistics is the normal distribution, also called the normal curve or bell curve. This distribution describes the expected distribution of sample means and many other chance occurrences. The normal curve is bell shaped, and almost all (99 per cent)

normal distribution A symmetrical, bell-shaped distribution that describes the expected probability distribution of many chance occurrences.

406

PART SIX > ANALYSING THE DATA

of its values are within 63 standard deviations from its mean. An example of a normal curve, the distribution of IQ scores, appears in Exhibit 12.2. In this example, 1 standard deviation for IQ equals 15. We can identify the proportion of the curve by measuring a score’s distance (in this case, standard deviation) from the mean (100). EXHIBIT 12.2 → THE NORMAL DISTRIBUTION: AN EXAMPLE OF THE DISTRIBUTION OF INTELLIGENCE QUOTIENT (IQ) SCORES

2.14% 55

13.59% 70

34.13% 85

34.13% 100

13.59% 115

2.14% 130

145

IQ

The standardised normal distribution is a specific normal curve, somewhat similar to a normal distribution, but which has several specific characteristics: (1) it is symmetrical about its mean; (2) the mean identifies the normal curve’s highest point (the mode) and the vertical line about which this normal curve is symmetrical; (3) the normal curve has an infinite number of cases (it is a continuous distribution) and the area under the curve has a probability density equal to 1.0; and (4) the standardised normal distribution has a mean of 0 and a standard deviation of 1. Exhibit 12.3 illustrates these properties. standardised normal distribution A purely theoretical probability distribution that reflects a specific normal curve for the standardised value, Z.

The standardised normal distribution is a purely theoretical probability distribution, but it is the most useful distribution in inferential statistics because it helps us to find the probability of any portion of the area under the standardised normal distribution. All we have to do is transform, or convert, the data from other observed normal distributions to the standardised normal curve. Exhibit 12.4 illustrates how to do this. Pr(Z)

EXHIBIT 12.3 → THE STANDARDISED NORMAL DISTRIBUTION

0.4

0.3

0.2

0.1

–3

–2

–1

0 m

1

2

3

Z

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

407

← EXHIBIT 12.4 LINEAR TRANSFORMATION OF ANY NORMAL VARIABLE INTO A STANDARDISED NORMAL VARIABLE11

X

Sometimes the scale is stretched.

Sometimes it is shrunk.

–2 –1

0

1

2

X– Z=

Why do I need to know about the standardised normal distribution? This ability to transform normal variables has many pragmatic implications for the marketing researcher. The standardised normal table (see Table A.2 in Appendix A) allows us to evaluate the probability of the occurrence of many events without any difficulty. By transforming some value into a Z-value we can calculate the probability of that Z-value occurring in the Z distribution and infer this probability onto the original value based on the robustness of the normal distribution. Computing the standardised value, Z, of any measurement expressed in original units is simple: subtract the mean from the value to be transformed and divide by the standard deviation (all expressed in original units). The formula for this procedure and its verbal statement follow.12 Standardised value 5

Z5

Value to be transformed 2 Mean Standard deviation X 2m σ

where m 5 hypothesised or expected value of the mean A simple example may help to illustrate. Suppose a small business has historical data that shows mean website visits per month (m) are 9 000 and that the standard deviation of the data is 500. The entrepreneur wants to know whether website visits in the coming month will be between 7500 and 9625. We must first transform the distribution of website visits, X into the standardised form using our simple formula. The following computation shows how to obtain the probability (Pr) of website visits being in this range: Z5

Z5

X 2m 7500 2 9000 5 523.00 s 500

X 2m 9625 2 9000 5 5 1.25 σ 500

Using Table A.2 in Appendix A, we find that When Z 5 23.00, the area under the curve (probability) equals 0.499. When Z 5 1.25, the area under the curve (probability) equals 0.394.

408

PART SIX > ANALYSING THE DATA

Thus, the total area under the curve is 0.499 1 0.394 5 0.893. (The area under the curve corresponding to this computation is the total shaded areas in Exhibit 12.5) The sales manager, therefore, knows there is a 0.893 probability that sales will be between 7500 and 9625. At this point, it is appropriate to repeat that to understand statistics one must understand the language that statisticians use, so please review now if necessary. Now that we have covered certain basic terminology, we will outline the technique of statistical inference. However, before we do so, three additional types of distributions must be defined: population distribution, sample distribution and sampling distribution. EXHIBIT 12.5 → STANDARDISED DISTRIBUTION CURVE

Pr(Z) 0.4

0.3

0.2

0.1

–3

–2

–1

0

1

2

3

Z

POPULATION DISTRIBUTION, SAMPLE DISTRIBUTION AND SAMPLING DISTRIBUTION When conducting a research project or survey, the researcher’s purpose is not to describe the sample of respondents, but to make an inference about the population. As defined previously, a population, or universe, is the total set, or collection, of potential units for observation. The sample is a smaller subset of this population. A frequency distribution of the population elements is called a population distribution. The mean and standard deviation of the population distribution are represented by the Greek letters m and s. A frequency distribution of a sample is called a sample distribution. The sample mean is designated X, and the sample standard deviation is designated S. The concepts of population distribution and sample distribution are relatively simple. However, we must now introduce another distribution: the sampling distribution of the sample mean. Understanding the sampling distribution is the crux of understanding statistics. The sampling distribution is a theoretical probability distribution that in actual practice would never be calculated. Hence, practical, business-oriented students have difficulty understanding why the notion of the population distribution A frequency distribution of the elements of a population.

sampling distribution is important. Statisticians, with their mathematical curiosity, have asked

sample distribution A frequency distribution of a sample.

having n elements, from a specified population?’ Assuming that the samples were randomly selected,

themselves: ‘What would happen if we were to draw a large number of samples (say, 50 000), each the sample means could be arranged in a frequency distribution. Because different people or sample units would be selected in the different samples, the sample means would not be exactly equal. The

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

409

shape of the sampling distribution is of considerable importance to statisticians. If the sample size is sufficiently large and if the samples are randomly drawn, the sampling distribution of the mean will be approximately normally distributed. To illustrate, suppose we wanted to estimate mean individual weekly taxable income for Brisbane CBD residents. If we wanted to do this, we could access this information free from the ABS website (the median is $1732 per week – let’s call this the population mean, although the mean may differ13). For now, let’s suppose we want to estimate this and do not have access to this information. To estimate this population mean, we might take a random sample from a sampling frame such as the Brisbane telephone directory. Let’s say we surveyed 500 Brisbane residents. What sample mean do you think you would get? You could get a figure of $900. You could also get a figure of $2500. In fact, the sample mean could be anywhere from the mean income of the 500 lowest income earners to the mean income of the 500 highest income earners. It is quite intuitive to expect, with a high degree of probability, a sample mean close to $1732 though. And values deviating from the population mean we would expect, but with a decreasing level of certainty. This relates to the central-limit theorem (discussed in the next section). A sampling distribution is a theoretical probability distribution that shows the functional relation

sampling distribution

between the possible values of some summary characteristic of n cases drawn at random and the probability (density) associated with each value over all possible samples of size n from a particular population. The sampling distribution’s mean is called the expected value of the statistic. The expected value of the mean of the sampling distribution is equal to m. The standard deviation of the sampling distribution of X is called the standard error of the mean SX and is approximately equal to: SX 5

s n

To review, there are three important distributions that we must know about to make an inference about a population from a sample: the population distribution, the sample distribution and the sampling distribution. They have the following characteristics: Mean

Standard deviation

Population distribution

m

s

Sample distribution

X

S

mx 5 m

SX

Sampling distribution

We now have much of the information we need to understand more about the concept of statistical inference. To clarify why the sampling distribution has the characteristic just described, we will elaborate on two concepts: the standard error of the mean and the central-limit theorem. You may be wondering why the standard error of the mean, SX , is defined as: SX 5

s n

The reason is based on the notion that the variance or dispersion within the sampling distribution of the mean will be less if we have a larger sample size for independent samples. We can see intuitively that a larger sample size allows the researcher to be more confident that the sample mean is closer to the population mean. In actual practice, the standard error of the mean is estimated using the sample standard deviation. Thus, SX is estimated using

s . n

sampling distribution A theoretical probability distribution of sample means for all possible samples of a certain size drawn from a particular population. standard error of the mean The standard deviation of the sampling distribution.

410

PART SIX > ANALYSING THE DATA

Exhibit 12.6 shows the relationship among a population distribution, the sample distribution and three sampling distributions for varying sample sizes. In part (a), the population distribution is not a normal distribution. In part (b), the sample distribution resembles the distribution of the population; however, there may be some differences. In part (c), each sampling distribution is normally distributed and has the same mean. Note that as sample size increases, the spread of the sample means around m decreases. Thus, with a larger sample size we will have a skinnier sampling distribution. EXHIBIT 12.6 → SCHEMATIC OF THE THREE FUNDAMENTAL TYPES OF DISTRIBUTIONS14

(a) The population distribution

= Mean of the population = Standard deviation of the population X = Values of items in the population

Provides data for

(b) Possible sample distributions

Provide data for

X

– X1

– X2

– Xn

X – X = Mean of a sample distribution S = Standard deviation of a sample distribution X = Values of items in a sample X

X

–x = Mean of the sampling distribution of means S–x = Standard deviation of Samples of size n, e.g. 500 the sampling distribution Samples of size < n, e.g. 100 of means – X = Values of all possible sample means – X

Samples of size > n, e.g. 2500 (c) The sampling distribution of the sample means

–x

WHAT WENT WRONG? HOW NORMAL ARE YOU?

Are you normal? A quiz at www.blogthings.com will give you an answer to that question (a site that offers many quizzes where people can compare themselves with others and find things out about themselves). It consists of twenty questions covering things like whether or not you change towels every day, whether you have closer to $40 or $100 on hand, whether you are comfortable using the bathroom with another person in the room, and so forth. Once the user finishes the quiz, the site provides him or her with a normal score by comparing the

responses to the overall distribution of responses. Nerdtest. com, Facebook and other websites have similar ‘normal’ tests. Millions have responded to these questions and one’s normalness is determined against that distribution. One of the authors of this book got 65 per cent on the first test and 90 per cent on the second test – which is quite normal it seems. I suppose that brings us back to the concepts of reliability and validity! (See Chapter 8.)

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

411

CENTRAL-LIMIT THEOREM Finding that the means of random samples of a sufficiently large size will be approximately normal in form, and that the mean of the sampling distribution will approach the population mean, is very useful. Mathematically, this is the assertion of the central-limit theorem, which states: As the sample size, n, increases, the distribution of the mean, X, of a random sample taken from practically any population approaches a normal distribution (with a mean, m, and a standard deviation, s).15 The central-limit theorem works regardless of the shape of the original population distribution (see Exhibit 12.7). A simple example will demonstrate the central-limit theorem. Assume that a consumer researcher is interested in the number of dollars children spend on toys each month. Assume further that the population the consumer researcher is investigating consists of eight-year-old children in a certain school. In this example the population consists of only six individuals. (This is a simple and perhaps somewhat unrealistic example; nevertheless, assume that the population size is only six elements.) Table 12.5 shows the population distribution of toy expenditures. Alice, a relatively deprived child, has only $1 per month, whereas Freddy, the rich kid, has $6 to spend. The average expenditure on toys each month is $3.50, so the population mean, m, equals 3.5 (see Table 12.5). TABLE 12.5 » HYPOTHETICAL POPULATION DISTRIBUTION OF TOY EXPENDITURES

Child

Toy expenditures ($)

Alice

1.00

Becky

2.00

Noah

3.00

Tobin

4.00

George

5.00

Freddy

6.00

∑ x = 21 i

m=

∑x n

i

=

21 6

= 3.5

Now assume that we do not know everything about the population, and we wish to take a sample size of two, to be drawn randomly from the population of the six individuals. How many possible samples are there? The answer is 15, as follows: 1, 2 1, 3

2, 3

1, 4

2, 4

3, 4

1, 5

2, 5

3, 5

4, 5

1, 6

2, 6

3, 6

4, 6

5, 6

central-limit theorem The theory that, as sample size increases, the distribution of sample means of size n, randomly selected, approaches a normal distribution.

412

PART SIX > ANALYSING THE DATA

EXHIBIT 12.7 → DISTRIBUTION OF SAMPLE MEANS FOR SAMPLES OF VARIOUS SIZES AND POPULATION DISTRIBUTIONS12

Population

Population

Population

Population

Values of X

Values of X

Values of X

Values of X

Sampling – distribution of X

Sampling – distribution of X

Sampling – distribution of X

Sampling – distribution of X

n=2

n=2

n=2

– Values of X

– Values of X

– Values of X

Sampling – distribution of X

Sampling – distribution of X

Sampling – distribution of X

n=5

n=5

n=5

– Values of X

– Values of X

– Values of X

Sampling – distribution of X

Sampling – distribution of X

Sampling – distribution of X

n = 30

n = 30

n=2

– Values of X Sampling – distribution of X n=5

– Values of X Sampling – distribution of X

n = 30 n = 30

– Values of X

– Values of X

– Values of X

– Values of X

Table 12.6 lists the sample mean for each of the possible 15 samples and the frequency distribution of these sample means with their appropriate probabilities. These sample means comprise a sampling distribution of the mean, and the distribution is approximately normal. If we increased the sample size to three, four or more, the distribution of sample means would more closely approximate a normal distribution. While this simple example is not a proof of the centrallimit theorem, it should give you a better understanding of the nature of the sampling distribution of the mean.

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

TABLE 12.6 »

ARITHMETIC MEANS OF SAMPLES AND FREQUENCY DISTRIBUTION OF SAMPLE MEANS

Sample means Sample ($)

SX($)

X ($)

Probability

1, 2

3.00

1.50

1/15

1, 3

4.00

2.00

1/15

1, 4

5.00

2.50

1/15

1, 5

6.00

3.00

1/15

1, 6

7.00

3.50

1/15

2, 3

5.00

2.50

1/15

2, 4

6.00

3.00

1/15

2, 5

7.00

3.50

1/15

2, 6

8.00

4.00

1/15

3, 4

7.00

3.50

1/15

3, 5

8.00

4.00

1/15

3, 6

9.00

4.50

1/15

4, 5

9.00

4.50

1/15

4, 6

10.00

5.00

1/15

5, 6

11.00

5.50

1/15

Frequency distribution Sample mean ($) 1.50

Frequency 1

Probability 1/15

2.00

1

1/15

2.50

2

2/15

3.00

2

2/15

3.50

3

3/15

4.00

2

2/15

4.50

2

2/15

5.00

1

1/15

5.50

1

1/15

ESTIMATION OF PARAMETERS Point estimates Our goal in using statistics is to make an estimate about population parameters such as the mean, m, and the standard deviation, s. In most instances of marketing research these parameters are unknown so we are required to sample. As discussed, X and S are random variables that will vary from sample to sample with a certain probability (sampling) distribution. Our previous example of statistical inference was somewhat unrealistic because the population had only six individuals. So consider the opening example in this chapter about attitudes towards household water conservation. When statistical inference is needed, the population mean, m, is a constant but

413

414

PART SIX > ANALYSING THE DATA

unknown parameter. To estimate the average attitudes towards water conservation a sample of 909 residents in the area was undertaken. If the sample mean, X, equals 3.79, we might use this figure as a point estimate. This single value, 3.79, would be the best estimate of the population mean. However, we would be extremely lucky if the sample estimate were exactly the same as the population value. We could calculate how likely this is by calculating a confidence interval.

Confidence intervals If we specify a range of numbers, or interval, within which the population mean should lie, we can be more confident that our inference is correct. A confidence interval estimate is based on the knowledge that m 5 X 6 a small sampling error. After calculating an interval estimate, we can determine how probable it is that the population mean will fall within this range of statistical values. In the water saving example, the researcher, after setting up a confidence interval, would be able to make a statement such as: ‘With 95 per cent confidence, I think that average attitudes towards household water conservation are between 3.72 and 3.86.’ This information can be used to estimate predisposition towards conserving household water because the researcher has a certain level of confidence that the true population mean lies within this range of values. One issue for the researcher is to determine how much random sampling error to tolerate. In other words, what should the confidence interval be? How much of a gamble should be taken that m will be included in the range? Do we need to be 80 per cent, 90 per cent or 99 per cent sure? The confidence level is a percentage or decimal that indicates the long-run probability that the results will be correct. Traditionally, researchers have used the 95 per cent confidence level. While there is nothing magical about the 95 per cent confidence level, it is useful to select this confidence level in our examples and seems to be the usual convention in business research. However, consider a medical researcher uses a 95 per cent confidence interval. If you were the one taking the drugs you might want them to be a little more rigorous in their selection of this standard! As mentioned, the point estimate gives no information about the possible magnitude of random sampling error. The confidence interval gives the estimated value of the population parameter, plus or minus an estimate of the error. We can express the idea of the confidence interval as follows: m 5 X 6 a small sampling error point estimate An estimate of the population mean in the form of a single value, usually the sample mean. confidence interval estimate A specified range of numbers within which a population mean is expected to lie; an estimate of the population mean based on the knowledge that it will be equal to the sample mean plus or minus a small sampling error. confidence level A percentage or decimal value that shows how confident a researcher can be about being correct. It states the long run percentage of confidence intervals that will include the true population mean.

More formally, assuming that the researchers select a large sample (more than 30 observations), the sampling error is given by: Small sampling error 5 Zc.l. SX

where Zc.l. 5 value of Z, or standardised normal variable, at a specified confidence level (c.l.) SX 5 standard error of the mean The precision of our estimate is indicated by the value of Zc.l. SX . It is useful to define the range of possible error, E, as follows: E 5 Zc.l. SX

Thus, m 5X6E

where X 5 sample mean E 5 range of sampling error

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

or m 5 X 6 Zc.l.SX

The confidence interval 6E is always stated as one-half of the total confidence interval. The following step-by-step procedure can be used to calculate confidence intervals: 1 Calculate X from the sample. 2 Assuming s is unknown (which is usually the case), estimate the population standard deviation by finding S, the sample standard deviation. 3 Estimate the standard error of the mean, using the formula: SX 5

S n

4 Determine the Z-value associated with the desired confidence level. The confidence level should be divided by 2 to determine what percentage of the area under the curve to include on each side of the mean. 5 Calculate the confidence interval. Let’s do this for the water conservation example above to show how the confidence intervals were arrived at. To summarise, the sample mean for attitudes ( X ) of 909 residents was 3.79, with a standard deviation (S) of 1.08.16 Knowing that it would be extremely coincidental if the point estimate from the sample were exactly the same as the population mean age (m), you decide to construct a confidence interval around the sample mean: 1 X 5 3.79 years 2 S 5 1.08 years 3 SX =

1.08 = 0.036 909

4 Suppose you wish to be 95 per cent confident the estimates from your sample will include the population parameter. Including 95 per cent of the area requires that 47.5 per cent (one-half of 95 per cent) of the distribution on each side be included. From the Z-table (Table A.2 in Appendix A), you find that 0.475 corresponds to the Z-value 1.96. 5 Substituting the values for Zc.l. and SX into the confidence interval formula gives: m 5 3.79 6 (1.96)(0.036) 5 3.79 6 0.071

You can thus expect that m is contained in the range from 3.72 to 3.86. Intervals constructed in this manner will contain the true value of m 95 per cent of the time. Step 3 can be eliminated by entering S and n directly into the confidence interval formula: m 5 X 6 Z c . l.

S n

Remember that S / n represents the standard error of the mean, SX . Its use is based on the centrallimit theorem. If you wanted to increase the probability that the population mean will lie within the confidence interval, you could use the 99 per cent confidence level, with a Z-value of 2.57. You may want to calculate the 99 per cent confidence interval for the above example; you can expect that m will be in the range between 2.70 and 3.88. We have now examined the basic concepts of inferential statistics. You should understand that sample statistics such as the sample means can provide good estimates of population parameters such as m. You should also realise that there is a certain probability of being in error when you estimate a population parameter from sample statistics. In other words, there will be a random sampling error,

415

416

PART SIX > ANALYSING THE DATA

which is the difference between the survey results and the results of surveying the entire population. If you have a firm understanding of these basic terms and ideas, which are the essence of statistics, the remaining statistics concepts will be relatively simple for you.

DETERMINING SAMPLE SIZE This theoretical knowledge about distributions can be used to estimate population parameters based upon samples. In some cases, with the required information, it can also be used to determine sample size. Determining sample size is looked at in Chapter 10. Though often done on the basis of judgement, we can be more scientific about how an appropriate sample size is determined. Specifically, if we recall the formula for a confidence interval around the mean: S n

m 5 X 6 Z c . l.

With some simple algebraic manipulation, we can rearrange this formula to make n the subject: Acceptable Error 5 E 5 Z

S n

E S 5 Z n E n 5S Z n5

ZS E

 ZS  n5   E

2

where Z 5 standardised value that corresponds to the confidence level S 5 sample standard deviation or estimate of the population standard deviation E 5 acceptable magnitude of error, plus or minus error factor (range is one-half of the total confidence interval) As discussed in Chapter 10, there are three main influences upon sample size (that is, the confidence level, variability in the sample and acceptable error). We can now appreciate these mathematically. Some of the parameters are difficult to estimate accurately in reality, but by manipulating the formula to make n the subject we can quickly appreciate what these factors are and if we have the information above, or can estimate it, then we can calculate an appropriate sample size.

TIPS OF THE TRADE

»» Inferential statistics involve making a guess (inference) about some aspect of a population based on results taken from a sample. »» Do not use inferential statistics when a census is available. When the entire population is measured, any observed difference is a real difference. »» Probability values are called p-values and the lower they get, generally, the more likely a researcher’s substantive hypothesis is supported. »» Low p-values of 0.05 or less generally support hypotheses about differences or relationships.

»

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

»

417

»» In almost all practical marketing research situations, a Z-test and t-test will yield the same result. Because of this and the fact that t-tests are more readily available in statistics packages, t-tests are more commonly used. »» Samples of 100 or more generally provide adequate power to identify significant relationships using regression analysis at the 0.05 level. »» A sample of 150 or more is generally sufficient for cross-tabulations (although this depends on the size of the cross-tabulation - see Chapter 14).

STATING A HYPOTHESIS

ONGOING PROJECT

Before moving on with our discussion of univariate statistical analysis, it will be useful to introduce some basic concepts in statistical testing. One of the fundamental building blocks in conducting

univariate

scientific statistical testing is to construct a hypothesis for testing. This hypothesis helps the researcher to clarify what they want to find out from the statistical test.

What is a hypothesis? A hypothesis is an unproven proposition or supposition that tentatively explains certain facts or phenomena. In its simplest form a hypothesis is a guess or a statement about something. A sales manager may hypothesise that the salespeople who are highest in product knowledge will be the most productive. An advertising manager may hypothesise that if consumers’ attitudes towards a product change in a positive direction, there will be an increase in consumption of the product. Statistical techniques allow us to decide whether or not our theoretical hypothesis is empirically confirmed. In other words, they allow us to statistically test what we may presume.

Null and alternative hypotheses Because scientists should be bold in conjecturing but extremely cautious in testing, statistical hypotheses generally are stated in a null form. A null hypothesis is a statement about a status quo. It is a conservative statement that communicates the notion that any change from what has been thought to be true or observed in the past will be due entirely to random error. In fact, the true purpose of setting up the null hypothesis is to provide an opportunity for nullifying it. For example, suppose academic researchers expect that highly dogmatic (that is, closed-minded) consumers will be less likely to try a new product than will less dogmatic consumers. The researchers would generally formulate a conservative null hypothesis. The null hypothesis in this case would be that there is no difference between high dogmatics and low dogmatics in their willingness to trial a new product. The alternative hypothesis would be that there is a difference between high dogmatics and low dogmatics. It states the opposite of the null hypothesis.

HYPOTHESIS TESTING Generally we assign the symbol H0 to the null hypothesis and the symbol H1 to the alternative hypothesis. The purpose of hypothesis testing is to determine which of the two hypotheses is correct. The process of hypothesis testing is slightly more complicated than that of estimating parameters because the decision-maker must choose between the two hypotheses. However, the student need not worry because the mathematical calculations are no more difficult than those we have already made.

hypothesis An unproven proposition or supposition that tentatively explains certain facts or phenomena; a proposition that is empirically testable. null hypothesis A statement about a status quo asserting that any change from what has been thought to be true will be due entirely to random sampling error. alternative hypothesis A statement indicating the opposite of the null hypothesis.

418

PART SIX > ANALYSING THE DATA

The hypothesis-testing procedure The process of hypothesis testing goes as follows. First, we determine a statistical hypothesis. We then imagine what the sampling distribution of the mean would be if this hypothesis were a true statement of the nature of the population. Next, we take an actual sample and calculate the sample mean (or appropriate statistic, if we are not concerned about the mean). We know from our previous discussions of the sampling distribution of the mean that obtaining a sample value that is exactly the same as the population parameter is highly unlikely; we expect some difference between the sample mean and the population mean. We then must determine if the deviation between the obtained value of the sample mean and its expected value (based on the statistical hypothesis) could have occurred by chance alone. In other words, we ask: ‘Is the sample mean significantly different enough from the population mean, such that this difference could have occurred by chance?’ Suppose we observe that the sample value differs from the expected value. Before we can conclude that these results are improbable (or even probable), we must have some standard, or decision rule, for determining if, in fact, we should reject the null hypothesis and accept the alternative hypothesis. Statisticians define this decision criterion as the significance level. significance level The critical probability in choosing between the null and alternative hypotheses; the probability level that is too low to warrant support of the null hypothesis.

The significance level is the critical probability in choosing between the null hypothesis and the alternative hypothesis. The level of significance determines the probability level – say, 0.05 or 0.01 – that is to be considered too low to warrant support of the null hypothesis. In other words, it is the probability that the results obtained from the sample have occurred as a result of chance. Assuming the null hypothesis being tested is true, if the probability of occurrence of the observed data is smaller than the significance level, then the data suggest that the null hypothesis should be rejected. In other words, there has been evidence to support contradiction of the null hypothesis (i.e., evidence to support the alternative hypothesis). In discussing confidence intervals (the set of acceptable hypotheses), statisticians use the term confidence level to refer to the level of probability associated with an interval estimate. However, when discussing hypothesis testing, statisticians change their terminology and call this the significance level, a (the Greek letter alpha).

An example of hypothesis testing Going back to the water conservation example. Suppose a local water authority is concerned that residents’ attitudes towards conserving household water are too low and want to test whether or not these attitudes, on a 5-point Likert scale, are positive (1 5 very negative and 5 5 very positive) to inform the nature of a social marketing campaign to reduce household water consumption. The scale is assumed to be an interval scale, and experience has shown that the previous distribution of this attitudinal measurement was approximately normal. On a 5-point scale a mean of 3.0 would imply respondents have neither a positive or negative attitude towards household water consumption. So if we wanted to test whether or not attitudes were positive we might formulate a null hypothesis that the mean is equal to 3.0: H0: m 5 3.0

The alternative hypothesis is that the mean does not equal 3.0: H1: m Þ 3.0

Next, the researcher must decide on a region of rejection. Exhibit 12.8 shows a sampling distribution of the mean assuming the null hypothesis (that is, assuming m 5 3.0). The darkly shaded area shows the region of rejection when a 5 0.025 in each tail of the curve. In other words, the region of rejection

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

419

shows those values that are very unlikely to occur if the null hypothesis is true, but relatively probable if the alternative hypothesis is true. The values within the unshaded area are acceptable at the 95 per cent confidence level (or 5 per cent significance level, or 0.05 alpha level), and if we find that our sample mean lies within this region of acceptance, we conclude that the null hypothesis is true. More precisely, we fail to reject the null hypothesis. In other words, the range of acceptance: (1) identifies those acceptable values that reflect a difference from the hypothesised mean in the null hypothesis; and (2) shows the range within which any difference is so minuscule that we would conclude that this difference was due to random sampling error rather than to a false null hypothesis. ← EXHIBIT 12.8 A SAMPLING DISTRIBUTION OF THE MEAN ASSUMING m 5 3.0

=

25

0.0

=

0.0

25 – X

= 3.0

In our example, based on the survey of 909 residents the sample mean was 3.79. (If s is known, it is used in the analysis; however, this is rarely true and was not true in this case.)17 The sample standard deviation was S 5 1.08. Now we have enough information to test the hypothesis. The researcher has decided that the decision rule will be to set the significance level at the 0.05 level. This means that in the long run the probability of making an erroneous decision when H0 is true will be fewer than 5 times in 100 (0.05). From Table A.2 in Appendix A, the researcher finds that the Z score of 1.96 represents a probability of 0.025 that a sample mean will lie above 1.96 standard errors from m. Likewise, the table shows that about 0.025 of all sample means will fall below 21.96 standard errors from m. The values that lie exactly on the boundary of the region of rejection are called the critical values of m. Theoretically, the critical values are Z 5 21.96 and 11.96. Now we must transform these critical Z-values to the sampling distribution of the mean for this study. The critical values are (note this is a similar calculation to the one used earlier to calculate the confidence intervals but here the value we are using is the population mean which has been defined a priori): Critical value 2 lower limit 5m 2 ZSX 5m 2 Z

S n

 1.08  5 3 2 1.96  = 2.93  909  Critical value 2 upper limit 5m 1 ZSX 5m 1 Z

S n

 1.08  = 3 1 1.96  = 3.07  909 

Based on the survey, X 5 3.79, in this case, the sample mean is contained outside the region of rejection (see Exhibit 12.9). Since the sample mean is greater than the critical value, 3.07, the researcher

critical values The values that lie exactly on the boundary of the region of rejection.

420

PART SIX > ANALYSING THE DATA

says that the sample result is statistically significant beyond the 0.05 level. In other words, fewer than 5 of each 100 samples will show results that deviate this much from the hypothesised null hypothesis when, in fact, H0 is actually true. In other words it is highly likely that mean attitudes are different from 3. What does this mean to the management of the water authority? The results indicate that residents seem to be positive about conserving household water. It is unlikely (probability of less than 5 in 100) that this result would occur because of random sampling error. This means that the management may consider other factors to change water conservation behaviour (e.g., increasing the level of control over changing that behaviour) because attitudes already seem to be positive. EXHIBIT 12.9 → A HYPOTHESIS TEST USING THE SAMPLING DISTRIBUTION OF X UNDER THE HYPOTHESIS m 5 3.0

2.93

3.0

3.07

Critical Value – Lower Limit

Hypothesised

Critical Value – Upper Limit

3.79

– X

– X from Sample

An alternative way to test the hypothesis is to formulate the decision rule in terms of the Z-statistic. Using the following formula, we can calculate the observed value of the Z-statistic given a certain sample mean, X : Zobs 5

0.79 X 2m 3.79 2 3.0 5 5 5 21.94 0.036 0.036 SX

In this case, the Z-value is 21.94 and we find that we have met the criterion of statistical significance at the 0.05 level (1.96).

Type I and Type II errors Hypothesis testing, as we previously stated, is based on probability theory. Because we cannot make any statement about a sample with complete certainty, there is always the chance that an error will be made. In fact, the researcher runs the risk of committing two types of errors. Table 12.7 summarises the state of affairs in the population and the nature of Type I and Type II errors. The four possible situations in the exhibit result because the null hypothesis can be either true or false and the statistical decision will be to either accept the null hypothesis or reject it. In other words, someone can be found guilty and actually be guilty (correct decision); found innocent and actually be innocent (correct decision); found guilty and be innocent (incorrect decision); or found innocent and be guilty (incorrect decision). Although lawyers and judges do not concern themselves with the statistical terminology of Type I and Type II errors, they do follow this logic. For example, our legal system is based on the concept

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

421

that a person is innocent until proven guilty. Assume that the null hypothesis is that the individual is innocent. If we make a Type I error, we will send an innocent person to prison. Our legal system takes many precautions to avoid Type I errors. A Type II error would occur if a guilty party were set free (the null hypothesis would have been accepted). Our society places such a high value on avoiding Type I errors that Type II errors are more likely to occur. If we use this analogy, the same can hold for statistical testing. TABLE 12.7 »

TYPE I AND TYPE II ERRORS IN HYPOTHESIS TESTING

State of null hypothesis in the population

Decision Accept H0

Reject H0

H0 is true

Correct – no error

Type I error

H0 is false

Type II error

Correct – no error

If the decision is made to reject the null hypothesis and the null hypothesis is in fact true, we will make what is called a Type I error. A Type I error has the probability of alpha (a), the level of statistical significance that we have set up. Simply put, a Type I error occurs whenever the researcher concludes that there is a statistical difference when in reality one does not exist. If the alternative hypothesis is, in fact, true (and the null hypothesis is false) but we conclude that we should not reject the null hypothesis, we make what is called a Type II error. The probability of making this incorrect decision is called beta (b). No error will occur if the null hypothesis is true and we make the decision to accept it. We will also make a correct decision if the null hypothesis is false and the decision is made to reject the null hypothesis. Unfortunately, without increasing sample size the researcher cannot simultaneously reduce Type I and Type II errors, because there is an inverse relationship between the two. Thus, reducing the probability of a Type II error increases the probability of a Type I error. In marketing problems, Type I errors generally are more serious than Type II errors, and thus there is greater concern with determining the significance level, alpha, than with determining beta.18

Research conducted in Australia used the Theory of Planned Behaviour (TPB) to understand how individuals reacted to a social marketing initiative to conserve water.19 The TPB is a theory from social psychology that says an individual’s volitional behaviour (household water conservation in this case) can be explained by that individual’s attitudes towards the behaviour, their level of perceived behavioural control over the behaviour and the views of significant others about that behaviour. The research concluded that the TPB was applicable in this context but could be augmented by considering other variables, including an individual’s perceived water right (the degree to which they feel it is their right to consume as much water as they wish) and an individual’s sentiment towards the water authority’s management of the water situation in the area. It also found that household water conservation intentions are higher for households with children, and for individuals who are 18–59 years old. The research was conducted using Computer Assisted Telephone Interviewing (see Chapter 5) with an overall sample size of 909 individuals from a regional city in Australia. How accurately do you think the results represent the Australian population?

Type I error An error caused by rejecting the null hypothesis when it is true. Type II error An error caused by failing to reject the null hypothesis when the alternative hypothesis is true.

REAL WORLD SNAPSHOT

422

PART SIX > ANALYSING THE DATA

ONGOING PROJECT

CHOOSING THE APPROPRIATE STATISTICAL TECHNIQUE Now that one statistical technique for hypothesis testing has been illustrated, it should be noted that a number of appropriate statistical techniques are available to assist the researcher in interpreting data. The choice of the method of statistical analysis depends on: (1) the type of question to be answered, (2) the number of variables and (3) the scale of measurement. The various statistical techniques and a simple tree diagram outlining which technique to use, based upon the questions above, is presented at the beginning of Part 6 on page 397.

Type of question to be answered A researcher may wish to compare sample means or may wish to compare the distribution of a variable. Comparison of two salespeople’s average monthly sales will require a t-test of two means (see Chapter 13), whereas a comparison of quarterly sales distributions will require a Chi-square test for independence (see Chapter 14). The researcher should consider the method of statistical analysis before choosing the research design and before determining the type of data to collect. Once the data have been collected, the initial orientation towards analysis of the problem will be reflected in the research design.

Number of variables The number of variables that will be simultaneously investigated is a primary consideration in the choice of statistical technique. A researcher who is interested only in the average number of times a prospective home-buyer visits price comparison websites to shop for interest rates can concentrate on univariate statistical analysis A type of analysis that assesses the statistical significance of a hypothesis about a single variable.

investigating only one variable at a time. To generalise from a sample about one variable at a time, the researcher conducts univariate statistical analysis. Statistically describing the relationship between two variables at one time, such as the relationship between advertising expenditures and sales volume, requires bivariate statistical analysis. Tests of differences and measuring the association between variables are discussed in Chapters 13 and 14 and also relate to the type of question to be answered. Multivariate statistical analysis, discussed in Chapter 15, is the simultaneous investigation of more than two variables.

univariate

Scale of measurement The scale of measurement on which the data are based or the type of measurement reflected in the data determines the permissible statistical techniques and appropriate empirical operations. Testing a hypothesis about a mean, as we have just discussed, requires interval scaled or ratio scaled data (recall from Chapter 8 that we can only take a mean of ratio or interval scaled data). Suppose a researcher is working with a nominal scale that identifies which credit card consumers most frequently use (MasterCard, Visa or American Express). Because the scale is nominal, the researcher may use only the mode as a measure of central tendency. In other situations, where data are measured on an ordinal scale, the median may be used as the average or a percentile may be used as a measure of dispersion.

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

423

Parametric versus nonparametric hypothesis tests At this point it is important to consider some new terms related to statistical techniques. So far, we have concerned ourselves with parametric statistics. When the data are interval or ratio scaled and the sample size is large, parametric statistical procedures are appropriate. These procedures are based on the assumption that the data in the study are drawn from a population with a normal (bellshaped) distribution and/or normal sampling distribution. However, often the data do not satisfy these assumptions. In such cases we can use nonparametric statistics instead. Making the assumption that the population distribution or sampling distribution is normal generally is inappropriate when data are either ordinal or nominal. Thus, nonparametric statistics are referred to as distribution free. Data analysis of both nominal and ordinal scales typically uses nonparametric statistical tests. We confine our discussion of parametric and nonparametric statistics at this stage and expand on the discussion with reference to more specific procedures below and in Chapters 13 and 14.

The section below asks students some questions about everyday behaviour. Let’s examine some of this behaviour and see what students spend their time doing. Test the following univariate hypotheses using the students from your university and/or from the entire sample: H1 The typical student sends more than 25 text messages per day. H2 The typical student makes more than 15 mobile phone calls each day. H3 Students are evenly distributed in the time they spend watching television. That is, one-quarter watch less than 1 hour a week, one-quarter watch 1–2 hours per day, one-quarter watch 2–3 hours per day, and one-quarter watch more than 3 hours per day. H4 Students average studying 2–3 hours per week.

SURVEY THIS!

Courtesy of Qualtrics.com

Some practical univariate tests So far we have discussed the theoretical concepts behind inferential statistics, with specific reference to the Z distribution. However, the Z distribution is largely theoretical, and in more practical marketing problems we typically have other constraints. For example, we may have small sample sizes of below 30 and typically we do not know the standard deviation. Furthermore, our data may be nonparametric. To overcome these limitations, we now discuss the t-test, a simple extension to the Z-test discussed earlier, and the Chi-square test for independence, a useful and simple test of differences in frequencies.

THE t-DISTRIBUTION In a number of situations researchers wish to test hypotheses about population means with sample sizes that are not large enough to be approximated by the normal distribution. When the sample size is small (n , 30) and the population standard deviation is unknown, we use the t-distribution. The

424

PART SIX > ANALYSING THE DATA

t-distribution A symmetrical, bell-shaped distribution that is contingent on sample size. It has a mean of zero and a standard deviation equal to 1. degrees of freedom The number of observations minus the number of constraints or assumptions needed to calculate a statistical term.

t-distribution, like the standardised normal curve, is a symmetrical, bell-shaped distribution with a mean of zero and a unit standard deviation. When sample size (n) is larger than 30, the t-distribution and Z-distribution may be considered almost identical. This is why in SPSS, a leading statistical software package, there is only an option to do a t-test. Since the t-distribution is contingent on sample size, there is a family of t-distributions. More specifically, the shape of the t-distribution is influenced by its degrees of freedom. Exhibit 12.10 illustrates t-distributions for 1, 2, 5 and an infinite number of degrees of freedom. The number of degrees of freedom (d.f.) is equal to the number of observations minus the number of constraints or assumptions needed to calculate a statistical term. Another way to look at degrees of freedom is to think of adding four numbers together when you know their sum – for example: 4 2 1 X 10 The value of the fourth number has to be 3. In other words, there is freedom of choice for the first three digits, but the fourth value is not free to vary. In this example there are three degrees of freedom.

Relative frequency

al

0.30

rm

5 2 1

0.35

No

0.40

EXHIBIT 12.10 → THE T-DISTRIBUTION FOR VARIOUS DEGREES OF FREEDOM

0.25 0.20 0.15

n=1 n=2

0.10

n=5 al

rm

No

0.05 0.00 –4

–3

–2

–1

0

1

2

3

4

Values of t

The calculation of t closely resembles the calculation of the Z-value. To calculate t, use the formula: t5

X−m SX

with n 2 1 degrees of freedom. Because the t-distribution closely approximates the Z-distribution with sample sizes over 30, in most practical cases we would use the t-distribution, not the Z-distribution. (indeed, in SPSS you can only perform a t-test.)

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

Calculating a confidence interval estimate using the t-distribution Suppose you are an entrepreneur introducing a new taco kit to help people cook authentic Mexican food. You have designed some sample taco kits and given them to people to try, and want to estimate purchase likelihood in the future. Therefore, with each taco kit you include a question on purchase likelihood using a probability scale from 0 per cent to 100 per cent. You will be satisfied with the test kit when average purchase likelihood is at least 70 per cent. You sent the taco kit to 25 people and find the sample mean, X , is 76 and the sample standard deviation, S, is 14.07. On the basis of this the researcher wishes to estimate the population mean with 95 per cent confidence. To find the confidence interval estimate of the population mean for this small sample, we use the formula m 5 X 6 t c.l.SX or

Upper limit = X 1 t c.l.(

S ) n

Lower limit = X 2 t c.l.(

S ) n

where

m 5 population mean

X 5 sample mean

tc.l. 5 critical value of t at a specified confidence level

SX 5 standard error of the mean

S 5 sample standard deviation

n 5 sample size More specifically, the step-by-step procedure for calculating the confidence interval is as follows:

1 X, calculated from the sample is 76. 2 Since s is unknown, the sample standard deviation, S, was estimated from the sample and is 14.07. 3 We estimate the standard error of the mean using the formula SX 5 SX 5 2.81.

14.07 S . Thus, SX 5 or 25 n

4 We determine the t-values associated with the desired confidence level. To do this, we go to Table A.3 in Appendix A. Although the t-table provides information similar to that in the Z-table, it is somewhat different. The t-table format emphasises the chance of error, or significance level (a), rather than the 95 per cent chance of including the population mean in the estimate. Our example is a two-tailed test. Since a 95 per cent confidence level has been selected, the significance level equals 0.05 (1.00 2 0.95 5 0.05). Once this has been determined, all we have to do to find the t-value is look under the 0.05 column for two-tailed tests at the row in which degrees of freedom (d.f.) equal the appropriate value (n 2 1). Under 24 degrees of freedom (n 2 1 5 25 2 1 5 24), the t-value at the 95 per cent confidence level (0.05 level of significance) is t 5 2.06 (see Table A.3 in Appendix A). 5 We calculate the confidence interval:  14.07  Upper limit 5 76 1 2.06  5 81.80  25   14.07  Lower limit 5 76 2 2.06  5 70.20  25 

In our hypothetical example it may be concluded with 95 per cent confidence that the population mean for purchase intention of the taco kit is between 70.20 and 81.80.

ONGOING PROJECT

425

426

PART SIX > ANALYSING THE DATA

one-sample t-test A hypothesis test that uses the t-distribution rather than the Z-distribution; it is used when testing a hypothesis with a small sample size and unknown µ.

Univariate hypothesis test using the t-distribution Now, using the same information let’s perform a univariate t-test - also known as a one-sample t-test. The step-by-step procedure for a one-sample t-test is conceptually similar to that for hypothesis testing with the Z-distribution. The first step is to state the null hypothesis and the alternative hypothesis: H0: m 5 70 H1: m Þ 70

Recall n 5 25. Next, the researcher needs the sample mean, X 5 76, and a sample standard deviation, S 5 14.07, and estimates the standard error of the mean ( SX ). Recall from above SX 5 2.81. The researcher then finds the t-value associated with the desired level of statistical significance. If a 95 per cent confidence level is desired, the significance level is 0.05. Next, the researcher must formulate a decision rule to specify the critical values, within which the sample mean should fall, by computing the upper and lower limits of the confidence interval to define the regions of rejection. Recall from above the value of t for 24 degrees of freedom (n 2 1 5 25 2 1), is 2.06. The critical values are:  14.07  Lower limit 5 m 2 t c.l.SX 5 70 2 2.06   25  5 70 2 (2.06 3 2.81) 5 64.20  14.07  Upper limit 5 m 1 t c.l.SX 5 70 1 2.06   25  5 70 1 (2.06 3 2.81) 5 75.80

Finally, the researcher makes the statistical decision by determining whether the sample mean falls between the critical limits. For the taco kit sample, X 5 76. In this case, the sample mean is within the region of rejection. Because the sample result is more than the critical value at the upper limit, the null hypothesis can be rejected. In other words, the entrepreneur should be satisfied that mean purchase likelihood is higher than 70 per cent. As with the Z-test, there is an alternative way to test a hypothesis with the t-statistic. This is by using the formula: t obs 5 t obs 5

X 2m SX

76 2 70 6 5 5 2.14 2.81 2.81

We can see that the observed t-value is greater than the critical t-value of 2.064 at the 0.05 level when there are 25 2 1 5 24 degrees of freedom. This means there is sufficient evidence to suggest that the sample mean is greater than the desired average purchase likelihood of 70, and we reject H0. Therefore, this finding implies that consumer acceptance of the new product is beyond what the entrepreneur deems to be an acceptable level and provides evidence that acceptance in the market place will be high. Of course this is a small sample, and to gain greater confidence the entrepreneur ought to conduct more comprehensive testing. However, the evidence here suggests that consumers are likely to purchase this new product. In a report it would be important to state the sample mean and in parentheses to state the corresponding t-value, the degrees of freedom and the significance level.

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

427

In summary, the one-sample t-test is an easy-to-use univariate test that helps researchers to understand whether or not a sample mean is statistically different from some population parameter or some known standard. Such tests are used in a variety of applications and are conceptually easy to understand. For instance, a researcher might wish to test if mean attitudes are higher than some set value; say 5, on a 7-point scale. Likewise, a researcher may wish to test whether or not the number of customer complaints one month is higher than usual. In such cases a researcher would use a onesample t-test to see if the sample mean is statistically different from the known value.

Conducting a one-sample t-test in SPSS

SPSS

We now demonstrate using SPSS. Access the file ‘purcintent.sav’ from the CourseMate website and perform the following click through sequence: Analyse > Compare Means > One-Sample T-Test. Select ‘likelihood’ as the Test Variable and input ‘70’ as the Test Value, as shown in Exhibit 12.11. Click OK. This produces the results shown in Exhibit 12.12. From Exhibit 12.12 we can see that the sample mean, X, is 76, and the standard error of the mean, SX , is 2.81. SPSS calculates the t-value as 2.132 based on these results (try the calculation yourself manually – all the information is there). However,

instead

of

simply

← EXHIBIT 12.11 DOING A ONE-SAMPLE t-TEST IN SPSS

determining

whether or not the t-value is greater or less than the critical t-value as we would with a manual calculation, SPSS provides an exact significance level. In this case, the twotailed significance value is 0.043. This means there is a 4.3 per cent probability that the observed sample mean has occurred as a result of chance. In other words, because the significance value is below the 5 per cent level, we can say that it is unlikely that these results have occurred by chance. It would appear that mean purchase likelihood is greater than 70, as required by our budding young entrepreneur.

ONE-SAMPLE STATISTICS Mean

N Purchase likelihood

25

Std. deviation

76.0000

Std. error mean

14.06829

2.81366

ONE-SAMPLE TEST Test value 5 70 t

d.f.

Sig. (2-tailed)

Mean difference

95% Confidence interval of the difference Lower

Purchase likelihood

2.132

24

0.043

6.00000

0.1929

Upper 11.8071

← EXHIBIT 12.12 OUTPUT FROM ONESAMPLE t-TEST IN SPSS

428

PART SIX > ANALYSING THE DATA

EXCEL

Conducting a one-sample t-test in Excel If you haven’t already done so, make sure you install the Analysis Toolpak in Excel (use Excel’s Help menu to find out how to do this). Excel does not automatically perform a one-sample t-test. Instead, the researcher has to perform some calculations beforehand and can use Excel to assist. Open up the data file purcintent.xlsx from the MarketingCentral website. In the first column (from A1 to A25), each of the values has been included from our sampling distribution. By the side of these values you will see calculations for the sample mean, the sample standard deviation, the sample size and the standard error. The first three are just computed using basic functions in Excel (see Exhibit 12.13 for those formulae). Once we know the sample mean, the sample standard deviation and the sample size we can then easily calculate the standard error. The desired mean is simply the mean or standard to which we are judging and comparing our sample mean – that is defined by us as 70 in this example. Once we know all these values we can then calculate the t-value. Knowledge of the t-value and other parameters such as the degrees of freedom then enables us to calculate a two-tailed significance level – that leads us to a value of 0.0434 as with SPSS. The same interpretation applies. You can use this spreadsheet to perform one-sample t-tests for up to 100 observations as long as the formulae in the other cells are not changed! Simply replace the current sample with your own numbers and replace the desired mean.

EXHIBIT 12.13 → DOING A ONE-SAMPLE t-TEST IN EXCEL

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

The Chi-square test for goodness of fit

429

ONGOING PROJECT

Going back to the taco kit example, suppose the entrepreneur wanted to test preference for their taco kit (Tasty Tacos) by doing a simple taste test, comparing it against the nearest competitor (Old El Perro) by asking only a small sample of 90 respondents. The results in Table 12.8 show the observed preferences. TABLE 12.8 »

ONE-WAY FREQUENCY TABLE FOR TACO KIT PREFERENCE

Taco Kit

Frequency (preference for each brand)

Tasty Tacos

55

Old El Perro

35 90

The Chi-square ( x2) test allows us to test for significance in the analysis of frequency distributions. Thus, categorical data on variables such as gender, education or dichotomous answers may be statistically analysed. Suppose, for example, that we wish to test the null hypothesis that the number of people who prefer Tasty Tacos is equal to the number of people who prefer Old El Perro. This would lead to the following hypotheses: H0: Preference for Old El Perro is equal to preference for Tasty Tacos H1: Preference for Old El Perro is not equal to preference for Tasty Tacos The logic inherent in the x2 test allows us to compare the observed frequencies (Oi) with the expected frequencies (Ei) based on our theoretical ideas about the population distribution or our presupposed proportions. In other words, the technique tests whether the data come from a certain probability distribution. It tests the ‘goodness of fit’ of the observed distribution with the expected distribution. If we expect the frequencies for preference to be equal and asked a random sample of 90 people then we would expect that 45 people would prefer Old El Perro and 45 people would prefer Tasty Tacos. In that case, does our observed distribution (that is, 55:35) differ significantly from the expected frequency distribution (45:45)? To determine this, we calculate the Chi-square statistic. Calculation of the Chi-square statistic allows us to determine whether the difference between the observed frequency distribution and the expected frequency distribution can be attributed to sampling variation. The steps in this process are as follows: 1 Formulate the null hypothesis and determine the expected frequency of each answer. 2 Determine the appropriate significance level. 3 Calculate the x2 value, using the observed frequencies from the sample and the expected frequencies. 2

2

4 Make the statistical decision by comparing the calculated x value with the critical x value. To analyse the taco kit preference data in Table 12.8, start with a null hypothesis that suggests that preference is equal for the two taco kits (we may have different expectations, and may therefore form a different hypothesis, but for simplicity let’s just assume we want to test against the hypothesis that they are equal). Thus, the expected probability of each answer (preference for Old El Perro or preference for the Tasty Tacos) is 0.5. In a random sample of 90, 45 people would be expected to prefer Old El Perro and 45 people would be expected to prefer the Tasty Tacos. Once we have the observed frequencies and have calculated the expected frequencies, the Chi-square statistic may be calculated.

Chi-square (x2) test A hypothesis test that allows for investigation of statistical significance in the analysis of a frequency distribution.

430

PART SIX > ANALYSING THE DATA

To calculate the Chi-square statistic, use the following formula: x2 5 ∑

(Oi 2 Ei )2 Ei

where x2 5 Chi-square statistic Oi 5 observed frequency in the ith cell Ei 5 expected frequency in the ith cell Sum the squared differences: x2 5

(O1 2 E1 )2 E1

1

(O2 2 E2 )2 E2

Thus, we determine that the Chi-square value equals 4.44 (see Table 12.9 for a detailed calculation): x2 5

TABLE 12.9 »

(55 2 45)2 1 (35 2 45)2 45

45

5 4.44

CALCULATING THE CHI-SQUARE STATISTIC

Preference for Taco Kits

Observed frequency (Oi )

Expected probability

Expected frequency (Ei)

(Oi 2 Ei )

Preference for Tasty Tacos

55

0.5

45

10

Preference for Old El Perro

35

0.5

45

210

Total

90

1.0

90

x2 5 4.44

Like many other probability distributions, the x2 distribution is not a single probability curve, but a family of curves. These curves, although similar, vary according to the number of degrees of freedom (k 2 1). Thus, we must calculate the number of degrees of freedom. (Remember, degrees of freedom refers to the number of observations that can be varied without changing the constraints or assumptions associated with a numerical system.) We do this as follows: d.f. 5 k 2 1

where k 5 number of cells associated with column or row data.20 In the taco kit example there are only two categorical responses. Thus, the degrees of freedom equal 1 (d.f. 5 2 2 1 5 1). Now the computed Chi-square value needs to be compared with the critical Chi-square values associated with the 0.05 probability level with 1 degree of freedom. In Table A.4 of Appendix A, the critical Chi-square value is 3.84. Since the calculated Chi-square is larger than the critical Chi-square value, the null hypothesis – that the observed values are comparable to the expected values – is rejected.21 Since we reject the null hypothesis, this means that there is evidence to suggest that Tasty Tacos may be preferred to Old El Perro in a simple taste test. However, clearly the sample size is relatively small and a researcher would be wise to consider conducting further research on different samples before any firm conclusions are drawn. Note that you may wish to use different expected frequencies (they do not have to be equal), but these ought to be informed by other evidence rather than guesswork (previous surveys, qualitative research etc.). These results can be

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

431

reported quite parsimoniously in a table such as Table 12.9. The table might be included in the body of the report with expected frequencies in brackets next to the observed frequencies. Brief reference could be made to the corresponding Chi-square value and the level of significance with appropriate implications drawn. Chi-square tests are useful when a researcher wants to examine whether or not there is a statistical difference between a sample frequency distribution and a known frequency distribution. For example, a researcher may wish to test whether or not awareness is greater than 10 out of 80 people for a new product. Conceptually, univariate Chi-square tests are similar to one-sample t-tests; the difference is that we are comparing a sample frequency distribution to a known frequency distribution, rather than comparing a sample mean to a known mean or value. We discuss the Chi-square test further in Chapters 14 and 15, as it is also frequently used to analyse association between two or more nonmetric variables in a cross-tabulation.

Conducting a Chi-square test in SPSS

SPSS

To conduct the test in SPSS, open up the data file ‘taco kit preference.sav’ from the CourseMate website and go through the following click through sequence: Analyse > Nonparametric Tests > Legacy Dialogs > Chi-Square. As in Exhibit 12.14, select ‘preference’ as the Test Variable and make sure that for the Expected Values you have selected ‘All categories are equal’. If there are different a priori assumptions about how preferences should be distributed, then these can be taken into account, too. Click OK and you should see the output in Exhibit 12.15. ← EXHIBIT 12.14 DOING A CHI-SQUARE TEST FOR GOODNESS OF FIT IN SPSS

432

PART SIX > ANALYSING THE DATA

PREFERENCE

EXHIBIT 12.15 → SPSS OUTPUT FROM THE CHI-SQUARE GOODNESS OF FIT TEST

Observed N Tasty Tacos

Expected N

Residual

55

45.0

10.0

Old El Perro

35

45.0

210.0

Total

90 TEST STATISTICS Preference

Chi-square d.f. Asymp. sig.

4.444a 1 0.035

0 cells (0.0%) have expected frequencies less than 5. The minimum expected cell frequency is 45.0. a

Based on these results it appears that Tasty Tacos is more preferred. However, we only asked 90 people in our random taste test. How likely is it that these results have occurred by chance? From the output, we can see that SPSS has calculated the expected values based upon equal preference. The calculation was performed using the Chi-square statistic (try this calculation yourself) and the results report a Chi-square value of 4.44 and a significance value of 0.035. Again, SPSS has calculated the exact significance value and we do not need to consult the tables in the back of the book. Therefore, the results suggest that there is a 3.5 per cent probability that the observed frequencies have occurred as a result of chance. In other words, because the significance value is below the 5 per cent level we can say that it is unlikely that these results have occurred by chance. So, it would appear that Tasty Tacos is more preferred based on these results. EXCEL

Conducting a Chi-square test in Excel To conduct a Chi-square test in Excel, open up the data file ‘taco kit preference.xlsx’ from the MarketingCentral website. Firstly, notice how the data are arranged in Exhibit 12.16. In Excel you need to first input the actual preferences (as in cells C3 and D3). To conduct the Chi-square test this needs to be matched against the expected frequencies. If we expected equal preferences then this means cells C6 and D6 must both be 45. You can calculate the Chi-square value by using the CHIDIST function (see the formula in cell C8). Based on the calculation, the Chi-square test reveals a significance level of 0.035 as in the SPSS example. This can be interpreted in the same way and suggests that Tasty Tacos is preferred because the probability that the results have occurred by sampling error is less than 0.05.

TIPS OF THE TRADE

»» Before running your analysis, always clean your data first. There may be important tasks to do first such as reverse coding variables, combining data, coding missing values, checking the coding and labels to ensure they are correct. »» Once the data are cleaned, use the Explore function to ‘eyeball’ the data. What are they telling you? Do they make sense? Do you see any patterns? Is there anything you have forgotten to clean (e.g., is there an inconsistent value in there – 77 on a 1 to 7 scale – or did you forget to designate the missing values)?

»

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

»

433

»» Always try to understand what you are doing when using SPSS or Excel. These packages will do almost anything that you tell them but that doesn’t make it right. Understanding the process is at least as important as the process itself. »» If you get stuck use the help menu. Both packages have useful help menus to assist you with commands. »» Clearly save different versions of your output – you can quickly get swamped in data if you are not careful! »» Be aware of the limitations of your data and the analyses you have performed. »» . . . but most importantly enjoy finding out new things and translating this information into actionable insights.

← EXHIBIT 12.16 DOING A CHI-SQUARE TEST FOR GOODNESS OF FIT IN EXCEL

434

PART SIX > ANALYSING THE DATA

A REMINDER ABOUT STATISTICS Learning the terms and symbols defined in this chapter will provide you with the basics of the language of statisticians and researchers. As you learn more about the pragmatic use of statistics in marketing research, do not forget these concepts. The speller who forgets that ‘i comes before e, except after c’ will have trouble every time he or she must tackle the spelling of a word with the ie or ei combination. The same is true for the student who forgets the basics of the ‘foreign language’ of statistics.

SUMMARY TO EXPLAIN THE DIFFERENCE BETWEEN DESCRIPTIVE AND INFERENTIAL STATISTICS AND DISCUSS THE PURPOSE OF INFERENTIAL STATISTICS IN TERMS OF POPULATION PARAMETERS AND SAMPLE STATISTICS

Statistics is the language of the researcher, and this chapter introduced its vocabulary. Descriptive statistics describe characteristics of a population or sample. Inferential statistics investigate samples to draw conclusions about entire populations. We use inferential statistics to make inferences about a population’s characteristics based on a sample. In other words, the sample provides us with an estimate (that is, a sample mean) that we use to infer characteristics of the population (that is, the population mean). MAKE DATA USABLE BY ORGANISING AND SUMMARISING THEM INTO FREQUENCY DISTRIBUTIONS, MEASURES OF CENTRAL TENDENCY AND MEASURES OF DISPERSION

There are several ways in which we can summarise data. A frequency distribution summarises data by showing how frequently each response or classification occurs. Three measures of central tendency are commonly used: the mean (or arithmetic average), the median (halfway value) and the mode (most frequently observed value). These three values may differ, and care must be taken to understand distortions that may arise from using the wrong measure of central tendency. Measures of dispersion along with measures of central tendency can describe a distribution. The range is the difference between the largest and smallest values observed. The variance and standard deviation are the most useful measures of dispersion. IDENTIFY THE CHARACTERISTICS AND IMPORTANCE OF THE NORMAL DISTRIBUTION, COMPUTE AND USE A STANDARDISED Z-VALUE, AND DISTINGUISH BETWEEN POPULATION, SAMPLE AND SAMPLING DISTRIBUTIONS

The normal distribution fits many observed distributions. It is symmetrical about its mean, with equal mean, median and mode.

12

Almost the entire area of the normal distribution lies within 6 3 standard deviations of the mean. Any normal distribution can easily be compared with the standardised normal, or Z, distribution whose mean is 0 and standard deviation is 1. This allows easy evaluation of the probabilities of many occurrences. The researcher can estimate the probability of different occurrences by calculating a Z-value (a standardised value). The Z-value is then used to estimate the probability of that value occurring by comparing it with the Z-distribution in Table A.2 in Appendix A. The techniques of statistical inference are based on the relationship among the population distribution, the sample distribution and the sampling distribution. The population distribution is a frequency distribution of the population elements and a sample distribution is a frequency distribution of a sample. A sampling distribution is a theoretical probability distribution that shows the functional relation between the possible values of some summary characteristic of n cases drawn at random, and the probability (density) associated with each value over all possible samples of size n. TO EXPLAIN THE CENTRAL-LIMIT THEOREM FOR THE PURPOSE OF COMPUTING CONFIDENCE INTERVAL ESTIMATES AND TESTING SIMPLE UNIVARIATE HYPOTHESES

These ideas are based around the central-limit theorem, which states: as sample size, n, increases, the distribution of the sample mean taken from a random sample approaches a normal distribution. The central-limit theorem is robust and works regardless of the shape of the original population distribution. The central-limit theorem allows us to determine confidence intervals around sample estimates. Estimating a population mean with a single value gives a point estimate. A range of numbers, within which the researcher is confident that the population mean will lie, is a confidence interval estimate. The confidence level is a percentage that indicates the long-run probability that the confidence interval estimate will be correct. Having reviewed

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

the concepts behind inferential statistics, this chapter discussed univariate statistical procedures for hypothesis testing. A hypothesis is a statement or conjecture made by the researcher. A null hypothesis is a statement about the status quo. The alternative hypothesis is a statement that indicates the opposite of the null hypothesis. In hypothesis testing, a researcher states a null hypothesis about a population mean and then attempts to disprove it. Based on a significance level, the Z-test defines critical values on the standardised normal distribution beyond which it is unlikely that the null hypothesis is true. If a sample mean is contained in the region of rejection, the null hypothesis is rejected. TO DISCUSS THE NATURE OF THE t-DISTRIBUTION, CALCULATE A HYPOTHESIS TEST ABOUT A MEAN USING A ONE-SAMPLE t-TEST AND TO RUN A ONE-SAMPLE t-TEST AND INTERPRET THE OUTPUT IN IBM SPSS STATISTICS AND MICROSOFT EXCEL

The t-distribution is used for hypothesis testing with small samples when the population standard deviation is unknown. The t-test is analogous to the Z-test. Conceptually, the hypothesis test of a proportion is similar to the Z-test for a mean. This chapter presented the technique for using t-distributions to estimate confidence intervals for the mean. Calculation of the confidence interval requires use of the central-limit theorem to estimate a range around the sample mean, which should contain the population mean. It also showed how to perform these tests and interpret the output in two commonly used statistical packages, SPSS and Excel.

435

TO EXPLAIN THE SITUATIONS IN WHICH A UNIVARIATE CHISQUARE TEST IS APPROPRIATE, UNDERSTAND AND BE ABLE TO PERFORM A CHI-SQUARE TEST AND TO RUN THE TEST AND INTERPRET THE OUTPUT IN SPSS AND MICROSOFT EXCEL

The Chi-square test allows testing of statistical significance in the analysis of frequency distributions. To conduct a Chi-square test an observed distribution of categorical data from a sample is compared with an expected distribution for goodness of fit. This chapter illustrated how to conduct Chi-square tests in SPSS and Microsoft Excel and interpret the output. Guidance was also given in relation to how to interpret the output from these packages and the inclusion of pertinent information from the output generated. DISTINGUISH BETWEEN PARAMETRIC AND NONPARAMETRIC STATISTICS

When the data are interval or ratio scaled and the sample size is large, parametric statistical procedures are appropriate. These procedures are based on the assumption that the data in the study are drawn from a population with a normal (bell-shaped) distribution and/or normal sampling distribution. However, in marketing studies, often the data do not satisfy these assumptions. If this is the case then we would use nonparametric statistics instead.

KEY TERMS AND CONCEPTS alternative hypothesis central-limit theorem Chi-square (x2) test confidence interval estimate confidence level critical values degrees of freedom frequency distribution hypothesis

mean median mode normal distribution null hypothesis one-sample t-test percentage distribution point estimate population distribution

population parameters probability range sample distribution sample statistics sampling distribution significance level standard deviation standard error of the mean

standardised normal distribution t-distribution Type I error Type II error univariate statistical analysis variance

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 What is the difference between descriptive and inferential statistics? 2 In your own words define and explain the terms ‘central tendency’ and ‘measures of dispersion’. For each measure of

central tendency and dispersion given an example of a situation where each is most appropriate.

436

PART SIX > ANALYSING THE DATA

3 Number of tackles and goals for AFL clubs in the 2015 season are shown in the table below, obtained from http://www.afl.com. au/stats. TABLE 12. 10 » AFL 2015 TEAM STATISTICS ON TACKLES AND GOALS

Club

Tackles

Goals

Adelaide Crows

1004

220

Brisbane Lions

1086

156

Carlton

970

170

Collingwood

1233

223

Essendon

1154

175

Fremantle

1112

210

Geelong Cats

1111

205

Gold Coast Suns

1146

180

GWS Giants

1161

204

Hawthorn

1089

282

Melbourne

1196

173

North Melbourne

1104

232

Port Adelaide

1245

225

Richmond

1004

201

St Kilda

1195

200

Sydney Swans

1237

211

West Coast Eagles

1029

261

Western Bulldogs

1190

221 Source: afl.com.au

Calculate the mean and median for these data. Why is the mode not very useful? Which measure of central tendency is most appropriate? Would a frequency distribution be useful to represent the data?

4 For the following situations select whether the mean, median or mode is the most appropriate measure of central tendency. a Average customer satisfaction (based on a 1 to 7 Likert scale) among your target segment of customers. b Average yearly sales (based on $000s per annum) for the eight members of your sales team. c Average sales price for equivalent quality used cars in your sales area. d Average gross salary (including commission and bonuses) for sales staff working in equivalent sales roles. e Preference for competing brands as indicated by selecting the most preferred of six possible car brands. 5 Describe in your own words the purpose of a standard deviation. Why is a standard deviation preferable to the average deviation when measuring dispersion? 6 Calculate the standard deviation for the data in Question 3. 7 Why do we need to determine the level of dispersion? (Hint: draw three distributions that have the same mean value but

different standard deviation values. Draw three distributions that have the same standard deviation value but different mean values.) Why might a standard deviation be useful in assessing the data in Question 3? 8 For a national retail chain an analyst noticed that in the Victorian region average service quality, on a 7-point scale (where 1 indicates low levels of service quality and 7 indicates high levels of service quality) was 5.8 (mean) and the standard deviation was 1.7. However, in the New South Wales region average service quality was 5.3 (mean), with a standard deviation of 3.3. What do these statistics tell us about these two sales regions and their levels of customer satisfaction? 9 What is the sampling distribution? How does it differ from the sample distribution? 10 What would happen to the sampling distribution of the mean if we increased sample size from 5 to 25? 11 Suppose a fast-food restaurant wishes to estimate average sales volume for a new menu item. The restaurant has analysed the sales of the item at a similar outlet and observed the following results:

X 5 500 (mean daily sales) S 5 100 (standard deviation of sample)

n 5 25 (sample size).

The restaurant manager wants to know into what range the mean daily sales should fall 95 per cent of the time. Perform this calculation.

12 What is the purpose of a statistical hypothesis? 13 What is a significance level? How does a researcher choose a significance level? 14 Distinguish between a Type I and Type II error. (Hint: you may wish to illustrate using a judge’s guilty/not guilty verdict to help you.) 15 What are the factors that determine the choice of the appropriate statistical technique? 16 A market research company and a television network are trying to see if seriously short breaks (shorter, more concise advertisement breaks) are more effective in generating recall (measured as percentage of customers recalling the ad) and liking for the advertised products and services (measured on a 1 to 7 Likert scale with lower number meaning less positive evaluations), than more traditional advertising breaks. State the various null and alternative hypotheses. 17 Assume you have the following data: H0: m 5 6.0, S 5 1.2, n 5 48 and X 5 5.5. Conduct a one-sample t-test at the 0.05 significance level to see if the sample mean of 5.5 is different from the population mean of 6.0. 18 Assume you have the following data: H0: m 5 2450, S 5 400, n 5 100 and X 5 2300. Conduct a hypothesis test at the 0.01 significance level.

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

19 If the data in Question 18 had been generated with a sample of 25 (n 5 25), in theory what statistical test would be appropriate? 20 How does the t-distribution differ from the Z-distribution? 21 The answers to a researcher’s question will be nominally scaled. What statistical test is appropriate for comparing the sample data with hypothesised population data? 22 A researcher plans to ask employees whether they favour, oppose or are indifferent about a change in the company’s superannuation provider. Formulate a null hypothesis for a Chi-square test, and determine the expected frequency for each answer. 23 Suppose the observed responses to the proposed survey in Question 22 were: favour (30), oppose (15) and indifferent (25). Perform a Chi-square test based on these observed responses and the expected frequencies provided in Question 22. 24 Give an example in which a Type I error may be more serious than a Type II error. 2 25 Refer to the Tasty Tacos x data on page 429. What statistical decisions could be made if the 0.01 significance level were selected rather than the 0.05 level? 26 The following is a summary of typical types of data that might be received from a questionnaire to discover the local population’s agreement and disagreement with the success of new neighbourhood initiatives designed to enhance wellbeing. Determine a statistical hypothesis and perform a Chi-square test on the data. a Public transport facilities in the neighbourhood have improved over the last year: Agree

20

Neutral

36

Disagree

28

437

b Generally speaking there is less rubbish lying around the neighbourhood now than a year ago: Agree

48

Neutral

22

Disagree

12 82

27 Suppose a researcher is interested in measuring residents’ attitudes towards the establishment of a new local tavern with 40 gaming machines. Based on a series of questions, it was established that the sample’s attitude towards the venue being established with the gaming machines was 2.24 on a scale of 1 (strongly disagree) to 5 (strongly agree). One hundred and twenty-two people were surveyed and the sample standard deviation was 0.63. The local government responsible for deciding whether or not the venue’s licence should be approved wants to use this information to assist in its decision about whether or not plans for the venue should go ahead. Based on this information should the venue go ahead? (Hint: conduct a univariate hypothesis test.) 28 Suppose an entrepreneur is interested in judging how his new product, curry kits, will be taken up in the marketplace. The entrepreneur selects a random sample of 98 people and finds that 62 people would try the new curry kit. Assume that there must be at least 70 people who will try the curry kit. The entrepreneur also finds that mean adoption intention, on a scale of 1 to 7, was 5.5. The entrepreneur will go ahead with the concept if mean adoption intention is higher than 5.2. Suppose the sample standard deviation is 1.23. What tests would you use to help the entrepreneur and what would your conclusions be based on?

84

ONGOING PROJECT RUNNING SOME UNIVARIATE OR BIVARIATE STATISTICS? CONSULT THE PROJECT WORKSHEETS FOR CHAPTERS 12, 13 AND 14 FOR HELP

Selecting a test to use is based on answering some simple questions (see the flowchart at the beginning of Part 6) about the hypotheses you are testing, how the variables are measured

and how many variables to include. You then need to be able to interpret the output from the test. These steps can be followed by using the project worksheets for Chapters 12, 13 and 14 available from the CourseMate website. It’s a good idea to know what tests you are going to use before you collect the data (then you can collect the right data).

438

PART SIX > ANALYSING THE DATA

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to log in to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ interactive quizzes

☑ datasets ☑ research activities ☑ videos.

☑ flashcards

WRITTEN CASE STUDY 12.1 ATTITUDES TOWARDS WATER CONSERVATION IN A DROUGHT-STRICKEN AREA OF AUSTRALIA In light of Australia’s dry weather conditions, managers and policy makers have had to try new ways of reducing the population’s water consumption. Policy tools such as price increases are politically contentious and other mechanisms such as increasing supply through desalination plants and new dams are expensive and have environmental consequences. One way of reducing household water consumption behaviours is to change people’s attitudes towards consuming and conserving water.22 Behaviour change through social marketing may also be a cheaper way to do it too. As a market researcher, suppose you want to collect some primary data to try to understand: 1 what residents’ intentions are in terms of conserving water in the next 12 months (measured using two items anchored from 1 to 7, where 7 indicates higher intention) 2 what their attitudes are to conserving water (measured using three items anchored from 1 to 7, where 7 indicates a higher attitude) 3 the degree to which they perceive they have the control to save water over the next 12 months (measured using two items anchored from 1 to 7, where 7 indicates higher perceived behavioural control) 4 the importance of the views of significant others (such as friends and family) in influencing their decision to conserve water (measured using three items anchored from 1 to 7,

where 7 indicates a greater influence from significant others). This will help you to create more efficiently targeted marketing communications.

QUESTIONS

1 Outline a proposed study to find the answers to these questions. Specify the kinds of things you ought to measure and how you would collect the data to help you answer these questions. 2 Specifically, how might you summarise the key data (e.g., attitudes, intentions, perceived behavioural control and social norms) to get an overview of what residents are saying? 3 Are there any hypotheses that you might generate? If so specify what they would be in relation to the proposal. 4 From the MarketingCentral website open up the data file ‘water.xlsx’ or ‘water.sav’, depending on whether you prefer to do the analysis in Excel or SPSS (Note: this is hypothetical data based around a real study.) Are attitudes, social norms, perceived behavioural control and intentions positive? Run some one-sample t-tests based on test values you determine, and comment on the results. Consider how you would communicate these results if presenting them in a report.

WRITTEN CASE STUDY 12.2 GAMBLING IN YOUR COMMUNITY: WHAT DO THE LOCALS THINK? When a new pub or tavern with electronic gaming machines (EGMs) opens in your community, often its impact upon the community has to be evaluated by the local council. Suppose, in one such case, the local council ran a survey. Specifically, the council wanted to assess community awareness about the application for gaming machines in the development, and what community attitudes were towards the new tavern with its EGMs. The findings from the research were intended to provide the council with a detailed understanding of community views towards EGMs. Several questions were asked to ascertain

residents’ attitudes and perceptions of changes to wellbeing as a result of the EGMs. Hypothetical responses to this data can be found in ‘gambling.xlsx’ or ‘gambling.sav’, depending on whether you prefer to do the analysis in Excel or SPSS. Some questions included community awareness of the application to install gaming machines and the degree of support that residents had for the initiative, including: 1 Whether or not residents were aware of the application to install gaming machines at the new development (‘Were you previously aware of the application for a licence to install

CHAPTER 12 > UNIVARIATE STATISTICAL ANALYSIS: A RECAP OF INFERENTIAL STATISTICS

gaming machines at the proposed tavern?’). This is Q2 In the dataset. 2 Whether or not residents were aware of the application to install 40 gaming machines at the new development (‘Were you previously aware that the application is for a licence to install 40 gaming machines at the proposed tavern?’). This is Q3 in the dataset. 3 Whether or not they would support or oppose the tavern going ahead in light of it having 40 EGMs (‘To what extent do you support or oppose the proposal to install 40 gaming machines at the proposed tavern?’). This is Q6 in the dataset. 4 Whether or not they would support or oppose the tavern going ahead if it did not have the EGMs (‘To what extent would you support or oppose the proposal to build the tavern if it did not have gaming machines?’). This is Q7 in the dataset.

439

QUESTIONS

1 Specify some hypotheses that you might want to test to ascertain (i) the degree to which residents were aware of the application for gaming machines, and (ii) whether or not residents support the proposed development going ahead. 2 What test(s) would you run to test your hypotheses? 3 Specifically, access the data and perform the following tests (which you may have already done based on your answers to the above). What do the results tell us? a Run Chi-square tests to determine if awareness (based on Q2 and Q3) is evenly distributed. b Run one-sample t-tests to determine if support for the installation of gaming machines (based on Q6 and Q7) is different from 3.

WRITTEN CASE STUDY 12.3 VICTORIA’S LEGAL OUTSOURCING SYSTEM AND CUSTOMER SATISFACTION According to a news article in The Australian, Victoria’s legal services are outsourced to a panel of 34 organisations. These organisations are subject to client satisfaction surveys measuring elements of client satisfaction such as quality, timeliness and value for money. Since 2004, 830 client satisfaction surveys have been carried out on the 34 organisations in the panel. These surveys are used to inform the Victorian Government’s decisions about how legal services are made available. As such, understanding and measuring customer satisfaction is crucial to evaluating the effectiveness of these services. But how should users of these surveys evaluate the results? Answer the questions below with regard to what should be done with the survey results and how they should be used. The full story can be accessed at: http://www. theaustralian.com.au/business/legal-affairs/victoria-overhaulsoutsourcing-system/story-e6frg97x-1111116617652.

QUESTIONS

1 Describe a procedure for statistically evaluating the panel’s satisfaction scores. 2 How would you know if the panel was achieving its objectives in terms of customer satisfaction? 3 Is a sample size of 830 sufficient to conclude about levels of client satisfaction? 4 Suppose a score of 3.9 out of 5 was found and the panel’s objectives were to maintain a score of at least 4 out of 5. A bureaucrat sees this and decides satisfaction is below what it should be. What would you say to them?

ONGOING CASE STUDY MOBILE PHONE SWITCHING AND BILL SHOCK The team has now collected the data but are confronted by the usual mass of raw numbers from a large sample survey. What does it all mean and where should they begin? David, Leanne and Steve need to get a feel for the data and understand their characteristics before performing any more advanced analysis.

QUESTIONS

1 Describe some initial descriptive statistics that might be useful to get a better feel for the data and the variables that might be explored.

2 Develop some hypotheses that could subsequently be explored. 3 The team wants to understand the bill shock variable in particular. How would David, Leanne and Steve go about estimating a confidence interval around its mean and in what ways might this be useful?

440

PART SIX > ANALYSING THE DATA

NOTES 1

Huff, Darrell & Geis, Irving (1954) How to lie with statistics, New York: W. W. Norton, p. 33. 2 Accessed at http://www.adzuna.com.au/search?q=marketing%20graduate&w= Australia on 10 August 2015. 3 Lowe, B., Lynch, D. & Lowe, J. (2015), ‘Reducing household water consumption: A social marketing approach,’ Journal of Marketing Management, 31 (3–4), 378–408. 4 See, for example, the QuickStats link: http://www.abs.gov.au/websitedbs/censushome. nsf/home/quickstats?opendocument&navpos=220, accessed on 10 August 2015. 5 http://www.theshout.com.au/2010/10/25/article/WA-first-to-try-Guinness-BlackLager/PRZKXYSSNH.html, accessed on 10 August 2015. 6 Most of the statistical material in this book assumes that the population parameters are unknown, which is the typical situation in most applied research projects. 7 Huff, Darrell & Geis, Irving (1954) How to lie with statistics, New York: W. W. Norton, p. 33. 8 http://www.payscale.com/research/AU/Job=Marketing_Manager/Salary, accessed on 30 July 2015. 9 The reasons for this are related to the concept of degrees of freedom, which will be explained later. At this point, disregard the intuitive notion of division by n, because it produces a biased estimate of the population variance. 10 Note the formula for the population standard deviation, s, has not been given. Nevertheless, you should understand that s measures the dispersion in the population and S measures the dispersion in the sample. Thus, we use S to estimate s. These concepts are crucial to understanding statistics. 11 Wonnacott, Thomas H. and Wonnacott, Ronald J. (1969) Introductory statistics, New York: John Wiley & Sons, p. 70. Reprinted with permission. 12 In practice, most survey researchers will not use this exact formula. A modification of X 2m , using the sample standard deviation in an adjusted form, is the formula, Z 5 SX frequently used.

(

)

13 Accessed at http://www.censusdata.abs.gov.au/census_services/getproduct/ census/2011/quickstat/CED304?opendocument&navpos=220 on 10 August 2015. 14 Adapted from Sanders, D. H., Murphy, A. F. & Eng, R. J. (1980) Statistics: A fresh approach, New York: McGraw-Hill, p. 123. 15 Selvanathan, Anthony, Selvanathan, Saroja, & Keller, Gerald (2013) Business Statistics: Australia/New Zealand, 6th edn, Cengage Learning: Melbourne. 16 All these data is available in the appendix of the following article: Lowe, B., Lynch, D. & Lowe, J. (2015), ‘Reducing household water consumption: A social marketing approach,’ Journal of Marketing Management, 31 (3–4), 378–408. 17 Technically the t-distribution should be used when the population variance is unknown and the standard deviation is estimated from sample data. However, with large samples it is convenient to use the Z-distribution, because the t-distribution approximates the Z-distribution. 18 A complete discussion of this topic is beyond the scope of this book. See almost any statistics textbook for a more detailed discussion of Type I and Type II errors. 19 Lowe, B., Lynch, D. & Lowe, J. (2015) ‘Reducing household water consumption: A social marketing approach,’ Journal of Marketing Management, 31 (3–4), 378–408. 20 The reader with an extensive statistics background will recognise that there are a few rare cases in which the degrees of freedom do not equal k – 1. However, these cases will not often be encountered by readers of this level of book, and to present them would only confuse the discussion here. 21 An example of how to use the Chi-square table is given in Appendix A, Table A.4. 22 Lowe, B., Lynch, D. & Lowe, J. (2015) ‘Reducing household water consumption: A social marketing approach,’ Journal of Marketing Management, 31 (3–4), 378–408; Lowe, B., Lynch, D. & Lowe, J. (2014), ‘The role and application of social marketing in managing water consumption: A case study,’ International Journal of Nonprofit and Voluntary Sector Marketing, 19 (1), 14–26.

13 »

WHAT YOU WILL LEARN IN THIS CHAPTER

To understand the reasons for conducting tests of differences, the tests of difference that are available to the researcher, and how measurement scale influences the analysis technique used. To state a null hypothesis in a test of differences among two or three means and to be able to test that hypothesis by calculating a t-test for two independent samples, a t-test for two paired/related samples, or an ANOVA for three or more samples. To use SPSS and Microsoft Excel to conduct tests of differences with independent samples t-tests, paired samples t-tests and ANOVAs, and to be able to interpret the output from these statistical packages. To understand the assumptions underlying t-tests and ANOVAs and understand that there are alternative tests that can be used when these assumptions do not hold. To understand the difference between statistical and practical significance for tests of differences.

BIVARIATE STATISTICAL ANALYSIS:

TESTS OF DIFFERENCES

‘Wow, that’s expensive!’ ‘No it’s not – it costs the same as back home – but it’s in yen!’

Consumers often react to different stimuli in different ways. Understanding how and why this happens is often the preserve of experimentation (see Chapter 7). Consider, for example, the ‘face value’ effect.1 Research in international consumer marketing shows that consumers who are used to low denomination currencies (like the Australian dollar, the euro and the British pound) tend to view prices presented in higher denomination currencies (like the Japanese yen) as more expensive even when they are aware of the exchange rate. It seems our decisions as consumers are often anchored by higher nominal values and this may have implications for consumer behaviour in international retail settings. To test this, researchers might create an experiment whereby one group of consumers is exposed to a product in Australian dollars and another group of consumers is exposed to exactly the same product, but in a higher denomination currency like Japanese yen. After reading about these products and viewing their marketing communications these consumers may rate each product in terms of their willingness to pay, value perceptions and purchase intentions. Specifically, this research would likely find that consumers exposed to the product in a high denomination currency are more likely to perceive that product as expensive, poorer value and consequently exhibit lower purchase intentions. Similar evidence of this effect based on a pseudo-experiment2 was found when Germany switched from deutschmarks to euros. This is what we mean by testing for differences; that is whether or not some group exhibits more or less of a characteristic than another group. Testing for differences between groups is an important aspect of marketing research and is

intuitive and simple to understand. Such analysis techniques are an important element of the market researcher’s toolkit because they are effective and easily understood by non-research specialists. They remain an important technique in areas of marketing like segmentation as such tests help us to understand how different consumer segments behave differently. Consider, for example, one study which tried to understand the reasons individuals play computer games.3 Using an existing model from social psychology called the Theory of Planned Behaviour, the study created a survey to better understand how attitudes, social norms and perceived behavioural control affect intentions to play computer games. However, the study further segmented consumers behaviourally, into non-gamers, casual gamers and hard-core gamers. In doing so it was possible to understand what reasons influence game-playing behaviour. Specifically, hardcore gamers exhibited more positive attitudes to game playing, and were more likely to believe that game playing was fun, enabled them to be creative, created challenges for them, and enabled them flexibility (amongst other beliefs). Knowing the characteristics of such segments through testing for differences can be useful to marketers in communicating with different customer segments. To address issues such as this, researchers use statistical techniques such as t-tests and ANOVAs, which allow the researcher to compare means between specific groups. Recall from Chapter 12 that we discussed the notion of inferential statistics on the basis that we generally need to collect a sample of responses, rather than a census, and this enables us to infer characteristics of the population to a certain degree of confidence. Such tests might be used to tell the researcher how likely it is that the observed means have occurred as a result of sampling error (see Chapter 12). Thus, we are not so interested in mathematical differences, but statistical differences. We now explain these tests in more detail.

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

441

442

PART SIX > ANALYSING THE DATA

ONGOING PROJECT

WHAT IS THE APPROPRIATE TEST OF DIFFERENCE? One of the most frequently tested hypotheses states that two groups differ with respect to some behaviour, characteristic or attitude (see the opening vignette, for example). In the classical experimental design, the researcher tests differences between subjects assigned to the experimental group and subjects assigned to the control group. For instance, a group of respondents in one area of a city might be exposed to a marketing campaign designed to get them to donate more blood, and attitudes towards blood donation and subsequent behaviour might be compared to a separate (and hopefully similar) group of respondents who were not exposed to the marketing campaign (these samples might be identified through cluster sampling to achieve similarity – see Chapter 10). If attitudes and levels of blood donation are different (higher) among the experimental group then one might conclude that the marketing campaign has been successful in changing attitudes and behaviours.4 Such tests are bivariate tests of differences. Don’t get confused here: bivariate refers to the number of different variables involved in the analysis. When there are comparisons between two independent groups (e.g., those who do and those who do not donate blood), this is known as an independent samples

test of differences An investigation of a hypothesis stating that two (or more) groups differ with respect to measures on a variable.

t-test for difference of means. On the other hand, given the limitations of experimental designs with experimental groups and control groups (e.g., group heterogeneity) researchers may wish to use some kind of pretest posttest design (see Chapter 7). This would enable a comparison to be made between two related samples, which would then overcome some issues with the first design. Therefore, an alternative experiment might involve randomly selecting a group of respondents (say, through cluster

independent samples t-test for difference of means A technique used to test the hypothesis that the mean scores on some interval- or ratio-scaled variables are significantly different for two independent samples or groups.

sampling again) and first surveying them about their attitudes and behaviours with respect to blood

paired samples t-test A technique used to test the hypothesis that mean scores differ on some interval- or ratio-scaled variable between related or paired samples.

suppose the same blood donation organisation wanted to know which of three separate blood donation

ONGOING PROJECT

donation, then exposing these respondents to the marketing campaign designed to get them to donate more blood, and once again surveying those same respondents about their attitudes and behaviours toward blood donation. In this case the two samples are related (also known as paired) rather than independent. In this case we would be comparing means between two related (paired) samples and this is known as a paired samples t-test. Sometimes researchers may wish to test for differences among three or more groups. For instance, campaigns would be most successful in leading to increased blood donations (in comparison with a control group). This time a t-test is not as useful as the means of four groups are being analysed. In this case the researcher would use an analysis of variance (ANOVA), which tests for differences in three or more means. We can see the differences between these tests in Exhibit 13.1.

THE INDEPENDENT SAMPLES t-TEST FOR DIFFERENCES OF MEANS The t-test may be used to test a hypothesis stating that the mean scores on a variable will be

independent samples

significantly different for two independent samples or groups. Specifically, this type of t-test is known as an independent samples t-test. Remember, from Chapter 8, we can only take the mean of an intervalor ratio-scaled variable, so one variable must be interval or ratio scaled and the other variable must be nominal scaled with two groups. In the prior example, the blood service wanted to determine if those exposed to a social marketing campaign donated more blood than those who were not exposed to it. Intention to donate blood and/or number of blood donations would be ratio scaled and exposure to the campaign (i.e., experimental group versus control group) would be a dichotomous nominal variable.

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

How many variables am I interested in analysing?

← EXHIBIT 13.1 CHOOSING A TEST TO COMPARE DIFFERENCES IN MEANS

Two variables (e.g, does mean number of store visits per month to Wendy’s differ by gender?) Bivariate analysis (Chapters 13 and 14) What am I interested in examining?

Association between variables

How are the variables measured?

One metric variable, one nonmetric with two groups

See Chapter 14

Independent samples t -test

One metric variable, one nonmetric with three or more groups Two metric variables from paired samples Paired samples t -test

One-way ANOVA

In theory, a t-test is used when the number of observations (sample size) in one or both groups is small (less than 30) and the population standard deviation is unknown. If the population standard deviation is known or if the sample size is greater than 30, we would use a Z-test.5 In practice, we very rarely know the population standard deviation, and with sample sizes above 30 the t-test is a close approximation to the Z-test anyway. Therefore, more often than not, researchers use a t-test to compare differences in means between two groups. To use the independent samples t-test for difference of means, we assume that the two samples are drawn from normal distributions and that the variances of the two populations or groups are equal. The test for normality can be conducted in SPSS, or some other statistical package, and the output from a t-test generally reports a statistic to determine equality of variances. The null hypothesis about differences between independent groups is normally stated as follows: m1 5 m2 or m1 2 m2 5 0. Therefore, if the null hypothesis were true there would be no difference in the two population means. The alternative (or rival) hypothesis is usually something such as m1 Þ m2 or m1 2 m2 Þ 0. Researchers sometimes specify the nature of this difference (e.g., by speculating as to which mean 2 m1 or m2 – is bigger/smaller). If this was the case the alternative hypothesis could be specified as m1 . m2 or m1 2 m2 . 0 (or substitute with , if the nature of the difference is proposed to be the other way around). In most cases comparisons are between two sample means ( X1 2 X2 ). A verbal expression of the formula for t is: t5

Mean 1 2 Mean 2 Variability of random means

443

444

PART SIX > ANALYSING THE DATA

Thus, the t-value is a ratio with information about the difference between means (provided by the sample) in the numerator and the standard error in the denominator. The question is whether the observed differences have occurred by chance alone. To calculate t, we use the following formula: t5

X1 2 X2 SX1 2 X2

where X1 5 mean for group 1 X2 5 mean for group 2

SX1 2 X2 5 pooled, or combined, standard error of difference between means pooled estimate of the standard error An estimate of the standard error for a t-test of independent means that assumes the variances of both groups are equal.

To calculate the pooled estimate of the standard error of the difference between means of independent samples, we use the following formula:  ( n 2 1)S12 1 ( n2 2 1)S22   1 1 SX1 2X2 5  1   n 1 n  n1 1 n2 2 2  1 2

where S12 5 variance of group 1 S22 5 variance of group 2 n1 5 sample size of group 1 n2 5 sample size of group 2 Let’s return to the blood donation example to see how an independent samples t-test would be conducted. First the researchers might identify two groups of respondents with similar characteristics (as mentioned, one way to do this might be through cluster sampling). One group (the experimental group) would be exposed to the social marketing campaign to donate more blood and the other group (the control group) would not be exposed to it. Some time after the campaign had run its course those in both groups may be surveyed (e.g., based on attitudes and intentions to donate blood) and their blood donation behaviour may be recorded (e.g., number of times donated). An application of the independent samples t-test would be to test differences in blood donation attitudes and/or behaviour between the two groups. Suppose attitude towards blood donation was measured by three items anchored by 1 to 5, and subsequently averaged for ease of reporting; that is, the total attitude scale could also range between 1 (very negative) and 5 (very positive). The results of the survey are shown in Table 13.1. A higher score indicates a more favourable attitude towards blood donation. The null hypothesis is that there is no difference in attitudes towards donating blood (as indicated by mean scores) between the two groups, where the experimental group are denoted by ‘eg’ and the control group are denoted by ‘cg’. More formally this can be expressed as mcg 5 meg or mcg 2 meg 5 0 for the null hypothesis, and mcg Þ meg or mcg 2 meg Þ 0 for the alternative hypothesis, as indicated earlier. To conduct the independent samples t-test, we first calculate the pooled estimate of the standard error based on the data in Table 13.1:  ( n 2 1)S12 1 ( n2 2 1)S22   1 1 SX1 2X2 5  1   n 1 n  n1 1 n2 2 2  1 2  (48 2 1)1.382 1 (492 1)1.332   1 1 SX1 2X2 5    48 1 49  48 1 49 2 2  SX1 2 X2 5 0.275

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

TABLE 13.1 »

445

ATTITUDE TOWARDS BLOOD DONATION BY GROUP

Control group

Experimental group

X 1 5 3.27

X2 5 4.22

S1 5 1.38

S2 5 1.33

n1 5 48

n2 5 49

The calculation of the t-statistic is

t5

X1 2 X2 SX1 2 X2

t5

3.27 2 4.22 0.275

t 523.45

In a test of two means, degrees of freedom are calculated as follows: d.f. 5 n 2 k

where n 5 n1 1 n2 k 5 number of groups In our example d.f. equals 95. If the 0.05 level of significance is selected, reference to Table A.3 in Appendix A yields the critical t-value.6 The t-value of 1.987 must be surpassed by the observed t-value if the hypothesis test is to be statistically significant at the 0.05 level. The absolute value of the calculated value of t, 3.45, exceeds the critical value of t for statistical significance (here we use the absolute value of the calculated t; the negative sign indicates the direction of the difference), so it is significant at a 5 0.05. In other words, because the calculated absolute value of t (3.45) is greater than the critical t value (1.98) we would reject the null hypothesis, and thus there would be evidence to show that the experimental group have significantly more positive attitudes towards blood donation than the control group. In other words the difference observed between our samples is unlikely to have occurred by chance. Practically speaking, the results here suggest that those exposed to the social marketing campaign hold more positive attitudes towards blood donation than those who were not. Therefore, if marketers were aiming to increase blood donation behaviours they might begin to scale up the campaign and see if it worked on other samples (of course, we may also want to perform a similar test on blood donation behaviour given attitudes are not always linked to behaviour as strongly as we may hope8). To see how we can use the independent samples t-test in practice, see the example in the ‘Real world snapshot’ box.

independent samples

REAL WORLD SNAPSHOT

CREDIT WHERE CREDIT IS DUE

The use of credit cards as opposed to cash for purchases has often been associated with increased spending (e.g., if you use a credit card as opposed to cash you are likely to spend more and purchase more frequently). Researchers wanted to determine how the way you pay for something (i.e., using a credit card or using cash) changes the way you perceive products. Specifically, the researchers proposed

»

446

PART SIX > ANALYSING THE DATA

»

SPSS

that people paying with a credit card would place more attention on a product’s benefits than people paying with cash, and that people paying with cash would place more attention on a product’s costs than people paying with a credit card. If this was true then people paying with a credit card should evaluate a product’s benefits more quickly than those paying with cash, and people paying with cash should evaluate a product’s costs more quickly than those paying with a credit card. In a consumer experiment based on a sample of 134 respondents the researchers primed half the sample as if they were using a credit card to pay for something, and the other half of the sample as if they were using cash. They were then asked to perform a task in order to evaluate a product’s costs and benefits and were timed. It was found using a t-test that respondents who paid by credit card indeed spent more time evaluating the benefits than those that paid with cash (t132 5 6.51; p , 0.0001). These findings have important public policy implications: 1) consumers should be encouraged to deliberate before spending, even more so in a world of one click transactions, 2) ease of payment may lead to over indulgence and sub-optimal choices. The moral of the story? It pays not to spend!9

Conducting an independent samples t-test in SPSS Access the SPSS data file ‘attitudes.sav’ from the CourseMate website. To run an independent samples t-test in SPSS do the following: Analyse > Compare Means > Independent-Samples T-Test. Select ‘totalatt’ as the Test Variable and ‘treatment’ as the Grouping Variable, click OK and then you will see the output appear in a different window.

EXHIBIT 13.2 →

DOING AN INDEPENDENT SAMPLES t-TEST IN SPSS

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

447

Running an independent samples t-test in SPSS would produce the results shown in Exhibits 13.3 and 13.4. From Exhibit 13.3 we can see that the mean attitude towards donating blood for the control group was 3.27 and the mean attitude towards donating blood for the experimental group was 4.22. This suggests that those exposed to the social marketing campaign have more favourable attitudes than those who were not exposed to it. But is this a real difference or did it occur by chance? (Recall we only sampled 97 respondents so the results could have occurred due to random error.) GROUP STATISTICS N

Treatment totalatt

Control Group Experimental Group

Mean 48 49

Std. deviation

3.2708 4.2177

Std. error mean

1.38492 1.33429

0.19990 0.19061

← EXHIBIT 13.3 SAMPLE STATISTICS OF ATTITUDE TOWARDS BLOOD DONATION BY TREATMENT

To determine this, we examine Exhibit 13.4. In Exhibit 13.4 we first look at the section called ‘Levene’s test for equality of variances’. Recall from our initial discussion of t-tests that an assumption underlying this technique was that the variances of the two samples were equal. For this, we look at the significance value (Sig. value) for Levene’s statistic. The significance value is 0.967, which is above the 0.05 critical significance value. (It is interpreted in the same way as a t-test. Using a 5 per cent significance level, if the significance value is below 0.05 then we say there is a statistical difference and if above 0.05 we say there is no statistical difference.) Therefore, we can assume the variances are equal for these two samples, which satisfies our assumption. Reading across the table in the row ‘Equal variances assumed’ we see the t-value is 23.43 and the significance value for the difference in means is 0.001 (if we found the variances were different we would read from the bottom row ‘Equal variances not assumed’). This value is below the 0.05 threshold, suggesting there is a 0.001 probability that the differences have occurred due to chance. This is a relatively small probability and it is below the 0.05 cut-off. Therefore, we can say that those exposed to the social marketing campaign have more positive attitudes than those who were not. INDEPENDENT SAMPLES TEST

t-test for equality of means

Levene’s test for equality of variances F totalatt

Equal variances assumed Equal variances not assumed

Sig. .002

.967

← EXHIBIT 13.4 SAMPLE OUTPUT FROM INDEPENDENT SAMPLES t-TESTS

t

d.f.

Sig. (two-tailed)

23.429

95

.001

23.428

94.681

.001

Conducting an independent samples t-test in Excel Access the Excel data file ‘attitudes.xlsx’ from the CourseMate website. Notice how the data in Excel must be in separate columns based on exposure to the treatment. First we might want to calculate the means, standard deviations and sample sizes as in Exhibit 13.5. To calculate the mean for the control group select a cell and type in: 5 average(A3:A50)

EXCEL

448

PART SIX > ANALYSING THE DATA

Do the same for the experimental group, but with different cell references based on the data (that is, B3:B51). To calculate the standard deviation for the experimental group, select a separate cell and type in: 5 stdev(A3:A50) EXHIBIT 13.5 → DOING AN INDEPENDENT SAMPLES t-TEST IN EXCEL

Again, do the same for the control group but with different cell references based on the respective data. To calculate the sample size for the control group select a separate cell and type in: 5 count(A3:A50) Do the same for the experimental group, but with different cell references based on those data. Now, to calculate the Sig. value we need to select another cell and type in: 5 TTEST(A3:A50,B3:B51,2,2) In the brackets the first values refer to the control group, the second values refer to the experimental group, the third value refers to whether or not we wish to run a one-tailed or two-tailed test (to be consistent with the SPSS example we have selected a two-tailed test) and the fourth value refers to the type of test we run. We have selected a ‘2’, which refers to an independent samples t-test with equal variances assumed. If you want to assume unequal variances then you would type in a ‘3’. For further information you can use the Excel help menu here and type in ‘ttest’. When you click Enter, the cell will report an exact Sig. value. The Sig. value reported is 0.000892, which when rounded up

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

is 0.001, as in the SPSS example. The interpretation is the same, of course. Therefore, using the same rationale as in the SPSS example, we can say that the experimental group seems to exhibit more favourable attitudes towards blood donation than the control group because the Sig. value is below the 0.05 threshold, suggesting there is a 0.001 probability that the differences have occurred due to chance. This is a relatively small probability and below the 0.05 cut-off.

PAIRED-SAMPLES t-TEST

ONGOING PROJECT

What happens when means need to be compared that are not from independent samples? Such might be the case when the same respondent is measured on two separate occasions; for instance, when the respondent is asked to rate both how much he or she likes shopping on the Internet and how much he or she likes shopping in traditional shops. Since the liking scores are both provided by the same person, the assumption that they are independent is not realistic. Additionally, if one compares the prices the same retailers charge in their shops with the prices they charge on their websites, the samples cannot be considered independent because each pair of observations is from the same sampling unit. In these situations a researcher should use a paired-samples t-test. The idea behind the paired-samples t-test can be seen in the following computation: t5

d sd

n

where d is the average difference between means in the paired samples, Sd is the standard deviation of the observed differences between means, and n is the number of observed differences between means (that is, the sample size). The test has degrees of freedom equal to the total number of paired differences minus one. We can compute a paired samples t-test in SPSS and Excel. To go back to our earlier example on attitudes towards donating blood, to overcome the issues of an experimental design with a control group we could run a pretest–posttest design with one sample whose attitudes and behaviours were measured prior to and subsequent to the social marketing campaign. Suppose the researcher recontacted the original sample of 97 respondents who were asked about their attitudes towards donating blood. This time, because of respondent attrition, only 68 of the respondents were able to be contacted and were subsequently willing to complete the survey for a second time, after the campaign. Therefore, the researcher would now have information on attitudes before the campaign and information on attitudes after the campaign. The null and alternative hypotheses could be stated in the same way as for an independent samples t-test. Therefore, if the null hypothesis was true there would be no difference in the population means before and after exposure to the advertisement. The alternative hypothesis would be that the two population means are different (e.g. m1 Þ m2 or m1 2 m2 Þ 0). To find out if attitudes improved we can run a paired samples t-test in SPSS.

Conducting a paired samples t-test in SPSS To conduct a paired samples t-test in SPSS, open up ‘attitudest1t2.sav’ from the MarketingCentral website and perform the following procedure: Analyse > Compare Means > Paired-Samples T-Test. Select ‘totalatt_t1’ and ‘totalatt_t2’ as the Paired Variables and click OK as shown in Exhibit 13.6. This should produce the output in Exhibit 13.7 and Exhibit 13.8. In Exhibit 13.7 we can see that mean attitudes before the campaign were 3.83 and after the campaign mean attitudes increased to 4.28. This suggests that mean attitudes have increased as a result of the campaign. However, these differences could have occurred as a result of sampling

SPSS

449

450

PART SIX > ANALYSING THE DATA

error. To determine the probability of this we

EXHIBIT 13.6 →

can refer to Exhibit 13.8. In particular we could

DOING A PAIRED SAMPLES t-TEST IN SPSS

use the t-value and the degrees of freedom to calculate whether or not the observed results are statistically significant, but SPSS does this for us; and, apart from examining the sample statistics, we only really need to examine the Sig. value. The Sig. value is 0.000, which is below the 0.05 threshold, suggesting there is a 0.000 probability that the differences in attitudes have occurred due to chance. This is a relatively small probability and below the 0.05 cut-off. Therefore, we can say that there is evidence to suggest the social marketing campaign changed people’s attitudes towards donating blood.

PAIRED SAMPLES STATISTICS

EXHIBIT 13.7 → SAMPLE STATISTICS OF ATTITUDE BEFORE AND AFTER THE CAMPAIGN

pair 1

Std. deviation

Std. error mean

totalatt_t1

3.8284

68

1.41947

.17214

totalatt_t2

4.2843

68

1.37167

.16634

PAIRED SAMPLES TEST

EXHIBIT 13.8 → SPSS OUTPUT FROM PAIRED-SAMPLES t-TEST

Paired differences Mean Pair 1

EXCEL

N

Mean

totalatt_t1 – totalatt_t2

2.45588

Std. deviation .52753

Std. error mean .06397

t

d.f.

Sig. (two-tailed)

–7.126

67

.000

Conducting a paired-samples t-test in Excel Access the Excel data file ‘attitudest1t2.xlsx’ from the CourseMate website. Again the data in Excel must be in separate columns to reflect the repeated/paired samples (as in Exhibit 13.9). First, we might want to calculate the means, standard deviations and sample sizes of attitudes before and after the campaign (that is, the sample in time period 1 and the sample in time period 2 2 T1 and T2). To calculate the mean for T1, select a cell and type in: 5 average(A3:A70) Do the same for the T2 data, but with different cell references based on the T2 data (that is, B3:B70). To calculate the standard deviation for the T1 data, select a separate cell and type in: 5 stdev(A3:A70)

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

451

← EXHIBIT 13.9 DOING A PAIRED SAMPLES t-TEST IN EXCEL

paired samples t-test

452

PART SIX > ANALYSING THE DATA

Again, do the same for the T2 data, but with different cell references based on the T2 data. To calculate the sample size for T1 select a separate cell and type in: 5 count(a3:a70) Do the same for T2, but with different cell references based on the T2 data. Now, to calculate the Sig. value we need to select another cell and type in: 5 TTEST(A3:A70,B3:B70,2,1) In the brackets, the first values refer to the data before respondents were exposed to the campaign, the second values refer to the data after respondents were exposed to the campaign, the third value refers to whether or not we wish to run a one-tailed or two-tailed test (to be consistent with the SPSS example we have selected a two-tailed test) and the fourth value refers to the type of test we run. This time we want to run a paired-samples t-test so we have selected a ‘1’. For further information you can use the Excel help menu and type in ‘ttest’. When you click Enter, the cell will report an exact Sig. value. The interpretation is the same as in SPSS, of course. The Sig. value reported is 0.000, as in the SPSS example, and this is below the 0.05 threshold suggesting there is a 0.000 probability that the difference in attitudes has occurred due to chance. This is a relatively small probability. Therefore, we can say that there is evidence to suggest the social marketing campaign changed people’s attitude towards blood donation. Therefore, based on the results we would reject the null hypothesis because the Sig. value is less than 0.05 (this is the equivalent of saying that the calculated t-value is greater than the critical t-value). If we reject the null hypothesis then there is evidence that attitudes have increased as a result of the campaign. In other words the difference observed in our sample is unlikely to have occurred by chance. If this were the case then it would imply that the campaign was effective in increasing attitudes towards blood donation. Therefore, such results would give credence to managers wishing to roll out the campaign on a larger audience for the purpose of improving attitudes.

SURVEY THIS!

Are men or women more preoccupied with their mobile phones and Internet networking opportunities? You may be able to answer this question by looking at the data from the student survey. Test the following hypotheses using data obtained from the survey either from your class only or using data obtained from all users (either from the website or from your instructor): H1: Women are more likely to text message than men. H2: Men are more likely to use more than one email address. H3: Women check their email more often than do men. H4: Men spend more time online daily than do women.

Courtesy of Qualtrics.com

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

ANALYSIS OF VARIANCE (ANOVA)

453

ONGOING PROJECT

Earlier in this chapter we looked at the independent samples t-test to test for differences in means between two independent groups. However, this technique becomes impractical when we wish to test differences in more than two means. In this case we use a powerful technique called one-way Analysis of Variance (ANOVA, derived from ANalysis Of VAriance). ANOVA becomes even more powerful because we can take account of multiple factors to explain variation in a response variable (this is called a two-way or n-way ANOVA and is discussed in Chapter 15 when we examine multivariate statistics). Recall from earlier in the chapter that when the means of more than two groups or populations are to be compared, one-way analysis of variance (ANOVA) is the appropriate statistical tool. This bivariate statistical technique is referred to as ‘one-way’

analysis of variance (ANOVA) Analysis involving the investigation of the effects of one treatment variable on an intervalscaled dependent variable; a hypothesis-testing technique to determine whether statistically significant differences in means occur between three or more groups.

because there is only one independent variable (even though there may be several levels of that variable).10 We can see an example of ANOVA occurring in the new product development process. New product development is a risky business for marketers and product failure rates can be high.11 However, the benefits of a successful new product launch are huge – consider the success of Apple’s iPad and iPhone. What factors contribute to a successful new product launch? There are many factors that marketers and managers need to take into account when launching a new product. Ultimately by understanding the target customer better, marketers can provide a product that matches customer needs more precisely. One way of doing this is through rigorous market testing. Many companies invest millions of dollars test marketing products and ‘tweaking’ them to further refine aspects of the product, such as price, promotions, product characteristics and distribution. For instance, in 2010 Diageo test marketed Guinness Black Lager in Western Australia.12 Test marketing enables marketers to test consumer reactions in the marketplace and refine elements of the marketing mix without the cost of failure with a full product rollout. Marketers of Guinness may have been interested in the price they could charge for this new product, the market segments attracted to it or the best way to promote and distribute the product and they could have tested various manifestations of the marketing mix to see how consumers reacted. Suppose, for example, that the beer was launched in different pubs at slightly different prices. For instance, in one group of pubs it might cost $6.50, in another group of pubs it might cost $6.00 and in another group of pubs it might cost $5.50. How can we test the extent to which sales differ by each of these prices? Here there is one independent variable: price. This variable is said to have three levels: $6.50, $6.00 and $5.50. Because there are three groups or levels, a t-test cannot be used. We need another test instead. Don’t be confused – even though the term ANOVA stands for analysis of variance, we are using variances to allow us to compare means, not simply comparing variances – this will be explained shortly in the section on F-tests. First, just be sure that you understand that we use an ANOVA when we are comparing means between three or more groups. If we have three groups or levels of the independent variable, the null hypothesis (H0) is stated as follows: m1 5 m2 5 m3

Therefore, if the null hypothesis were true there would be no difference in the three population means; that is all the means are equal. The alternative hypothesis (H1) would be that not all the means are equal or at least one of the mis is different (e.g., it could be that m1 Þ m2 Þ m3 or m1 Þ m2). In the test-marketing example, we are concerned with average sales differing based on price of the

analysis of variance (ANOVA)

454

PART SIX > ANALYSING THE DATA

new beer. As the term ‘analysis of variance’ suggests, the problem requires comparing variances to make inferences about the means. The logic of this technique goes as follows. If the grouping variable (that is, price) is responsible for differences in sales for the beer, then the variation in responses between each of the three groups will be comparatively larger than the variation in responses within each of the groups (that is, it is the grouping variable that is causing differences in responses). Likewise, if the grouping variable is not responsible for differences in sales, then the variation in responses between groups will be comparatively smaller than the variation within groups. We can understand this more intuitively with another simple example. Suppose you are the sales manager of a company and were interested in examining sales figures between three different sales reps. You might first plot a chart that shows sales for all three sales reps at the same time (see Exhibit 13.10). From the chart you can ascertain that most contracts appeared to be around the $25 000 mark, with some lower and some higher. You may then wish to examine these same sales figures by sales rep, as shown in Exhibit 13.11. 60

EXHIBIT 13.10 → TOTAL SALES

50

Count

40

30

20

10

0 5.00

10.00

15.00

20.00

25.00

30.00

35.00

40.00

Sales (’000s)

From the chart we can see that Mark has made most of the sales up to $25 000, Fred has made most of the sales between $25 000 and $30 000 and Angie has made the majority of sales above $35 000. For the moment, disregard other possible reasons for the differences (such as differences in sales territory). From this sample it would suggest that Angie has made more sales than Fred, who has made more sales than Mark. The distributions within each salesperson’s sales seem relatively similar (that is, the variation within the groups is the same); however, because of the differences between members of the sales force, the data suggest a comparatively higher variance between salespeople, relative to the variance within the group. Now suppose the variation between salespeople was smaller, as in Exhibit 13.12. Because the variation between each of the salespeople is smaller, the ratio of variance between groups to variance within groups will be smaller. We are less likely to conclude that there is a difference in salespeople because the differences are smaller and could more conceivably have occurred by chance. We now explain how we measure this mathematically using a statistic called the F-test.

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

40.0%

Sales representative Mark Fred

455

← EXHIBIT 13.11 LARGE VARIATION IN SALES BETWEEN SALES REPS

Angie

Per cent

30.0%

20.0%

10.0%

0 40

.0 35

.0

0

0 .0

.0 25

30

0

0 .0 20

.0 15

10

5.

.0

0

0

00

0.0%

Sales (’000s)

40.0%

Sales representative Mark Fred Angie

20.0%

10.0%

Sales (’000s)

0 .0 38

0 37 .0

0

0

.0 33

32 .0

00 28 .

0 27 .0

00 23 .

00 22 .

0 .0 18

0 17 .0

0 .0 13

0 .0 12

00

0.0%

8.

Per cent

30.0%

← EXHIBIT 13.12 SMALL VARIATION IN SALES BETWEEN SALES REPS

456

PART SIX > ANALYSING THE DATA

TIPS OF THE TRADE

At this stage the astute student might be asking: ‘Why can’t we use several separate t-tests to make multiple comparisons?’ In a sense we can – we could compare each pair of groups using a separate t-test and this would tell us which pairs differ. In other words, we could create all possible combinations of the price treatments (that is, $6.50, $6.00 and $5.50) and run several t-tests (you would need to perform three t-tests if there were three different treatments, as in the test marketing example above). In principle you could do this, but it creates other problems. In particular, performing several t-tests increases the chance of a Type I error (see Chapter 12 to review this concept – recall a Type I error is the error involved in rejecting a null hypothesis when in fact the null hypothesis is correct). With one comparison between two sample means we set the Type I error rate at 0.05 or 5 per cent because we said that a 5 per cent chance of error is a sufficient risk for us to accept for this kind of study. However, that was only for one test. If there is a 5 per cent chance of Type I error doing one test and a 5 per cent chance of Type I error if we perform another test, then we can quickly see that the probability of committing a Type I error increases the more tests we do. More specifically, in our example, if we wished to compare all the different pairs using several t-tests, we would have to conduct three13 different t-tests and each of these tests would have a Type I error of 0.05. Using the standard significance level of 0.05, this would mean the probability of committing at least one Type I error is 0.1426 or 14.3 per cent. This is called the experimentwise Type I error rate:14 P(Type I Error) 5 aE 5 1 2 (1 – a)C 5 1 2 (1 2 0.05)3

So the chance of rejecting the null hypothesis when in fact it is correct is over 14 per cent when we run three tests! As the number of groups increases this number quickly becomes very large. Fortunately, one such solution to this issue is the one-way ANOVA, which allows us to compare groups simultaneously using the F-test.

The F-test F-test

We can examine this ratio of variances using an ANOVA, for which we use a statistical test called the F-test for comparing one sample variance (the variance between groups) with another sample variance (the variance within groups). The F-test determines whether there is more variability in the scores of one sample than in the scores of another sample. The key question is whether the two sample variances are different from each other or whether they are from the same population. To obtain the F-statistic (or F-ratio), the variance between groups is divided by the variance within groups. To test the null hypothesis of no difference between the sample variances, a table of the F-distribution is necessary. Using Table A.5 or A.6 in Appendix A is much like using the tables of the Z- and t-distributions that we have previously examined. These tables portray the F-distribution, which is a probability distribution of the ratios of sample variances. These tables indicate that the distribution of F is actually a family of distributions that change quite drastically with changes in sample sizes. Thus, degrees of freedom must be specified. Inspection of an F-table allows the researcher to determine the probability

F-test A procedure used to determine whether there is more variability in the scores of one sample than in the scores of another sample. total variance The sum of within-group variance and between-group variance.

of finding an F as large as the calculated F.

IDENTIFYING AND PARTITIONING THE TOTAL AMOUNT OF VARIATION In an analysis of variance, the basic consideration for the F-test is identifying the total amount of variation. There will be two forms of variation: (1) variation of scores within groups and due to individual differences; and (2) systematic variation of scores between groups due to manipulation of the independent variable or characteristics of the independent variable. Thus, we can partition total variance into within-group variance and between-group variance.

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

The F-distribution is the ratio of these two sources of variances; that is: F5

Variance between groups Variance within groups

A larger ratio of variance between groups to variance within groups implies a greater value of F. If the F-value is large, the results are likely to be statistically significant; in other words, if the value of F is large, the more likely the differences in means has occurred as a result of the grouping variable (variance between groups is comparatively higher than variation within groups).

Calculating the F-ratio The data in Table 13.2 are from a hypothetical packaged-goods company’s test market experiment on pricing. Three pricing treatments were administered in four separate areas (12 test areas, A to L, were required). These data will be used to illustrate ANOVA. TABLE 13.2 »

A TEST MARKET EXPERIMENT ON PRICING

Sales in units ('000s) Regular price, $0.99

Reduced price, $0.89

Cents-off coupon, regular price

Test market A, B or C

130

145

153

Test market D, E or F

118

143

129

Test market G, H or I

87

120

96

Test market J, K or L

84

131

99

X2 5 134.75

X3 5 119.25

Mean

X 1 5 104.75

Grand mean

X 5 119.58

Terminology for the variance estimates is derived from the calculation procedures, so an explanation of the terms used to calculate the F-ratio should clarify the meaning of the analysis of variance technique. The calculation of the F-ratio requires that we partition the total variation into two parts: Total sum of squares 5 Within group sum of squares 1 Between group sum of squares

or SStotal 5 SSwithin 1 SSbetween

The total sum of squares (or SStotal) is computed by squaring the deviation of each score from the grand mean and summing these squares:15 n

c

(

SStotal 5∑ ∑ Xij 2X i 51 j 51

)

2

where Xij5 individual score (that is, the ith observation or test unit in the jth group) X 5 grand mean

n 5 number of all observations or test units in a group c 5 number of jth groups (or columns)

457

458

PART SIX > ANALYSING THE DATA

In our example: SStotal 5 (130 2 119.58)2 1 (118 2 119.58)2 1 (87 2 119.58)2 1 (84 2 119.58)2 1 (145 2 119.58)2 1 (143 2 119.58)2 1 (120 2 119.58)2 1 (131 2 119.58)2 1 (153 2 119.58)2 1 (129 2 119.58)2 1 (96 2 119.58)2 1 (99 2 119.58)2 5 5948.93

SSwithin (the variability that we observe within each group) is calculated by squaring the deviation of each score from its group mean and summing these scores: n

SSwithin 5∑

∑ ( X 2X ) c

2

ij

j

i 51 j 51

where Xij 5 individual score X 5 group mean for the jth group j

n 5 number of observations in a group c 5 number of jth groups In our example: SSwithin 5 (130 2 104.75)2 1 (118 2 104.75)2 1 (87 2 104.75)2 1 (84 2 104.75)2 1 (145 2 134.75)2 1 (143 2 134.75)2 1 (120 2 134.75)2 1 (131 2 134.75)2 1 (153 2 119.25)2 1 (129 2 119.25)2 1 (96 2 119.25)2 1 (99 2 119.25)2 5 4148.25

SSbetween (the variability of the group means about a grand mean) is calculated by squaring the deviation of each group mean from the grand mean, multiplying by the number of items in the group and summing these scores: c

(

SSbetween 5 ∑ n j X j 2 X j =1

)

2

where X 5 group mean for the jth group X 5 grand mean nj 5 number of items in the jth group In our example, SSbetween 5 4(104.75 2 119.58)2 1 4(134.75 2 119.58)2 1 4(119.25 2 119.58)2 5 1800.68

The next calculation requires dividing the various sums of squares by their appropriate degrees of freedom. These divisions produce the variances, or mean squares. To obtain the mean square between groups, we divide SSbetween by c 2 1 degrees of freedom: MSbetween 5

SSbetween c 21

In our example: MSbetween 5

1800.68 1800.68 5 5 900.34 321 2

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

To obtain the mean square within groups, we divide SSwithin by cn 2 c degrees of freedom: MSwithin 5

SSwithin cn 2c

In our example: MSwithin 5

4148.25 4148.25 5 5 460.91 1223 9

Finally, the F-ratio is calculated by taking the ratio of the mean square between groups to the mean square within groups. The between-groups mean square is the numerator and the within-groups mean square is the denominator: F5

MSbetween MSwithin

In our example: F5

900.34 5 1.95 460.91

There will be c 2 1 degrees of freedom in the numerator and cn 2 c degrees of freedom in the denominator: c 21 321 2 5 5 cn 2c 3(4)23 9

Using the 5 per cent level of significance, in Table A.5 in Appendix A, the critical value of F at the 0.05 level for 2 and 9 degrees of freedom indicates that an F of 4.26 would be required to reject the null hypothesis. In our example F was calculated to be 1.95, so is smaller than the critical F. Because the calculated F is lower than the critical F, this means the difference is not statistically significant. Therefore, there is no evidence to reject the null hypothesis (you could be excused for thinking that this means we ‘accept’ the null hypothesis, but our findings only provide some degree of evidence so we can never truly be sure, and would typically say ‘there is no evidence to reject the null hypothesis’). In other words, any difference we observed based on our sample means is probably not a ‘real’ difference and could have occurred by chance. Practically speaking, this means there was no real difference in consumer response to the three different pricing treatments. Even though the price was cheaper than the regular price treatment in the $0.89 treatment and the coupon treatment, the fact that there were no statistical differences could indicate that the differences in price from such small price reductions were not noticed by consumers. This would be consistent with pricing research that has observed a Just Noticeable Difference threshold for price discounts such that the size of the discount must be noticeable to consumers. Usually this is around 5–10 per cent in fast moving consumer goods categories.16 Therefore, based on these results managers should either stick to the regular price strategy (rather than use the tested sales promotions) or promote the products more heavily. The information produced from an analysis of variance is traditionally summarised in table form. Tables 13.3 and 13.4 summarise the formulas and data from our example.17

459

460

PART SIX > ANALYSING THE DATA

TABLE 13.3 »

ANOVA SUMMARY TABLE

d.f.

Mean square

c21

MSbetween 5

)

cn 2 c

MSwithin 5

∑ (X 2 X )

cn 2 1

2

Source of variation

Sum of squares

Between groups

SSbetween 5 ∑n j X j 2X

(

c

Within groups

j=1

n

c

i 51

j 51

(

)

SSwithin 5 ∑ ∑ X ij 2X j

n

SS total 5 ∑

Total

i 51

c

2

2

2

j 51

ij

SSbetween c 21

SSwithin cn 2 c

F-ratio F5

MSbetween MSwithin

–

where c 5 number of groups n 5 number of observations in a group

TABLE 13.4 »

PRICING EXPERIMENT ANOVA TABLE

Sum of squares

d.f.

Mean square

F-ratio

Between groups

1 800.68

2

900.34

1.953

Within groups

4 148.25

9

460.91

Total

5 948.93

11

Source of variation

EXPLORING RESEARCH ETHICS

SPSS

–

–

LOCATION, LOCATION, LOCATION . . . AND PRODUCT PLACEMENT!

Product placement is an increasingly used marketing communications tool and an alternative way for marketers to reach consumers. It typically involves the planned integration of brands within media content such as films, computer games and radio. So next time you see that Apple logo flash up when an actor is using their laptop it probably wasn’t an accident! The ethics of product placement has been questioned as marketers can surreptitiously gain access to consumers when their guard is down. Research, using tests of differences such as ANOVA, makes some interesting observations about how we as consumers process product placements.18 One finding is that more prominent placements lead to lower brand attitudes and purchase intention because individuals perceive their freedom from marketing communications has been encroached upon in this otherwise safe environment – interesting, given marketers may pay more for placements which are more prominent. Furthermore, if the presence of product placements within a movie is disclosed to consumers at the start of the viewing then, curiously, this makes no difference to viewers’ evaluations of placed brands. Given some regulators would like to see greater prior disclosure this provides a useful and thought-provoking finding for policy makers.

Conducting an ANOVA in SPSS We now conduct an ANOVA in SPSS. Open up ‘testmkt.sav’ from the CourseMate website. The numbers in this data file come from the data presented in Table 13.2. To find out if mean sales differ based on the three different price treatments (regular price $0.99, reduced price $0.89, cents-off coupon and regular price), perform the following procedure: Analyse > Compare Means > One-way ANOVA. Select ‘sales’ as the Dependent List and ‘price’ as the Factor. Then click on Options and select Descriptives and Homogeneity of variance test. Click Continue and then OK as in Exhibit 13.13.

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

461

You should see the sample statistics in Exhibit 13.14 and the ANOVA results in Exhibit 13.15. By eyeballing the data there seems to be some difference in means at first glance. For instance, mean sales for the regular price treatment are 104.75, mean sales for the reduced price treatment are 134.75 and mean sales for the coupon treatment are 119.25. There appears to be a large difference between the reduced price treatment and the regular price treatment. However, recall we have only taken these figures from a small number of stores (that is, four in each treatment). Could these results have occurred as a result of sampling error? Based on the displayed Sig. value, which is above 0.05, we cannot conclude the means are different; it doesn’t seem to matter what price is charged or if there is a promotion because it doesn’t seem to affect sales. This means that there is no statistical difference between the same means generated from the test market. ← EXHIBIT 13.13 DOING AN ANOVA IN SPSS

EXHIBIT 13.14 → SALES BY PRICE TREATMENT

Descriptives Sales N

Mean

Std. deviation

Std. error

Minimum

Maximum

Regular price

4

104.7500

22.79437

11.39719

84.00

130.00

Reduced price

4

134.7500

11.61536

5.80768

120.00

145.00

Cents-off coupon

4

119.2500

26.98611

13.49305

96.00

153.00

Total

12

119.5833

23.25534

6.71324

153.00

EXHIBIT 13.15 → ANOVA RESULTS FOR DIFFERENCES IN SALES BY PRICE TREATMENT

ANOVA Sales Sum of squares

d.f.

Mean square

F

Sig.

Between groups

1 800.667

2

900.333

1.953

0.197

Within groups

4 148.250

9

460.917

Total

5 948.917

11

462

PART SIX > ANALYSING THE DATA

EXCEL

Conducting an ANOVA in Excel Open the file ‘testmkt.xlsx’ from the CourseMate website. Notice how in Excel the data need to be entered differently from in SPSS if we want to conduct an ANOVA (see Exhibit 13.16). In SPSS the sales data are one variable (that is, one column) and the test market is another variable (that is, one column) and SPSS distinguishes differences in sales based on differences in the test-market grouping variable. In Excel the data for each price treatment need to be included in a separate column. Make sure you have installed the Analysis Toolpak in Excel before continuing (to do this search for Analysis Toolpak in Excel’s Help menu). Perform the following procedure: Data > Data analysis. Select ANOVA: Single Factor. Select the checkbox ‘Labels in first row’ if you want to see the labels for each group (that is, the labels for the price treatments). Select the input range, which means selecting all the data involved in the analysis (and you may include the column headings too if you selected the checkbox ‘Labels in first row’). Select your Output Range or alternatively have the output go to a new Worksheet. Click OK and you should see the sample statistics and the accompanying ANOVA table as in Exhibit 13.17. The tables are interpreted in the same way as we have discussed already; again we can see the Sig. value (labelled as a p-value in Excel) is 0.197, which is higher than the 0.05 cut-off and suggests there is no evidence that mean sales are different based on the price treatment.

EXHIBIT 13.16 → DOING AN ANOVA IN EXCEL

EXHIBIT 13.17 → ANOVA OUTPUT FROM EXCEL

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

WHAT WENT RIGHT?

WE LIKE WHAT WE HEAR!19

Using a controlled experiment in Australia, some researchers were interested, among other things, in determining how environmental factors such as song familiarity affected how long respondents perceived they had been waiting. The idea is that one can affect perceived waiting times (after all ‘perception is reality’, as they say) regardless of whether or not actual waiting times can be changed. So, in a simulated waiting room respondents were played songs that varied in familiarity and other factors that could be varied by retailers to ultimately

463

influence perceived waiting time. Respondents were then asked to estimate how long they had waited. Interestingly, respondents reported longer waiting times for unfamiliar music than for familiar music, suggesting that retailers can influence perceived waiting time by careful selection of music. Rather than just looking at the means to see if they were different, the researchers used an ANOVA to determine the extent to which any differences in perceived waiting time could have occurred by chance.20

NONPARAMETRIC STATISTICS FOR TESTS OF DIFFERENCES For the statistical tests you have learned to use so far, it has been necessary to assume that the population (or sampling distribution) is normally distributed. If it is normal, the error associated with making inferences about the population from sample data can be estimated. If it is not normal, however, the error may be large and cannot be estimated in the same way. Such tests are also useful when data do not have the correct properties. For instance, in Chapter 8 we discussed the differences between ratio, interval, ordinal and nominal data. We established that we can take means for ratio and interval data, but it is not feasible to take means for ordinal data and nonsensical to take means for nominal data. Previously, in order to test differences in means between two or more groups, we needed to make sure the variable that we are taking the mean of is either interval or ratio scaled. Nonparametric tests overcome these limitations without placing so many constraints on the data. Nonparametric tests have many advantages: they avoid the error caused by assuming that a population is normally distributed when it is not; this is often the case for marketing data which frequently use ordinal scales. For example, if consumers from one segment of the market are asked to rank brands in order of preference and these rankings are compared against those from another segment of the market (this might be the case if a researcher feels Likert scales have limitations and thinks it easier for consumers to rank multiple brands rather than provide an individual rating for each brand) then they may use a nonparametric statistical test to see if these rankings differ between the two samples. Though such tests are often valuable to the marketing researcher and more pragmatic in many situations, we simply want to highlight them here, as an alternative to what has been discussed so far. Students wishing to go further should consult a more in-depth statistics book.21

STATISTICAL AND PRACTICAL SIGNIFICANCE FOR TESTS OF DIFFERENCES So far, when interpreting statistical tests, we have generally been concerned with determining a p-value or significance value to tell us how likely it is that the results we have observed have occurred as a result of sampling error. For instance, in the ANOVA SPSS example we found a significance value of

464

PART SIX > ANALYSING THE DATA

0.197, which suggests that the differences in means are likely to have occurred by chance. But suppose we found a significance value of 0.014 instead, and concluded that there was a 1.4 per cent probability that the results observed had occurred by chance. Does that really mean the means were different from a practical perspective? The answer to this question is often subject to interpretation and is what we term practical significance. There was a reasonably small difference in mean sales based on the three sales promotion treatments. However, recall we were interested in determining which sales promotion would be most effective. Suppose the market was very small and the 10 cents price reduction led to only a minor change in demand. Would this minor change in demand, for the relatively large change in cost to the company, be worthwhile? Would a significantly large number of consumers purchase the product at the lower price, relative to the higher price? Perhaps not. Only small differences were seen, so even if a statistical difference was observed, practically, the difference might be very small – this is what we mean by substantive significance. Keep this is mind when interpreting statistical output. We can ascertain practical significance with another type of statistics called the effect size. Effect size tells us how large the difference in means is. A more detailed discussion of effect size is beyond the scope of effect size

TIPS OF THE TRADE

this book and interested readers are referred to a statistics text for further information.22

A t-test is used to compare means between two groups. »» An independent samples t-test predicts a continuous (interval or ratio) dependent variable with a categorical (nominal or ordinal) independent variable. »» A paired samples t-test compares means from two different responses from the same sampling unit. Therefore, the sampling is dependent. A one-way ANOVA extends the concept of an independent samples t-test to more than two groups. »» Don’t be fooled by the fact that it involves an F-test instead of a t-test. They are mathematically related and, in fact, an F-value is the square of a t-value that would result from the same analysis. »» Statistics packages usually have an ANOVA package or a one-way ANOVA package. However, general linear model procedures can also conduct these tests and offer more flexibility, as we will see in later chapters. »» Simple hand calculations can be useful in learning what statistical procedures actually do. However, in conducting actual tests, take advantage of computer software whenever permissible.

SUMMARY TO UNDERSTAND THE REASONS FOR CONDUCTING TESTS OF DIFFERENCES, THE TESTS OF DIFFERENCE THAT ARE AVAILABLE TO THE RESEARCHER, AND HOW MEASUREMENT SCALE INFLUENCES THE ANALYSIS TECHNIQUE USED

Tests of difference are used when the researcher wants to investigate hypotheses stating that two (or more) groups differ with respect to a certain behaviour, characteristic or attitude. This

13

chapter covered two commonly used tests of differences: t-tests and ANOVAs. Bivariate statistical techniques analyse scores on two variables at a time. Both the type of measurement and the number of groups to be compared influence researchers’ choice of the type of statistical test of differences. Specifically we perform a t-test (ANOVA) when we want to examine differences in the means of an interval/ratio scaled variable between two (three or more) groups.

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

TO STATE A NULL HYPOTHESIS IN A TEST OF DIFFERENCES AMONG TWO OR THREE MEANS AND TO BE ABLE TO TEST THAT HYPOTHESIS BY CALCULATING A t-TEST FOR TWO INDEPENDENT SAMPLES, A t-TEST FOR TWO PAIRED/ RELATED SAMPLES, OR AN ANOVA FOR THREE OR MORE SAMPLES

The null hypothesis about differences between two groups is stated as follows: m1 5 m2 or m1 2 m2 5 0

To calculate a t-test for two independent samples we use the formula displayed on page 444. To calculate a t-test for two paired/ related samples we use the procedure outlined on page 448. The one-way ANOVA is appropriate when the researcher is comparing means between three or more different groups. The total variance in the observations is partitioned into two parts: that from within-group variation and that from between-group variation. The ratio of the variance between groups to the variance within groups gives an F-statistic. The F-distribution is a measure used to determine whether the variability of two samples differs significantly. If we have three groups or levels of the independent variable, the null hypothesis is stated as follows: m1 5 m2 5 m3 (that is, all three means are the same) If we have more than three groups or levels of the independent variable, the null hypothesis is stated as follows: m1 5 m2 5 … 5 mk (that is, all ‘k’ means are the same) To calculate an F-test for three independent samples we use the procedure on page 457. TO USE SPSS AND MICROSOFT EXCEL TO CONDUCT TESTS OF DIFFERENCES WITH INDEPENDENT SAMPLES t -TESTS, PAIRED SAMPLES t -TESTS AND ANOVAS, AND TO BE ABLE

465

TO INTERPRET THE OUTPUT FROM THESE STATISTICAL PACKAGES

The tests covered in this chapter used SPSS and Excel, two commonly used and powerful statistical packages. SPSS and Excel display a variety of different statistics and output when a t-test or ANOVA is conducted. In both cases the researcher ought to examine the descriptive statistics (that is, the means, standard deviations and number of respondents in each group). However, to ascertain the probability of sampling error the researcher must examine the Sig. value. TO UNDERSTAND THE ASSUMPTIONS UNDERLYING t-TESTS AND ANOVAS AND UNDERSTAND THAT THERE ARE ALTERNATIVE TESTS THAT CAN BE USED WHEN THESE ASSUMPTIONS DO NOT HOLD

The t-test is appropriate when the researcher is comparing means between two different groups. However, the t-test and ANOVA are based on certain, often restrictive assumptions about the scales used and the sampling distribution. In such cases when the data are not normally distributed, or if the scale used is not interval or ratio, researchers can refer to another class of techniques called nonparametric techniques. UNDERSTAND THE DIFFERENCE BETWEEN STATISTICAL AND PRACTICAL SIGNIFICANCE FOR TESTS OF DIFFERENCES

Finally, when conducting tests of difference, be sure to consider mathematical differences and statistical differences. Especially with large samples, statistical differences can often be seen. However, they may be small and not very meaningful. If this is the case, we say there is no substantive significance.

KEY TERMS AND CONCEPTS analysis of variance (ANOVA) F-test independent samples t-test for difference of means

paired samples t-test pooled estimate of the standard error test of differences

total variance

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 What tests of difference are appropriate in the following situations? a A manager wishes to see if a new service training program has increased service quality perceptions by measuring

service quality before and after the training program on the same sample of customers. b Advertising and brand managers have responded ‘yes’, ‘no’ or ‘maybe’ to a survey about increasing their use of social

466

PART SIX > ANALYSING THE DATA

c d

e

f

media over the next six months. Their answers need to be compared. A market researcher wants to test which of five incentives for a mail survey works best in relation to increasing the response rate. A grocery retailer believes that shoppers given a healthy product to taste (e.g., an apple) when entering the store will purchase more fruit than shoppers given a less healthy product to taste (e.g., a chocolate brownie) After launching a new product in four separate test markets a market researcher wants to compare whether purchase likelihood differs among the four sites. A market researcher wants to see if the addition of a new artificial sweetener (stevia) changes consumer taste perceptions of a soft drink by asking a group of consumers to taste the original soft drink and then the new soft drink with the artificial sweetener.

2 Test to see if there is a difference in mean intentions to conserve water depending on whether or not the individual had been exposed to a social marketing campaign designed to reduce water consumption. Those exposed to the social marketing campaign

Those not exposed to the social marketing campaign

X 1 5 4.1

X2 5 3.8

S12 5 0.4

S22 5 0.8

n1 5 122

n2 5 104

3 Marketing managers can present consumers with many different sales promotions to stimulate consumption. One kind of sales promotion is a ‘BOGOF’ or buy-one-get-onefree. Another type is a larger package size (for example, a 1-litre bottle of clothes washing detergent might get turned into a 2-litre bottle for the same price, as a temporary offer). There is some research in marketing which shows that if given bigger package sizes, people use more. Suppose you are a retailer and you want to test whether or not this is the case. You contact a small sample of 44 respondents. Half are given two 1-litre bottles and the other half are given one 2-litre bottle. You get them to use the washing detergent over a two-week period and measure how much has been used in

millilitres. Perform a t-test on the data below to see if package size increases consumption. Usage in mL (respondents with two 1-litre bottles)

Usage in mL (respondents with one 2-litre bottle)

535

654

342

621

300

423

422

211

354

289

269

465

312

325

415

541

212

465

220

421

285

333

329

454

145

248

267

453

128

324

222

254

239

332

321

453

245

378

269

474

211

345

220

218

4 Given the following data, is there a difference between means? Sample 1

Sample 2

Sample mean (X )

4.8

5.4

Sample variance (S2)

1.1

1.3

Sample size (n)

34

38

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

How about if the standard deviation changes as in the table below? What does this mean? Sample 1

Sample 2

4.8

5.4

Sample variance (S )

1.5

1.4

Sample size (n)

34

38

Sample mean (X ) 2

5 A sales force (n 5 67) received some management-byobjectives training. Are the before/after mean scores for salespeople’s job performance statistically significant at the 0.05 level? Before

After

t

Planning ability

4.84

5.43

4.88

Territory coverage

5.24

5.51

1.89

Activity reporting

5.37

5.42

0.27

Skill

6 To better understand the characteristics of potential adopters of a new product the characteristics of adopters and nonadopters were compared based on a small survey. Adopters

non-adopters

$48 271

$33 783

Attitudes towards healthy food consumption

6.1

4.3

Readiness to adopt new technologies

5.8

5.5

Average income

Income was measured as a dollar amount and the two attitudinal scales (attitudes towards healthy food consumption and readiness to adopt new technologies) were measured on a 1 to 7 rating scale with 7 reflecting more positive attitudes. What is the most appropriate test to perform?

7 Suppose a researcher has one nominal-scaled variable with two categories (e.g., gender) and one ordinal-scaled variable (e.g., preference ranking for car models) and wishes to use a t-test. Is this appropriate? What other options does the researcher have? 8 Explain in your own words the circumstances under which an independent samples t-test would be used and the circumstances under which an ANOVA would be used. Give examples. 9 What is the purpose of a paired samples t-test and how does it differ from an independent samples t-test? Which test would be appropriate if a market researcher wanted to see if mean

467

customer satisfaction was different in one time period from another time period for the same sample of respondents? And if they wanted to test if customer satisfaction was different between heavy users and light users? 10 In an experiment with wholesalers, a researcher manipulated perception of task difficulty and measured level of aspiration for performing the task a second time (1 5 no aspiration, 10 5 very high aspiration). Group 1 was told the task was very difficult, group 2 was told the task was somewhat difficult but attainable and group 3 was told the task was easy. Perform an ANOVA on the resulting data: Level of aspiration (10-point scale) Subjects

Group 1 Very difficult

Group 2 Somewhat difficult but attainable

Group 3 Easy

1

7

4

3

2

6

5

5

3

7

7

6

4

9

7

10

5

9

6

9

6

9

6

8

7

10

7

9

8

9

7

8

Cases (n)

8

8

8

11 From a recent survey about savings behaviour a market researcher finds that mean savings in bank accounts are $1378, at credit unions they are $2962 and at other financial institutions they are $698. What test would the researcher use to determine if statistical differences exist in savings amounts between customers at the three types of financial institutions? 12 In a t-test, the t-statistic is calculated by subtracting one group mean from the other group mean. However, with an ANOVA, a different process is used that compares variances to find differences in means. Explain why a different process is used and explain the basic idea behind this process. 13 After a survey of 2000 people about their attitudes towards saving for retirement, suppose a researcher found a statistically significant difference in three group means at the 5% level of significance, using an ANOVA. Sample mean 1 5 5.21, sample mean 2 5 5.25 and sample mean 3 5 5.28. Use this

468

PART SIX > ANALYSING THE DATA

information to explain the difference between statistical and substantive significance.

EXHIBIT 13.18 → EXCEL OUTPUT FOR PAIRED SAMPLES t-TEST FOR TWO SAMPLE MEANS

14 When would a researcher use nonparametric statistics and when would a researcher use parametric statistics? 15 Suppose you are a marketing manager and have responsibility for the success of a new product launch (a new kind of pouring yoghurt drink). To gain more information on consumer perceptions of the product you need to determine: (a) which of three price points will be the most acceptable to consumers; (b) whether or not consumers prefer the new pouring yoghurt to its nearest substitute (say, a normal premium brand yoghurt); and (c) whether or not consumers will continue to purchase the new pouring yoghurt or if they initially purchase it out of interest, rather than a keen desire for the product. Explain how you would go about doing this. 16 An Excel spreadsheet is reprinted in Exhibit 13.18 below. Interpret the t-test results. Are they statistically significant?

ONGOING PROJECT RUNNING SOME UNIVARIATE OR BIVARIATE STATISTICS? CONSULT THE PROJECT WORKSHEETS FOR CHAPTERS 12, 13 AND 14 FOR HELP

Selecting a test to use is based on answering some simple questions (see the flowchart at the beginning of Part 6) about the hypotheses you are testing, how the variables are measured and how many

variables to include. You then need to be able to interpret the output from the test. These steps can be followed by using the project worksheets for Chapters 12, 13 and 14 available from the CourseMate website. It’s a good idea to know what tests you are going to use before you collect the data (then you can collect the right data).

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to log in to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ interactive quizzes

☑ datasets ☑ case projects ☑ videos.

☑ online video activities

WRITTEN CASE STUDY 13.1 DO SOCIAL MARKETING INITIATIVES WORK? THE NOT SO DARK SIDE OF MARKETING Revisit and familiarise yourself with the context and data surrounding Written case study 12.1 on page 438. Using the same data please answer the following questions. 1 Do those who participate in the social marketing initiative intend to use less water than those who do not participate? 2 Does intention to conserve water differ by age group?

3 Given the nature of the data, please comment on any other limitations. In your answers clearly outline what the hypotheses are, specify the relevant variables and the tests that will be used. Write up the results as you would in a marketing research report. Please also use the data to run any other tests using the analysis techniques learned in this chapter.

CHAPTER 13 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF DIFFERENCES

469

WRITTEN CASE STUDY 13.2 GAMBLING IN YOUR COMMUNITY: HOW DO PERCEPTIONS VARY BY DIFFERENT CONSUMERS? Revisit and familiarise yourself with the context and data surrounding Written case study 12.2 on page 438. Several questions were asked to ascertain residents’ attitudes and perceptions of changes to wellbeing as a result of the EGMs. One series of questions asked about respondents’ changes to wellbeing (Q_29_a–i to Q_29_h–i) as a consequence of the introduction of 40 EGMs at the proposed tavern. Other demographic questions included variables such as income, age

group and gender. Do residents’ perceptions of the influence of the gaming machines vary by age, income and gender? Given the nature of the data please comment on any other limitations. In your answer clearly outline what the hypotheses are, specify the relevant variables and the tests that will be used. Write up the results as you would in a marketing research report. Please also use the data to run any other tests using the analysis techniques learned in this chapter.

ONGOING CASE STUDY MOBILE PHONE SWITCHING AND BILL SHOCK Now David, Leanne and Steve have a much better understanding of the data and what they are saying. However, management at AusBargain are particularly interested in trying to develop a better understanding of the different segments that exist among its customer base.

QUESTIONS

1 D efine some hypotheses that could be developed around the bill shock variable which may show how different segments

respond differently to bill shock. For example, is bill shock higher among younger age groups? Does it differ by different professions? How about based on psychographic dimensions such as activities, interests and opinions? 2 Develop some hypotheses that could subsequently be explored. 3 S pecify what variables you would use and clearly specify an appropriate test to use.

NOTES 1

2 3 4

5 6

7

Lowe, B., Barnes, B.R. & Rugimbana, R. (2012) ‘Discounting in International Markets and the Face Value Effect: A Double Edged Sword? Currencies,’ Psychology & Marketing, 29 (3). 144–156 Raghubir, P. & Srivastava, J. (2002) ‘Effects of Face Value on Product Valuation in Foreign Currencies’, Journal of Consumer Research, 39 (3). 335–347. Gaston-Breton, C. (2006) ‘The impact of the euro on the consumer decision process: theoretical explanation and empirical evidence’, Journal of Product and Brand Management, 15, 272–279. Prugsamatz, P., Lowe, B. & Alpert, F. (2010), ‘Modelling Consumer Entertainment Software Choice: An Exploratory Examination of Key Attributes, and Differences by Gamer Segment,’ Journal of Consumer Behaviour, 9 (5), 381–392. Remember, in true experimental designs respondents ought to be randomised to experimental and treatment groups to design out any biases (e.g., what if these areas of the city (not) exposed to the campaign had vastly different characteristics?). In practice this is not always possible. Researchers may either try to create groups that are as similar as possible through some kind of cluster sampling approach or alternatively can apply sophisticated statistical techniques like propensity score matching to statistically remove heterogeneity in the data. For a good review on this topic see Rubin, D.B. and Waterman, R.P. (2006), ‘Estimating the Causal Effects of Marketing Interventions Using Propensity Score Methodology,’ Statistical Science, 21 (2), 206–222. Selvanathan, Anthony, Selvanathan, Saroja, & Keller, Gerald (2013) Business Statistics: Australia/New Zealand, 6th edn, Cengage Learning: Melbourne. Recall from Chapter 12 that the 5 per cent level of significance is not the only level of significance we can use – it depends on how confident we want to be about the likelihood of these results occurring as a result of sampling error. However, as a rule of thumb, the 5 per cent level is often used. Our d.f. value was 95, which is not represented in the table because the difference In t-values starts to become very small at high d.f. levels. For a more stringent test we

8

9 10 11 12

should select the value with the higher degrees of freedom; that is, when d.f. is 120. In reality it won’t make a difference to the end result in this case. Carrington, M.J., Neville, B.A., & Whitwell, G.J. (2010) ‘Why ethical consumers don’t walk their walk: Towards a framework for understanding the gap between the ethical purchase intentions and actual buying behaviour of ethically minded consumers’, Journal of Business Ethics, 97, 139-158. DOI: 10.1007/s10551-010-0501-6 Source: Chatterjee, P. & Rose, R. L. (2012) ‘Do payment mechanisms change the way consumers perceive products?’, Journal of Consumer Research, 38 (April), 1129–1139. A one-way analysis of variance may also be referred to as a single-factor analysis of variance, because only one variable (factor) is manipulated. Castellion, George and Markham, Stephen K. (2012) ‘New Product Failure Rates: Influence of Argumentum ad Populum and Self-Interest’, Journal of Product Innovation Management, 30(5), pp. 976–979. Sources: http://www.theshout.com.au/2010/10/25/article/WA-first-to-try-GuinnessBlack-Lager/PRZKXYSSNH.html; http://www.couriermail.com.au/ipad/seeking-aguinness-pig-for-a-black-lager/story-fn6ck8la-1225999604627.

13 We can calculate this by hand, given there are only three groups. However, in  r r! general the formula to calculate the number of comparisons is:   5 2!(r 2 2)!  2 In our example, making pairwise comparisons of three means would lead to  3 3! 5 3 comparisons.  5 2!(322)!  2

470

PART SIX > ANALYSING THE DATA

14 If each test is independent, in general the probability of no Type I errors is P(No Type I Errors) 5 (1–a)c, where c is the number of tests conducted. As such the probability of committing at least one Type I error, or our experimentwise error rate (aE), is 1 2 (1 2 a)c 15 At first, the formula looks complicated. Our example shows that the procedure is not difficult, but it does require that the squared deviations for all the observations within a group (n) be summed and then these totals be summed for all groups (c). Sum of squares is an abbreviated term for ‘sum of the squared deviation scores’. 16 Gupta, S. & Cooper G. (1992) ‘The discounting of discounts and promotion thresholds’, Journal of Consumer Research, 19, 401–11. 17 You may be wondering at this stage whether or not you can use an ANOVA to find a difference in means between two groups. In fact using an ANOVA to detect a difference between two groups should provide you with the same result. This is no accident since the F and t are mathematical functions of one another.

18 Chan, F.Y., Petrovici, D. & Lowe, B. (2016) ‘Antecedents of product placement effectiveness across cultures,’ International Marketing Review, 33 (1), 5–24. 19 Bailey, Nicole & Areni, Charles S. (2006) ‘When a few minutes sound like a lifetime: Does atmospheric music expand or contract perceived time?’, Journal of Retailing, 82(3), 189–202. 20 Though the summary presented here shows that the researchers were interested in comparing means between two groups (suggesting the use of an independent samples t-test), the researchers were also examining the influence of several other variables and used an n-way ANOVA (see Chapter 15 for further details). 21 Selvanathan, Anthony, Selvanathan, Saroja, & Keller, Gerald (2013) Business Statistics: Australia/New Zealand, 6th edn, Cengage Learning: Melbourne. 22 Selvanathan, Anthony, Selvanathan, Saroja, & Keller, Gerald (2013) Business Statistics: Australia/New Zealand, 6th edn, Cengage Learning: Melbourne.

WHAT YOU WILL LEARN IN THIS CHAPTER

To give examples of the types of marketing questions that may be answered by analysing the association between two variables and to be able to list and apply the various techniques to do so. To calculate a simple correlation coefficient, a coefficient of determination and interpret a correlation matrix. To explain the concept of bivariate linear regression and to be able to interpret the output from a bivariate linear regression. To be able to design and interpret a crosstabulation and to test for association using a Chi-square test for independence. To understand how to run correlation, regression and Chi-square analysis in SPSS and Microsoft Excel, and to be able to interpret the output.

TESTS OF ASSOCIATION

Associate with a biscuit? (Answer: Hobnob)

In the past, if marketers wanted to predict which customers were most likely to buy their products they could have conducted a survey with questions about purchase likelihood and consumer characteristics, and ascertained which characteristics were most likely to be associated with a high purchase likelihood. However, survey data may not always be a reliable predictor of actual purchase1 because of the typical errors involved with survey research (see Chapter 5). Increasingly, companies are looking to their own transaction data based on actual purchases and linking this to data about customer characteristics (e.g., past purchases, socio-economic profile and social media habits) to more reliably predict what consumers will actually buy and to inform decisions about the marketing mix. Research problems of this nature

are about testing for association between variables. In marketing, there are many other examples of information needs concerning association. If we know how two variables relate to each other, we can make better predictions about outcomes. In a nutshell, ‘association’ is a word to describe the nature of the relationship between two or more variables. If advertising expenditure increases, will this increase market share? Which shares the biggest association with increased sales – advertising or sales promotions? How is sales productivity associated with pay incentives? Such questions can be answered by statistically investigating the relationships between relevant variables. This chapter is concerned with understanding associations between variables (using correlation and Chi-square tests) and predicting values of one variable based on values of another variable (using bivariate regression). We now discuss the concept of association and show how researchers can measure associations.

iStock.com/rgbdigital

14 »

BIVARIATE STATISTICAL ANALYSIS:

To understand the difference between practical and statistical significance when testing for association.

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

471

472

PART SIX > ANALYSING THE DATA

THE BASICS In marketing, the number of sales is often the dependent variable we wish to predict. The independent variables found to be associated with the dependent variable sales volume may be aspects of the marketing mix (such as price, number of salespeople or amount of advertising) and/or uncontrollable variables (such as population or gross national product). For example, most managers would not be surprised to find that sales of baby strollers are associated with the number of babies born a few months prior to the sales period. In this case the dependent variable is the sales volume of baby strollers, and the independent variable is the number of babies born. The mathematical symbol X is commonly used for the independent variable, and Y typically denotes the dependent variable. It is appropriate to label dependent and independent variables only when it is assumed that the independent variable caused the dependent variable. For simple associations between two interval or ratio variables, we can test association using correlation or regression analysis. For instance, we might want to test whether or not there is an association between sales and advertising. However, the variables may not always be interval or ratio and we may need to use nonparametric tests of association. For instance, we might want to test whether or not gender (a dichotomous variable) is related to brand awareness (low versus high). For two such nominal variables, we would use a Chi-square test. Likewise, if we wanted to test whether tests of association A general term that refers to a number of bivariate statistical techniques used to measure whether or not two variables are associated with each other.

or not two ordinal variables were associated, we would use a Spearman’s rank correlation coefficient; for instance, if we wanted to test the association between rank preference of shopping centres and a ranking of convenience for shopping centres. We now discuss these tests of association in greater depth. Exhibit 14.1 shows that the type of measure used will influence the choice of the proper statistical tests of association (see Chapter 8 to recap on measurement – understanding the way variables are measured is important in understanding which technique is appropriate and its underlying assumptions).

EXHIBIT 14.1 → BIVARIATE ANALYSIS – COMMON PROCEDURES FOR TESTING ASSOCIATION

How many variables am I interested in analysing? Two variables (e.g., does mean number of store visits per month to Wendy’s differ by gender?) Bivariate analysis (Chapters 13 and 14)

What am I interested in examining?

Testing for differences between groups

How are the variables measured?

Two nonmetric/nominal or categorical variables

Two metric variables

Two ordinal variables

Chi-square

Pearson’s correlation/ bivariate regression

Spearman’s correlation

See Chapter 13

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

PEARSON’S CORRELATION COEFFICIENT

473

ONGOING PROJECT

The most popular technique for indicating the relationship of one variable to another is simple correlation analysis. Just remember that both variables should be interval or ratio scaled.2 The Pearson’s correlation coefficient is a statistical measure of the covariation, or association, between two variables. The correlation coefficient, r, ranges from 11.0 to 21.0. If the value of r equals 11.0, there is a perfect positive linear (straight-line) relationship. If the value of r equals 21.0, there is a perfect negative linear relationship, or a perfect inverse relationship. No correlation exists if r equals 0; the two variables would be regarded as independent. A correlation coefficient indicates both the magnitude of the linear relationship and the direction of that relationship. For example, if we find that r 5 20.92, we know we have a very strong inverse relationship – that is, the greater the value measured by variable X, the lower the value measured by variable Y. The formula for calculating Pearson’s correlation coefficient for two variables X and Y is as follows: rxy 5 ryx 5

S( Xi 2 X )( Yi 2 Y ) S( Xi 2 X )2 ( Yi 2 Y )2

where the symbols X and Y represent the sample averages of X and Y, respectively. An alternative way to express the correlation formula is: rxy 5 ryx 5

s xy s2x s2y

where s2x 5 variance of X s2y 5 variance of Y sxy 5 covariance of X and Y with s xy 5

S( Xi 2 X )( Yi 2 Y ) N

If associated values of Xi and Yi differ from their means in the same direction, their covariance will be positive. If the values of Xi and Yi tend to deviate in opposite directions, their covariance will be negative. Pearson’s correlation coefficient actually is a standardised measure of covariance. In the formula, the numerator represents covariance and the denominator is the square root of the product of the sample variances. To understand why researchers are more likely to use a correlation rather than a covariance, consider the association between height and weight. Table 14.1 shows the heights, measured in centimetres (X1) or inches (X2) and the weights of 10 different people (1 inch 5 2.54 cm). Suppose a researcher is measuring the degree of association between height and weight by calculating a covariance. If height was measured in centimetres, then the covariance between height and weight would be 101.8. The units are centimetres times kilograms. But, if height was measured in inches, then the covariance would be 40.1. This time the units would be inches times kilograms. Therefore, with a covariance, the same data, depending on how they are measured, will give a different covariance. However, a correlation coefficient standardises the covariance, effectively removing the units of measurement. So, researchers find the correlation coefficient useful because they can compare two correlations without regard for the amount of variance exhibited by each variable separately.

Pearson’s correlation coefficient A statistical measure of the covariation, or association, between two variables.

474

PART SIX > ANALYSING THE DATA

TABLE 14.1 »

COVARIANCE BETWEEN HEIGHT AND WEIGHT

X1 height (cm)

X2 height (in)

Y weight (kg)

180

70.9

90

170

66.9

89

160

63.0

90

160

63.0

85

155

61.0

70

157

61.8

72

160

63.0

73

180

70.9

95

170

66.9

85

150

59.1

50

Covar(X1,Y ) 5 101.8

Covar(X2,Y ) 5 40.1

As a rule of thumb, the strength of a correlation coefficient can be summarised as follows:

6 0.81 to 6 1

Very strong

6 0.61 to 6 0.8

Strong

6 0.41 to 6 0.6

Moderate

6 0.21 to 6 0.4

Weak

6 0.01 to 6 0.2

Very weak

0

No relationship – the two variables are independent

Exhibit 14.2 illustrates the correlation coefficients and scatter diagrams for several sets of data. EXHIBIT 14.2 → SCATTER DIAGRAM TO ILLUSTRATE CORRELATION PATTERNS

r = 0.30

r = 0.80

Y

r = +1.0

Y

Y

X

X

Low positive correlation r=0

Perfect positive correlation

r = –0.60

Y

Y

No correlation

X

High positive correlation

X

r = –1.0 Y

X Moderate negative correlation

Perfect negative correlation

X

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

475

An example As an illustration of the calculation of the correlation coefficient, consider an investigation made to determine whether store customer satisfaction is associated with average staff absences per month. To determine whether the two variables in Table 14.2 are associated, a correlation analysis of the data is carried out. Some of the calculations are summarised in Table 14.2 also. X 5 4.99

Y 5 40.31 2

∑ ( Xi 2 X) 5 17.84 2

∑ ( Yi 2 Y ) 5 5.59 ∑ ( Xi 2 X)( Yi 2 Y ) 5 26.34 r5

∑ ( Xi 2 X)( Yi 2 Y ) 2

∑ ( Xi 2 X) ( Yi 2 Y )2

5

26.34 (17.84)(5.59)

5

26.34 5 20.635 99.71

The correlation between the two variables is 20.635, which indicates an inverse relationship. Thus, when the number of staff absences is higher, customer satisfaction is lower. This makes intuitive sense. If absences are higher then staff spend less time dealing with customers which may lead to greater dissatisfaction. TABLE 14.2 »

CORRELATION ANALYSIS OF STORE CUSTOMER SATISFACTION WITH AVERAGE MONTHLY STORE STAFF ABSENCES

Store customer satisfaction (Xi )

Average monthly store staff absences (Yi )

5.5

39.6

4.4

Xi 2 X

(Xi 2 X )2

Yi 2 Y

(Yi – Y )2

(Xi 2 X ) (Yi 2 Y )

0.51

0.2601

20.71

0.5041

20.3621

40.7

20.59

0.3481

0.39

0.1521

20.2301

4.1

40.4

20.89

0.7921

0.09

0.0081

20.0801

4.3

39.8

20.69

0.4761

20.51

0.2601

0.3519

6.8

39.2

1.81

3.2761

21.11

1.2321

22.0091

5.5

40.3

0.51

0.2601

20.01

0.0001

20.0051

5.5

39.7

0.51

0.2601

20.61

0.3721

20.3111

6.7

39.8

1.71

2.9241

20.51

0.2601

20.8721

5.5

40.4

0.51

0.2601

0.09

0.0081

0.0459

5.7

40.5

0.71

0.5041

0.19

0.0361

0.1349

5.2

40.7

0.21

0.0441

0.39

0.1521

0.0819

4.5

41.2

20.49

0.2401

0.89

0.7921

20.4361

3.8

41.3

21.19

1.4161

0.99

0.9801

21.1781

3.8

40.6

21.19

1.4161

0.29

0.0841

20.3451

3.6

40.7

21.39

1.9321

0.39

0.1521

20.5421

3.5

40.6

21.49

2.2201

0.29

0.0841

20.4321

4.9

39.8

20.09

0.0081

20.51

0.2601

0.0459

5.9

39.9

0.91

0.8281

20.41

0.1681

20.3731

5.6

40.6

0.61

0.3721

0.29

0.0841

0.1769

476

PART SIX > ANALYSING THE DATA

Another question that researchers ask concerns statistical significance. The procedure for determining statistical significance is the t-test of the significance of a correlation coefficient.3 We would test this using a procedure similar to a t-test. The logic behind the test is similar to that for the significance tests already considered in Chapters 12 and 13. For further details on the statistical significance testing procedure, interested students should consult a more advanced statistics book.4 Typically, the null hypothesis (H0 ) being tested is that the population correlation coefficient (r or ‘rho’) is equal to zero. Therefore, the null hypothesis for a correlation analysis would be r 5 0. The alternative hypothesis is usually something such as r Þ 0. As with the t-test hypotheses discussed in Chapter 13, researchers sometimes specify the nature of this correlation coefficient (e.g., whether the correlation coefficient is negative – that is, ,0 – or whether it is positive – that is, .0). If this was the case the alternative hypothesis might be r , 0 or r . 0 depending on prior expectations about the direction of the relationship. We will confine our discussion to cases where the alternative hypothesis would be specified as being not equal to zero, rather than greater than or equal to zero (e.g., positive or negative). For the correlation coefficient calculated from the data in Table 14.2 the p-value is 0.0035 (this can be calculated by hand using the formula at the end of the chapter).5 Therefore, based on the results we would reject the null hypothesis because the significance value (Sig. value) is less than 0.05 (again this is the equivalent of saying that the calculated t-value is greater than the critical t-value). Having rejected the null hypothesis, there is evidence that the relationship observed in our sample between customer satisfaction and staff absences is statistically significant (e.g., different from zero). In other words, the correlation coefficient observed is unlikely to have occurred by chance. These findings could have implications for marketers and human resource managers keen to understand how absences may be reduced.

Coefficient of determination We can use the correlation coefficient further to provide other useful statistical information. For instance, if we wish to know the proportion of variance in Y that is explained by X (or vice versa), we coefficient of determination (r2) A measure obtained by squaring the correlation coefficient; that proportion of the total variance of a variable that is accounted for by knowing the value of another variable.

can calculate the coefficient of determination by squaring the correlation coefficient: r2 5

Explained variance Total variance

The coefficient of determination, r2, measures that part of the total variance of Y that is accounted for by knowing the value of X. In the example about store customer satisfaction and average monthly store staff absences, r 5 20.635; therefore, r2 5 0.403. About 40 per cent of the variance in store customer satisfaction can be explained by staff absences, and vice versa. In other words, we can tell how much of an influence one variable might have upon another. This is quite useful because it further tells us how we may or may not be able to influence the variation in one variable. Of course, to do so, we have to make an assumption about causality (see the Tips of the trade and What went right boxes in this chapter).

TIPS OF THE TRADE

CORRELATION IS NOT CAUSATION

Years ago in a US newspaper, Sir Ronald Fisher, the father of experimental design, was quoted on the subject of cigarette smoking and cancer. Fisher pointed out that the only way to establish a causal connection between the two would be to randomly assign a large sample of newborn babies to two groups: those from whom cigarettes would be withheld and those who would be forced to smoke them. Some 70 or 80 years later we might have conclusive evidence of the true effects of smoking on death by various causes.

»

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

»

477

Observing a consistent relationship between two variables over time, or over cases at one point in time, does not prove that one causes the other. This is stated in the simple slogan: ‘Correlation is not causation.’ As the statistician said when he quit smoking: ‘I know that correlation is not causation, but in this case I’m willing to take a chance.’ He put in a nutshell exactly what we do whenever we put a causal interpretation on any association we observe – we take a chance. Even though two the variables are not causally related, they can be statistically related because both are caused by a third or more other factors. When this is so, the variables are said to be spuriously related. A range of other very quirky spurious correlations can be found at tylervigen.com6 (e.g., per capita cheese consumption seems to be associated with the number of people who die by becoming tangled in their bed sheets!). Therefore, be careful what you infer from a correlation coefficient!

Correlation matrix A correlation matrix is the standard form for reporting correlational results. It is similar to a betweencity distance table, except that the research variables are substituted for cities and a coefficient of correlation is substituted for the number of kilometres. Table 14.3 shows a correlation matrix about

correlation matrix The standard form for reporting correlational results.

‘sweethearting’ frequency and its antecedents and consequences. Sweethearting is a term used to describe the behaviour of service workers who give unauthorised products and services to conspiring TABLE 14.3 » PEARSON’S CORRELATION COEFFICIENT MATRIX FOR THE ANTECEDENTS AND CONSEQUENCES OF ‘SERVICE SWEETHEARTING’

Variables

1

Sweethearting frequency (SF)

1.00

Risk-seeking propensity (RSP)

0.34*

Personal ethics (PE)

20.27*

Need for social approval (NSA)

2

3

4

5

6

7

8

9

10

11

1.00 2.021*

1.00

0.16*

0.00

0.28*

1.00

Financial gain (FG)

0.44*

0.22*

20.30*

20.04

1.00

Reciprocity (Re)

0.52*

0.38*

2031*

0.04

0.48*

Job satisfaction (JS)

20.03

20.08

0.33*

0.18*

20.19*

20.14

1.00

Organisational commitment (OC)

20.12

20.20*

20.03

20.24*

20.06

20.07

0.29*

1.00

1.00

Deviant work group norms (DWGN)

0.53*

0.31*

20.21*

0.09

0.33*

0.45*

20.06

0.01

1.00

Job control (JC)

0.20*

0.15*

0.19*

0.16*

0.28*

0.08

0.15*

0.18*

20.06

1.00

Punishment severity (PS)

20.04

0.02

20.04

20.06

0.08

0.00

20.17*

0.24*

20.10

20.07

1.00

Punishment certainty

20.13

20.04

0.22*

9.04

0.06

0.35*

Notes: *p,.05

2.004

20.01

0.04

0.07

2.015*

478

PART SIX > ANALYSING THE DATA

customers.7 In this article the authors report on a study which tries to explain the antecedents of this behaviour and its consequences. The constructs were all measured based on multi-item scales from a survey of service employees. You will encounter this type of matrix on many occasions. Note that the main diagonal consists of correlations of 1.00; this will always be the case when a variable is correlated with itself. Sweethearting frequency (SF) has a positive correlation of 0.34 with risk-seeking propensity (RSP). So, the service employees with a higher propensity for risk seeking are more likely to engage in higher sweethearting behaviours. On the other hand sweethearting frequency has a negative correlation with a service employee’s personal ethics or –0.27. So, as you might expect, service employees with a lower level of personal ethics are more likely to be involved in higher sweethearting behaviours. Without going over each paired relationship in the correlation matrix you can quickly see that an understanding of such associations can be very useful to marketers. In a large correlation matrix such as this it is customary to footnote each statistically significant correlation coefficient, rather than reporting the exact p-value. In summary, when we calculate a correlation coefficient we can obtain information about the direction of an association (that is, whether or not the coefficient is positive or negative), the strength of the association (that is, how close the coefficient is to 61) and the statistical significance association. This is summarised in Table 14.3; correlation can be compared against a similar technique, regression, as in Table 14.4.

SPSS

Running a correlation in SPSS Health information sites like WebMD are some of the most visited websites on the Internet. Suppose a survey for a range of such websites has been conducted for people with serious health conditions and assesses satisfaction with a website and a range of different possible antecedents to website satisfaction, including respondents’ perceptions of the emotional support the website provides and their perceptions of the level of solution support it provides.8 To see how emotional support and solution support are associated with website satisfaction one might construct a correlation matrix. Again, the null hypothesis being tested for each correlation is r 5 0 and the alternative hypothesis is r ≠ 0. To explore the correlation between these variables in SPSS open up ‘health communities.sav’. First you may want to produce an initial scattergraph. To get a feel for the data perform the following click-through operation: Graphs > Legacy Dialogs > Scatter/Dot… Select ‘Simple Scatter’, then select ‘Define’. Select ‘satisfaction’ for the Y axis and ‘emotional’ for the X axis and then press ‘OK’ as in Exhibit 14.3. You should see the scattergraph shown in Exhibit 14.4 (you can also produce two other scattergraphs with these variables). As you might expect, we can see from the scattergraph that the higher the perceived emotional support, the higher the level of satisfaction. We can quantify this relationship, and the others, by calculating a correlation coefficient in SPSS. To do this, perform the following click-through procedure: Analyse > Correlate > Bivariate. Select ‘satisfaction’, ‘emotional’ and ‘solution’, ensuring that the box ‘Pearson’ is selected, and click OK. The correlation coefficient is shown in Exhibit 14.5. By examining the correlation matrix we can see that the correlation between satisfaction and emotional support is 0.547 and the correlation between satisfaction and solution support is 0.352. This suggests a moderate-strong positive relationship between satisfaction and emotional support and a weak-moderate relationship between satisfaction and solution support. Underneath the coefficients we see the ‘Sig.’ value, representing the significance level or p-value. This is 0.000 for both correlations, which is, of course, well below the standard 0.05 cut off. Therefore, we reject the null hypothesis and say there is strong evidence to say that the associations have not occurred by chance. The manager may be able to influence website satisfaction through enhancing emotional support and solution support.

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

479

← EXHIBIT 14.3 CREATING A SCATTERGRAPH IN SPSS

10.00

← EXHIBIT 14.4 SCATTERGRAPH OF SATISFACTION AND EMOTIONAL SUPPORT

Satisfaction

8.00

6.00

4.00

2.00

0.00 1.00

2.00

3.00

4.00 Emotional

5.00

6.00

7.00

480

PART SIX > ANALYSING THE DATA

CORRELATIONS

EXHIBIT 14.5 →

RUNNING A CORRELATION IN SPSS

solution

Pearson correlation sig. (2-tailed) N

emotional

Pearson correlation sig. (2-tailed) N

satisfaction

Pearson correlation sig. (2-tailed) N

solution 1 222 .574** .000 222 .352** .000 222

emotional .574** .000 222 1 222 .547** .000 222

satisfaction .352** .000 222 .547** .000 222 1 222

**Correlation is significant at the 0.01 level (two-tailed). EXCEL

Running a correlation in Excel Open up the datafile ‘health communities.xlsx’. In the Excel file, note that the data are arranged in the same way as in SPSS. Each variable is in a column. Before calculating the correlation coefficient, we may first want to create a scattergraph as we did in SPSS. To do this, simply select the two columns of data that you are interested in correlating (that is, A2:B223). Then perform the following click-through procedure: Insert > Scatter. Repeat as needed for further charts.

EXHIBIT 14.6 →

DOING A CORRELATION IN EXCEL

Select the appropriate scattergraph and the chart should appear in the open worksheet. To calculate the correlation coefficient, select a cell where you want to calculate the coefficient, and in that cell simply type in: 5CORREL(A2:A223, B2:B223) The cell reference A2:A223 and B2:B223 simply refers to where the X variable and Y variable have been input in the spreadsheet. Excel automatically calculates Pearson’s correlation coefficient rather

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

481

than any other type. Unfortunately, Excel does not provide a significance test for the correlation coefficient that is calculated. Repeat the procedure for further correlations, say between satisfaction and solution support. Based on these results (and specifically those from SPSS where a Sig. value is reported) we would reject the null hypothesis. It appears there is evidence to suggest that the correlation obtained from our sample is greater than zero, and that emotional support and solution support is indeed associated with website satisfaction. When reporting the results, remember to include direction, strength and statistical significance of the correlation coefficient. We now discuss another type of correlation coefficient, which can be used with fewer underlying assumptions about the data.

REAL WORLD SNAPSHOT

WHAT MAKES SOMEONE ATTRACTIVE?

What are the things that make someone attractive? Many fashion marketers might be interested in this question. The correlation matrix below was computed with SPSS. The correlations show how different characteristics relate to each other. Variables include a measure of fit (meaning how well the person matches a fashion retail concept), attractiveness, weight (how overweight someone appears), age, manner of dress (how modern) and personality (warm–cold). Thus, a sample of consumers rated a female model shown in a photograph on those characteristics. The results reveal the following: Correlations Fit

Attract

Fit

Attract

Weight

Age

Modern

Cold

Pearson correlation

1

Sig. (two-tailed) N

62

0.831** 0.000 62

20.267* 0.036 62

0.108 0.404 62

20.447** 0.000 62

20.583** 0.000 62

Pearson correlation

0.831** 0.000 62

1 62

20.275* 0.030 62

0.039 0.766 62

20.428** 0.001 62

20.610** 0.000 62

20.267* 0.036 62

20.275* 0.030 62

1

0.082 0.528 62

0.262* 0.040 62

0.058 0.653 62

0.108 0.404 62

0.039 0.766 62

0.082 0.528 62

1 62

20.019 0.882 62

0.104 0.423 62

20.447** 0.000 62

20.428** 0.001 62

0.262* 0.040 62

20.019 0.882 62

1

0.603** 0.000 62

20.583** 0.000 62

20.610** 0.000 62

0.058 0.653 62

0.104 0.423 62

0.603** 0.000 62

Sig. (two-tailed) N Weight

Pearson correlation Sig. (two-tailed) N

Age

Pearson correlation Sig. (two-tailed) N

Modern

Pearson correlation Sig. (two-tailed) N

Cold

Pearson correlation Sig. (two-tailed) N

62

62

1 62

* Correlation is significant at the 0.05 level (two-tailed). ** Correlation is significant at the 0.01 level (two-tailed).

»

482

PART SIX > ANALYSING THE DATA

»

Thus, if the model seems to ‘fit’ the store concept, she seems attractive. If she is too big, she is seen as less attractive. Age is unrelated to attractiveness or fit. Modern dress style and coldness of personality also are associated with lower attractiveness. Using these correlations, a retailer can help determine what employees should look like!

Nonparametric correlation In situations in which the data are ordinal such as ranking data (see Chapter 8 to recap on measurement), a nonparametric correlation technique may be substituted for the Pearson correlation technique. One such commonly used technique is called Spearman’s rank-order correlation coefficient. Suppose two groups of consumers – say, time pressed and non-time pressed – are asked to rank, in order of preference, the brands of a product class such as microwave meals. The researcher then wishes to determine the agreement, or correlation, between the two groups. Because the data (their preferences based on their rankings) are not interval or ratio, a Spearman’s rank-order correlation coefficient is calculated. The Spearman rank coefficient is computed as follows: n

rs 5 1 2

6∑ di2 i 51

n3 2 n

where di is the difference between the ranks given to the ith brand by each group. Thus, if brand B were ranked first by group 1 and sixth by group 2, db2 would be equal to (1 2 6)2, or 25. In some cases two brands may be given equal scores by a group or be tied for a certain rank. If the number of ties is not large, their effect is small, and we simply assign the average of the ranks that would have been assigned had no ties occurred. We then calculate rs as before. If the number of ties is large, however, we can introduce a correction factor to offset their effect on rs. This test is also easily run in SPSS by selecting ‘Spearman’ in the dialogue box for running a correlation, as seen in Exhibit 14.7, from our earlier example. The output is displayed and interpreted in

EXHIBIT 14.7 →

RUNNING A SPEARMAN’S RANKORDER CORRELATION IN SPSS

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

483

exactly the same manner as a Pearson’s correlation coefficient. We may wish to do this if our variables are interval scaled and feel that the distances between scale points are not equal.

SURVEY THIS!

What factors are associated with how many text messages a typical student sends? This is a question that can be addressed with correlation or simple regression analysis. Use number of text messages sent and one other variable of your choice to explore this issue. Use a variable that is interval or ratio scaled. Conduct a correlation analysis and, interpret the results and draw an appropriate conclusion.

Courtesy of Qualtrics.com

REGRESSION ANALYSIS

ONGOING PROJECT

Regression is another technique for measuring the linear association between a dependent and an independent variable. One way to think about regression is by viewing it as correlation’s ‘bigger brother’. Information obtained from a regression provides exactly the same information as a correlation coefficient, but it enables us to do one more thing. In Table 14.4 we can see that correlation allows us to determine the direction of association, the strength of association and the statistical significance of the association. Regression does all this, but it also allows us to predict values of one variable based upon values of another variable. This basic concept is very useful to managers. TABLE 14.4 »

COMPARING CORRELATION AND REGRESSION

Correlation

Regression

Does it tell us the direction of the relationship?

Yes

Yes

Does it tell us the strength of the relationship?

Yes

Yes

Does it tell us the statistical significance of the relationship?

Yes

Yes

Does it allow us to predict values of one variable based on values of another variable?

No

Yes

For instance, consider Exhibit 14.8. It shows a scattergraph for store customer satisfaction and average monthly staff absences based on the example earlier in the chapter (see Table 14.2). Notice (as one might predict, and as would be reflected by a correlation coefficient) as the absences decrease, store customer satisfaction increases. But suppose we wanted to predict the level that absences should be to maintain a particular level of customer satisfaction. To do this we might use regression.

regression scattergraph

484

PART SIX > ANALYSING THE DATA

7.00

EXHIBIT 14.8 →

R2 Linear = 0.403

A DIAGRAMMATIC REPRESENTATION OF REGRESSION

Store customer satisfaction

6.00

5.00 y = 50.7 – 1.13*x 4.00

3.00 39.00

39.50 40.00 40.50 41.00 Average monthly store staff absences

41.50

Regression allows us to predict this by drawing a ‘line of best fit’ through the data so we can predict all values of one variable based on values of another variable. There will be some error, as we cannot possibly draw a straight line through all points on the scattergraph. However, it gives us a useful idea. We will consider how we derive this line shortly. However, at this stage, it is important that you understand the purpose of regression – to estimate the slope of a line drawn through a series of data points representing values of two variables. We now explain the underpinnings of regression. Although regression and correlation are mathematically related, regression assumes that the dependent variable (Y) is predictively, or ‘causally’, linked to the independent variable (X). We know from Chapter 7 on experimentation that we cannot be sure one variable causes changes in another variable. However, in regression we make this inference. Therefore, this underscores the importance of using a solid theory to inform our regression models and other data analysis techniques, rather than simply letting the data do the talking. Regression analysis attempts to predict the values of a ratio or interval-scaled dependent variable from specific values of ratio- or interval-scaled independent variables. Although regression analysis has numerous applications, forecasting sales is a very common objective in marketing. bivariate linear regression A measure of linear association that investigates straightline relationships of the type Y 5 a 1 bX, where Y is the dependent variable, X is the independent variable, and a and b are two constants to be estimated.

The discussion here concerns bivariate linear regression. This form of regression investigates a straight-line relationship of the type Y 5 a 1 bX, as shown in Exhibit 14.9, where Y is the dependent variable, X is the independent variable, and a and b are two constants to be estimated. Recall from high school maths the formula for a straight line. You would probably already have learned this, but it is likely to have been expressed as y 5 mx 1 c. Nonetheless, the principle is the same – a straight line is some constant plus the slope of that line. The symbol a represents the Y intercept, and b is the slope coefficient. The slope b is the change in Y due to a corresponding change of one unit in X. The slope may also be thought of as rise over run – the rise in units on the Y-axis divided by the run in units along the X-axis. (D is the notation for ‘a change in’). Suppose a researcher is interested in forecasting sales for a construction distributor (wholesaler) in Brisbane. The distributor believes a reasonable association exists between sales and building permits issued by city councils. Using bivariate linear regression on the data in Table 14.5, the researcher

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

485

will be able to estimate sales potential (Y) in various city councils based on the number of building permits (X). To better illustrate the data in Table 14.5, we have plotted them on a scatter diagram in Exhibit 14.10. In the diagram the vertical axis indicates the value of the dependent variable (Y), and the horizontal axis indicates the value of the independent variable (X). Each single point in the diagram represents an observation of X and Y at a given point in time – that is, paired values of X and Y. One way to determine the relationship between X and Y is to ‘eyeball’ it as shown in Exhibit 14.10 – that is, draw a straight line through the points in the exhibit. However, this procedure is subject to human error. As shown in Exhibit 14.10, two researchers might draw different lines to describe the same data – as such we need a more systematic approach to drawing the ‘best’ line. TABLE 14.5 »

RELATIONSHIP OF SALES POTENTIAL TO BUILDING PERMITS ISSUED

Y dealer’s sales volume (thousands)

Dealer

X building permits

1

77

86

2

79

93

3

80

95

4

83

104

5

101

139

6

117

180

7

129

165

8

120

147

9

97

119

10

106

132

11

99

126

12

121

156

13

103

129

14

86

96

15

99

108

Y

← EXHIBIT 14.9

ESTIMATING A STRAIGHT LINE

b 5 DY DX a the constant or intercept

X

486

PART SIX > ANALYSING THE DATA

Y 165

EXHIBIT 14.10 →

SCATTER DIAGRAM AND EYEBALL FORECAST

160 155 150 145 My line

140 135 130 125 120 115

Your line

110 105 100 95 90 85 80 75

85

95

105

115

125

135

145

155

165

175

185

195 X

Least-squares method of regression analysis The task of the researcher is to find the best means for fitting a straight line to the data. The leastsquares method is a relatively simple mathematical technique for ensuring that the straight line will most closely represent the relationship between X and Y. The logic behind the least-squares technique goes as follows: no straight line can completely represent every dot in the scatter diagram; there will be a discrepancy between most of the actual scores (the dots) and the predicted score based on the regression line. Simply stated, any straight line drawn will generate errors. The leastsquares method uses the criterion of attempting to make the least amount of total error in prediction of Y from X. More technically, the procedure used in the least-squares method generates a straight line that minimises the sum of squared deviations of the actual values from this predicted regression line. With the symbol e representing the deviations of the dots from the line, the least-squares criterion is as follows: n

∑e

2 i

is minimum

i 51

where ei 5 Yi – Yˆ i (the residual) Yi 5 actual value of the dependent variable Yˆ 5 estimated value of the dependent variable i

n 5 number of observations i 5 number of the particular observation In a nutshell, the goal of least-squares regression is to estimate a straight line that minimises the error between the predicted values and the actual values. This is what we call our line of best fit. It fits the data best. Doing this enables us to test a key hypothesis about the data; namely that there is

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

487

a linear relationship between the two variables being tested. Because we have estimated the slope of the line of best fit we can test the null hypothesis that the slope of the line (b) is equal to zero. The alternative hypothesis would be that the slope of the line is not equal to zero. Therefore, H0 is b 5 0 and H1 is b ≠ 0. If it is zero then the line would be flat and there would be no relationship between the two variables. On the other hand, if it was not zero then this would imply that there is a relationship between the two variables. The general equation for a straight line is Y 5 a 1 bX. A more appropriate estimating equation includes an allowance for error (hence the inclusion of e in the equation): Y 5 aˆ 1 bˆ X 1 e

WHAT WENT RIGHT? THE DANGERS OF INFERRING CAUSALITY BASED ON CORRELATIONS

Suppose a social marketing initiative had been run to try to reduce residents’ water consumption behaviour. See, for example, an Australian study in the Journal of Marketing Management.9 Based upon a multiple regression analysis, the research found that residents who participated in the initiative were more likely to exhibit intentions to conserve water in the future, but only if they felt they had a high degree of perceived behavioural control over their water consumption decisions (e.g., they had water-saving devices, didn’t have leaking pipes and taps that were costly to repair). Based on these results, one would be tempted to assume that the social marketing initiative was successful and caused residents to reduce their water consumption. In an ideal world the researchers would have constructed an experiment and randomly allocated some respondents to the

treatment (or, even better, matched them to the treatment – the social marketing initiative – on salient characteristics that are likely to affect water consumption). However, it is almost impossible to randomly allocate residents to such an initiative. Therefore, if respondents can effectively ‘self-select’ and determine whether or not they want to be involved in the initiative, it might be the case that those respondents who sign up are also respondents who have a greater interest in water conservation and the environment in general. Therefore, again, researchers should beware of generalising too much about causation based on association techniques such as correlation, regression and Chi-square analysis.

The symbols â and bˆ are used when the equation is a regression estimate of the line. Thus, to compute the estimated values of a and b, we use the following formulae:

and

n ( ∑ XY ) 2 ( ∑ X )( ∑ Y ) bˆ 5 2 n ( ∑ X2 ) 2 ( ∑ X ) aˆ 5 Y 2 bˆ X

where bˆ 5 estimated slope of the line (the regression coefficient) â 5 estimated intercept of the Y-axis Y 5 dependent variable

Y 5 mean of the dependent variable X 5 independent variable

X 5 mean of the independent variable n 5 number of observations With simple arithmetic we can solve these equations for the data in Table 14.5 (see Table 14.6). By doing this we would be able to test the hypothesis that b 5 0 (that is, that there is no relationship between distributor’s sales to a dealer and the number of building permits). Again, we could also test the direction of this relationship if we anticipate that the relationship would be positive or negative. However, for

488

PART SIX > ANALYSING THE DATA

simplicity we will continue to use a two-tailed test and test whether or not the slope is different from zero, rather than specifying the direction. To estimate this relationship we perform the following calculations: n( ∑ XY ) 2 ( ∑ X )( ∑ Y ) bˆ 5 n( ∑ X2 ) 2 ( ∑ X )2 5

15(193 345) 2 2 806 875 2 900 175 2 2 806 875 93 330 5 5 5 0.546380 3 686 385 2 3 515 625 3 686 385 2 3 515 625 1 70 760

aˆ 5 Y 2 βˆ X 5 99.8 2 0.54638(125) 5 99.8 2 68.3 5 31.5 TABLE 14.6 »

LEAST-SQUARES COMPUTATION

Dealer

Y

Y2

X

X2

XY

1

77

5 929

86

7 396

6 622

2

79

6 241

93

8 649

7 347

3

80

6 400

95

9 025

7 600

4

83

6 889

104

10 816

8 632

5

101

10 201

139

19 321

14 039

6

117

13 689

180

32 400

21 060

7

129

16 641

165

27 225

21 285

8

120

14 400

147

21 609

17 640

9

97

9 409

119

14 161

11 543

10

106

11 236

132

17 424

13 992

11

99

9 801

126

15 876

12 474

12

121

14 641

156

24 336

18 876

13

103

10 609

129

16 641

13 287

14

86

7 396

96

9 216

8 256

15

99

9 801

108

11 664

10 692

oY 5 1 497 Y 5 99.8

oY 2 5 153 283

oX 5 1 875 X 5 125

oX 2 5 245 759

oXY 5 193 345

The formula Yˆ 5 31.5 1 0.546X is the regression equation used for the prediction of the dependent variable. Suppose the wholesaler is considering opening a new dealership in an area where the number of building permits equals 89. Sales in this area may be forecasted as:

Yˆ 5 31.5 1 0.546(X) 5 31.5 1 0.546(89) 5 31.5 1 48.6 5 80.1

Thus, the distributor may expect sales of 80.1 (or $80 100) in this new area.10 Calculation of the correlation coefficient gives an indication of how accurate the predictions are. In this example the correlation coefficient is r 5 0.9356 and the coefficient of determination is r2 5 0.8754. This means the number of building permits issued by city councils explains 87.5 per cent

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

of the variation in sales. In summary, this is an excellent fit – the independent variable explains a large degree of variation in the dependent variable.

Drawing a regression line To draw a regression line on the scatter diagram, only two predicted values of Y need to be plotted. Dealer 7 (actual Y value 5 129): Yˆ 7 5 31.5 1 0.546(165) 5 121.6 Dealer 3 (actual Y value 5 80): Yˆ 3 5 31.5 1 0.546(95) 5 83.4

Using the data for dealer 7 and dealer 3, we can draw a straight line connecting the points 121.6 and 83.4. Exhibit 14.11 shows the regression line. Y 160

← EXHIBIT 14.11 LEAST-SQUARES REGRESSION LINE

150 140 Actual Y for Dealer 7

130 120

Y for Dealer 7 110 100 Y for Dealer 3

90

Actual Y for Dealer 3

80

70

80

90

100

110

120

130

140

150

160

170

180

190 X

To determine the error (residual) of any observation, the predicted value of Y is first calculated. The predicted value is then subtracted from the actual value. For example, the actual observation for dealer 9 is 97, and the predicted value is 96.5; thus only a small margin of error, e 5 0.5, is involved in this regression line: ei 5 Y9 2 Yˆ 9 5 97 2 96.5 5 0.5

where Yˆ 9 5 31.5 1 0.546(119) Again, this shows that the estimated straight line is an excellent fit to the data because the observed value is only slightly discrepant from the predicted value.

489

490

PART SIX > ANALYSING THE DATA

Tests of statistical significance Now that we have considered the error term, a more detailed look at explained and unexplained variation is possible. Exhibit 14.12 shows the fitted regression line. If a researcher wished to predict any dealer’s sales volume (Y) without knowing the number of building permits (X), the best prediction would be the average sales volume (Y ) of all dealers. Suppose, for example, that a researcher wished to predict dealer 8’s sales without knowing the value of X. The best estimate would be 99.8 (Y 5 99.8). Exhibit 14.12 shows that there would be a large error, because dealer 8’s actual sales were 120. Once the regression line has been fitted, this error can be reduced. With the regression equation, dealer 8’s sales are predicted to be 111.8, reducing the error from 20.2 (Y 2 Y 5 120 2 99.8) to 8.2 (Y 2 Yˆ 5 120 i

i

2 111.8). Simply stated, error is reduced by using Yi 2 Yˆ i rather than Yi 2 Y . The reduction in the error is the explained deviation due to the regression; the smaller number, 8.2, is the deviation not explained by the regression. EXHIBIT 14.12 → SCATTER DIAGRAM OF EXPLAINED AND UNEXPLAINED VARIATION

Y 130

tion ssion via De regre Y i 5 d by Y i 2 laine exp not

Dealer 8 actual sales

120 – Yi 2 Y 5 Total deviation

110

– Yi 2 Y 5 Deviation explained by regression

100 90

– Y

DY

80

Y5

a1

80

bX

90

b 5 DY DX

DX

100

110

120

130

140

150

160

170

180

X

Thus, the total deviation between Yi 2 Yˆ can be partitioned into two parts: Deviation

Total deviation for each

5

observation Yi 2 Y

5

Deviation explained by the regression ( Yˆ i 2 Y )

1

1

unexplained by the regression (residual error) ( Yi 2 Yˆ )

where

Y 5 mean of the total group Yˆ i 5 value predicted with regression equation Yi 5 actual value For dealer 8, the total deviation is 120 2 99.8 5 20.2, the deviation explained by the regression is 111.8 2 99.8 5 12, and the deviation unexplained by the regression is 120 2 111.8 5 8.2. If these values are summed over all values of Yi (that is, all observations) and squared, these summed deviations will

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

491

provide an estimate of the variation of Y explained by the regression and that unexplained by the regression: Total variation (SSt) ∑( Yi 2 Y )2

5 5

Explained

1

variation (SSr) ∑( Yˆ i 2 Y )2

1

Unexplained variation (SSe) ∑( Yi 2 Yˆ )2

We have thus partitioned the total sum of squares, SSt, into two parts: The regression sum of squares, SSr, and the error sum of squares, SSe: SSt 5 SSr 1 SSe

An F–test, or an analysis of variance, can be applied to a regression to test the relative magnitudes of SSr and SSe with their appropriate degrees of freedom. Table 14.7 shows the technique for conducting the F-test. TABLE 14.7 »

ANALYSIS OF VARIANCE TABLE FOR BIVARIATE REGRESSION

Source of variation

d.f.

Sum of squares

Explained by regression

2 SSr 5 ∑(Yˆi 2 Y )

k21

Unexplained by regression (error)

SSe 5 ∑(Yi 2 Yˆ )

n2k

2

Mean square (variance) SSr k 21 SSe n2k

Where k 5 number of estimated constants (variables) n 5 number of variations

For the example on sales forecasting, the analysis of variance summary table, which shows relative magnitudes of the mean square, is presented in Table 14.8. From Table A.6 in Appendix A we find that the calculated F-value of 91.3, with 1 degree of freedom in the numerator and 13 degrees of freedom in the denominator, exceeds the probability level of 0.01. TABLE 14.8»

ANALYSIS OF VARIANCE SUMMARY TABLE FOR REGRESSION OF SALES ON BUILDING PERMITS

Source of variation

d.f.

Mean square

3 389.49

1

3 398.49

483.91

13

37.22

3 873.40

14

Sum of squares

Explained by regression Unexplained by regression (error) Total

The coefficient of determination, r2, reflects the proportion of variance explained by the regression line. The formula for calculating r2 is: r2 5

SSr SS 512 e SSt SSt

For our example, r2 is 0.875: r2 5

3 398.49 5 0.875 3 873.40

F–test (regression) A procedure to determine whether more variability is explained by the regression or unexplained by the regression.

492

PART SIX > ANALYSING THE DATA

The coefficient of determination may be interpreted to mean that 87.5 per cent of the variation in sales was explained by associating the variable with building permits. Recall also that a key objective of simple regression analysis was to be able to test whether or not the slope of the line was flat. To do this we start by specifying the hypothesis to be tested: H0: b 5 0 H1: b ≠ 0

To test the hypothesis we can use a procedure similar to the t-test procedure used in Chapter 12. The test statistic is: t5

bˆ 2 b Sbˆ

In this equation bˆ is the sample coefficient obtained from our regression (bˆ 5 0.546) and we compare this with the slope should the null hypothesis be true (i.e. b 5 0). So in this case we are comparing the sample coefficient with a population coefficient of zero. Sbˆ is the standard error of the sample coefficient and in this example is 0.057.11 Therefore: t5

0.546 5 9.58 0.057

From Table A.3 in Appendix A the critical t-value for n – 2 (d.f. 5 15 2 2 5 13) degrees of freedom and a 5 per cent level of significance is 2.160. Therefore, the calculated t-value of 9.58 is higher than the critical t-value of 2.160, so there is evidence at the 5 per cent level of significance that the sample coefficient is greater than 0. This means we can reject the null hypothesis, and that there is evidence to suggest the number of building permits shares a linear relationship with dealer’s sales volume. Practically speaking, this would mean that the construction company could gauge sales potential within a city based upon levels of building permits issued and changes in this variable. SPSS

Running a regression in SPSS Recall that the only difference in application between correlation and regression is that regression allows us to estimate and draw a straight line through the data, which enables us to make predictions about the dependent variable based on values of the independent variable. Taking the example about health communities from before, suppose you wish estimate a line of best fit to predict website satisfaction based on levels of emotional support. To do this we need to perform a regression to estimate the slope and intercept of the straight line – the line of best fit (see Exhibit 14.16). Start by opening up the dataset ‘health communities.sav’. Again, the relevant hypotheses are H0: b 5 0 and H1: b Þ 0. Perform the following click-through procedure: Analyse > Regression > Linear as in Exhibit 14.13. Select ‘satisfaction’ as the Dependent Variable and ‘emotional’ as the Independent Variable. Ensure that the Method is set to ‘Enter’. Click OK and you should see some output appear. The regression generates the output in Exhibit 14.14. We can visually examine this by drawing the line of best fit in a scattergraph. Again create the scattergraph between ‘satisfaction’ and ‘emotional’.

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

493

← EXHIBIT 14.13 DOING A REGRESSION IN SPSS

Model Summary Model 1

R

R square

Adjusted R square

.547a

.299

.296

← EXHIBIT 14.14 REGRESSION OUTPUT FOR AVERAGE PRODUCTIVITY AND MONTHS EMPLOYED

Std. error of the Estimate 1.55470

a. Predictors: (Constant), emotional

ANOVAa Sum of squares

d.f.

Mean square

F

Sig.

226.579

1

226.597

93.740

.000b

Residual

531.760

220

2.417

Total

758.339

221

Model 1

Regression

a. Dependent Variable: satisfaction b. Predictors: (Constant), emotional

Coefficientsa Model

1

Unstandardised coefficients B

Std. error

(Constant)

2.976

.548

emotional

.937

.097

a. Dependent variable: aveprod satisfaction

Standardised coefficients Beta .547

t

Sig.

5.434

.000

9.682

.000

494

PART SIX > ANALYSING THE DATA

With the scattergraph that is produced edit the chart by double-clicking on it. Click on the box in the chart editor menu that allows you to ‘Add fit line at total’, as in Exhibit 14.15. EXHIBIT 14.15 →

CREATING A SCATTERGRAPH AND LINE OF BEST FIT IN SPSS

In Exhibit 14.16, we can see the line of best fit estimated slopes upward suggesting a positive relationship (which is what our correlation analysis told us earlier). We can also see from the scattergraph that the line seems to fit the data reasonably well, suggesting a good fit. For a more rigorous analysis of the estimated line, we now refer back to the regression output in Exhibit 14.14. The first piece of output we are given tells us about how well the model fits the data. This is important and is where we find out summary statistics such as r2. The r2 value is 0.299. This is the same as the coefficient of determination and tells us that emotional support explains 29.9 per cent of the variation in website satisfaction. That is, just over one-third of virtual community website satisfaction can be accounted for by the emotional support provided by the website. Notice also the value of r which is 0.547 and is the same as the absolute value of the correlation coefficient in Exhibit 14.5 (r will be the same as the correlation coefficient for a bivariate regression, but not for a multivariate regression – see Chapter 15 for further discussion on multiple regression). This is an important point. As mentioned above, bivariate regression might be classified as correlation’s ‘older brother’. Conceptually and mathematically the rationale for each technique share significant similarities – bivariate regression just goes a bit further by allowing us to estimate a straight line. We will see in Chapter 15 how the results differ when we extend bivariate regression to analyse the impact of multiple independent variables. Before examining the coefficients, we first need to examine the ANOVA table in Exhibit 14.14 to further examine model fit. We determined, through examining the r2 value, that the model provides a relatively good fit to the data. However, in the same vein as with other tests conducted so far, we still do not know whether or not these results have occurred by chance. To estimate this we refer to the significance value in the ANOVA output. The significance value of 0.000, which is well below the 0.05 cut off, tells us we can be highly certain that the results observed so far have not occurred as a result of sampling error. With this information, we can now go and examine the regression coefficients to estimate our line of best fit.

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

← EXHIBIT 14.16 LINE OF BEST FIT IN SPSS

In the output the ‘constant’ is the intercept, so a 5 2.98. To examine the regression coefficient for emotional support we can see that b 5 0.937. There is also another value called a ‘standardised coefficient’. This is the equivalent of the correlation coefficient in Exhibit 14.5 and is simply the standardised regression coefficient. Again, this is only the case for bivariate regression. This will be discussed further in Chapter 15 when we look at the effect on the dependent variable of multiple independent variables. Before we draw the line, we must also examine the significance level of the regression coefficient. It is 0.000, so we can be highly confident that this regression coefficient has not occurred as a result of chance, and that there is evidence that there is a linear relationship between these two variables. Note that this is the same outcome as interpreting the r2 value and the ANOVA above. The individual coefficients will be different for a multiple regression, so this process will also provide differing results when interpreting a multiple regression. With these results we can now estimate Y 5 a 1 bX. Y 5 2.98 1 0.937X

Therefore, for every one unit increase in emotional support (that is X), website satisfaction will Increase by 0.937 units. Then, using this equation, what will website satisfaction be if emotional support is, say, 6? Y 5 2.98 1 0.937 3 6 5 8.6

495

496

PART SIX > ANALYSING THE DATA

EXCEL

Running a regression in Excel To run the same regression in Excel, open up the data file ‘health communities reg.xlsx’. To produce the line of best fit, create a scattergraph in Excel by selecting the data array and headings, then selecting Insert > Scatter. Then once the scattergraph has been produced under the Design tab, make the selection shown in Exhibit 14.17.

EXHIBIT 14.17 →

CREATING A SCATTERGRAPH AND LINE OF BEST FIT IN EXCEL

We can now estimate the slope of this line by running the regression.12 To do this, perform the following click-through procedure: Data > Data Analysis. Select ‘Regression’ and click OK. Define the Input Y Range and the Input X Range (including the variable headings if you want these to appear in the accompanying output). If this is the case then also select the Labels checkbox. Select the output to appear in a new worksheet, which is the default (or select as appropriate), click OK and you should see the output below. The output can be interpreted in the same way. We can see that the r2 is 0.299, and the model’s fit is statistically significant because it is below 0.05 (see the significance of the F-value). The coefficient for ‘emotional’ is 0.936 and that is also statistically significant based on the p-value, which is below 0.05. Therefore, summarising the results from SPSS and Excel, we can see that the Sig. value/p-value is less than 0.05. This suggests there is evidence to reject the null hypothesis, and that there is a linear relationship between website satisfaction and emotional support. When reporting these results we should be careful to report the regression coefficient, the intercept and the Sig. values/p-values of the coefficients. We should also report the r2 value in order to assess the model’s explanatory power, and report on what the model means in practical terms. Here the results suggest there is a positive relationship between satisfaction and emotional support, and that emotional support explains 29.9 per cent of the variation in website satisfaction. The conclusions drawn here are mainly the same as for the correlation analysis earlier, which employed the same variables. However, we could also interpret the regression equation by again pointing out that a 1 unit increase in emotional support will increase website satisfaction by 0.936 units.

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

497

← EXHIBIT 14.18 DOING A REGRESSION IN EXCEL

← EXHIBIT 14.19 REGRESSION OUTPUT IN EXCEL

REAL WORLD SNAPSHOT

Size and weight Most OECD countries seem obsessed with weight control. Being thin seems to stay in and the fight to get thin is a multibillion dollar business. Recall that in an earlier ‘Real world snapshot’ box we discussed correlations between factors related to attractiveness. What if the following hypothesis was tested? H1: Perceptions that a female model is overweight are related negatively to perceptions of attractiveness.

»

498

PART SIX > ANALYSING THE DATA

»

Using the scales from the earlier box, this can be tested with a simple regression. The results can be summarised as shown here: Model 1

Sum of squares

d.f.

Mean square

F

Sig.

4.914

0.03

t

Sig.

Regression

9.228

1

9.227

Residual

112.66

60

1.877

121.8871

61

Total

Unstandardised coefficients

1

Standardised coefficients Std. error

Model

b

(Constant)

4.413

0.952

20.582

0.262

3113

b

4.636 20.275

22.216

0.00002 0.03

The results support the hypothesis. The b 5 20.275 is both in the expected direction (negative) and significant (p , 0.05). Therefore, if respondents perceived someone as ‘too big’, they likewise saw the person as less attractive.

TIPS OF THE TRADE

»» A correlation coefficient is the most basic way of depicting the extent of a bivariate relationship between two continuous (better than ordinal) variables. »» A correlation matrix is a good way of quickly summarising the individual bivariate relationships that exist between a set of variables. »» Simple regression also depicts a bivariate relationship, but distinguishes an independent variable from a dependent variable. »» Regression analysis can at best establish evidence only of concomitant variation between the two variables. »» Regression coefficients can either be the actual slope coefficient scaled consistent with the measures used or standardised. »» Raw slope coefficients (b1) are most appropriate when: »» All variables use the same numeric scale, particularly when the dependent variable is a monetary unit and the independent variable is the same monetary unit such as would be the case when predicting the selling price of a product in auction with the initial bid price ($/$). »» The researcher is interested in prediction only, meaning obtaining values of Y. »» Standardised regression coefficients (b1) are most appropriate when: • The researcher needs to compare coefficients with each other. • The researcher’s emphasis is on explanation as opposed to prediction. »» When a regression returns a standardised b coefficient less than 21.0 or greater than 1.0 the researcher should examine the data for a problem. »» When testing a hypothesis using regression analysis: • Check the statistical significance of the t-test for a parameter estimate. • Check the sign of the parameter estimate. »» Be careful not to infer causality when interpreting a correlation or regression coefficient, unless backed up by strong theoretical arguments.

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

CROSS-TABULATIONS: THE CHI-SQUARE TEST FOR GOODNESS OF FIT

499

ONGOING PROJECT

So far, we have explained how to test for association between interval or ratio variables. As mentioned towards the start of the chapter, there are many situations in which we wish to test whether or not two nominal variables are associated. Because the data do not satisfy the normal assumptions behind methods such as correlation and regression, researchers commonly use a nonparametric statistical test to calculate the association between two nominal variables. This test is usually coupled with a very practical and often-used table known as a cross-tabulation. A cross-tabulation, or contingency table, is a joint frequency distribution of observations on two or more sets of variables. The Chi-square distribution provides a means for testing the statistical significance of cross-tabulations. It allows us to test for differences between two groups’ distributions across categories. If differences exist then there is an association between the two nominal variables. As mentioned in Chapter 12, the Chi-square test involves comparing the observed frequencies (Oi ) with the expected frequencies (Ei ). It tests the goodness of fit of the observed distribution with the expected distribution. Table 14.9 shows sales staff satisfaction with their job. It suggests that the majority of sales staff

Chi-square test for independence A test that statistically analyses significance in a joint frequency distribution.

(80 per cent) are satisfied with their job. However, if we analyse the data by subgroups based on how fairly compensated sales staff feel that they are, then we can see that cross-tabulating against another variable can help us to understand why staff are answering the question in a certain way. Inspection of Table 14.10 suggests agreement about job satisfaction is more likely if an individual feels fairly compensated and less likely if an individual doesn’t feel fairly compensated. Thus, from our simple analysis there appears to be an association between job satisfaction and how fairly compensated an individual feels in that role. TABLE 14.9 »

ONE-WAY FREQUENCY TABLE FOR SALES STAFF JOB SATISFACTION

How satisfied are you with your job?

Frequency (%)

Very satisfied

100 (50%)

Somewhat satisfied

60 (30%)

Not satisfied

40 (20%)

Total

200

TABLE 14.10 » CONTINGENCY TABLE (CROSS-TABULATION) FOR SATISFACTION WITH JOB AND PERCEIVED FAIRNESS OF COMPENSATION

I feel I am fairly compensated Agree How satisfied are you with your job?

Disagree

Very satisfied

93 (62%)

7 (14%)

100

Somewhat satisfied

48 (32%)

12 (24%)

60

10 (7%)

30 (61%)

40

49

200

Not satisfied

151

contingency table Chi-square

500

PART SIX > ANALYSING THE DATA

So far we have not discussed the notion of statistical significance. Is the observed association between job satisfaction and perceived fairness of compensation the result of chance variation due to random sampling? The Chi-square test allows us to conduct tests for significance in the analysis of the R 3 C contingency table (where R 5 row and C 5 column). The formula for the Chi-square statistic is the same as that for one–way frequency tables (see Chapter 12): x2 5 ∑

(Oi 2 Ei )2 Ei

where x2 5 Chi-square statistic Oi 5 observed frequency in the ith cell Ei 5 expected frequency in the ith cell Again, as in the univariate Chi-square test, a frequency count of data that nominally identify or categorically rank groups is acceptable for the Chi-square test for a contingency table. Both variables in the contingency table will be categorical variables rather than interval- or ratio-scaled continuous variables. We begin, as in all hypothesis-testing procedures, by formulating the null hypothesis and selecting the level of statistical significance for the particular problem. In managerial terms, the researchers ask whether employees of different levels of job satisfaction have different levels of agreement about the fairness of their compensation. Translated into a statistical question, the problem is: ‘Is job satisfaction independent of perceived fairness of compensation in the role?’ Therefore, the null hypothesis would be that there is no association between the two variables under examination (e.g., this is the same as saying they are independent of each other). The alternative hypothesis would be that these two variables are associated (e.g., not independent of each other). For the example here, which tests if there is an association between job satisfaction and perceived fairness of compensation, the null hypothesis would be that there is no association between job satisfaction and perceived fairness of compensation. The alternative hypothesis would be that there is an association between job satisfaction and perceived fairness of compensation. As usual we would specify a 5 per cent level of statistical significance. Table 14.10 is a 3 3 2 (R 3 C) contingency table because it has three rows and two columns. To compute the Chi-square value for the 3 3 2 contingency table (Table 14.10), the researcher must first identify an expected distribution for that table. Under the null hypothesis that perceived fairness of compensation should be equal, as a proportion, by different levels of job satisfaction. This is one reason why we calculate column percentages – to be able to quickly ‘eyeball’ the data). There is an easy way to calculate the expected frequencies for the cells in a cross-tabulation. To compute an expected number for each cell, we need to first think about what our null hypothesis is. For a Chi-square test for association, our null hypothesis would be that job satisfaction and perceived fairness of compensation are not associated or are independent. The probability of two independent events can be defined by: P( A ∩ B) 5 P( A) 3 P( B)

The symbol ∩ simply refers to the word and. In other words P( A ∩ B) , means the probability of A and B. This means, in the above example, the probability of being very satisfied (in this sample) and the probability of agreeing that you are fairly compensated (in this sample) is: P( A ∩ B) 5 P( verysatisfied ∩ agreeing) 5

151 100 3 5 0.3775 200 200

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

To calculate the expected value in that cell we simply multiply the calculated probability (0.3775) by the number in the sample (200). Thus, the expected value is 75.5. We follow the same procedure for each of the other cells. More simply, this formula can be expressed as: Eij 5

RiC j n

where Ri 5 total observed frequency in the ith row Cj 5 total observed frequency in the jth column n 5 sample size A calculation of the expected values does not utilise the actual observed numbers of respondents in each individual cell; only the total column and total row values are used in this calculation. The expected cell frequencies are calculated as shown in Table 14.11. TABLE 14.11 » CALCULATION OF OBSERVED VERSUS EXPECTED FREQUENCIES

I feel I am fairly compensated How satisfied are you with your job?

Agree

Disagree

Very satisfied

93 (75.5)

7 (24.5)

100

Somewhat satisfied

48 (45.3)

12 (14.7)

60

Not satisfied

10 (30.2)

30 (9.8)

40

151

49

200

Note: Expected frequencies are in parenthesis. They were calculated as follows: EVsA 5 ESsD 5

151 3 100 200 49 3 60 200

5 75.5 EVsD 5

5 14.7 ENsA 5

49 3 100

200 151 3 40 200

5 24.5 ESsA 5

5 30.2 ENsD 5

151 3 60

200 49 3 40 200

5 45.3

5 9.8

To compute a Chi-square statistic, use the same formula as before, but calculate degrees of freedom as the number of rows minus one (R 2 1) times the number of columns minus one (C 2 1): x2 5 ∑

(Oi − Ei )2 Ei

with (R 2 1)(C 2 1) degrees of freedom. Table 14.11 shows the observed versus the expected frequencies for the job satisfaction question. Using the data in Table 14.11, we calculate the Chi-square statistic as follows: (93 2 75.5)2 (7 2 24.5)2 (48 2 45.3)2 (12 2 14.7)2 (10 2 30.2)2 (30 2 9.8)2 1 1 1 1 1 75.5 24.5 45.3 14.7 30.2 9.8 5 4.06 1 12.5 1 0.16 10.50 1 13.51 1 41.63 5 72.36

x2 5

The number of degrees of freedom equals 2: (R 2 1)(C 2 1) 5 (3 2 1)(2 2 1) 5 2

From Table A.4 in Appendix A, we see that the critical x2 value at the 0.05 probability level with 2 d.f. is 5.99. Thus, the null hypothesis is rejected – there is evidence to suggest that job satisfaction and perceived fairness of compensation are associated. This means sales staff are more likely to be

501

502

PART SIX > ANALYSING THE DATA

satisfied in their roles if they perceive they are being fairly compensated. In a report it is important to illustrate the nature of such relationships with a relevant cross-tabulation that is clearly labelled. Remember to calculate column percentages to make interpretation of the table simpler (that is, if the independent variable is in the columns and the dependent variable is in the rows, which is the convention). After presenting the cross-tabulation it would also be useful to include the relevant Chisquare statistic and an indication as to whether or not it is statistically significant (though this may not be emphasised in a managerial report). Proper use of the Chi-square test requires that each expected cell frequency (Eij ) have a value of at least 5. If this sample size requirement is not met, the researcher should take a larger sample or combine (collapse) response categories.

SPSS

Cross-tabulation and Chi-square tests in SPSS Suppose you are interested in who is more likely to drink Coca-Cola and who is more likely to drink Pepsi. You might have a hunch that Coca-Cola drinkers are likely to be older than Pepsi drinkers. Cola preference is of course a nominal (nonmetric) variable and if the age variable is a grouped variable (that is, under 40 and over 40) then that, too, would be a nominal (nonmetric) variable. If age was measured as a ratio variable, then we may have to consider some other analysis technique. Therefore, we believe age group and soft drink preference are associated. We could examine this quite easily in a cross-tabulation generated from SPSS, and shown in Exhibit 14.20.

EXHIBIT 14.20 →

DOING A CROSSTABULATION IN SPSS

To construct this cross-tabulation in SPSS open up the datafile ‘cola preference.sav’. Perform the following click-through procedure: Analyse > Descriptive Statistics > Crosstabs. As in Exhibit 14.20, insert ‘branpref’ in the rows and ‘agegroup’ in the columns. Click on Cells and select Column Percentages because there are a different number of those surveyed in each age group. This enables us to compare ‘apples with apples’ and ‘oranges with oranges’ when analysing the crosstabulation. In other words, any differences in brand preference can be attributed to the age variable more easily if we compare them as percentages in the columns. Click Continue and then click OK and the output should appear.

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

BRANPREF 5 BRAND PREFERENCE * AGEGROUP 5 AGE GROUP CROSS TABULATION Agegroup 5 Age group .40 branpref 5 brand preference

Pepsi

Count % within agegroup 5 Age Group

Coca-Cola

Count % within agegroup 5 Age Group

Total

Count % within agegroup 5 Age Group

Total

,40 28

45

73

30.4%

45.9%

38.4%

64

53

117

69.6%

54.1%

61.6%

92

98

190

100.0%

100.0%

100.0%

503

← EXHIBIT 14.21 A CROSS-TABULATION OF SOFT DRINK PREFERENCE BY AGE GROUP FROM SPSS

The results shown in Exhibit 14.21 tell us a number of things. First, respondents from this sample always tend to prefer Coca-Cola to Pepsi because percentage preference for Coca-Cola is higher for each age group. However, we can also notice from the cross-tabulation that respondents under 40 are more likely to prefer Pepsi than respondents over 40. Why? Because the percentage who prefer Pepsi (Coca-Cola) seems to be lower (higher) for the older group of consumers. Therefore, the results of the cross-tabulation support our prediction. Younger groups tend to prefer Pepsi to older groups and vice versa. However, we only sampled 190 people. Could these results have occurred as a result of sampling error? Possibly, so to determine the likelihood of sampling error we can run a Chi-square test. To run a Chi-square test you need to go back to the Crosstabs dialogue box – there is a separate option within the dialogue box that allows you to do this. Therefore, perform the following click-through procedure again: Analyse > Descriptive Statistics > Crosstabs. ← EXHIBIT 14.22 DOING A CROSSTABULATION AND CHISQUARE TEST IN SPSS

This time, before running the Crosstab, select the Statistics box and select Chi-square. Click Continue and then click OK. From the Chi-square test output (Exhibit 14.23) we will use the top row ‘Pearson Chi-square’. For 2 3 2 cross-tabulations, in fact, we can use one of the other statistics displayed, but for the purposes of simplicity we will stick to what we have done so far. The calculated Chi-square value is 4.808 and the significance level of this statistic is 0.028. Therefore, this significance level is well below the 0.05 cut-off, providing evidence that the results have not occurred as a result of sampling error.

504

PART SIX > ANALYSING THE DATA

CHI-SQUARE TESTS

EXHIBIT 14.23 →

A CHI-SQUARE TEST FOR SOFT DRINK PREFERENCE BY AGE GROUP

Value Pearson chi-square Continuity correctionb Likelihood ratio

Asymp. Sig. (2-sided)

d.f.

4.808a

1

0.028

4.176

1

0.041

4.842

1

0.028

Fisher’s exact test Linear-by-linear association N of valid cases

Exact sig. (2-sided)

0.037 4.783

1

Exact sig. (1-sided)

0.020

0.029

190

a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 35.35. b. Computed only for a 2 3 2 table.

EXCEL

Cross-tabulation and Chi-square tests in Excel To perform the same test in Excel, open up ‘cola preference.xlsx’. Excel will not automatically construct a cross-tabulation for us as SPSS does. Instead, we have to create the crosstab using Excel’s ‘Pivot Tables’ function. Notice how the data are arranged. As well as having ‘brandpref’ and ‘agegroup’ as in SPSS, there is also a separate variable called ‘ID’. This is the respondent’s ID and it needs to be included to enable us to count the joint frequencies of the two variables. First, select ‘PivotTable’ from the Insert menu bar in Excel. Select all the data, from A1:C191 (including the column headings), in the Table/Range box, as shown in Exhibit 14.24. This will tell Excel where to obtain the data. You can either have the PivotTable placed in the existing worksheet or created in a new worksheet. For this example, we have selected the PivotTable to appear in the existing worksheet – select a cell where you want it to appear. Click OK. The format for the PivotTable is drawn up with the PivotTable Fields alongside. Drag ‘branpref’ from the PivotTable Fields into the Rows and drag ‘agegroup’ from the PivotTable fields into the Columns as in Exhibit 14.25. This begins to create the format of the crosstab based on the number of categories for each variable (that is, in this case a 2 3 2 crosstab). Select ID by ticking it or just drag ID into the Values box. Initially the table displays very large numbers – it has summated all the IDs for each of the joint frequencies! Of course, this is meaningless to us so we need to get the table to count them rather than summate them. To do this, right click on the top lefthand corner of the table where it says Sum of ID and ask Excel to Summarise Values By > Count as in Exhibit 14.26. You should then see the joint frequencies for each of the variables. Remember it’s often more informative if we can view our frequencies as column percentages. We can do this, too. To do this, right-click on Count of ID and select Show Values As. Then select % of Column Total as in Exhibit 14.27. Now the joint frequencies will be displayed as column percentages. The results are, of course, the same as they are in SPSS. But we still have not examined the statistical significance of these data with a Chi-square test. Again, Excel doesn’t automatically do this for us. But we have our observed frequencies from the pivot table, so we just need to calculate the expected frequencies. Open up ‘cola preference – Chi-square.xlsx’. These are the same data and spreadsheet as before, but we have left in the PivotTable as we need the observed values to calculate the Chi-square statistic.

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

505

← EXHIBIT 14.24

DOING A CROSSTABULATION AND CHI-SQUARE TEST IN EXCEL (USING THE PIVOT TABLE FUNCTION IN EXCEL)

← EXHIBIT 14.25 DOING A CROSSTABULATION AND CHISQUARE TEST IN EXCEL (ASSIGNING VARIABLES TO THE ROWS AND COLUMNS)

506

PART SIX > ANALYSING THE DATA

EXHIBIT 14.26 → DOING A CROSSTABULATION AND CHI-SQUARE TEST IN EXCEL (CREATING THE OBSERVED VALUES FOR EACH CELL)

In order to calculate the Chi-square statistic, we need the observed frequencies (which we have from the PivotTable) and the expected frequencies. We don’t have the expected frequencies, but they are easy to calculate. Recall the discussion from earlier in the chapter about how to calculate expected frequencies (see section ‘Cross tabulations: The Chi-square test for goodness of fit’ from earlier in this chapter). Remember that to calculate the expected values based on the observed information, we can use the following formula: Eij 5

R iC j n

where Eij 5 expected frequency in the ith row and jth column Ri 5 total observed frequency in the ith row Cj 5 total observed frequency in the jth column n 5 sample size This is simple enough to do in Excel, so we can set up an expected values table to calculate the expected values. (Hint: You can check if you have done it correctly because the column and row totals should be the same as in the observed table.) See Exhibit 14.28 to see how to calculate the expected values for a cross-tabulation in Excel. To conduct the Chi-square test, we simply create a formula as in Exhibit 14.28, which tells Excel to provide the significance value by comparing the observed values with the expected values – that is, 5CHITEST(F4:G5,G11:H12). This returns the significance level of 0.0283, which is the same as what we found in SPSS. In other words, this significance level is well below the 0.05 cut-off, providing evidence that the results have not occurred as a result of sampling error.

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

507

← EXHIBIT 14.27 DOING A CROSSTABULATION AND CHISQUARE TEST IN EXCEL (CREATING COLUMN PERCENTAGES

← EXHIBIT 14.28 CELL REFERENCES FOR A CROSS-TABULATION AND CHI-SQUARE TEST IN EXCEL

508

PART SIX > ANALYSING THE DATA

EXPLORING RESEARCH ETHICS

SOME WISE WORDS ABOUT PREDICTION AND FORECASTING

What we covered in this chapter forms the basis for prediction and forecasting – that is, using historical data to predict and forecast the future. However, be careful how you apply this new found knowledge because using historical data to predict future events has its issues! »» ‘Prediction is very difficult, especially if it’s about the future.’ (Nils Bohr) »» ‘An economist is an expert who will know tomorrow why the things he predicted yesterday didn’t happen today.’ (Evan Esar) »» ‘I always avoid prophesying beforehand because it is much better to prophesy after the event has already taken place.’ (Winston Churchill) »» ‘Prophecy is a good line of business but it is full of risks.’ (Mark Twain in Following the Equator) »» ‘It is far better to foresee even without certainty than not to foresee at all.’ (Henri Poincaré in The Foundations of Science)

STATISTICAL AND PRACTICAL SIGNIFICANCE FOR TESTS OF ASSOCIATION In Chapter 12, we discussed differences between practical and statistical significance for tests of differences. We need to make the same distinction here for tests of association. For example, earlier we saw that a correlation coefficient could range from 11 to 21. The closer the coefficient is to 61, the stronger the correlation or association. If we get a large coefficient such as 10.9, recall that the coefficient of determination is 0.92 5 0.81. That is, 81 per cent of the variation in one variable is explained by variation in the other variable. Intuitively, this means the two variables are strongly associated and therefore practically significant. Suppose also that the significance level is below 0.05. Then not only is the correlation coefficient large, but it is also unlikely to have occurred by chance. However, in another situation we may get a correlation of, say, 10.1. Suppose this coefficient is also statistically significant with a significance value less than 0.05. Again, the coefficient of determination would be 0.12 5 1 per cent. Therefore, even though the coefficient is statistically significant and has not occurred by chance, the two variables do not appear to be meaningfully associated. Thus, while there may be a high statistical significance, practically that may not be very meaningful. The same process applies to regression. No matter how highly statistically significant the coefficient of the independent variable is, it does not matter if the coefficient is small and only affects the dependent variable marginally. We can also discuss practical significance in the case of a Chi-square test and cross-tabulation. Referring to Table 14.10, recall that once percentages had been calculated for ease of comparison, 62 per cent of those who agreed to being fairly compensated were very satisfied with their job, and 14% of those who disagreed they were fairly compensated were very satisfied with their job. Intuitively, this suggests an association between job satisfaction and perceived fairness of compensation. The Chi-square test was also highly statistically significant, suggesting that this relationship did not occur by chance. However, suppose the percentages were 54 per cent and 58 per cent respectively, and the Chi-square test revealed a statistically significant outcome. Based on this information, would you then conclude that a lower proportion of employees were satisfied with their jobs if they perceived their compensation was unfair? You might, but remember in this case there is only a 4 percentage point difference. This may not be a practically significant difference.

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

509

In conclusion, when interpreting statistical significance, do not simply rely on the statistical test to tell you whether or not there is an association between two variables. Use a healthy dose of common sense and look also at what the data are telling you to see if it is meaningful.

SUMMARY

14

GIVE EXAMPLES OF THE TYPES OF MARKETING QUESTIONS THAT MAY BE ANSWERED BY ANALYSING THE ASSOCIATION BETWEEN TWO VARIABLES AND TO BE ABLE TO LIST AND APPLY THE VARIOUS TECHNIQUES TO DO SO

each correlation. The interpretation of a correlation matrix is the same for Pearson’s correlation coefficient as it is for Spearman’s rank-order coefficient.

In many situations, researchers wish to test if two variables are interrelated, or associated. For example, marketers have often had an interest in how consumers’ attitudes are associated with purchase intentions or how different marketing variables (such as advertising and promotional spending) are associated with sales and other measures of performance. Many bivariate statistical techniques can be used to measure and understand such associations. Association can be assessed using a variety of techniques including covariance, Pearson’s correlation coefficient, Spearman’s rank-order correlation coefficient and the Chi-square test for independence. To ascertain which measure of association is most appropriate, researchers need to determine each variable’s scale of measurement. For variables measured on an interval or ratio scale, Pearson’s correlation coefficient is appropriate, and for variables measured on an ordinal scale, Spearman’s rank-order correlation coefficient is appropriate. If the researcher has nominal scaled variables, then cross-tabulation and Chi-square tests should be used.

EXPLAIN THE CONCEPT OF BIVARIATE LINEAR REGRESSION AND BE ABLE TO INTERPRET THE OUTPUT FROM A BIVARIATE LINEAR REGRESSION

Bivariate linear regression investigates a straight-line relationship between one dependent variable and one independent variable. It is similar to correlation analysis with one important extension: because we estimate the line of best fit through the data, we can use bivariate regression analysis to predict values of a dependent variable based on values of an independent variable. The regression can be done intuitively by plotting a scatter diagram of the X and Y points and drawing a line that most closely fits the observed relationship. The least-squares method mathematically determines the best-fitting regression line for the observed data. The line determined by this method may be used to forecast values of the dependent variable, given a value for the independent variable. The goodness of the line’s fit may be evaluated with a variant of the ANOVA (analysis of variance) technique or by calculating the coefficient of determination.

CALCULATE A SIMPLE CORRELATION COEFFICIENT, A COEFFICIENT OF DETERMINATION, AND INTERPRET A CORRELATION MATRIX

BE ABLE TO DESIGN AND INTERPRET A CROSS-TABULATION AND TO TEST FOR ASSOCIATION USING A CHI-SQUARE TEST FOR INDEPENDENCE

The correlation coefficient, r, measures the strength and direction of the relationship between two variables and can be calculated using the formula on page 475. The coefficient of determination, r2, measures the amount of variation in one variable, explained by the variation in the other variable. It is used to assess the explanatory power of a variable. The results of a correlation computation often are presented in a correlation matrix. This is analogous to a between-city distance table, except that the research variables are substituted for cities and a coefficient of correlation is substituted for the number of kilometres. The main diagonal consists of correlations of 1.00, and beneath the main diagonal are the correlation coefficients between each pair of variables, which will be less than 1.00. Below the correlation coefficients are the Sig. values and the sample sizes for

In other situations, a commonly used tool by researchers is the Chi-square test for independence or goodness of fit. For two nominal variables, a researcher may wish to construct a contingency table or cross-tabulation. This displays the joint frequency distribution of two variables. We can use a crosstabulation to ‘eyeball’ the joint frequency distribution in the first instance. The Chi-square test is then used to determine the level of statistical significance of the association. The formula for the Chi-square test is shown on page 503. It allows the researcher to examine the difference between an observed joint frequency distribution and an expected joint frequency distribution (that is, if the two variables were independent). If there is a difference, then the two variables are associated. We would perform this test when we have two nominal scaled variables.

510

PART SIX > ANALYSING THE DATA

UNDERSTAND HOW TO RUN CORRELATION, REGRESSION AND CHI-SQUARE ANALYSIS IN SPSS AND MICROSOFT EXCEL, AND BE ABLE TO INTERPRET THE OUTPUT

UNDERSTAND THE DIFFERENCE BETWEEN PRACTICAL AND STATISTICAL SIGNIFICANCE WHEN TESTING FOR ASSOCIATION

The tests covered in this chapter used SPSS and Excel. SPSS and Excel display a variety of different statistics and output when you run these tests, but not all of it is suitable to display in a report, although it ought to be understood by the researcher. This chapter showed how to conduct these tests in SPSS and Excel, and showed how to interpret the output. When writing up the results within a report careful attention should be paid to including only the most relevant information. This would include the appropriate parameters from the test, as outlined in the chapter, and the probability of sampling error, ascertained from the Sig. value.

When interpreting tests for association, as with tests of difference, the researcher ought to examine the substantive significance of the association. Substantive significance refers to the importance of the association. If enough people are sampled, an association can be statistically significant, but if the association is very small (that is, 0.02), it is probably not very important because the association is very weak. In other words, such an association may be statistically significant but not substantively significant.

KEY TERMS AND CONCEPTS bivariate linear regression Chi-square test for independence

coefficient of determination (r2) correlation matrix

F-test (regression) Pearson’s correlation coefficient

test of association

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 The discussion in this chapter is limited to linear relationships. Try to diagram some nonlinear relationships that show r values of zero. 2 Draw some scattergraphs that you would be likely to see for each of the following correlation coefficients: ☑ r 5 0 ☑ r 5 0.11 ☑ r 5 0.21 ☑ r 5 0.45 ☑ r5 0.92 3 How does correlation differ from bivariate regression? 4 Consider that your boss wanted to predict how electronic word of mouth (eWOM) within the brand community varied

based on trust in the brand community. Your boss also wanted to predict levels of eWOM for customers who had a brand identity of 4 (on a 5 point scale). What statistical technique would you use to do this? 5 Consumers often visit online health communities to gain support for a health condition. This support may be classified as solution-based support or as emotional support. The support they gain from visiting the website may be associated with website satisfaction. A correlation matrix for three variables (solution support, emotional support, website satisfaction) is shown below. Comment.

EXHIBIT 14.29 → CORRELATION MATRIX FROM SPSS

solution 1

Pearson correlation sig. (two-tailed) N

emotional 1

Pearson correlation sig. (two-tailed) N

satisfaction 1

Pearson correlation sig. (two-tailed) N

**Correlation is significant at the 0.01 level (two-tailed)

CORRELATIONS solution 1 1 270 .580** .000 270 .345** .000 270

emotional 1 .580** .000 270 1 270 .534** .000 270

satisfaction 1 .345** .000 270 .534** .000 270 1 270

511

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

6 For each of the following examples, describe and explain the type of correlation coefficient that you would use: ☑ correlation between sales (measured in thousands of dollars) and advertising (measured in thousands of dollars) ☑ correlation between brand preference (measured as a ranking variable) and relative frequency of purchasing that brand (also measured as a ranking variable) ☑ correlation of brand preference for cricketing brands (measured as a ranking variable) and relative frequency of playing cricket (measured as a weekly average) ☑ correlation between attitude towards a product (measured on a Likert scale) and attitude towards a brand (also measured on a Likert scale). 7 Interpret the following regression equations: a yˆ 5 aˆ 1 bˆ X 5 2.2 1 0.6X, where yˆ 5 likelihood of visiting an online community and X 5 level of satisfaction with the community. b yˆ 5 aˆ 1 bˆ X 5 5.5 1 1.8X, where yˆ 5 likelihood of purchasing a new product and X 5 perceived relative advantage. c what would be the limitations of using these types of regression equations to make these predictions. 8 a The following ANOVA summary table is the result of a regression of social norms (e.g., the influence of significant others such as family and friends) on intention to conserve water. Is the relationship statistically significant at the 0.05 level? Comment. d.f.

Mean square

F-value

262.7

1

262.7

106.7

Unexplained by regression

659.6

268

2.5

Total

922.3

269

Source of variation

Sum of squares

Explained by regression

8 b Based upon data from the same regression the output below was obtained. According to the output do social norms affect intention to conserve water? Write out the appropriate regression equation. Model

Unstandardised Standardised coefficients error

t

Sig.

10 The following table gives a football team’s season-ticket sales, percentage of games won and number of active alumni for the years 2007 to 2015. Year

Season ticket sales

Percentage of games won

Number of active alumni

2007

8 419

58

5 869

2008

10 253

63

6 212

2009

12 457

75

6 315

2010

13 285

36

6 860

2011

14 177

27

8 423

2012

15 730

63

9 000

2013

11286

56

9111

2014

16892

82

8456

2015

16987

93

9876

a Produce a correlation matrix for the variables (use either SPSS or Excel), and interpret the correlation between each pair of variables. b Estimate a regression model for sales and percentage of games won. c Estimate a regression model for sales and number of active alumni. 11 Are the different forms of consumer credit in the following table highly correlated? Year

Travel cards ($)

1

61

2

Bank credit cards ($)

Retail cards ($)

Other credit cards ($)

828

9 400

11 228

939

76

1 312

10 200

12 707

1 119

3

110

2 639

10 900

14 947

1 298

4

122

3 792

11 500

17 064

1 650

5

132

4 490

13 925

20 351

1 804

6

164

5 408

14 763

22 097

1 762

7

191

6 838

16 395

25 256

1 832

8

238

8 281

17 933

28 275

1 823

Interestfree credit ($)

Constant

3.1

0.5

6.1

0.000

9

273

9 501

18 002

29 669

1 893

Social norms

0.9

0.1

10.3

0.000

10

238

11 351

19 052

32 622

1 981

11

284

14 262

21 082

37 702

2 074

9 An economist is attempting to predict the average total budget of retired couples in the Gold Coast based on average Australian urban retired couples’ total budgets. An r2 of 0.7824 is obtained. Will the regression be a good predictive model?

512

PART SIX > ANALYSING THE DATA

12 Explain in your own words what a cross-tabulation is and what it is used for. 13 Explain when you would use a Chi-square test and give an example. Make sure you describe the scales used. 14 Calculate column percentages for the tables below. Firstly, explain the relationship that you observe. After you have done this, perform a Chi-square test on the data and comment on what it shows. a The new product is superior to other alternatives on the market and there is no difference in perception between consumers and managers Consumers

Distributors

Agree

75

65

Disagree

48

15

Total

123

80

b Residents are receptive to drinking recycled water and there Is no difference by gender

Agree Disagree Total

Is your city a safe place to live?

Regional city

State capital city

10

21

7

17

Disagree

38

99

Total

55

137

Regional city

State capital city

Agree

5

20

2

6

Agree Neutral

Does your city have good local services and facilities?

Male

Female

25

16

7

8

Neutral

32

24

Disagree

48

111

Total

55

137

c Store preference differs by age of shopper Store A

Store B

20–34

27

73

35–54

31

82

55 and over

11

93

69

248

Total

of the state capital by residents of the state capital using a variety of social indicators (e.g., crime, safety, services). To do this they ran a survey in the state capital and a survey in their own city. Two questions they asked about in the survey were ‘Is your city a safe place to live?’ and ‘Does your city have good local services and facilities?’ Respondents could agree, remain neutral or disagree. The results are shown below. Is the regional city perceived by residents to be safer? Is the regional city perceived by residents to have good local services and facilities?

15 When conducting a Chi-square test, one of the assumptions is that the expected value for each cell must be at least 5. Suppose this assumption has been violated. What might you do to remedy it? 16 Suppose a regional city council is trying to compare perceptions of its city by its residents with the perceptions

17 Suppose a researcher commented that ‘Chi-square tests and cross-tabulations are commonly used in market research projects because of their simplicity’. Comment on this statement. Do you agree or disagree? Explain your answer. 18 A manufacturer of disposable cleaning cloths told a retailer that sales for this product category closely correlated with sales of disposable nappies. The retailer thought he would check this out for his own sales-forecasting purposes. The researcher says: ‘Disposable washcloths/wipes sales can be predicted with knowledge of disposable nappy sales.’ Is this the right thing to say?

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

513

ONGOING PROJECT RUNNING SOME UNIVARIATE OR BIVARIATE STATISTICS? CONSULT THE PROJECT WORKSHEET FOR CHAPTERS 12, 13 AND 14 FOR HELP

Selecting a test to use is based on answering some simple questions (see the flowchart at the beginning of Part 6) about the hypotheses you are testing, how the variables are measured and how many

variables to include. You then need to be able to interpret the output from the test. These steps can be followed by using the project worksheets for Chapters 12, 13 and 14 available from the CourseMate website. It’s a good idea to know what tests you are going to use before you collect the data (then you can collect the right data).

WRITTEN CASE STUDY 14.1 NEIGHBOURHOOD RENEWAL AND THE USE OF MARKETING RESEARCH Neighbourhood renewal is often a key focus of local government strategies to improve conditions and wellbeing of residents. One neighbourhood renewal initiative was concerned with trying to understand residents’ attitudes and behaviours towards local services within a regional area of Australia. Private and public

housing residents of the area were surveyed about their attitudes and usage of local services. Based upon the survey, the results in Exhibit 14.30 and Exhibit 14.31 were observed in relation to transport usage and resident participation in the community, both by housing type.

EXHIBIT 14.30 → CROSS-TABULATION OF TRANSPORT USAGE BY HOUSING TYPE

What would you say is your main form of transport?

Private housing residents

Public housing residents

Car or motorbike

72

53

Public transport

10

28

Taxis

2

2

Cycling

5

4

Walking

32

15

Other

2

2

Don’t go out much

1

0

EXHIBIT 14.31 → CROSS TABULATION OF PARTICIPATION IN THE COMMUNITY BY HOUSING TYPE

Private housing residents How much do people in this neighbourhood participate in local activities?

Public housing residents

Very little

46

43

A moderate amount

53

48

A lot

20

21

QUESTIONS

1 Interpret these results and conduct an appropriate test to see whether or not there is an association between transport usage and housing type.

2 Write up the results in a format appropriate for a manager who might be reading the final report.

514

PART SIX > ANALYSING THE DATA

WRITTEN CASE STUDY 14.2 HOW INNOVATIVE IS A NEW PRODUCT AND WHY? Consumer perceptions of how innovative a product is might help to explain whether or not that product will be successful. Some say that perceived innovativeness is the degree of novelty and enhanced benefit that a product offers – in other words the degree to which a product is newer and better.13 Based on a large-scale sample, researchers tried to ascertain how three concepts – perceived technological newness, perceived concept

newness and perceived relative advantage – were associated with perceived innovativeness. The measures of each construct were constructed from multi-item Likert scales and the researchers used correlation coefficients to view the bivariate relationships between the constructs. The researchers found the results displayed in Exhibit 14.32.

EXHIBIT 14.32 → CORRELATION MATRIX OF PERCEIVED INNOVATIVENESS WITH ITS ANTECEDENTS

Perceived technology newness Perceived technology newness

Perceived concept newness

Perceived relative advantage

Perceived innovativeness

1

Perceived concept newness

0.70**

1

Perceived relative advantage

0.58**

0.59**

1

Perceived innovativeness

0.60**

0.75**

0.61**

1

***p , 0.001 **p , 0.01 *p , 0.05

QUESTIONS

1 Interpret these results and explain their significance to managers. What practical implications might they have?

2 What are the limitations of using correlation coefficients to find out the relationship between these variables?

WRITTEN CASE STUDY 14.3 FOOD LABELLING AND COUNTRY OF ORIGIN Country of Origin (COO) labels are used by food retailers in Australia and other countries to indicate where foods were made, produced or grown.14 They serve as an important quality signal to alleviate consumer risk perceptions. As an Australian food manufacturer you might be interested in understanding how consumers from different countries (e.g., potential export markets such as the UK, France, Spain, Belgium, Holland, Sweden, Turkey and America) perceive various foods from Australia (e.g., olive oil, wine, beer, pulses, cheese) in relation to their overall product quality, taste and food safety perceptions. Suppose you were also asked to find out if there were any other patterns in the data such that certain countries shared an association in their perceptions of Australian food with other countries.

QUESTIONS

1 Design the procedures you would use to help provide the information for this, clearly specifying how you would collect the data and what would be measured. 2 Describe how you would use a cross-tabulation and Chisquare test to help you. In your answer specifically state: (a) what questions would be asked of respondents; (b) how these questions would be anchored; (c) how you would actually run the test; and (d) draft some likely cross-tabulations without the data included.

CHAPTER 14 > BIVARIATE STATISTICAL ANALYSIS: TESTS OF ASSOCIATION

515

ONGOING CASE STUDY MOBILE PHONE SWITCHING AND BILL SHOCK The analysis is going well. The team is getting on top of the data analysis and gaining a better understanding about segments which exist and their different perceptions and behaviours. These insights were well received at an interim meeting with management at AusBargain, but it has also become clear that management still have some questions that are left unanswered. Are bill shock and switching related? What factors are most likely to be associated with switching behaviour? Bill shock? The team regroup.

QUESTIONS

1 Define some hypotheses which could be developed about (a) the association between bill shock and switching behaviour, (b) the association between bill shock and other possible influencers, and (c) the association between switching intention and other possible Influencers. 2 For each of the hypotheses that you define, clearly specify the analysis technique and the variables involved in the analysis.

NOTES 1 2

3 4 5

Alexander, D.L., Lynch, John G & Wang, Qing (2008), ‘As time goes by: Do cold feet follow warm intentions for really new versus incrementally new products?’ Journal of Marketing Research, 45 (3), 307–319. When correlating two interval- or ratio-scaled variables, the correct statistic to use is a Pearson’s correlation coefficient. However, as we will see later in the chapter, marketing data often come ordinally scaled. In these cases we can use a different type of nonparametric statistic called a Spearman’s correlation coefficient. Again, this is the case for a Pearson’s correlation coefficient. The procedure is a little different for other nonparametric tests of correlations. This will be discussed later in the chapter. For example, see Selvanathan, Anthony, Selvanathan, Saroja, & Keller, Gerald (2013) Business Statistics: Australia/New Zealand, 6th edn, Cengage Learning: Melbourne. For a t-test under the null hypothesis r 5 0, t is distributed with d.f. 5 n 2 2: r t5 sr sr 5

12 r

2

n2r

Table A.7 in the appendix provides the critical value of r for the Pearson correlation coefficient to test the null hypothesis that r equals zero.

6 See: http://tylervigen.com/spurious-correlations, accessed 17 March 2016. 7 Brady, M. K., Voorhees C. M. & Brusco, M.J. (2012) ‘Service sweethearting: Its antecedents and customer consequences,’ Journal of Marketing, 76 (March), 81–98. 8 Please see the following article which is related to this topic: Johnson, Devon S. & Lowe, Ben (2015), ‘Emotional support, perceived corporate ownership and skepticism toward out–groups in virtual communities,’ Journal of Interactive Marketing, 29, February, pp. 1–10. 9 Lowe, B., Lynch, J. and Lowe, J. (2015) ‘Reducing household water consumption: A social marketing approach,’ Journal of Marketing Management, 31 (3–4), 378–408. 10 This is a point estimate. A confidence interval can be calculated for this sales estimate; however, the topic is beyond the scope of this book. 11 Interested readers are referred to a more advanced statistics textbook to see how to perform this calculation. For example, Selvanathan, Anthony, Selvanathan, Saroja, & Keller, Gerald (2013) Business Statistics: Australia/New Zealand, 6th edn, Cengage Learning: Melbourne. 12 Before doing this, please make sure you have Excel’s Analysis Toolpak installed. 13 Lowe, Ben & Alpert, Frank (2015) ‘Forecasting consumer perception of innovativeness’, Technovation, November, pp. 1–14. 14 http://www.foodstandards.gov.au/consumer/labelling/coo/Pages/default.aspx.

15 » WHAT YOU WILL LEARN IN THIS CHAPTER

To distinguish between univariate analysis, bivariate analysis and multivariate analysis. To distinguish between the two basic groups in multivariate analysis: dependence methods and interdependence methods. To discuss the concept of multiple regression analysis. To define the coefficient of partial regression. To interpret the statistical results of multiple regression analysis. To define and discuss factor analysis. To define and discuss cluster analysis. To define and discuss multidimensional scaling.

MULTIVARIATE STATISTICAL ANALYSIS Turning complicated data into useful psychographic segments

Roy Morgan Research is an Australian-based company with operations around the globe. It is Australia’s best-known and longest-established market research and public opinion polling company. The company gathers market research information by personal interviews, telephone interviews and self-administered questionnaires, and via the Internet. Each year the company interviews more than 50 000 people in Australia. Questions relate to lifestyle and attitudes, media consumption, Internet usage, brand and product usage, purchase intentions, retail visits, services, financial information, and recreation and leisure activities. Data purchased from the Australian Bureau of Statistics and other sources are added to create a very large and complicated set of figures that combine attitudes, values and behaviours with demographics and geographic location. Multivariate analysis is used to filter and summarise these data into information about customer groups, called Helix Personas, that go beyond simplistic demographics to an understanding of underlying values and patterns of behaviour.1 The main groups named in the Personas data service are:

(100) Leading Lifestyles ‘High income families, typically own their own home in the inner suburbs. Leading Lifestyles acknowledge their good lot in life, but make no apologies for it – they studied hard, worked hard, developed in-demand skills and professional networks, invested smartly, donated to personally important charities and helped out family members.’ Comprises 24 per cent of the population. Likes: cinema, city-working, education, gardening, networking, news websites, shopping, travel. (12 subgroups)

516

PART SIX > ANALYSING THE DATA

(200) Metrotechs ‘Young, single, well educated, inner city professionals with high incomes, typically renting apartments. Cultured, connected, clued-in and cashed up. Say goodbye to the myth of the entitled Gen Y hipster expecting everything to be handed to them on a silver platter: Metrotechs are hard-working and ambitious.’ Comprises 13 per cent of the population. Likes: career, environment, education, fashion, fine dining, taxi travel, technology, wine. (10 subgroups)

(300) Today’s Families ‘Young families in the outer suburbs, living up to their above-average incomes. Their beloved gizmo-enriched home is the nucleus of their family. Compared to many, Today’s Families have got it good: they’re on relatively high incomes, they send their kids to private schools and they’re usually first with the latest technologies. But for this community, money’s always an issue: if it’s not mortgage payments, it’ll be school fees or some other annoying bill. In fact, if a financial genie granted them a wish, lower interest rates or reduced taxes would top the list.’ Comprises 10 per cent of the population. Likes: cars, coffee, take-away, gadgets, magazines, alcohol, reality TV, sport. (6 subgroups)

(400) Aussie Achievers ‘Closest to the average Australian, these young, educated, outer suburban families are working full time to pay off their expensive separate house. How wide is the middle? As the quintessentially average community, Aussie Achievers show just how diverse and multifaceted “ordinary” can be: leaders and followers, conservatives and progressives, bungeejumpers and jumper-knitters, up-to-the-minute or out-of-the-loop, loud-mouthed or tight-lipped, overthe-top or under-the-radar – this is the heartland.’ Comprises 10 per cent of the population. Likes: football, cars, E-readers, gambling, gardening, homerenovation, restaurants, television. (4 subgroups)

(500) Getting By ‘Young parents or older families with children still at home, outer suburbs, bargain hunters. Over the years, these young migrants from the Middle East, India, SouthEast Asia and Africa have formed and joined bustling multicultural districts scattered across major metropolitan areas. With children born overseas or here, Getting By are often in the process of building up, from scratch, their local skills, assets and cross-cultural networks. Many have yet to establish intergenerational wealth and are working to pay the rent and develop a platform for their children’s futures.’ Comprises 12 per cent of the population. Likes: cities, discount-coupons, fashion, freedom, health, looking good, movies, sport. (9 subgroups)

(600) Golden Years ‘Conservative, risk-averse retirees focused on health, security and maintaining an income from investments or the pension, even if they’re mortgage-free. Scattered across Australia in safe and senior-friendly enclaves – whether in the country, by the sea, in a bustling regional township or fringe metro suburb – these largely AngloAustralian empty nesters, grandparents and retirees,

either married or single, are now living on below-average incomes – yet without the greater expenses and obligations of years past.’ Comprises 12 per cent of the population. Likes: AFL, animals, beer, news, discount shopping, lottery, movies, tennis. (6 subgroups)

(700) Battlers ‘Mostly Aussie-born, these struggling young families, single mums and retirees are focused on making ends meet. Many are welfare dependent. Young single mums, struggling young families, retirees surviving on the pension . . . Battlers come in many ages and lifestages. So what defines this community? Low incomes and trying financial circumstances, mostly. Not only is money tight for Battlers, unemployment is widespread. Keeping day-to-day costs down is a constant battle; trying to maintain some sense of security in this rapidly changing world a huge challenge. Is it any wonder they’re not waxing lyrical about the economy or the state of the nation?’ Comprises 19 per cent of the population. Likes: Australian beer, CD/DVD, video games, home improvement, appliances, lay-by, pokies, smoking. (9 subgroups)

Helix personas summary

Source: Roy Morgan Research, July 2012–June 2014

Applying multivariate analysis to research data Computer technology has fostered the rapid diffusion of multivariate analysis in marketing research. A number of computer software packages have changed techniques that once were expensive and exotic into affordable and

Roy Morgan Helix Personas, summarised by various demographic, values and behavioural variables. Each bubble is one of the subgroups and seven colours define the seven main Persona groups.

regular forms of analysis. In light of the multivariate statistical revolution, students need to understand these powerful tools of analysis. This chapter presents a nontechnical description of some multivariate methods; it does not include computation formulae.

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

517

518

PART SIX > ANALYSING THE DATA

THE NATURE OF MULTIVARIATE ANALYSIS Social research generally and marketing problems in particular are inherently multidimensional. The price of motor vehicles is simultaneously influenced by inflation, advertising expenditures and multivariate statistical analysis

the international exchange rates. Consumers can evaluate various shopping centres on the basis of many attributes. As researchers become increasingly aware of the multidimensional nature of their problems, they will use multivariate analysis more to help them solve complex problems. The investigation of one variable at a time is referred to as univariate analysis, which we discussed in Chapter 12. Investigation of the relationship between two variables is bivariate analysis, discussed in Chapters 13 and 14. When problems are multidimensional and involve three or more variables, we use

multivariate statistical analysis Statistical methods that allow the simultaneous investigation of more than two variables.

multivariate statistical analysis. Multivariate statistical methods allow us to consider the effects of more than one variable at the same time. For example, suppose a forecaster wishes to estimate demand for house-roofing materials for the next five years. While past sales records alone might predict consumption, additional variables such as building approvals, inflation and population growth are likely to give greater insight into the determinants of demand for roofing. Consumers who are evaluating supermarkets may be concerned about the distance to each store, perceived cleanliness, price levels and various store attributes. Often we find that an apparent relationship between two variables disappears when we take into account a third variable that affects both variables. To understand problems such as these, researchers need multivariate analysis.

SURVEY THIS!

About this time of the semester, you may want to understand what factors lead to success in a Marketing Research course. Let’s try to understand what factors are associated with studying. Notice that at least two questions in our data set deal with how much people study and one in particular deals with how much time students spend studying Marketing Research. Take a look at the portion of the questionnaire shown. Let’s examine this research question: Student involvement with different social Courtesy of Qualtrics.com behaviours is related to their study habits. Use the responses to the items shown in the screenshot or from other portions of the questionnaire, as you see fit, to explore this research question. This could be done by using each item as an individual predictor in a multiple regression model. However, there is a risk of multicollinearity and the results will be more complex than if data reduction is applied first. So, an alternative is to first identify underlying factors among the variables above by applying exploratory factor analysis with a varimax rotation. Then, create the small number of independent variables using the multi-item composites indicated by the factor analysis. The regression model becomes simpler to test and understand. Can you come up with a better model? Try adding two or three independent variables of your own using other variables to the model. Is your model better? Explain.

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

519

CLASSIFYING MULTIVARIATE TECHNIQUES Exhibit 15.1 presents a useful classification of most multivariate statistical techniques. Two basic groups of multivariate techniques are dependence methods and interdependence methods. When there are clear dependent and independent variables then we use dependence methods, and when variables or cases are not regarded as dependent, but may be simply related or similar, then we use interdependence methods. Dependence methods are used when we believe that some variables have an effect on other variables. For example, answers to a purchase-intention question may be the result of brand name, price, and a combination of product features. Interdependence methods are used when we want to group together similar variables or cases, such as finding groups of consumers with similar combinations of attitudes, or groups of variables that seem to measure the same thing. All multivariate methods

← EXHIBIT 15.1 A CLASSIFICATION OF SELECTED MULTIVARIATE METHODS

Are some of the variables dependent on others?

Yes

No

Dependence methods

Interdependence methods

The analysis of dependence If we use a multivariate technique to explain or predict the dependent variable(s) on the basis of two or more independent variables, we are attempting to analyse dependence. A bank will routinely ask: ‘Is this person a good credit risk, based on her age, income and marital status?’ A fast-food chain will forecast potential sales of a new site based on population density, road traffic and competition. Multivariate dependence methods include multiple regression analysis, multiple discriminant analysis, logistic regression, multivariate analysis of variance and n-way cross-tabulation, among many others.

The analysis of interdependence Interdependence methods of analysis are used to group things together and give them meaning. No one variable or variable subset is to be predicted from or explained by the others. The most common interdependence methods are exploratory factor analysis, cluster analysis and multidimensional scaling. In exploratory factor analysis, we might find that some variables in a research study are very closely related and so combine them to create a new variable with a composite name. In cluster analysis, we might find different groups of potential customers, or market segments, based on similar combinations of attitudes, values and demographics. Multidimensional scaling lets us visualise how objects – brands, market segments, variables – compare with each other.

multivariate dependence methods Multivariate statistical techniques that explain or predict one or more dependent variables on the basis of two or more independent variables. interdependence methods Multivariate statistical techniques that are used to group things together and give them meaning.

520

PART SIX > ANALYSING THE DATA

Influence of measurement scales As in all forms of data analysis, the scale of measurement determines which multivariate technique is appropriate for the data. Table 15.1 and Exhibit 15.5 (see page 539) show that selection of a multivariate technique requires consideration of the types of measures used for both independent and dependent sets of variables. For ease of diagramming, Table 15.1 and Exhibit 15.5 refer to nominal and ordinal scales as nonmetric and interval and ratio scales as metric. TABLE 15.1 »

MULTIVARIATE ANALYSIS: CLASSIFICATION OF DEPENDENCE METHODS

Independent variables Dependent variable

TIPS OF THE TRADE

ONGOING PROJECT

cross-tabulation

Metric

Nonmetric

Metric

Partial correlation Multiple regression analysis

N-way univariate analysis of variance

Nonmetric

Discriminant analysis Binary logistic regression

N-way cross-tabulation

THE SEVEN COMMANDMENTS FOR USERS OF MULTIVARIATE METHODS2

1 Do not be technique oriented. Focus on management’s needs, and then choose an appropriate analytical tool. 2 Consider multivariate models as information for management. Multivariate models (equations or perceptual maps) are an aid to, not a substitute for, managerial judgement. 3 Do not substitute multivariate methods for researcher skill and imagination. Statistics do not ensure causality and are not substitutes for commonsense. 4 Develop communication skills. Management seldom accepts findings based on methods it doesn’t understand. 5 Avoid making statistical inferences about the parameters of multivariate models. We are seldom certain of the distribution of a market population due to non-sampling and measurement errors. 6 Guard against the danger of making inferences about the market realities when such inferences may be due to the peculiarities of the method. Be sure the statistical findings are consistent with sound theory and common sense. 7 Exploit the complementary relationship between functional and structural methods. Use one method to support another.

ANALYSIS OF DEPENDENCE n-way cross-tabulation We learned about cross-tabulations in Chapter 14 on bivariate analysis: n-way cross-tabulations are

n-way cross-tabulation Where two nonmetric scaled variables are compared after accounting for the effects of a third (or more) nonmetric variable.

an extension of the same idea with a cross-tabulation contained within the separate groups of a third nonmetric variable. Let’s examine the case of an advertiser who wants to see if people have responded to a magazine campaign by trialling the ‘new improved’ brand. The advertiser wonders if gender may also have an influence on brand trial.

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

The cross-tabulation of product trial against gender, in Table 15.2A, shows that 59 per cent of women have trialled the brand compared with 42 per cent of men. The Chi-square statistic suggests that we would see such a distribution less than 5 per cent of the time purely by chance, so we would conclude there is a relationship between gender and trial. In Table 15.2B, it seems that magazine readership is not related to brand trial. The cross-tabulation shows that 60 per cent of magazine readers have trialled the brand, compared with 49 per cent of non-readers, but the difference, as indicated by the Chi-square statistic, is insufficient to conclude that we wouldn’t see this result by chance. We’d be tempted to conclude that the magazine campaign has been a failure. And we’d be wrong. TABLE 15.2A » CROSS-TABULATIONS PRODUCT TRIAL (GENDER)

Female

Male

Total

Not trialled

35

32

67

%

41%

58%

48%

Trialled

50

23

73

%

59%

42%

52%

Total %

85

55

100%

100%

140 100%

Chi-square 5 3.87 (d.f. 5 1) Sig. 5 0.049

TABLE 15.2B » CROSS-TABULATIONS PRODUCT TRIAL (MAGAZINE READERSHIP)

Not read magazine

Read magazine

Total

Not trialled

54

13

67

%

51%

37%

48%

Trialled

51

22

73

%

49%

63%

52%

Total

105

%

100%

35 100%

140 100%

Chi-square 5 2.15 (d.f.5 1) Sig. 5 0.143 If we take into account the effects of gender on the relationship between readership and trial, we see a different picture. In Table 15.2C, we ran the cross-tabulation between trial and readership separately for each level of the variable ‘gender’. Among women we see that 67 per cent of readers have trialled the brand compared with 57 per cent of non-readers, and the Chi-square statistic suggests that there is little difference. Among men, however, we see that 60 per cent of readers have trialled the brand compared with just 31 per cent of non-readers. The Chi-square statistic suggests that there is a relationship! Among men the magazine campaign seems to have an effect on brand trial. In this case, women seem to be trialling the brand regardless of magazine readership, but men who read the magazine are much more likely to trial the brand than men who do not read the magazine. By layering the cross-tabulation as we have here, we have uncovered a relationship that was hidden by a third variable.

521

522

PART SIX > ANALYSING THE DATA

TABLE 15.2C » TRIAL BRAND (READ MAGAZINE) GENDER CROSS-TABULATION

Gender Female

Not read magazine Not trialled

30

%

43%

Trialled % Total %

Read magazine

Total

5

35

33%

41%

40

10

50

57%

67%

59%

70

15

85

100%

100%

100%

Female Chi-square 5 0.463 (d.f. 5 1) Sig.. 5 0.496 Gender Male

Not read magazine Not trialled

24

%

69%

Trialled % Total %

Read magazine 8 40%

11

12

31%

60%

Total 32 58% 23 42%

35

20

55

100%

100%

100%

Male Chi-square 5 4.270 (d.f. 5 1), Sig. 5 0.037 One of the reasons that we did not see the relationship between readership and trial is the different numbers in each group: men and women, read and not-read. The combined effect of a ‘lurking variable’, with different group sizes, is an example of Simpson’s Paradox. The unequal group sizes, in the presence of a lurking variable, can weight the results incorrectly. This can lead to seriously flawed conclusions. To avoid such flawed conclusions, we must have a very clear model, or theory, in mind of what are the interrelationships among all the constructs in a study, and then analyse the data separately for each group.

partial correlation

partial correlation analysis An analysis of association between two linear variables after controlling for the effects of other variables.

Partial correlation analysis We learned about correlation in Chapter 14. Partial correlation analysis is the measure of correlation between two linear variables that also takes into account the influence of a third linear variable. Often we see that an apparent relationship changes when we include the effects of other potential influences. Consider the diagram in Exhibit 15.2. X and Y may be highly correlated with each other and we’d be tempted to believe that there was a relationship between them. However, they may both be

EXHIBIT 15.2 ↓ RELATIONSHIP BETWEEN X AND Y CONFOUNDED BY Z

affected by the third variable, Z. When this influence is accounted for then the apparent relationship between X and Y may disappear. Let’s consider a manager who wonders whether it is worthwhile hiring university graduates for the sales team. That is, he wants to see if there is a relationship between the education levels of his salespeople and the level of commissions they earn. He knows that the level of experience

Z

of the sales representative also has an influence on commissions. Table 15.3A shows a correlation matrix of the three variables under consideration. It appears from this correlation of near-zero, –0.05, X

Y

that there is no relationship at all between education level and commissions earned. The manager would be tempted to infer that experience is the only thing that counts, and decide to not hire

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

523

well-educated salespeople. But note that experience is correlated with both commissions and education. If we remove the effect of experience from both these variables, then we see the true relationship between education and commissions. The partial correlation between education and commission, Table 15.3b, is actually 0.53. That’s a moderately high positive correlation, and a huge change from –0.05. The sales manager should conclude that experienced candidates and well-trained candidates (and preferably both) are good hires. TABLE 15.3A» CORRELATIONS

Experience Experience

1

Commission

0.66

Education

20.56

Commission

Education

1 1

20.05

TABLE 15.3B» PARTIAL CORRELATIONS, WITH EFFECT OF 3RD VARIABLE REMOVED

Control variable Experience

Commission Commission

1

Education

0.53

Education 1

Partial correlation can be applied to more than one control variable; typically, when we want to include several control influences in a partial correlation then we can achieve exactly the same result with multiple regression.

n-way univariate analysis of variance (ANOVA) In Chapter 13 we briefly discussed one-way analysis of variance. A simple extension of this procedure allows the simultaneous measure of the differences in the average of a dependent

analysis of variance (ANOVA)

variable under the influence of more than one nominal-scaled variables. There are several reasons we’d want to test several variables at once in ANOVA. First, as we found in n-way crosstabulations and partial correlation, our investigation of the relationship between two variables often is confounded by the influences of other variables. Second, sometimes variables interact with each other in their effects on a dependent variable. Finally, the simultaneous examination of effects reduces the likelihood of mistakenly finding a statistically significant relationship when there really isn’t one (Type I error). Thus, n-way univariate analysis of variance (ANOVA) simultaneously tests for the differences in the mean of a metric dependent variable among two or more nonmetric independent variables. Consider the example of an online retailer of healthcare products, vitamins and dietary supplements. The retailer is experimenting with different messages and design aspects of the website. In one experiment, two different website features were systematically varied: (1) the website could emphasise either informational messages (product features, ingredients etc.) or transformational messages (image and broad benefit statements); and (2) the website could either encourage visitors to buy now with frequent links to the shopping cart, or let visitors continue to browse without pushing them. Thus, with some simple scripting, each website visitor was randomly placed into one of four different opening

n-way univariate analysis of variance (ANOVA) A technique that simultaneously tests for the differences in the mean of a metric dependent variable among two or more nonmetric independent variables.

524

PART SIX > ANALYSING THE DATA

pages. Also, the systems administrator was able to learn from the Web log whether a buyer came to the site from a paid search link, a paid online advertisement link, or in some other way. The researcher felt that these three variables (message type, call to action and mode of accessing the site) would have an effect on the amount of money spent on purchases. One hundred people purchased from the site. The results of a three-way univariate ANOVA are presented in Tables 15.4A, 15.4B and 15.4C. We can see from the F-statistics in the ANOVA table that each of the three variables (method of site entry, message-type, appeal) has an effect on amount spent. An examination of the averages can tell us which levels of each of those variables have the greatest effect. Also of interest is the importance of interaction effects. We see in the ANOVA table that the combined effects of message and appeal influence purchases in addition to these website features acting alone. Post-hoc analysis of the means for the joint occurrences of different levels of influence reveals where the effect lies: the type of appeal seems to have no effect when combined with a transformational message ($123 vs $126), but an informational message combined with an appeal to buy now causes buyers to spend nearly an extra $75 on average ($221 vs $295). TABLE 15.4A» THREE-WAY ANALYSIS OF VARIANCE

Between-subjects factors Value label Method of site entry

N

1

Browse

11

2

Paid ad

31

3

Paid search

58

Message

0

Transformational

49

1

Informational

51

Appeal

0

None

53

1

Buy now

47

TABLE 15.4B» THREE-WAY ANOVA: METHOD OF SITE ENTRY, MESSAGE PRESENTED AND PURCHASE APPEAL

Tests of between-subjects effects dependent variable: Dollar sales Source Corrected model

Type III sum of squares

d.f.

Mean square

F

Sig.

621 343

10

62 134

32.3

0.00

1 347 296

1

1 347 296

699.4

0.00

Entry

47 910

2

23 955

12.4

0.00

Message

131 471

1

131 471

68.3

0.00

9680

1

9680

5.0

0.03

11 311

2

5656

2.9

0.06

2254

2

1127

0.6

0.56

12 470

1

12 470

6.5

0.01

0.4

0.52

Intercept

Appeal Entry 3 Message Entry 3 Appeal Message 3 Appeal

812

1

812

Error

Entry 3 Message 3 Appeal

171 434

89

1926

Total

4 978 815

100

792 777

99

Corrected total R-squared 50.8 (Adjusted R-squared 50.8)

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

525

TABLE 15.4C» POST-HOC INTERACTION EFFECTS: MESSAGE BY APPEAL

Dependent variable: Dollar sales

95% confidence interval

Message

Appeal

Mean

Std. error

Lower bound

Upper bound

Transformational

None

123.5

10.8

102.1

144.9

Buy now

126.2

9.6

107.2

145.2

None

220.9

15.7

189.6

252.2

Buy now

294.5

9.4

275.9

313.2

Informational

Why do we check the ANOVA table for significant effects before we examine the means? We could simply run several t-tests and make it so much simpler. The problem is the possibility of Type I error (finding a relationship when there really isn’t one) as a result of multiple testing of significance. If we are prepared to accept mean differences at the 95 per cent confidence level, then the chance of making a Type I error is 1 2 0.95 5 0.05, or 5 per cent. If we make two t-tests at once, then the chances of Type I error become 1 2 (0.95 3 0.95) 5 1 2 0.90 5 0.10, or 10 per cent. Three t-tests make the chance of a Type I error 1 2 (0.95 3 0.95 3 0.95) 5 1 2 0.86 5 0.14, or 14 per cent, and so on. Using only paired t–tests makes it quite likely that we would wrongly find a ‘significant’ relationship. Looking for significance in the ANOVA table first takes into account the effects of all other influences before we draw conclusions about individual variables.

Multiple regression analysis

ONGOING PROJECT

Multiple regression analysis is an extension of bivariate regression analysis that allows for simultaneous investigation of the effect of two or more independent variables on a single, intervalscaled dependent variable. Chapter 14 illustrated bivariate linear regression analysis with an example concerning a construction company’s sales volume. In that example, variations in the dependent variable were attributed to changes in a single independent variable. Yet reality suggests that several factors are likely to affect such a dependent variable. For example, sales volume might depend not only on the number of building permits, but also on price levels, amount of advertising and income of consumers in the area. Thus, the problem requires identification of a linear relationship with multiple regression analysis. The multiple regression equation is: Y 5 b0 1 b1X1 1 b2X2 1 b3X3 1 … 1 bnXn

or in shorthand: Y 5 Sbi Xi

Consider the following situation. A new brand of a frequently purchased consumer product is introduced into 25 regional shopping centres across Australia. Different levels of advertising were measured on a per-capita basis. Not all supermarkets were prepared to take the new brand, so the percentage of available retail outlets stocking the new brand served as a measure of the level of distribution, and each region was tested with a different level of pricing relative to the most popular brand in the region. At the end of three months the brand manager was able to compile the figures presented in Table 15.5. Sales were regressed against the measures for promotion, distribution and relative price, and the results are presented in Table 15.6.

multiple regression analysis Where the effects of two or more metric scaled independent variables on a single, metricscaled dependent variable are investigated.

526

PART SIX > ANALYSING THE DATA

TABLE 15.5 » DATA – NEW PRODUCT SALES PER CAPITA WITH PER CAPITA ADVERTISING, DISTRIBUTION LEVEL AND RELATIVE PRICE

Region #

Promotion ($ per 1000 residents)

Distribution (% of retailers in region stocking our brand)

Relative Price (Price relative to leading brand in region)

Sales (Number of boxes sold per 1000 residents)

1

4.40

97

0.49

5.410

2

5.49

54

1.51

3.600

3

7.40

61

1.69

5.550

4

0.63

11

0.91

2.060

5

7.19

10

0.57

5.560

6

3.62

52

0.32

4.950

7

1.31

30

0.61

2.990

8

3.45

40

0.15

3.970

9

3.86

95

0.84

4.300

10

5.27

80

0.85

6.090

11

4.35

82

0.85

4.970

12

7.69

53

0.97

6.690

13

4.63

34

0.79

2.190

14

8.94

17

1.35

2.840

15

9.75

75

0.74

7.390

16

8.32

96

0.36

7.070

17

6.08

90

1.36

0.960

18

8.84

33

0.97

5.470

19

3.95

98

1.27

4.360

20

4.24

18

1.30

3.110

21

2.11

51

0.66

2.770

22

1.96

52

0.85

5.130

23

4.94

28

1.01

5.420

24

4.35

88

1.88

3.960

25

1.52

89

0.71

2.490

Promotion 5 advertising dollars per 1000 residents. Distribution 5 percentage of supermarkets stocking the new brand. Price 5 Price of brand relative to the current leading brand. That is, 1.30 means that our brand is 30 per cent more expensive than the leading brand; 0.70 means that our brand is 30 per cent less expensive. Sales 5 unit sales per 1000 people.

INTERPRETING REGRESSION OUTPUT Output from a multiple regression can be a bewildering range of numbers and symbols, but, if approached in a systematic way, it is fairly straightforward and extremely informative. Different computer programs will present output in slightly varying ways, but generally they are presented in three parts: overall regression statistics, an ANOVA table and the regression equation. Examining the output in Table 15.6 we see the multiple-R is 0.725. This is a correlation coefficient – the correlation between the dependent variable (in this case, sales volume) and the combined

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

effects of the independent variables (in this case, promotion, distribution and price). As in bivariate regression, the coefficient of multiple determination, or multiple index of determination, indicates the percentage of variation in Y, the dependent variable, explained by the variation in the independent variables. The coefficient of multiple determination is more commonly called R-squared, which, not surprisingly, is multiple-R squared. R-squared can be interpreted as the level of the variation in the dependent variable that is explained by the combined effects of the independent variables. In Table 15.6 we see that R-squared is 0.526, which means that 52.6 per cent of the variance in sales is explained by the combined effects of promotion, distribution and price. Adjusted-R-squared is a slight adjustment to R-squared and is used only when comparing different regression equations with different degrees of freedom. Standard error is the same as the standard error in a t-test: it defines the confidence interval of any forecasts of the dependent variable made with the subsequent regression formula. The ANOVA table in the regression output shows how R-squared is calculated and the level of statistical significance of the regression. Most people skip over the ANOVA table because it’s seen as being too hard and because almost always the regression is statistically significant. Nevertheless, it is worth spending a few moments considering the output and learning that it is not as difficult as we might fear. In Table 15.6 the regression is based on 25 observations. If we know the sum total of all 25, then we need to know the value of 24 of those observations to learn the value of the last one – so there are 24 degrees of freedom in total. Three independent variables mean that there are 3 degrees of freedom used in the regression. That leaves 21 degrees of freedom to capture the error, or residual in the model. The term ‘SS’ stands for ‘sum of squares’. The total sum of squares is the sum of squared deviations from the mean of the dependent variable. That is, in our example, we take the sales in each observation and subtract it from the average sales, square the difference, and then add up the results for all observations. The regression sum of squares gives us the sum of squared deviations in the dependent variable when we take into account the independent variables. Any business statistics textbook will walk you through the details of how this is done. The ratio of the regression SS to total SS gives us an interesting statistic: R-squared. Here the regression SS 5 35.106 and total SS 5 66.798. The ratio is 35.106/66.798 5 0.5255, which we can see in the regression statistics section of the output. TABLE 15.6 »

MULTIPLE REGRESSION OUTPUT – SALES AGAINST PROMOTION, DISTRIBUTION AND PRICE

Regression statistics Multiple R

0.725

R-square

0.526

Adjusted R-square

0.458

Std. error

1.228

Observations

25

ANOVA d.f.

SS

MS

F

Sig.-F

Regression

3

35.106

11.702

7.754

0.001

Residual

21

31.692

1.509

Total

24

66.798

527

528

PART SIX > ANALYSING THE DATA

Regression Equation Coefficients

Std. error

t-stat

Intercept

2.811

0.800

3.511

Promotion

0.402

0.101

3.983

0.0175

0.008 0.600

Distribution Price

21.424

b

p-value 0.002

0.611

0.001

2.288

0.344

0.033

22.374

20.364

0.027

The term ‘MS’ is the mean square, the average sum of squares for each degree of freedom. Thus MS is the SS divided by the d.f. The term ‘F’ is an F-statistic – the ratio of two sums of squares created with different degrees of freedom. Here F is the ratio of the regression MS and the residual MS. The F-statistic is important because it permits us to determine the likelihood that we would get the same regression results purely by chance. Before advances in technology, we would look up a table of F-statistics with varying d.f. to determine statistical significance. Now computer printouts routinely give that figure for us. In our example, the ratio of mean squares is 7.75. The significance of this F-statistic of 0.001 tells us that the likelihood of seeing such a ratio with 3 and 21 degrees of freedom purely by chance seems to be about one in one thousand, which is quite unlikely. Here we can say that there is some explanatory power in the regression model – there is a relationship among the independent variables and the dependent variable. Experienced researchers always first look at the significance level of the F-statistic before they look at anything else in the regression output. If F is not statistically significant then there is no point in looking at anything else in the model. The F-statistic tells us whether the independent variables acting together can explain the variation in the dependent variable. The final section of the regression output tells us which variables have an effect and how much, and suggests a formula we can use to forecast future values of the dependent variable. For each independent variable there is a coefficient that is an estimate of the change in the dependent variable for each unit change in that independent variable. In Table 15.6 we can say that if promotional expenditure goes up by one dollar, then unit sales goes up by 0.402 units per 1000 people. The coefficient for distribution suggests that unit sales go up by 0.0175 units sold for each one percentage point improvement in distribution. And if relative price increases by one unit, then units sold will fall by 1.424. The constant is a sort of starting point. It is the level of the dependent variable when all of the independent variables are at zero. So the coefficients define an equation that can be used to predict other values of the dependent variable. In our example, the equation is: Sales 5 2.811 1 0.402(promotion) 1 0.0175(distribution) 2 1.424(price)

In words, we would say: Sales in units per 1 000 people can be estimated as 2.811 units, plus 0.402 units for each extra dollar of advertising per target, plus 0.0175 for each percentage improvement in distribution, less 1.424 units for unit change in relative price. In multiple regression, the coefficients b1, b2 and so on are called coefficients of partial regression. Remember that these coefficients are estimates. The true parameter, estimated by the coefficient, may be a bit higher or it may be a bit lower. How much higher or lower? The standard error of the estimate helps us work out a confidence interval. Actually, the true parameter may even be zero, meaning that the independent variable has no effect at all on the dependent variable. We can check for this possibility by seeing how far away zero is from the estimate. Generally, we say that if zero is more than about two standard errors from the coefficient then we can be pretty sure that the true parameter is not zero. Sound familiar? It’s a t-test. Here we divide the coefficient by its standard error to get a t-statistic, and the p-value tells us the statistical significance of that coefficient. Generally, we say that if the p-value is less than 0.05 then we can be pretty confident that the true parameter is not zero and that the independent variable has an effect on the dependent variable.

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

Which independent variable has the greatest influence on the dependent variable? Many people are tempted to look only at the coefficients to find the largest number. That would be a mistake. Remember that the coefficients are defined as the unit change in the dependent variable for each unit change in the independent variable. So the coefficients are a function of the level of measurement of our data. Sales are in units per 1000 people. We could have measured sales per person and the coefficients would all be one one-thousandth of the current figures. The percentage distribution was measured as the number of percentage points (97 per cent 5 97). We could have measured that independent variable as a fraction (97 per cent 5 0.97) and the coefficient would be one hundred times its current value. Changing the scale of measurement changes the coefficients, making it very difficult to work out which independent variable has a greater or lesser contribution. The better way to discover which independent variable has the greatest influence is to remove the effects of the units of measurement altogether by using a standardised measurement. If we rescale all of the independent and dependent variables so that they have a mean of zero and a standard deviation of one, and then run the regression again, we get standardised coefficients – more often called ‘standardised beta’ or simply ‘beta’. In Table 15.7 we can see that promotion has the higher beta, so it must have the greater influence. Another tip is that it also has the highest t-statistic and the lowest (most significant) p-value. The standardised betas also have a very special role. They are partial correlations. The betas can be interpreted as the correlation between each independent variable and the dependent variable when the effects of all the other independent variables are removed. So, in Table 15.7 the partial correlation between promotion and sales, with the effects of distribution and price removed, is 0.61. This compares with a ‘zero-order’ correlation coefficient of 0.54 between promotion and sales as shown in Table 15.7. What has caused the difference between the normal correlation and the partial correlation between promotion and sales? Examination of the correlation table shows that there is a low, but possibly important, correlation between promotion and price. (Maybe this company is providing more promotion to those regions where it has the higher price, or perhaps higher expenditure on promotions creates not only increased demand but greater price inelasticity.) This phenomenon, known as multicollinearity, means that some of the information contained in promotion is also contained in price. Such correlation among regressors is very common in field research and especially in market and social research, and makes it difficult to completely separate the effects of one independent variable from another. TABLE 15.7 »

CORRELATION MATRIX

Promotion Promotion Distribution

Distribution

Price

Sales

1.000 20.003

1.000

Price

0.200

0.034

1.000

Sales

0.537

0.330

20.230

1.000

PROBLEMS WITH MULTIPLE REGRESSION A continuous, interval-scaled dependent variable is required in multiple regression, as in bivariate regression. Interval scaling is also required for the independent variables; however, dummy variables may be used. A dummy variable is a variable that has two distinct levels that are usually coded as 0 and 1. An underlying assumption of multiple regression is that all the independent variables are independent of each other. That is, the independent variables should be uncorrelated with each other. Unfortunately, this is rarely the case. Each independent variable is usually correlated with the other

529

530

PART SIX > ANALYSING THE DATA

multicollinearity A problem in multiple regression when the independent variables are correlated with each other, causing the parameter estimates to be unreliable.

independent variables. Often, when the correlations are not high, this is not a big problem. Also, when the researcher is most interested in forecasting values rather than explaining results, collinearity in the regressors is rarely a problem. However, when the researcher wants to know which variables have the greatest effect, and by how much, then such ‘multicollinearity’ can be a very difficult problem. Another regression example is useful for illustrating the effects of multicollinearity. Assume that a toy manufacturer wishes to forecast sales by sales territory. It is thought that competitors’ sales, the presence or absence of a company salesperson in the territory (a binary variable) and primary school enrolment are the independent variables that might explain the variation in sales. The data appearing

multicollinearity

in Table 15.8 and in Table 15.9 are the results from multiple: Regression equation: Y 5 102.18 1 0.387X1 1 115.2X2 1 6.73X3 Coefficient of multiple determination (R-squared): 0.85

The regression equation indicates that sales are positively related to X1, X2 and X3. The coefficients (bs ) show the effect on the dependent variable of a 1-unit increase in any of the independent variables. The value b2 5 115.2 indicates that an increase of $115 200 (000 included) in toy sales is expected with each additional unit of X2. Thus, it appears that adding a company salesperson will have a very positive effect on sales. Primary school enrolments also may help predict sales. An increase of 1 unit of enrolment (1000 students) indicates a sales increase of $6730 (000 included). A 1-unit increase in competitors’ sales volume (X1) in the territory adds little to the toy manufacturer’s sales ($387). TABLE 15.8 »

DATA FOR A MULTIPLE REGRESSION PROBLEM

Y sales (’000s)

X1 competitors’ sales (’000s)

222

X2 salesperson (1) or agent (0)

X3 primary school enrolment (’000s)

106

0

304

213

0

18

218

201

0

22

501

378

1

20

542

488

0

21

790

509

1

31

523

644

0

17

667

888

1

25

700

941

1

32

869

1066

1

36

444

307

0

30

479

312

1

22

TABLE 15.9 »

MULTIPLE REGRESSION OUTPUT

Regression statistics Multiple R

0.919

R-square

0.845

Adjusted R-square

0.787

Std. error

96.728

Observations

12

23

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

ANOVA d.f.

SS

Regression

3

408 805

136 268

Residual

8

74 850

9356

Total

11

483 655

MS

F

Sig.-F

14.6

0.001

Regression equation b Intercept

Std. error

t-stat

p-value

127.98

0.8

0.45

102.181

X1 competitors’ sales (’000s)

0.387

0.12

3.11

0.01

X2 salesperson (1) or agent (0)

115.202

71.2

1.62

0.14

1.11

0.30

X3 primary school enrolment (’000s)

6.732

6.06

The correlation between Y and X1 (competitors’ sales) with the effects of X2 (salesperson) and X3 (school enrolments) removed from both Y and X1, is the partial correlation. Because the partial correlation between sales and X1 has been adjusted for the effect produced by variation in X2 (and other independent variables), the correlation coefficient obtained from the bivariate regression will not be the same as the partial coefficient in the multiple regression. In other words, the original value of b1 is the simple bivariate regression coefficient. In multiple regression, the coefficient b1 is defined as the partial regression coefficient for which the effects of other independent variables are held constant. When independent variables are highly correlated with each other, any parameter estimates in multiple regression become highly unreliable. This multicollinearity effect comes about because the information contained in one independent variable is also contained in another independent variable. That means the effect of one independent variable on the dependent variable becomes confused with the effects of the other independent variable. Table 15.10 shows the correlation matrix for all four of the variables in the example. Examination of the correlation matrix shows that all of the independent variables are moderately correlated with each other. If one of the independent variables is removed from the regression analysis, the parameter estimates will change. So how do we decide what the ‘true’ value for the parameter estimates should be? Unfortunately, there is no clear answer. Checks, such as removing some of the data and changing the independent variables, can give indications of the sensitivity of the multicollinearity problem to the parameter estimates. Multicollinearity is a problem when we need to explain a relationship between a dependent and independent variables; however, in many practical cases we don’t really need to explain a relationship: we want only to predict the level of the dependent variable as a function of the independent variables. In such cases, multicollinearity is not really a problem, and non-significant parameter estimates also are not a concern – the researcher is simply interested in the amount of extra variance explained with the introduction of a new independent variable. TABLE 15.10»

CORRELATION MATRIX

Y Y sales

X1

X2

X3

1

X1 competitors’ sales

0.86

1

X2 salesperson (1) or agent (0)

0.73

0.59

1

X3 primary school enrolment

0.68

0.58

0.5

1

531

532

PART SIX > ANALYSING THE DATA

The value of R-squared (0.845) in our example tells us that the variation in the independent variables accounted for about 85 per cent of the variance in the dependent variable. Typically, introducing additional independent variables into the regression equation explains more of the variation in Y than it is possible to explain with fewer variables. In other words, the amount of variation explained by two independent variables in the same equation is usually greater than the variation in Y explained by either one separately. The effect on parameter estimates, significance levels and explained variance is illustrated in Table 15.11, where the same regression is repeated but with one variable removed from the analysis. We can see that sometimes the parameter estimates can change dramatically. Sometimes an important regressor appears to be not statistically significant. In extreme cases a coefficient may even be negative when it would otherwise be positive, and vice versa. It is important then to check for the reasonableness of your parameter estimates and compare the standardised betas (bs) with the correlations to see if multicollinearity may be a problem, and then consider whether it matters a great deal. TABLE 15.11» EFFECTS OF MULTICOLLINEARITY ON COEFFICIENTS AND SIGNIFICANCE LEVELS WHEN OTHER INDEPENDENT VARIABLES ARE REMOVED

X1 X2 X3

X1 X2

X2 X3

X1 X3

Multiple R

0.92

0.91

0.81

0.89

R-squared

0.85

0.82

0.66

0.79

Adjusted R-squared

0.79

0.78

0.58

0.75

Variables included

Std. error

96.73

97.98

b

r

102.18

0.45

X1

0.39

X2 X3

Intercept

135.54

105.06

r

b

r

b

r

231

0.002

63.19

0.731

57.16

0.683

0.01

0.44

0.004

0.47

0.004

115.20

0.14

134.69

0.086

6.73

0.30

9.14

0.185

b

208.73

0.046

14.30

0.099

There are several other assumptions needed for multiple regression (and other multivariate techniques), all of which require advanced study. Several excellent technical books deal with this topic.3 The growing availability of commercial computer programs allows the researcher to compute multiple regressions without a great deal of effort – or thought. Once again, managers should be aware of the limitations of the techniques they use and the importance of having some clear ideas of the nature of the relationships they expect to see in any analysis.

REAL WORLD SNAPSHOT

TOO MUCH OF A GOOD THING!

Researchers often test hypotheses by examining regression coefficients. We are looking for correlations but sometimes in all the wrong places. Financial data can be problematic to analyse. Consider the case of a financial manager trying to analyse gross margin per employee (dependent variable) using the following independent variables: »» average sales per square metre per quarter »» average labour costs per week »» years of experience for the manager »» job performance rating for the previous year (100-point scale).

»

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

»

Regression can be conducted in SPSS by clicking on ANALYZE, REGRESSION and then LINEAR. The VIF (variance inflation factors) column must be requested by clicking on STATISTICS and then checking COLLINEARITY DIAGNOSTICS. After doing so, the following results are obtained. ANOVA Model 1

Regression

Sum of squares

d.f.

Mean square

F

Sig.

142 566.5332

4

35 641.6333

13.57

0.0000008

Residual Total

91 934.43848 234 500.9717

35

2626.698242

39

The F of 13.57 is highly significant (,0.001), so the variables explain a large portion of the variance in the dependent variable. The model R-squared is 0.61 also supporting this conclusion. The results for the independent variable tests show that even though the model results appear strong, only one independent variable (sales) is statistically significant at the 95 per cent confidence level. However, the B coefficients do not make sense. The B coefficients for both sales and labour are beyond the range that B should theoretically take (21.0 to 1.0). Nothing can be correlated with something more than perfectly (which would be a correlation of 1.0 or 21.0). Notice also that the two VIF factors for sales and labour are in the 50s. Generally, when multiple VIF factors approach 5 or greater, problems with multicollinearity can be expected. In this case, the two variables ‘average sales per square metre per quarter’ and ‘average labour costs per week’ are so highly correlated that the regression program cannot separate them. Coefficientsa Model 1

t

Sig.

0.73

0.472

2.94

0.006

56.4

21.588

22.01

0.053

55.9

20.054

20.51

0.613

1.0

20.61

0.545

1.1

Unstandardised coefficients B

Std. error

171.242

235.937

0.091

0.0308

Labour

20.070

0.035

Experience

20.488

0.956

21.856

3.034

20.069

(Constant) Sales

Performance

Standardised coefficients beta 2.340

VIF

A dependent variable: margin.

As often occurs with financial data, they can be difficult to use as independent variables. In this case, the researcher may wish to rerun the model after dropping one of the offending variables.

Binary logistic regression Binary logistic regression is very similar to multiple regression: the independent variables are interval or ratio scaled, and the dependent is a simple binary measure: yes–no, buy or not buy etc. Many events occur as a sort of step function, or tipping point, where at one moment we move from stable to unstable, a person moves from indecision to decision, from non–purchase to purchase, and so on. Such a step function is conveniently estimated using the logistic curve, which is the cumulative density function of the standard normal distribution, as illustrated in Exhibit 15.3. When Z is negative then the value for the dependent variable, Y, is close to zero. At positive scores for Z, Y then jumps up to a value close to one. As Z approaches then passes zero, Y then makes a transition. The task then is to find a linear combination of independent variables to estimate the values for Z so that we get the

533

534

PART SIX > ANALYSING THE DATA

best estimates for the binary dependent variable, Y. Like a multiple regression formula, the logistic binary logistic regression

regression function is given by: Zi 5 b1X1i 1 b2X2i 1 … 1 bnXni

binary logistic regression Establishes a rule for forecasting the value of a binary dependent variable from a combination of two or more metric independent variables.

where Zi 5 ith discriminant score bn 5 coefficient for the nth variable Xni 5 ith value on the nth independent variable The output from a binary logistic regression model is interpreted in the same way as for multiple linear regression. A ‘goodness of fit’ statistic, as pseudo-R-squared, is even included. 1

EXHIBIT 15.3 → THE LOGISTIC CURVE

0.9 0.8 0.7 Probability

0.6 0.5 0.4 0.3 0.2 0.1 0 24

23

22

21

0

1

2

3

4

Z score

Table 15.12 summarises the techniques just discussed to help you decide which is most appropriate for your data and research question. TABLE 15.12»

SUMMARY OF MULTIVARIATE TECHNIQUES FOR ANALYSIS OF DEPENDENCE

Technique

Purpose

Number of dependent variables

Number of independent variables

Type of measurement Dependent

Independent

n-way cross-tabulation

To remove the effects of a third (or more) variables on the relationship between two nominal-scaled variables

1

2 or more

Nominal

Nominal

Partial correlation

To remove the effects of a third (or more) variables on the relationship between two interval or ratio-scaled variables

1

2 or more

Interval

Interval

n-way analysis of variance

To uncover the effect of several categorical variables on an intervalscaled dependent variable

1

2 or more

Nominal

Nominal

Multiple regression

To investigate simultaneously the effects of several independent variables on a dependent variable

1

2 or more

Interval

Interval

Logistic regression

To predict the probability that an object or individual will belong in one of two mutually exclusive categories, based on several independent variables

1

2 or more

Nominal

Interval

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

535

ANALYSIS OF INTERDEPENDENCE We now turn our attention to the analysis of interdependence. Rather than attempting to predict a variable or set of variables from a set of independent variables, we use techniques such as factor analysis, cluster analysis and multidimensional scaling to better understand the structure of a set of variables or objects. Interdependence methods

← EXHIBIT 15.4 MULTIVARIATE ANALYSIS: CLASSIFICATION OF INTERDEPENDENCE METHODS

Are inputs metric?

Nonmetric

Metric

Factor analysis

Cluster analysis

Metric multidimensional scaling

Nonmetric multidimensional scaling

exploratory factor analysis

Exploratory factor analysis Suppose we measure the height, weight, occupation, education and income for 50 men. The results of a factor analysis might indicate that height and weight may be summarised by the underlying dimension of size. The variables occupation, education and income may be summarised by the underlying concept of social status. Thus, we have reduced five variables down to two new general variables, or factors. The purpose of factor analysis is to summarise the information contained in a large number of variables into a smaller number of factors. Factor analysis refers to a number of diverse techniques used to discern the underlying dimensions or regularity in phenomena.4 If a researcher has a set of interrelated variables, he or she may use factor analysis to untangle the linear relationships into separate patterns. For example, suppose a researcher collects questionnaire data on retailer managers’ attitudes towards a distributor. Questions concern delivery, pricing arrangements, discounts, sales personnel, repair service and other relevant issues. The researcher wants to reduce the large number of variables to certain underlying constructs, or dimensions, that will summarise the important information contained in the variables. Factor analysis accomplishes this by combining the questions to create new, more abstract variables called factors. In general, the goal of factor analysis is parsimony – to reduce a large number of variables to as few dimensions, or constructs, as possible. Most exploratory factor analysis solutions derive factors that are orthogonal. That is, the factors have zero correlation with each other. Each factor then can be considered to be independent of, and unrelated to, each of the other factors.

exploratory factor analysis A type of analysis used to discern the underlying dimensions or regularity in phenomena. Its general purpose is to summarise the information contained in a large number of variables into a smaller number of factors.

536

PART SIX > ANALYSING THE DATA

Solutions to factor analysis problems may be portrayed by geometrically plotting the values of each variable for all respondents or observations. Geometric axes may be drawn to represent each factor. New solutions are represented geometrically by rotation of these axes. Hence, a new solution with fewer or more factors is called a rotation.

INTERPRETING FACTOR RESULTS The use of factor analysis to reduce a large number of variables to a few interpretable dimensions is illustrated in the following consumer behaviour example, testing consumer involvement. Data were gathered from 60 undergraduate students on their responses to an advertisement for a brand of exercise equipment. Ten questions from Zaichkowsky’s revised personal involvement inventory5 were asked of each student. Investigators wanted to learn if those ten different aspects of personal involvement could be simplified into a smaller number of overall concepts. Based on factor analysis, reported in Table 15.13, they uncovered two separate underlying dimensions of involvement: Factor #1 – important, relevant, means a lot to me, valuable and needed; factor #2 – interesting, exciting, appealing, fascinating and involving. Based on the variables that loaded highly on each factor, the factors could be interpreted as representing a cognitive dimension (beliefs and thoughts) and an affective dimension (feelings).

Factor loadings The factor loadings in Table 15.13 are analogous to the correlations of the original variables with the factor. Each factor loading is a measure of the importance of the variable in measuring each factor. The statement ‘This message is IMPORTANT to me’ has a high factor loading (0.879) on factor 1 and a relatively low loading on factor 2. Inspection of the table indicates that for each of the variables loadings are much higher on one factor than on the other factor. Factor loadings provide a means for interpreting and labelling the factors.

Factor scores Factor analysis procedures derive factor scores, which represent each observation’s calculated value, or score, on each factor. Factor scores then are similar to unstandardised coefficients in regression. The factor scores may be used in subsequent analysis. When the factors are to represent a new set of variables that may predict or be dependent on some phenomenon, the new input may be factor scores.

Eigenvalues Eigenvalues are the sums of the squared factor loadings for each factor. Recall that the sums of squared correlations is much the same as R-squared in multiple regression. In the context of factor analysis, an eigenvalue can be interpreted as the number of variables worth of information contained in each factor. So with 10 variables, an eigenvalue of 3.81 suggests 3.81/10 5 0.381 or 38 per cent of the ONGOING PROJECT

information in all of the variables is captured, or summarised, by that factor.

Total variance explained Along with the factor loadings, Table 15.13 gives the percentage of total variance of all of the original variables explained by each factor. Factor 1 summarises 38 per cent of the variance, and factor 2 summarises 37 per cent of the variance. Together the two factors summarise 75 per cent of the total variance in all 10 variables. This explanation of variance is similar to R-squared in multiple regression.

Communalities These are the sum of the squared factor loadings for each variable. Like R-squared, communality is a measure of the percentage of variance in each variable explained by all of the factors. A relatively high communality indicates that a variable is well represented by the results of the factor analysis.

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

537

Recall that one of the underlying assumptions in multiple regression is that the independent variables are uncorrelated with each other. In addition to reducing a large number of variables to a manageable number of dimensions, factor analysis may reduce the problem of multicollinearity in multiple regression. If several independent variables are highly correlated, conducting a factor analysis as a preliminary step prior to regression analysis and use of factor scores may reduce the problem of having several intercorrelated independent variables. Thus, factor analysis may be used to meet the statistical assumptions of various models. TABLE 15.13 » FACTOR ANALYSIS WITH VARIMAX ROTATION – 10 PERSONAL INVOLVEMENT VARIABLES6

Factor loadings Factor #1

Factor #2

Cognitive dimension

Affective dimension

Important

0.879

0.097

0.78

Relevant

0.896

0.198

0.84

Means a lot to me

0.849

0.110

0.73

Communalities

Valuable

0.761

0.142

0.60

Needed

0.850

0.207

0.77

Interesting

0.369

0.733

0.67

Exciting

0.164

0.835

0.73

Appealing

0.015

0.853

0.73

Fascinating

0.129

0.896

0.82

Involving

0.178

0.888

0.82

Eigenvalues % variance explained

3.81 38

3.68 37

7.49 75

HOW MANY FACTORS? This discussion has concentrated on summarising the patterns in the variables with a reduced number of factors. The question arises: ‘How many factors will be in the problem’s solution?’ This question is complex because there can be more than one possible solution to any factor analysis problem, depending on factor rotation. The default on most statistical packages is that we keep extracting new factors until no new factor captures as much variation in the data as one variable does – that is, a new factor, before rotation, has an eigenvalue greater than one. The default is a useful starting point, but the researcher should also consider issues such as the total amount of variance explained, the amount of information in each variable explained (communalities) and, of course, the interpretation of the factor solutions. If the factor solutions are more easily interpreted with more or fewer factors, then choose that easier solution.

Cluster analysis Cluster analysis refers to a body of techniques used to identify objects or individuals that are similar with respect to one criterion or several criteria. Where exploratory factor analysis is designed to discover natural groups of variables, cluster analysis is designed to discover natural groups of cases. The purpose of cluster analysis is to classify individuals or objects into a small number of mutually exclusive and exhaustive groups. The researcher seeks to determine how objects or individuals should be assigned to groups to ensure that there will be as much similarity within groups and as

cluster analysis A body of techniques for classifying individuals or objects into a small number of mutually exclusive groups, ensuring that there will be as much likeness within groups and as much difference among groups as possible.

538

PART SIX > ANALYSING THE DATA

much difference between groups as possible. The cluster should have high internal (within-cluster) cluster analysis

homogeneity and high external (between-cluster) heterogeneity. A typical use of cluster analysis is to facilitate market segmentation by identifying individuals who have similar needs, lifestyles or responses to marketing strategies. For example, clusters, or subgroups, of 4WD vehicle owners may be identified on the basis of their similarity with respect to 4WD vehicle usage and the benefits they want from 4WD vehicles. Alternatively, the researcher might use demographic or lifestyle variables to group individuals into clusters identified as market segments. The Roy Morgan Helix Personas we saw at the beginning of this chapter were all identified using cluster analysis. There are two major types of cluster analysis: hierarchical cluster analysis and K-means cluster analysis. In hierarchical cluster analysis, all the objects start as separate clusters on their own and then the two clusters that are ‘closest’, or most similar to each other, are joined together to make a new cluster, then the next two closest clusters are joined, and then the next two, and so on until all clusters are joined together. A graphical representation of this progressive agglomeration, called a dendogram, helps the researcher decide how many clusters, or market segments, are appropriate. We will illustrate hierarchical cluster analysis through an example relating to the preferences for different features of outdoors and adventure tourism by 15 individuals. Each person who was attending a tourist resort in the mountain ranges in southeast Queensland was asked a number of questions designed to capture their attitudes and expectations of such resorts. The questions were designed to capture three factors: (1) preference for adventure (canoeing, abseiling etc.), (2) interest in ecology and concern for the environment, and (3) expectations for comfort (room service, air-conditioning etc.) The results appear in Table 15.14. It’s difficult to see which people are similar to one another from the raw data. TABLE 15.14 » ADVENTURE TOURISM DATA FOR CLUSTER ANALYSIS7

ID

Adventure

Ecology

Comfort

A

3.9

4.9

3.4

B

1.1

2.6

4.2

C

5.2

2.8

6.0

D

3.0

1.3

1.7

E

1.3

1.9

2.4

F

1.4

1.9

5.6

G

2.8

5.9

3.9

H

1.3

5.5

5.0

I

6.4

6.2

6.8

J

4.4

5.4

2.4

K

6.4

4.6

5.7

L

1.7

4.7

2.0

M

1.1

2.0

2.7

N

5.2

5.3

6.9

O

5.0

6.0

2.4

However, hierarchical cluster analysis shows that there are three distinct groups of people. The dendogram in Exhibit 15.5 illustrates the derivation of the three groups very clearly. A three-dimensional scatterplot of the 15 people in the space of the three factors also nicely illustrates the location of the three market segments (see Exhibit 15.6). We can see that Cluster #1 is made up of respondents A, G, H, J, L and O. This group has a slightly lower-than-average interest in adventure, a relatively higher concern for ecology and is not worried too much about creature comforts. Cluster #2 is made up of

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

Dendrogram Ward Linkage, Squared Euclidean Distance

← EXHIBIT 15.5 DENDOGRAM OF HIERARCHICAL CLUSTER ANALYSIS

Distance

36.37

24.25

12.12

0.00 A

G

J

O

H

L

C I N Observations

K

B

F

D

E

M

respondents D, E, F and M. This segment has no desire for adventure activities, is unconcerned with ecology and the environment, and is not worried about luxuries. They seem to just want to get away from it all and escape for their holiday. Cluster #3 is made up of respondents C, I, K and N. Members of this segment are the most interested in adventure activities and the most concerned for a comfortable stay, and they have about the same high level of interest in ecology and the environment as do the members of Cluster #1. An analysis of variance of the three clusters against the three factors enabled these descriptions. In this example, individuals are grouped on the basis of their similarity or proximity to one another. The logic of cluster analysis is to group individuals or objects by their similarity to, or distance from, each other. In this simple three-dimensional example, we easily can see the natural groupings. The task for statistical routines is to derive a calculus for measuring the distances between objects and then establishing rules for combining close objects into groups separate from other groups. It gets more complicated in multiple dimensions. 3D Scatterplot of adventure vs ecology vs comfort

← EXHIBIT 15.6 3D SCATTERPLOT OF CLUSTER ANALYSIS DATA

Adventure

6

4

2

6 4 2

4 Comfort

6

2

Ecology

539

540

PART SIX > ANALYSING THE DATA

This example should help to clarify the difference between factor analysis and cluster analysis. In factor analysis the researcher might search for constructs that underlie the variables (e.g., population, retail sales, number of retail outlets); in cluster analysis the researcher would seek constructs that underlie the objects. Cluster analysis differs from multiple discriminant analysis in that the groups are not predefined. The purpose of cluster analysis is to determine how many groups really exist and to define their composition. It describes a sample of objects by examining only a sample; it does not

multi-dimensional scaling

predict relationships.

Multidimensional scaling Multidimensional scaling provides a means for measuring objects in multidimensional space on the basis of measures of the similarity of objects. The perceptual difference among objects is reflected in the relative distance among objects in the multidimensional space. Traditionally, attitudes have been measured by using a scale for each component of an attitude and then combining the individual scores into an aggregate score. In a common form of multidimensional scaling, subjects are asked to evaluate an object’s similarity to other objects. For example, a prestige car study may ask respondents to rate the similarity of a BMW to a Mercedes and also to a Jaguar and other vehicles. The analyst then attempts to explain the difference in objects on the basis of the components of attitudes. The unfolding of the attitude components helps explain why objects are judged to be similar or dissimilar, and then which objects are really competitors in the minds of consumers. Computer algorithms also are able to construct similarity distances on the basis of data on attitudes or behaviours. As a simple example, we can take the published road distances, in kilometres, between major cities in Australia, as shown in Table 15.15. TABLE 15.15 » ROAD DISTANCES BETWEEN MAJOR AUSTRALIAN CITIES

City

Adelaide

Alice

Brisbane

Cairns

Canberra

Darwin

Melbourne

Perth

Sydney

Adelaide Alice Springs Brisbane

1533 2044

3100

Cairns

3143

2500

1718

Canberra

1204

2680

1268

2922

Darwin

3042

1489

3415

3100

3917

728

2270

1669

3387

647

4045

Perth

2725

3630

4384

5954

3911

4250

3430

Sydney

1427

2850

1010

2730

288

3991

963

Melbourne

4110

Multidimensional scaling cleverly transforms these distances into a map, as in Exhibit 15.7. We can

multidimensional scaling A statistical technique that locates objects in multidimensional space on the basis of measures of the similarity of objects.

see in the map that the relative positions of each of the cities are extraordinarily well captured, but

perceptual map An application of multidimensional scaling to show graphically how objects are perceived by consumers.

customers purchased two expensive suits at the same time. It’s a fairly complicated matrix. If we

north–south, east–west are ignored. In another example, Table 15.16 shows a sample of concurrent purchases made at an upmarket menswear store in Germany. In the week under consideration, customers purchased an expensive suit and also purchased an expensive traditional shirt at the same time on 18 occasions. On 28 occasions, subject this matrix of joint occurrences to multidimensional scaling, then we see the perceptual map shown in Exhibit 15.8. A perceptual map shows graphically how objects are perceived by consumers relative to other objects.

541

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

1.5

Perth

1.0

Dimension 1

← EXHIBIT 15.7 MULTIDIMENSIONAL SCALING SOLUTION TO INTER-CITY ROAD DISTANCES

0.5

Adelaide Melbourne Canberra

0.0

Alice Springs Darwin

Sydney

–0.5

Brisbane

Cairns

–1.0 –0.5

0.0

0.5

1.0

Dimension 2

TABLE 15.16 » JOINT PURCHASES IN UP-MARKET MENSWEAR STORE IN GERMANY8

Product category

1

2

3

4

5

6

7

8

9

10

11

12

13

14

1

Expensive suit

28

18

13

6

10

2

2

3

17

10

24

25

1

0

2

Expensive trad. shirt

18

68

17

8

25

23

27

9

46

0

21

10

14

0

3

Expensive tie

13

17

0

0

10

0

6

10

22

4

3

23

3

0

4

Cheap tie

6

8

0

13

0

15

22

25

24

8

18

9

33

46

5

Imported shirt

10

25

10

0

20

3

6

25

5

1

7

5

0

16

6

Stylish shirt

2

23

0

15

3

0

13

13

109

48

281

43

3

0

7

Jeans

2

27

6

22

6

13

26

26

222

146

197

167

12

18

8

Modern pants

3

9

10

25

25

13

26

57

275

88

117

146

21

67

9

Medium-priced shirt

17

46

22

24

5

109

222

275

487

57

178

46

87

12

10

Cheap suit

10

0

4

8

1

48

146

88

57

109

8

8

42

19

11

Cheap shirt

24

21

3

18

7

281

197

117

178

8

273

46

15

20

12

Cheap knitwear

25

10

23

9

5

43

167

146

46

8

46

110

14

24

13

Coloured socks

1

14

3

33

0

3

12

21

87

42

15

14

508

45

14

Modern jacket

0

0

0

46

16

0

18

67

12

19

20

24

45

88

We can see from the relative proximity of the objects in Exhibit 15.8 that certain products are frequently purchased together. Some we would expect to see, such as the joint purchase of expensive or imported shirts, ties, and suits to make up a complete ensemble. Others may be unexpected, such as jeans and medium-priced shirts, and modern pants with knitwear (sweaters). Information like this may help managers and floor staff to better upsell with complementary items.

542

PART SIX > ANALYSING THE DATA

The labelling of the dimension axes is a task of interpretation for the researcher and is not statistically determined. As with other multivariate techniques in the analysis of interdependence, there are several alternative mathematical techniques for multidimensional scaling depending on the nature of the data. Stylish shirt

EXHIBIT 15.8 → MULTIDIMENSIONAL SCALING SOLUTION TO JOINT PURCHASES OF UPMARKET MENSWEAR STORE

0.6 0.4

Dimension 2

Expensive trad. shirt

Cheap shirt Expensive suit

Jeans Medium-priced shirt

0.2

Cheap knitwear

0.0000

Expensive tie

Modern pants

Cheap suit

–0.2 Imported socks

–0.4 Coloured socks

–0.6

Modern jacket

–1.0

–0.5

Cheap tie

0.0

0.5

1.0

Dimension 1

TABLE 15.17 » SUMMARY OF MULTIVARIATE TECHNIQUES FOR ANALYSIS OF INTERDEPENDENCE

TIPS OF THE TRADE

Technique

Purpose

Type of measurement

Exploratory factor analysis

To summarise into a reduced number of factors the information contained in a large number of variables

Interval

Cluster analysis

To classify individuals or objects into a small number of mutually exclusive and exhaustive groups, ensuring that there will be as much likeness within groups and as much difference among groups as possible

Interval

Multidimensional scaling

To measure objects in multidimensional space on the basis of respondents’ judgements of their similarity

Varies depending on technique

»» Dependence techniques are useful for predicting specific outcomes such as performance, profit, sales or value. »» Interdependence techniques are useful for understanding the structure or dimensionality of data. »» Measurement scale level along with the purpose of the research (prediction or understanding the structure of the data) helps to identify the appropriate multivariate data analysis tool for a given situation. »» Multiple regression results should use independent variables that are not highly correlated with each other. • VIFs greater than 5 signify problems with multicollinearity. • Multiple VIFs greater than 3 signify potential multicollinearity problems.

»

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

»

543

»» Factor analysis can be useful in reducing data before conducting a multiple regression analysis. The factor results can help identify and create factors that are not highly related. • Varimax rotation is the most frequently applied factor rotation technique. • When varimax rotation is used, the researcher can use factor scores to form composite measures for independent variables (constructs) and the resulting measures will be unrelated – meaning no multicollinearity. • Factor analysis is useful in providing evidence of measurement validity. »» Statistical results from any technique (regression, partial least squares-structural equation modelling (PLS-SEM) etc.) must be interpreted in light of the overall concern for internal and external validity – including generalisability.

15

SUMMARY DISTINGUISH BETWEEN UNIVARIATE ANALYSIS, BIVARIATE ANALYSIS AND MULTIVARIATE ANALYSIS

The investigation of one variable at a time is referred to as univariate analysis. In univariate analysis we consider averages and distributions. Bivariate analysis is the investigation of how two variables may be related to each other. Multivariate statistical analysis methods allow us to investigate the relationships of many variables at the same time. DISTINGUISH BETWEEN THE TWO BASIC GROUPS IN MULTIVARIATE ANALYSIS: DEPENDENCE METHODS AND INTERDEPENDENCE METHODS

There are two main groups of multivariate statistical techniques. Dependence methods try to explain or predict the value of a dependent variable as a function of changes in the value of two or more independent variables. Interdependence methods try to find groups of variables that seem to say the same thing, or groups of people who are similar to each other. DISCUSS THE CONCEPT OF MULTIPLE REGRESSION ANALYSIS

Multiple regression is the most well-known multivariate dependency technique. As an extension of simple regression, it seeks to explain the variation in a dependent variable as a linear function of two or more independent variables. All variables must be interval scaled or ratio scaled, but some independent variables may be binary dummy variables (0, 1). DEFINE THE COEFFICIENT OF PARTIAL REGRESSION

The coefficient of partial regression, or standardised beta, can be interpreted as the correlation between an independent variable

and the dependent variable with the effects of all the other independent variables removed. INTERPRET THE STATISTICAL RESULTS OF MULTIPLE REGRESSION ANALYSIS

When examining the output from multiple regression we look first at the F-statistic. A significant F-statistic shows that the independent variables together explain some of the variation in the dependent variable. The R-squared statistic indicates the proportion of variation in the dependent variable that is explained by the combined effects of the independent variables. The coefficients provide us with a formula for estimating other values of the dependent variable as a function of the independent variables. The significance of those coefficients tells us whether the individual variables really have any influence. If any independent variables are correlated with each other (known as multicollinearity), then their coefficients and their significance can become unreliable. Multicollinearity is a problem when interpreting regression, but it may not be a problem if prediction only is the goal. DEFINE AND DISCUSS FACTOR ANALYSIS

Exploratory factor analysis is an interdependence method used to summarise into a small number of factors the information contained in a large number of variables. We use factor analysis to group variables that are highly correlated with each other into a single measure. DEFINE AND DISCUSS CLUSTER ANALYSIS

Where factor analysis finds groups of variables, cluster analysis finds groups of cases. Cluster analysis classifies observations into a small number of mutually exclusive and exhaustive groups; these

544

PART SIX > ANALYSING THE DATA

should have as much similarity within groups and as much difference between groups as possible.

of similarity or dissimilarity, such as respondents’ judgements about their similarity or frequency of purchase.

DEFINE AND DISCUSS MULTIDIMENSIONAL SCALING

Another interdependence method is multidimensional scaling. It measures objects in a multidimensional space, based on measures

KEY TERMS AND CONCEPTS binary logistic regression cluster analysis dependence methods exploratory factor analysis

interdependence methods multicollinearity multidimensional scaling multiple regression analysis

multivariate statistical analysis n-way cross-tabulation n-way univariate analysis of variance (ANOVA)

partial correlation analysis perceptual map

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 How do multivariate statistical analysis methods differ from univariate and bivariate methods?

6 Give an example of a situation in which each of the techniques mentioned in Question 5 might be used.

2 What is the distinction between dependence methods and interdependence methods?

7 Why have computer software packages increased the use of multivariate analysis?

3 What is the aim of multiple linear regression? Binary logistic regression?

8 Why might a researcher want to use multivariate rather than univariate or bivariate analysis?

4 Give an example of a situation in which each of the techniques mentioned in Question 3 might be used.

9 A researcher uses multiple regression to predict a client’s national sales volume based on historical measures of gross domestic product, average household income, unemployment and the consumer price index. What should the researcher be obliged to tell the client about this multiple regression model?

5 What is the aim of factor analysis? Cluster analysis? Multidimensional scaling?

ONGOING PROJECT RUNNING SOME UNIVARIATE OR BIVARIATE STATISTICS? CONSULT THE CHAPTER 15 PROJECT WORKSHEET FOR HELP

Selecting a test to use is based on answering some simple questions (see the flowchart at the beginning of Part 6) about the hypotheses you are testing, how the variables are measured and how many

variables to include. You then need to interpret the output from the test. These steps can be followed by using the project worksheet available from the CourseMate website. It’s a good idea to know what tests you are going to use before you collect the data (then you can collect the right data).

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to login to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ interactive quizzes ☑ flashcards

☑ online research activities ☑ search me! activities ☑ videos.

CHAPTER 15 > MULTIVARIATE STATISTICAL ANALYSIS

545

WRITTEN CASE STUDY 15.1 COASTAL STAR SALES CORPORATION Coastal Star Sales Corporation is a wholesaler based in northern NSW that markets leisure products from several manufacturers. Coastal Star has an 11-person sales team that sells to sportinggoods retailers in New South Wales and Queensland. Exhibit 15.9 shows the names of these 11 salespeople, gives some descriptive information about each person and provides the sales performance for each of the last two years. Use the data in Exhibit 15.9 to answer the following questions.

2 Derive new variables to reflect changes between sales in the current year and the previous year. Which is more meaningful, percentage change or dollar change? 3 Calculate all bivariate regression equations that will predict current sales. Which is the best one? 4 Using multiple regression, find a model that will help explain current sales. 5 Use cluster analysis to see if there are any natural groupings among the salespeople.

QUESTIONS

1 Calculate appropriate univariate statistics for each variable.

EXHIBIT 15.9 → COASTAL STAR SALES CORPORATION: SALESPERSON DATA

Region

Salesperson

Age

Years of experience

Sales ($) Previous year

Current year

QLD

Jackson

40

7

412 744

411 007

QLD

Smith

60

12

1 491 024

1 726 630

QLD

La Forge

26

2

301 421

700 112

QLD

Miller

39

1

401 241

471 001

QLD

Mowen

64

5

448 160

449 261

NSW

Young

51

2

518 897

519 412

NSW

Fisk

34

1

846 222

713 333

NSW

Jing

62

10

1 527 124

2 09 041

NSW

Krieger

42

3

921 174

1 030 000

NSW

Liu

64

5

463 399

422 798

NSW

Weiner

27

2

548 011

422 001

NOTES 1 Helix Personas® by Roy Morgan Research. Online: http://www.helixpersonas.com.au. 2 Sheth, Jagdish N. (1977) ‘Seven commandments for users of multivariate methods’, in Sheth, J.N. (ed.) Multivariate methods, New York: American Marketing Association, pp. 333–5. 3 For excellent discussions of multivariate analysis, see Myers, J.H. & Mullet, G.M. (2003) Managerial applications of multivariate analysis in marketing; Thomson; Morrison, D.F. (2005) Multivariate statistical methods, 4th edn; Thomson; Lattin, J., Carroll, D. & Green, P. (2003) Analyzing multivariate data; Thomson; Hair, J., Black, B., Babin, B., Anderson, R. & Tatham, R. (2010) Multivariate data analysis, 7th edn: Pearson. 4 The purpose of this section is to discuss factor analysis techniques at an intuitive level. The discussion is not complicated by the various mathematically complex differences among the techniques. An excellent discussion of the mathematical aspects of factor analysis appears in Rummel, R. J. (1967) ‘Understanding factor analysis,’ Journal of Conflict Resolution, 11(4), pp. 444–80.

5 Zaichkowsky, J. L. (1994) ‘The personal involvement inventory: Reduction, revision and application to advertising,’ Journal of Advertising, 23(4), pp. 59–70. 6 Ten personal involvement inventory questions are seven-point semantic differential scales from Zaichkowsky, J. L. (1994) ‘The Personal Involvement Inventory: Reduction, revision and application to advertising’, Journal of Advertising, 23(4), pp. 59–70. 7 Measures for adventure, ecology and comfort each are the average of five seven-point Likert scales designed to capture the degree to which each respondent is concerned with that aspect of outdoor travel. Constructs were drawn from Weaver, David & Lawton, Laura (2002) ‘Overnight ecotourist market segmentation in the Gold Coast hinterland of Australia’, Journal of Travel Research, 40, February, pp. 270–80. 8 Merkle, E. (1981) ‘Die Erfassung und Nutzung von Informationen über den Sortimentsverbund in Handelsbetrieben,’ in Schriften zum Marketing, band 11. (‘The collection and use of information about the range in composite trading companies,’ in writings on marketing.)

SEVEN FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT 16 » COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

1 Background

Audience

2 Review of current knowledge

Affects formality, complexity, length, style Preparation and rehearsal

Voice and gestures

Graphic aids

Structure

Oral presentation

Written report

3 Research method

4 Results

5 Conclusions

REPORTING RESEARCH

Language and style Communication tools

iStock.com/Izabela Habur

Graphics

547

WHAT YOU WILL LEARN IN THIS CHAPTER

To explain how the research report is the crucial means for communicating the whole research project. To discuss the research report from a communications model perspective. To define the term ‘research report’. To outline the research report format and its parts. To present qualitative research results.

RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

Crucial role for effective reporting

After spending days or weeks working on a project, the researcher is likely to feel that preparation of the report is just a dull formality. After all, the real work has been done; it just has to be put on paper and sent to the client. Treating it so lightly would be a bad idea. The project may have been well designed, the data carefully gathered and analysed by sophisticated statistical methods, and important conclusions reached, but if the project is not effectively reported, all of that effort will be wasted. Often the research

report is the only part of the project that others see. If the people who need to use the research results have to wade through a disorganised presentation, are detoured by technical jargon they do not understand, or encounter sloppy language or sloppy thinking, they will probably discount the report and make decisions without it, just as if the project had never been done. Thus, the research report is a crucial means for communicating the whole project – the medium by which the project makes its impact on decisions. This chapter explains how research reports, oral presentations and follow-up conversations help communicate research results. iStock.com/Pinkypills

16 »

COMMUNICATING RESEARCH RESULTS:

To discuss the importance of using graphics in research reporting. To explain how tables and charts are useful for presenting numerical information and how to interpret their various components. To identify the various types of research charts. To discuss how an oral presentation may be the most efficient means of supplementing the written report.

548

PART SEVEN > FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

INSIGHTS FROM THE COMMUNICATIONS MODEL Some insights from the theory of communications help to clarify the importance of the research report. Exhibit 16.1 illustrates one view of the communication process. Several elements influence successful communication: 1 the communicator – the source or sender of the message (the writer of the report) 2 the message – the set of meanings being sent to or received by the audience (the findings of the research project)

communication process The process by which one person or source sends a message to an audience or receiver and then receives feedback about the message.

3 the medium – the way in which the message is delivered to the audience (the oral or written report itself) 4 the audience – the receiver or destination of the message (the manager who will make a decision based [we hope] on the report findings) 5 feedback – a communication, also involving a message and channel, that flows in the reverse direction (from the audience to the original communicator) and that may be used to modify subsequent communications (the manager’s response to the report). Who

Says what

1 Communicator

2 Message

In what way

To whom

3 Medium

4 Audience

← EXHIBIT 16.1 THE COMMUNICATION PROCESS

With what effect 5 Original communicator

Feedback

Medium

Message Original audience

This model of communication oversimplifies the case, though. It implies that the message flows smoothly from the writer to the reader, who in turn promptly provides the writer with feedback. It is much more complicated. The communicator and the audience each have individual fields of experience. These overlap to some extent, otherwise no communication would be possible. Nevertheless, there is much experience that is not common to both parties. As communicators send a message, they encode it in terms that make sense to them based on their fields of experience. As the individuals in the audience receive the message, they decode it based on their own fields of experience. The message is successfully communicated only if the parties share enough common experience for it to be encoded, transmitted and decoded with the same meaning. In the research setting, there is a communicator (the researcher) who has spent a great deal of time studying a problem. The researcher has looked at secondary sources, gathered primary data, used statistical techniques to analyse the data, and reached conclusions. When the report on the project is written, all this ‘baggage’ will affect its contents. A researcher may believe that the reader has a lot of background information on the project and produce pages and pages of unexplained tables, assuming the reader will unearth from them the same patterns that the researcher has observed. The report may contain technical terms such as parameter, F-distribution, hypothesis test and correlation, on the assumption that the reader will understand them. Another researcher may assume that the reader does not have a lot of background information and explain everything in terms suitable for a CHAPTER 16 > COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

549

550

PART SEVEN > FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

primary-school child. Although the researcher’s intent is to ensure that the reader will not get lost, he or she may insult the reader in the process. Usually when readers receive a report, they have not thought much about the project. They may not know anything about statistics and they may have many other responsibilities. If the report cannot be understood quickly, they may put it on a stack of things to do one day. Simply delivering a report to its audience is not sufficient to ensure that it gets attention. The report needs to be written so as to draw on the common experience of the researcher and the reader. And it is the writer’s responsibility to make sure that it does so – not the reader’s. Unless a report is really crucial, a busy reader will not spend time and effort struggling through an inadequate or difficult-to-read document.

THE REPORT IN CONTEXT A research report is an oral presentation and/or written statement, used to communicate research results, strategic recommendations and/or other conclusions to management or other specific audiences. Although this chapter deals primarily with the final written report that an extensive research project requires, remember that the final report usually is not the only report. For a small project, a short oral or written report on the results may be all that is needed. Extensive projects may involve many written documents, interim reports, a long final written report and several oral presentations. In addition, technical materials may be posted on an organisation’s intranet. Most commercial marketing research reports are never presented as a formal written document, but instead a series of face-to-face meetings, enhanced with PowerPoint, are all that is formally presented to the client. This chapter emphasises the final report, but other communications may be just as important to the project’s success. The chapter’s suggestions can be easily adapted to apply to shorter, less formal reports.

REPORT FORMAT Although every research report is custom-made for the project it represents, some conventions of report format are universal. They represent a consensus about the parts necessary for a good research report and how they should be ordered. Look at any research report – whether published as a commissioned government report, an academic journal article, a 10-minute PowerPoint presentation or a routine report between company departments – and you will see the same pattern. The body of the report always has the same five components, always presented in the same order. research report An oral presentation or written statement of research results, strategic recommendations and/or other conclusions to a specific audience. report format The make-up or arrangement of parts necessary to a good research report.

Body of the research report 1 Background 2 Current knowledge 3 Research method 4 Results 5 Conclusions

CHAPTER 16 > COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

BACKGROUND The background section serves to place the research in context. It explains the nature of the problem, for whom it is being written, what prompted the research and so on. In a routine or simple short report, the background and current knowledge sections may be merged together into one section.

CURRENT KNOWLEDGE The second section is never actually called ‘Current knowledge’, but that is its function. More often, it has the heading ‘Previous research’ or ‘Literature review’. It is a review of what we currently know about the problem, plus previous research on the topic or related topics, including different research techniques that have investigated similar problems. If the report is an academic piece then it may be called a literature review, with different chapters covering theory, methodology, and previous results. A large government report, into the state of an industry for example, may include detailed secondary data as industry statistics. If the report is an internal memo with an update on a routine market share measure, then this section may be simply a reminder of how the research has been conducted in the past. The purpose of this section is to place the current research problem in the context of what we currently know about it and similar problems. After clearly stating what we know about a problem area, then we are in a better position to understand what we do not know. Section two typically concludes with specific statements about what we do not know – the gaps in our knowledge. After all, there is no value in conducting research when we already know the answer. When we clearly say what we do not know then, logically, we can outline specific questions for the research report.

RESEARCH METHOD Section three takes those specific research questions we wrote at the end of Section two, and then details how we went about finding answers to them. In a large study this section may take up two or more chapters. In a routine report this may be simply a single-paragraph reminder of how the data are gathered.

RESULTS Section four presents the results of undertaking the research method, with data transformed into information suitable for the target readership. This section focuses on answering the research question as concisely as possible, using graphs or tables as appropriate. Be very careful that you present only results that are directly an outcome of your data gathering. It is not appropriate, for example, to say, ‘Brand awareness is only 16 per cent compared with the leading brand’s 30 per cent, so we need to engage in a promotion campaign to improve this figure’ Here, brand awareness is an outcome of the research, and what to do about it is a call to action, presumably based on the conclusion that 16 per cent is too low. But maybe it is not too low under the circumstances, and maybe a promotion campaign is not the best way to improve the figure. That conclusion and that recommendation may be better given to an executive with experience in the industry, or may highlight a need for further research. The Results Section should be as objective as possible, presented in such a way that readers can easily draw conclusions themselves. If you do want to make value judgements or a call to action, then do this in the next section.

CONCLUSIONS Section five checks back to see if the information presented has answered the research questions, and points out any other questions left unanswered or that have emerged in the course of the study. Recommendations for action may be included in this section too.

551

552

PART SEVEN > FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

Summary All reports have the same components in the same order. The five components may be only a few lines long in an informal report, or they may take several chapters in formal reports, but they are always the same components, in the same order. In some cases, say in some industry reports, previous research and the research method may be mentioned in only one or two sentences, with the details located in an appendix. If we expand these five sections then we can see the components of a formal survey-based research report, detailed next.

Body of a formal descriptive research report 1 Background 1.1 Problem symptoms that prompted the research 1.2 Sponsor of the research or the body for whom the report is written 2 Previous research 2.1 Previous research on the topic or related topics 2.2 Theories or models that help our understanding 2.3 Research techniques available 2.4 Gaps in our knowledge 3 Research method 3.1 Specific research question (based on the gaps In our knowledge) 3.2 Specific steps in answering the question 3.2.1 Sampling frame 3.2.2 Instrumentation 3.2.3 Procedures 4 Results 4.1 Checks that the sample matches the target population, and that measures are reliable 4.2 Tables and graphs for each research question 4.3 Explanation 5 Conclusions 5.1 Summary of findings (Have we answered the question?) 5.2 Contribution to our knowledge 5.3 Limitations of the research 5.3.1 Sampling frame, scope, time etc. 5.3.2 Activities that could have been done differently or better 5.4 Further research 5.4.1 What we haven’t learned 5.4.2 Other questions arising from the study Usually different titles for headings are used, according to the topic under investigation, but these headings show the purpose of each section and subsection. And of course, depending on the nature of the topic or the research method, sub-sections may be removed and others added or expanded.

Two-stage research report – exploratory followed by descriptive Many research projects involve several distinct stages. Some marketing research reports, for example, require a qualitative, exploratory research task designed to find out the range of ideas or general opinions within a target population. A second research stage may be a quantitative, descriptive study

CHAPTER 16 > COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

designed to learn to what extent those opinions are represented in the population. Commercial marketing research reports done in two stages would be presented as two distinct reports, each with the same five core components presented above. If the two stages are combined into the one report then we still see the same basic components for each stage: 1 Background 2 Previous research 3 Method 3.1 Need for exploratory and confirmatory stages 4 Exploratory research 4.1 Exploratory question 4.2 Method 4.2.1 Choice of exploratory research method 4.2.2 Details of recruitment of participants, criteria for inclusion, location etc. 4.2.3 Procedures 4.3 Exploratory results 4.3.1 Examples of responses, illustrating key findings 4.3.2 Tables, lists and explanation 4.4 Conclusions 4.4.1 Findings that are unexpected or surprising 4.5 Summary and contribution to the next stage 5 Quantitative research 5.1 Specific research question arising from the qualitative stage 5.2 Method 5.2.1 Sampling frame, instrumentation and procedures 5.3 Results 5.3.1 Tables, graphs and discussion 6 Conclusions Note that the same five core components are still there, appearing twice: once for the exploratory stage and then again for the descriptive stage. Again, most research reports may not have the same headings given here; they will have different names to suit the task.

Additional parts of the report TITLE PAGE The title page should state the title of the report, for whom the report was prepared, by whom it was prepared and the date of release or presentation. The title should give a brief but complete indication of the purpose of the research project. Addresses and titles of the preparer and recipient may also be included. On confidential reports, a list of the people to whom the report should be circulated may be supplied. For the most formal reports, a title fly page should precede the title page; only the title appears on this page.

TABLE OF CONTENTS A table of contents is essential to any report more than a few pages long. It should list the divisions and subdivisions of the report with page references. The table of contents is based on the final outline of the report, but it should include only the first-level subdivisions. For short reports it is sufficient to include only the main divisions. If the report includes many figures or tables, a list of these should immediately follow the table of contents.

553

554

PART SEVEN > FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

If you use styles on your word processor to designate each of your headings, then you can create a table of contents automatically and keep it up to date if you make any changes.

EXECUTIVE SUMMARY The executive summary briefly explains why the research project was conducted, what aspects of the problem were considered, what the outcome was, and what should be done. It is a vital part of the report. Studies show that nearly all managers read a report’s summary, while only a minority reads the rest of the report. So your only chance to produce an impact may be in the summary. The executive summary is not another term for ‘Introduction’. The summary should be written to be self-sufficient; in fact, the summary is often detached from the report and circulated by itself. A manager must be able to read the Summary and understand all that is important in the study; purpose, method, results and recommendations. The summary should be written only after the rest of the report has been completed. It represents the essence of the report. It should be one page long (or, at most, two pages), so the writer must carefully sort out what is important enough to be included in it. Many pages of the full report may have to be condensed into one summarising sentence. Different parts of the report may be condensed more than others; the number of words in the summary need not be in proportion to the length of the section being discussed.

THE BODY The body constitutes the bulk of the report. It contains the basic components described earlier. Anything that you want your reader to see should be included in the body of your report. Up until the late twentieth century, researchers didn’t have word processors and had to place images, graphs and tables in appendices. Now we can place any graph or supporting material immediately next to the first mention of the item in the text. You should never require that your reader shuffle between your text and appendices to find a graph that you refer to.

APPENDICES An appendix presents the ‘too . . .’ material. Any material that is too technical or too detailed to go in the body should appear in the appendix. This includes materials of interest only to some readers or subsidiary materials not directly related to the objectives. Some examples of appendix materials are samples of questionnaires, special calculations or discussions of highly technical questions, detailed or comprehensive tables of results, and a bibliography (if appropriate). Like chapters or sections, appendices should be ordered and labelled. Present each new topic in a different appendix, beginning on a new page, with an easily understood title, and summary information explaining the contents of the appendix. Never assume that the reader will look at the appendix. If you want your reader to see a particular piece of information, place it in the body of your report.

Templates and styles All modern word processors have style sheets available to the writer. Unfortunately, most writers rarely use them. A style sheet (or ‘Styles’ in Microsoft Word) is a set of predefined formatting instructions that you can use repeatedly throughout a document. To make a heading, most people tend to highlight the text and systematically make changes to size, centring and so on. And you have to go through that same process for each heading. If you store the formatting commands in a style, you can apply that style any time you need it without having to do all of the reformatting. If you want to change the

CHAPTER 16 > COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

555

design of your document, one click does it all. Without styles, your word processor thinks that your heading is just a normal paragraph. With styles, we ‘tag’ or identify parts of a document. For example, a major heading can be tagged as a Heading 1 and a section heading as Heading 2. Other components can be formally tagged as a figure or a table with proper style sheets. When your document is properly tagged, your word processor can generate or update a table of contents for you instantly. Styles can apply to text, and they can apply to text divisions such as paragraphs, headings, sections and chapters for numbering, line spacing, indenting and alignment. Reasons for using styles: →→ Consistency – each section is formatted the same, providing a professional, clean-looking document. →→ Ease in modifying – if you want to change the look of a heading or a paragraph, you only need to update a given style once and then all of the instances of that style are automatically changed. →→ Efficiency – having created a style once, you can use it throughout the document without having to format each section individually. →→ Table of contents – styles can generate or update a table of contents quickly. →→ Faster navigation – the document map feature in Word lets you move quickly from section to section. →→ Outline view – styles allow you to outline and rearrange a document’s main topics. →→ Outline numbering – many report writers like to have key sections numbered; numbered styles automatically create and update numbered sections, even after rearranging sections. An hour or so experimenting with styles in your word processor can save you many more hours of piece-by-piece editing when preparing your report. Once you have created a set of styles in a document that you’re happy with, you can save them together as a template for use in other documents. If you don’t use styles, then you have decided to spend more time on formatting than on thinking and writing.

PRESENTING QUALITATIVE RESEARCH Regardless of whether your research is quantitative or qualitative, it should be presented in such a way that the reader can understand the circumstances under which the data were gathered, processed and interpreted. That way, the reader can see any sources of bias, or the special situations that might bring about a result. Qualitative research frequently involves personal interaction and subjective interpretation; the manner of an interviewer affects the manner of an interviewee, which, in turn, affects the questions and behaviour of the interviewer again. Importantly, qualitative researchers share a philosophical view that knowledge cannot be separated from the knower – that everything is an interpretation of events, and nothing speaks for itself. Reflexivity is the term used when the researcher understands that he or she has a direct effect on the process and outcomes of research. To demonstrate reflexivity, researchers develop very clear protocols for conducting research and for reporting their processes and results. The sections in a research report are the same, but much more effort and space is devoted to explaining the protocols and people involved so that any potential bias can be understood. In addition to the effect that an interviewer has on the responses of participants, other participants can have an effect on behaviour. The presence of others, and the nature of exchanges among them all should be explained in the research report. Remember that good qualitative research is very difficult and very time-consuming; much more difficult than quantitative research. Bad qualitative research is easy. If you want to do good qualitative research you need to invest the time needed to prepare, gather, process, analyse and then report on your research properly.

reflexivity Reflexivity entails the researcher being aware of his or her effect on the process and outcomes of research based on the premise that ‘knowledge cannot be separated from the knower’1.

556

PART SEVEN > FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

Deriving themes The key outcome of qualitative research usually is a set of themes. These are key ideas or topics about how people think and feel about the issue under investigation. For example, research on how people deal with visiting a new café might cover such issues as familiarisation, the purchase process, and criteria for evaluation. Familiarisation might include sub-themes such as how people ‘learn’ what to expect from this type of café, where to sit, where to place an order, likely prices and special ‘language’ used for different types of coffee. Different themes may overlap with other themes, recognising that the same phenomenon may cover several different issues for participants, or may be looked at from different perspectives. Similarly, the purchase process and evaluation criteria may have additional sub-themes that emerge from the data. Depending on the research question, the researcher may have well-defined themes in mind before the data are gathered, and the purpose of the research is to learn how one cohort of people feels about each of them. On other occasions, the researcher may want themes to emerge from the data. Usually the research involves both, with the researcher having some key ideas to investigate while being open to discovering something completely new.

CONSENSUS AMONG PARTICIPANTS Among focus groups it is valuable to see to what extent participants agree or disagree about particular issues or questions. The matrix in Table 16.1 offers a template for assessing the level of consensus in a focus group. The specific questions may be written down beforehand, but good interviewers always leave room for adding additional questions where they help the flow or add clarity to results. TABLE 16.1 »

MATRIX FOR ASSESSING LEVEL OF CONSENSUS IN FOCUS GROUP2

Focus group question

Participant 1

Participant 2

Participant 3

Participant 4

Participant 5

Participant 6

1 2 3 . . . The following notations can be entered in the cells: A 5 Indicated agreement (i.e., verbal or nonverbal) D 5 Indicated dissent (i.e., verbal or nonverbal) SE 5 Provided significant statement or example suggesting agreement SD 5 Provided significant statement or example suggesting dissent NR 5 Did not indicate agreement or dissent (i.e., nonresponse) Additional coding may include non-verbal responses, such as sighs, laughter, frown or smiles.

The consensus matrix may not appear in the final report, or may be included in an appendix. Its main purpose is to help the researcher extract the key themes and also to see how much participants agree with each other. The purpose of qualitative research varies. Usually qualitative research deals with ‘what’ and ‘why’ questions. ‘How much’ questions generally are the domain of quantitative researchers. The goal of focus-group and interview research is to explain the nature of a phenomenon – how people link ideas or events together. Counting responses to questions, such as presenting percentage expressing agreement, can add weight to your story, and for some readers it can give the report

CHAPTER 16 > COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

more weight. But trying to quantify your qualitative data should not be the goal. The great value of interviews and focus groups is the discovery of different perspectives, and the exploration of reasons that people develop their views. So recognition of dissent from majority opinion, and the reasons for their opinion, are just as important as discovery of a majority opinion.

Qualitative presentation styles The following are just a few of the many ways qualitative findings can be arranged and presented: 3 →→ Natural – The data are presented in a shape that resembles the phenomenon being studied. For instance, if the data are excerpts from a café visit, where the researcher wants to understand the sequence of events that affect perceived service quality, then we might present them in a sequential order or in an order that represents the flow of the visit itself. →→ Most simple to most complex – For the sake of understanding, start the presentation of data with the simplest example you have. As the complexity of each example or exemplar presented increases, the reader will have a better chance of following the presentation. →→ First discovered/constructed to last discovered/constructed – The data are presented in a chroniclelike fashion, showing the course of the researcher’s personal journey in the study. This style is reminiscent of an archaeological style of presentation: what was the first ‘relic’ excavated, then the second and so forth. In the café research study, for example, the report may begin with the observation that people come in to the café, stand for a moment (some look confused) and then walk out. The research proceeds from there. →→ Storytelling – Data are arranged as a narrative. Researchers plot out the data in a fashion that allows them to transition from one exemplar to another, just as narrators arrange details in order to best relate the particulars of the story. Stories have different forms, but usually there is a build-up and a ‘climax’ where there is a significant discovery, or a problem is resolved. →→ Most important to least important or from major to minor – Like the journalistic style of the inverted pyramid, the most important ‘findings’ are presented first and the minor ‘discoveries’ come last. →→ Dramatic presentation – This one is the opposite of the inverted pyramid style. With the dramatic arrangement scheme, researchers order their data presentation so as to save the surprises and unforeseen discoveries for last. Whatever the style you use for presenting your findings, the theme derived from your findings should be clearly explained. Include at least one, and preferably several, different quotes from your participants to emphasise and bring life to each of your themes. That doesn’t mean you present a list of quotes, or transcripts, and then write a list of ‘findings’. Themes should be carefully explained and sorted, and presented to help the reader. Quotes should accompany the themes and be identified by the participant ID (usually a number but can be a pseudonym.)

Checklist for reporting qualitative research The following checklist is adapted for one specifically designed for reporting on interviews and focus groups.4 Most should be presented in the body of your report so that the reader can assess the validity of your findings. Some, such as coding guides and transcripts (if required), are normally presented in appendices.

557

558

PART SEVEN > FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

Domain

Item

Description

Personal characteristics

Interviewer/ facilitator

Identify which of the report contributors conducted the interview or focus group: gender, qualifications, experience or training in interview techniques.

Relationship with participants

Participant’s knowledge of the researcher and researcher

Did the participants know the interviewer before the research? What did the interviewer and interviewee know about each other? What was the interviewee told about the interviewer?

Theoretical framework

Methodological orientation

What methodological orientation underpinned the study? For example, grounded theory, to build theories from the data; ethnography, to understand the culture of groups with shared characteristics; phenomenology, to describe the meaning and significance of experiences; and content analysis, to systematically organise data into a structured format.

Participant selection

Sampling

How were participants selected? For example, purposive, convenience, snowball. How were they approached? For example, face-to-face, advertisement, email. How many participants, and how many of those who were asked refused to participate or dropped out?

Setting

Data collection setting

Where were data collected? For example, in a café, in workplace, home. Others present beside interviewer and participants? Visible to others? Demographics of participants, dates and times.

Data collection

Protocol

Details of interview guide: questions, prompts, pilot testing. Audio or video recording to collect data? Were field notes made during and/or after the interview of focus group? Duration of interviews of focus group? Was data saturation achieved over the course of the interviews? Were transcripts returned to participants for comment and/or correction?

Data analysis

Coding and checking

How was coding of data done? Number of coders, coding tree. Were themes identified in advance or derived from the data? Was any software used to manage the data? What? Did participants provide feedback on the findings?

Reporting

Clarity and consistency

Were participant quotations presented to illustrate the themes/findings? Was each quotation identified? e.g., participant number or pseudonym. Major themes presented clearly. Range of views presented. Consistency between data presented and the findings.

Research team and reflexivity

Study design

Analysis and findings

GRAPHIC AIDS graphic aids Pictures or diagrams used to clarify complex points or emphasise a message.

The person who first said ‘A picture is worth a thousand words’ probably had graphic aids in mind. Used properly, graphic aids can clarify complex points or emphasise a message. Used improperly or sloppily, however, they can distract or even mislead. The key to effective use of graphic aids is to make them an integral part of the text. The graphics should always be interpreted in the text. This does not mean that the writer should exhaustively explain an obvious chart or table; it means that the text should point out the key elements of any graphic aid and relate them to the discussion in progress. Always explain to the reader what conclusions should be drawn from any table or picture. Several types of graphic aids may be useful in research reports; these include tables, charts, maps and diagrams. The following discussion briefly covers the most common ones – tables and charts.

CHAPTER 16 > COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

Tables Tables are most useful for presenting numerical information, especially when several pieces of information have been gathered about each item discussed. For example, consider how hard it would be to follow all the information in Table 16.2 if it were presented in narrative form. Using tables allows a writer to point out significant features without getting bogged down in detail. The body of the report should include only relatively short summary tables; comprehensive tables should be reserved for an appendix. TABLE 16.2 »

PARTS OF A TABLE5

Characteristics of new-and longer-term residents, capital cities and coastal centres New residents From capital cities (a)

From nonmetropolitan areas (b)

Total(c)

Longerterm residents

393.3

1 648.0

10 626.7

Capital city (a) Population

’000

260.1

Proportion of people who: are students (d)

%

18.4

24.6

27.8

14

are in the labour force (e)

%

82.5

78.3

73.6

77.4

are in low-skilled occupations (f)

%

29.5

41.9

39.8

41.8

live in low income households (g)

%

8.6

14.1

11.8

16.1

live in high income households (g)

%

37

23.8

24.6

22.1

187

228.9

530.5

2 015.5

Coastal centre Population

’000

Proportion of people who: are students (d)

%

14.4

19

18.9

11.8

are in the labour force (e)

%

75.9

75.6

74.9

76.5

are in low-skilled occupations (f)

%

38.3

47

42.9

47.4

live in low income households (g)

%

16.3

18.9

16.6

21.6

live in high income households (g)

%

20.2

16.2

17.8

14.7

(a) Greater Capital City Statistical Areas (GCCSA) (b) All areas in Australia outside of capital cities (c) Includes new residents who were overseas in 2006 (d) Includes people aged 15 years and over (e) Includes people aged 15 to 64 years (f) Employed people (g) Includes people in private households only.

Each table should include the following: 1 Table number. This allows for simple reference from the text to the table. If the text includes many tables, a list of tables should be included just after the table of contents. 2 Title. The title should indicate the contents of the table and be complete enough to be intelligible without referring to the text. The table number and title are generally placed at the top because the table is read from the top down. 3 Stubheads and bannerheads. The stubheads contain the captions for the rows of the table, and the bannerheads (or boxheads) contain those for the columns. Note that stubheads let us combine multiple cross-tabulations into one table.

559

560

PART SEVEN > FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

4 Notes. Any explanations or qualifications for particular table entries or sections should be given in footnotes immediately under the table. 5 Source notes. If a table is based on material from one or more secondary sources rather than on new data generated by the project, the sources should be acknowledged, usually below the table. Table 16.3 illustrates a typical table from a survey research report; it cross tabulates demographics with survey responses. Table 16.4 shows how data from a statistical test might be reported in table form. Note that this table is very different from what may be output from a statistical analysis program; only the useful information has been presented. TABLE 16.3 »

REPORTING FORMAT FOR A TYPICAL CROSS-TABULATION FROM A SURVEY6

Gender profile of attendees at professional sport events Proportion by Gender

Deviation by Gender

Male

Female

Male

Female

Rugby Union

66

34

0

0

Cricket

73

27

7

–7

Motor sport

73

27

7

–7

Horse racing

56

44

–10

10

Soccer

71

29

5

–5

NRL

63

37

–3

3

Basketball

62

37

–4

3

Tennis match

59

41

–7

7

Golf

68

32

2

–2

Average

66

34

TABLE 16.4 »

REPORTING FORMAT FOR A TYPICAL STATISTICAL TEST7

Highest education distribution of shoppers carrying bluetooth enabled devices vs non–bluetooth enabled devices Secondary school

Tertiary education

Postgraduate

Bluetooth enabled

56%

38%

60%

Everyone else

44%

63%

40%

N555

N580

65

x 5 9.92, d.f. 5 2, p,0.01 2

Statistical output Most computer-based statistical analysis programs present output in tables and charts. Advances in computer graphical user interfaces (GUI) allow researchers to easily highlight sections from a statistics program and cut and paste directly into a word processor report. Many of the tables in this book have been prepared this way. Care must be given, however, to edit such charts and tables so that they do their job simply, and without confusion or misrepresentation. A multitude of decimal places does not make a table more accurate; it just makes the table more difficult to read. If you present a percentage figure to just one decimal place then you are claiming

CHAPTER 16 > COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

accuracy of one in one thousand. Can we honestly claim such accuracy with a sample of, say, 200 people? Most often it is better to bring percentages to whole numbers only. Similarly, there is rarely any value in presenting statistical tests (correlations, t-tests, Chi-square, F-statistics and so on) with more than one or two decimal places, or significance levels with more than two decimal places. Taking the time to edit statistical output in this way makes your report much more tidy and helps the reader understand and interpret your results. Many statistical packages present much more detail than is needed in any normal report. Remove all that extraneous output. For example, a correlation matrix in SPSS by default includes the correlation coefficient, sample size and significance level in each cell. Sample size is almost always the same, so remove those entries from the table and simply include it once in a footnote to the table. If all correlations are statistically significant then simply say so, also in a footnote. If some are significant and others are not then you can indicate these with one or more asterisks. Thus, a large and busy table can be reduced by two-thirds by removing repeated or redundant information. Exhibit 16.2 shows an example of a cross-tabulation as it appears in SPSS output. Exhibit 16.3 is the same information edited for presentation in a report. Similarly, Exhibit 16.4 is an example of a correlation matrix from SPSS, and Exhibit 16.5 shows the same correlation matrix with all of the redundant information removed; see how much shorter and easier the matrix is to read and understand. In summarising and explaining statistical output for your reader, do not write or say anything that you do not understand. If you do then it shows – unless you are reporting to someone who knows and understands less than yourself, and that is a dangerous assumption. EXHIBIT 16.2 →EXAMPLE OF CROSS-TABULATION FROM SPSS OUTPUT

Interest in GPS navigation system by usual type of driving Type of driving

Interest in GPS navigation system

No interest

School, shopping, work

Around city for work

Long country driving

Total

53

50

25

128

81.5%

62.5%

55.6%

67.4%

12

30

20

62

18.5%

37.5%

44.4%

32.6%

65

80

45

190

100.0%

100.0%

100.0%

100.0%

Count % within type of driving

Interest

Count % within type of driving

Total

Count % within type of driving Chi-square tests Value

Pearson Chi-square

9.656a

d.f.

Asymp. Sig. (two-sided)

2

0.008

Likelihood ratio

10.126

2

0.006

Linear-by-linear association

8.850

1

0.003

n of valid cases 5 190 a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 14.68.

561

562

PART SEVEN > FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

EXHIBIT 16.3 →SPSS OUTPUT EDITED FOR REPORT PRESENTATION

Level of interest in GPS navigator by usual type of driving School, shopping, work

Around city for work

Long country driving

Total

No Interest

81%

62%

56%

67%

Interest

19%

38%

44%

33%

n

65

80

45

190

Chi-square 5 9.66, d.f. 5 2, Sig. , 0.01.

EXHIBIT 16.4 →EXAMPLE OF CORRELATION MATRIX FROM SPSS OUTPUT

Correlations Average cost (meal for two) Average cost (meal for two)

Distance from the city (kilometres)

Number of dishes on menu

Average waiting time

Pearson correlation

1.000

Sig. (two-tailed) n

32

Pearson correlation

–0.394*

Sig. (two-tailed)

Distance from the city (kilometres)

–0.368*

–0.016

0.026

0.038

0.933

32 1.000

0.026 32

Pearson correlation

–0.368*

0.217

0.038

0.232

32

32

0.217

0.376*

0.232

0.034

32

32

1.000

Pearson Correlation

–0.016

0.376*

0.319

0.933

0.034

0.075

32

32

0.319 0.075

32

n

32

32

n

Sig. (two-tailed)

Average waiting time

–0.394*

n

Sig. (two-tailed)

Number of dishes on menu

32

32

32

1.000

32

*Correlation is significant at the 0.05 level (two-tailed).

EXHIBIT 16.5 →CORRELATION MATRIX EDITED FOR REPORT

Average cost

Distance from the city

Number of dishes on menu

Average cost (meal for two)

1.00

Distance from the city (kilometres)

0.39*

1.00

Number of dishes on menu

0.37*

0.22

1.00

Average waiting time

0.02

0.38*

0.32

n 5 32; *correlation is significant at the 0.05 level (two-tailed).

Average waiting time

1.00

CHAPTER 16 > COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

563

Charts Charts translate numerical information into visual form so that relationships may be easily grasped. The accuracy of the numbers is reduced to gain this advantage. Each chart should include the following: 1 Figure number. Charts (and other illustrative material) should be numbered in a separate series from tables. The numbers allow for easy reference from the text. If there are many charts, a list of them should be included after the table of contents. 2 Title. The title should describe the contents of the chart and be independent of the text explanation. The number and title may be placed at the top or bottom of the chart. 3 Explanatory legends. Enough explanation should be put on the chart to spare the reader a need to look at the accompanying text. Such explanations should include labels for axes, scale numbers and a key to the various quantities being graphed. 4 Source and footnotes. Any secondary sources for the data should be acknowledged. Footnotes may be used to explain items, although they are less common for charts than for tables. Charts are subject to distortion, whether unintentional or deliberate. A common way of introducing distortion is to begin a scale at some value larger than zero. Exhibit 16.6 shows how this exaggerates the differences between values. Proportion who agree with statements about downloading

← EXHIBIT 16.6 DISTORTION OF CHART FROM BROKEN AXIS8

I think that you should be able to download or access the content you want for free from the Internet It is easy to find content on the Internet for free that would usually be paid for If you had paid for a digital file you should then be able to share it with others It is wrong to access content online without the creators/artist’s permission 35%

40%

45%

50%

100% legal

55%

60%

65%

70%

Illegal downloaders

PIE CHARTS Most pie charts are useless. They are popular because they can look pretty. However, they rarely communicate information better than alternative chart types. The idea behind a pie chart is to show graphically how a whole is sliced up among various groups. As shown in Exhibit 16.7, each angle, or ‘slice’, is proportional to its percentage of the whole and should be labelled with its description and percentage. The problem with pie charts is that most people are not very good at linking an angle or a slice of pie with the relative size of something, relative to other slices. Most often, people must first read the categories of the chart, and then the percentage figures – in other words, they’re looking at a round table, decorated with angled sections. It may look attractive but it’s not informative. More often the same information can be presented in a much more useable way with a bar graph. But because they can be pretty, pie charts can be a good way to draw the reader’s attention, so a pie chart is an excellent tool if you have nothing of value to say in the text.

564

PART SEVEN > FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

If you can’t help yourself, or you have nothing to say, make sure you use pie charts properly. They work only for categorical data, not for ordinal data. That is, it should make no difference what order the data are presented in. So never use a pie chart for, say, a 1–5 agree-disagree scale.

EXHIBIT 16.7 → A PIE CHART OFTEN IS JUST A PRETTY (BUT DIFFICULT TO READ) ROUND TABLE9

Smartphone OS market share Australia: June 2015 BlackBerry Other 1% 0%

Wind ows 6%

iOS 35%

Smartphone OS market share Australia: June 2015 BlackBerry Other 1% 0%

Wi nd ow

EXHIBIT 16.8 → IT LOOKS PRETTY BUT IT’S EVEN MORE DIFFICULT TO READ AND INTERPRET WITHOUT REFERRING TO THE LABEL TEXT10

Android 57%

s6 %

iOS 35%

Android 57%

Consider the popular pie chart in Exhibit 16.7. It shows the smartphone operating systems market shares in Australia at mid-2015, the latest figures available at the time of publication. Can you work out how much more popular iOS is over Windows? Could you answer the question easily without having to refer to the text in the graph, or even making a mental calculation? If you have to rely on the

CHAPTER 16 > COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

565

text, then you may as well use a table instead of a graph! Exhibit 16.8 is the same graph, but this time it’s dressed up a little with the addition of 3D. This design may be pretty but it is even less effective at communicating any clear message. For the information to be useful the reader must ignore the graph and search the text only. The pie chart is simply a round table with colour inside. Now consider the same information again but presented as a bar graph in Exhibit 16.9. We can see easily that Android is about 1.5 times as popular as iOS, and about ten times more popular as Windows. And that should be the purpose of a graph. Smartphone OS market share Australia: June 2015

← EXHIBIT 16.9 BAR GRAPHS ARE EASY TO READ AND INTERPRET11

70% 60% 50% 40% 30% 20% 10% 0% Android

iOS

Windows

BlackBerry

Other

BAR CHARTS Bar charts show changes in the value of a dependent variable at discrete intervals of the independent variable. Bar charts are ideal when the independent variable is ordinal, such as, say, a 1 to 5 ratings scale. A simple bar chart is shown in Exhibit 16.10. The multiple bar chart (Exhibit 16.10) shows how multiple variables are related to the primary variable. In each of these cases, each bar needs to be clearly identified with a different colour or pattern. Do not use too many divisions or dependent variables. Too much detail obscures the essential advantage of charts, which is to make relationships easy to see. Proportion who agree with statements about downloading

← EXHIBIT 16.10 MULTIPLE BAR CHART12

I think that you should be able to download or access the content you want for free from the Internet It is easy to find content on the Internet for free that would usually be paid for If you had paid for a digital file you should then be able to share it with others It is wrong to access content online without the creators/artist’s permission 0%

10%

20%

30%

100% legal

40%

50%

60%

Illegal downloaders

70%

566

PART SEVEN > FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

The bar graph in Exhibit 16.10 has the additional advantage that it can be photocopied in black and white without risk of losing its communication value. It is worth considering several different presentation options for each of your graphs. Exhibit 16.11, a stacked multiple bar chart, and Exhibit 16.12, a normal multiple bar chart, present the same information in two different ways. Which communicates better for you? The stacked bar chart highlights the fact that the parts all sum to the same amount in all groups, which may not be as clear in the regular bar chart. It’s easy to compare the size of the outermost categories for each group, because they have the same starting points. It’s not so easy to compare the inner categories, however. EXHIBIT 16.11 → MULTIPLE STACKED BAR CHART SHOWS SUM TO WHOLE13

Proportion of downloaders who consumed any content legally/illegally (past 3 months) Video games TV Music Movies 0%

10%

20%

30%

100% legal

EXHIBIT 16.12 → MULTIPLE BAR CHARTS ALLOW COMPARISONS OF DIFFERENT CATEGORIES14

40%

50%

60%

Mix of legal and illegal

70%

80%

90%

100%

100% illegal

Proportion of downloaders who consumed any content legally/illegally (past 3 months) 100% 80% 60% 40% 20% 0% Movies

Music 100% legal

TV Mix of legal and illegal

Video games 100% illegal

LINE GRAPHS Line graphs are useful for showing the relationship between two continuous variables. That is, both variables are at least interval scaled. Line graphs are not appropriate when the data are discrete classifications. The dependent variable generally is shown on the vertical axis and the independent variable on the horizontal axis. The most common independent variable for such charts is time, but it is by no means the only one. The multiple line graph (Exhibit 16.13) shows the relationship of more than one dependent variable to the independent variable. All variables should be either interval or ratio scaled. The line for each

CHAPTER 16 > COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

567

dependent variable should be in a different colour or pattern and should be clearly labelled. Do not try to squeeze in too many variables: this can quickly lead to confusion rather than clarification. If you can label each line then most often you don’t need to include a legend, which can easily take up more than a third of the space of the graph. Smartphone OS market share Australia: 2012–2015

← EXHIBIT 16.13 MULTIPLE LINE GRAPH15

Android

70% 60% 50% 40%

iOS

30% 20% 10% 0%

Windows BlackBerry

Other 2012

2013

2014

2015

Maximise the ‘data to ink’ ratio A pretty graph is not necessarily one that communicates well. The default graphics in popular software, such as Microsoft Excel and Microsoft PowerPoint, make it very easy to create reasonably attractive charts and graphs for inclusion in your reports and presentations. Unfortunately, often these graphics focus more on eye-catching colour than on actually communicating Information well. A simple rule is to maximise the ‘data to ink’ ratio. In any graph ‘ink’ is turned into titles, labels, colour, lines, edges and spaces. Some ink is necessary ‘data’, some defines necessary boundaries, such as lines defining axes, and some is decoration. If you can minimise as much of the decoration as possible then your graphs will be simple, and easy to read and understand. The best graphs use minimum colour, no fancy decoration and no repeated information. When colours are used, they should be complementary.

EXPLORING RESEARCH ETHICS

RESEARCH RESULTS ARE NOT THE SAME AS RESEARCH RECOMMENDATIONS

It’s very tempting and easy to make a statement like ‘Results show that only 10 per cent of the general population is aware of your brand, which means you urgently need to embark on an advertising campaign.’ No, the results do not mean such a thing at all. The results simply say that there is 10 per cent brand awareness. The word ‘only’ is a value statement, suggesting that the number is a very small amount. In fact, it may be normal or quite large if we’re talking about a very specialised product category. Whether a number is low or high is only true relative to some other comparable figure. So we’d want to know how close competitors compare. Further, advertising is just one method of achieving brand awareness, and it is almost never an efficient method if the target market can be well defined. Typically, a client commissions a research report because he or she needs to make a decision. Ideally that decision has been communicated to the research supplier to ensure that the most appropriate

»

568

PART SEVEN > FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

»

information is gathered. So we would expect that the research report will be tailored towards the decision-making and information needs of the client. But this is not the same as reporting the results as objectively as possible. What can be worse is the confounding of a particular agenda with the research results. Most people interpret information to match their expectations. One of this book’s authors once read a news report that said, ‘Today’s voting intentions suggest an outright rejection of the government’s fiscal policy.’ No it doesn’t. Voting intentions alone give no reason for those intentions. On the same day, we read a research report claiming, ‘Our low market share strongly suggests a change in packaging design.’ Without evidence about the reasons why people do or don’t buy there is no evidence in the results to infer such a conclusion. The AMSRS Code of Professional Behaviour says, under Rule #26, ‘When reporting on the results of a market research project the Researcher must make a clear distinction between the findings as such, the Researcher’s interpretation of these, and any recommendations based on them.’

THE ORAL PRESENTATION Most research reports are presented orally as well as in writing. Usually this is in the form of a meeting oral presentation A spoken summary of the major findings, conclusions and recommendations, given to clients or line managers to provide them with the opportunity to clarify any ambiguous issues by asking questions.

between the researchers and the decision-makers. The purpose of an oral presentation is to highlight the most important findings of a research project and provide clients or line managers with the opportunity to clarify any ambiguous issues by asking questions. The oral presentation may be as simple as a short conference with a manager at the client organisation’s location or as formal as a report to the board of directors. The key to effective presentation in either situation is preparation. Communication specialists often suggest that a person preparing an oral presentation begin at the end.16 In other words, while preparing a presentation, researchers should think about what they want the client to know once it has been completed. The researcher should select the three or four most important findings for emphasis and rely on the written report for a full summary. The researcher needs to be ready to defend the results of the research. This is not the same as being defensive; rather, it means being prepared to deal in a confident, competent manner with the questions that arise. Remember that even the most reliable and valid research project is worthless if the managers who must act on its results are not convinced of its importance. The principles of good speech-making apply to a research presentation. Lecturing or reading a script to the audience is sure to impede communication. Do not read a prepared text word for word. If you rely on brief notes, are familiar with the subject and do as much rehearsal as the occasion calls for, then you will communicate better. Avoid research jargon and use short, familiar words. Maintain eye contact with the audience and repeat the main points. An oral presentation often is organised around a standard format: ‘Tell them what you are going to tell them, tell them, and tell them what you just told them.’

Learn by example and practice Some of the most inspiring and memorable presentations are found in the TED talks (TED.com) by leaders in science, technology and the arts. While usually invisible to the audience, most TED talks have a very well-defined structure and very strictly-controlled presentation format of speech, visualisations

CHAPTER 16 > COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

569

and a time limit of less than 18 minutes. The following tips are adapted from one of the many guides for TED talks.17

Introduction • A strong introduction is crucial. • Start with a clear statement of your message – something the audience cares about. • Be brief. • Don’t open with a string of stats.

Body In presenting your topic and evidence: • Make a list of all the evidence you want to present. • Order all of the items in your list based on what a person needs to know before they can understand the next point, and from least to most exciting. Now cut out everything you possibly can without losing the integrity of your argument. There is no such thing as a good presentation that is also ‘too short’. • Minimise jargon.

Conclusion • Leave your audience feeling positive toward you and your research. Don’t use your conclusion to only summarise what you’ve already said; tell your audience how your results might affect the organisation or, better still, them personally. • If appropriate, give your audience a call to action.

Graphs and infographics • Keep graphs visually clear, especially if the content is complicated. • No slide should support more than one point. • Use as little text as possible – if your audience is reading, they are not listening. • Avoid using bullet points. Consider putting different points on different slides.

Rehearse. Rehearse. Rehearse. Most TED speakers we see on video have gone through at least four formal presentations in front of

TED organisers before the TED event, over a period of between three and six months, and many more practise presentations in their own time. They focus on rearranging their points to make the most engaging story, and cutting out more stuff.

ONLINE REPORTS: EASY TO GET, EASY TO IGNORE18

A variety of commercially available computer programs provide detailed data on website usage. Among these titles are Clickz, OpenTracker, CrazyEgg and WebTrends. These programs can gather details and generate reports about the behaviour of website visitors (for example, new visitors, returning visitors and subscribers to the company’s email newsletter). Behaviours that can be tracked include the bouncerate (the rate at which people leave a site after only seeing one page) the links that visitors click on, the purchases they make, the amount of time they spend at the website and the number abandoning the electronic cart without making a purchase. Research has shown that about three-quarters of senior marketing managers read regular reports on page-views and click-throughs, and costs of click-through. More importantly, there was a direct positive relationship between the use of these measures and market orientation, innovativeness of the organisation and recent performance of the business.

REAL WORLD SNAPSHOT

570

PART SEVEN > FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

Audio-visual aids Graphic aids should be designed to convey a simple, attention-getting message that supports a point on which the audience should focus its thinking. As they do in written presentations, presenters should interpret graphics for the audience. The best slides are easy to read and interpret, using a large font, complementary colours and highlights that enhance the readability of charts. These features can also distract an audience. Don’t go overboard with clever artistic devices.

GRAPHICS PACKAGES There are several computer presentation packages on the market. All do a similar job and all are good. Microsoft PowerPoint is the most popular program. Macintosh users have an excellent package in Apple-Keynote. Free services like Zoho or Google Docs and open-source packages such as OpenOffice, LibreOffice or FreeOffice all include word processor, spreadsheet and presentation software with functions that will satisfy almost all users. All these presentation packages offer similar features, such as formatting templates and integration with other software. And they all include features for enhancing the presentation, such as sound effects, animation of text and images, slide transitions, fonts, and integration with movie and audio files. Use presentation enhancement features prudently. Nothing identifies the ‘newbie’ more than the use of all of the possible animations, sound effects and transitions that can be found. Too much fuss tends to distract from the message and ultimately defeats the purpose of the presentation. Some experts have argued that PowerPoint, and similar programs, oversimplify complex information so that important issues are lost. PowerPoint presentations can cause the audience members to just passively sit and accept the information in front of them, and fail to link it with other information and to critically engage with the research results.19 If you feel that you must use some clever animation or sound effect, use it just once to bring attention to, and highlight, an important piece of information. The best presentations using audio-visual tools tend to be fairly straightforward. Note that the best quality handmade chocolates are sold in very simple packaging.

ASSERTION-EVIDENCE Technical presentations, such as research results, differ from lectures or inspirational talks. Slide presentations to accompany technical reports have been found to benefit from a technique called the assertion-evidence approach.20 Unlike traditional slides that might have a short heading and several point-form pieces of text, the assertion-evidence approach argues for a longer full sentence of two to three lines, stating an assertion. This is followed by evidence to support the assertion – graphic or short text. When an audience is interested in your topic they will make the effort to read a title, especially when the speaker reads it out loud. In tests, researchers found that longer slide headings, presented as assertions, were remembered better and enjoyed more than slides with traditional topic headings. Graphs too benefit from using longer titles that summarise the key findings as well as describe the content.

SPEAKING Your tone of voice and physical movements can enhance or detract from your message too. Gestures can highlight key points in your message and make presentations more interesting.21 Here are some tips on how to gesture: →→ If you’re nervous, hold a pen in one hand and pretend that it’s a conductor’s baton, or even a sword. You’ll be surprised how much more powerful you feel.

CHAPTER 16 > COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

571

→→ Don’t try to talk about something you don’t understand. It shows. Unless you have a career in crime or politics, it shows in your voice and your mannerisms. Leave out any material that you’re not completely sure of, or leave it for someone else to discuss. →→ Face your audience and make your gestures above your waist, preferably about head height, so that people can see easily your face and the gestures. →→ Practise.

SUMMARY EXPLAIN HOW THE RESEARCH REPORT IS THE CRUCIAL MEANS FOR COMMUNICATING THE WHOLE RESEARCH PROJECT

The research report may be the only part of the project that others ever see. The presentation of the report reflects the perceived quality of the research, so the report must be organised, concise and clear in language, with a minimum of unfamiliar jargon. Only then will managers see value in the research report and take action based on its findings. DISCUSS THE RESEARCH REPORT FROM A COMMUNICATIONS MODEL PERSPECTIVE

The communications model shows the process by which one person or source sends a message through a communications medium to an audience or receiver and then receives feedback about the message. DEFINE THE TERM ‘RESEARCH REPORT’

A research report is an oral presentation and/or written statement, the purpose of which is to communicate research results, strategic recommendations and/or other conclusions to management or other specific audiences. Conclusions and recommendations should relate directly, and only, to the findings of the research carried out. OUTLINE THE RESEARCH REPORT FORMAT AND ITS PARTS

The report format should be varied to suit the level of formality of the particular situation. The body of the research report always has the same five components in the same order: 1 Background 2 What we currently know, concluding with what we don’t know 3 Research method 4 Results 5 Conclusions.

16

TO PRESENT QUALITATIVE RESEARCH RESULTS.

Qualitative research reports focus on learning ‘what’ and ‘why’ about people's thoughts and behaviour. Results are presented as a set of themes which are key ideas drawn from the researcher's interpretation of the data. Usually the themes are accompanied by verbatim quotes from participants. DISCUSS THE IMPORTANCE OF USING GRAPHICS IN RESEARCH REPORTING

Graphics are pictures or diagrams used to clarify complex points or emphasise a message. Their first purpose is to communicate; graphics should not be just a colourful way of filling in a page. EXPLAIN HOW TABLES AND CHARTS ARE USEFUL FOR PRESENTING NUMERICAL INFORMATION AND HOW TO INTERPRET THEIR VARIOUS COMPONENTS

Tables are best used when there is a large amount of quantitative information to present at one time. They allow detailed comparisons of exact numbers. Charts are best used when we want to show a pattern or a relationship. IDENTIFY THE VARIOUS TYPES OF RESEARCH CHARTS

Different types of data are better presented in different types of graph. Categorical data are best presented in bar charts, while time series data are best presented in line graphs. Continuous or ratioscaled data are best presented using histograms. Pie charts may be pretty, but they are rarely useful for communicating effectively. DISCUSS HOW AN ORAL PRESENTATION MAY BE THE MOST EFFICIENT MEANS OF SUPPLEMENTING THE WRITTEN REPORT

For many managers, the personal presentation by the researcher is the only contact with the research and its results. Thus, the oral presentation should be sufficiently detailed and targeted to the audience so that all questions can be answered. The researcher must extract and present the most important key findings in a professional and entertaining way.

572

PART SEVEN > FORMULATING CONCLUSIONS AND WRITING THE FINAL REPORT

KEY TERMS AND CONCEPTS communication process graphic aids

oral presentation report format

research follow-up

research report

QUESTIONS FOR REVIEW AND CRITICAL THINKING 1 Why is it important to think of the research report from a communications perspective?

one of your lecturers.) Do they meet the standards set forth in this chapter?

2 What role do verbatim quotes play in qualitative research reports?

5 How does the oral presentation of research differ from the written research report?

3 How can you demonstrate validity and reliability in a qualitative research report?

6 Think about presentations in the past that you have really enjoyed and from which you learned something valuable. What features made the presentation so good?

4 Go to the Website for the Australia and New Zealand Marketing Academy (ANZMAC) at http://www.anzmac.org. Go to the proceedings of a recent ANZMAC conference and browse some of the papers. (You may find a paper written by

7 What ethical concerns arise when you prepare (or read) a report?

ONGOING PROJECT WRITING UP YOUR FINAL REPORT FOR YOUR RESEARCH STUDY? CONSULT THE CHAPTER 16 PROJECT WORKSHEET FOR HELP

develop a good research report. Present your report in a logical and systematic manner, with graphs and tables that communicate.

Download the Chapter 16 project worksheet from the CourseMate Website. It outlines a series of steps taken in this chapter to

COURSEMATE ONLINE STUDY TOOLS Flip to the start of your textbook and use the tear-out card to log in to CourseMate for Marketing Research. There you can test your understanding and revise chapter concepts with: ☑ case projects

☑ crosswords on key concepts ☑ online video activities ☑ videos.

☑ flashcards

NOTES 1

Anderson, L. (2008) ‘Reflexivity’, in The SAGE Dictionary of Qualitative Management Research, R. Thorpe & R. Holt (Eds) DOI: http://dx.doi.org/10.4135/9780857020109 2 Onwuegbuzie, A.J., Dickinson, W. B., Leech, N.L. & Zoran, A.G (2009) ‘A Qualitative Framework for Collecting and Analyzing Data in Focus Group Research’, International Journal of Qualitative Methods 8(3). 3 Constas, M. A. (1992) ‘Qualitative analysis as a public event: The documentation of category development procedures’, American Educational Research Journal, 29, 253–266.

4 Tong, A., Sainsbury, P., & Craig, J. (2007) ‘Consolidated criteria for reporting qualitative research (COREQ): a 32-item checklist for interviews and focus groups’, International Journal for Quality in Health Care, 19(6), 349–357. doi:10.1093/intqhc/mzm042 5 Australian Bureau of Statistics (2012) Report 2071.0 – ‘Reflecting a Nation: Stories from the 2011 Census, 2012–2013’, Online: http://www.abs.gov.au/ausstats/[email protected]/ Lookup/2071.0main+features702012–2013 6 Lees, G., Morrison, A. & Robertson, M. (2014) ‘Duplication of Purchase Law in Sport Event Markets: New Zealand Case Study’, Proceedings of ANZMAC Conference 2014, Griffith University, Brisbane, 1–3 Dec. 2014.

CHAPTER 16 > COMMUNICATING RESEARCH RESULTS: RESEARCH REPORT, ORAL PRESENTATION AND RESEARCH FOLLOW-UP

7 Adapted from: Bogomolova, S., Page, W. & Phua, P. (2014) ‘Validating bluetooth logging as metric for shopper behaviour studies’, Proceedings of ANZMAC Conference 2014 Griffith University Brisbane (1–3 Dec. 2014). 8 Australian Bureau of Statistics (2011) Business Use of Information Technology, 2009–10, Cat. No. 8167.0 (24 June 2015) Online: https://www.communications.gov.au/ departmental-news/new-online-copyright-infringement-research-released, Accessed March 2016. 9 TNS Australia (2015) Online Copyright Infringement Research: A Marketing Research Report, (263103193) prepared for Department of Communications (24 June 2015) Online: https://www.communications.gov.au/departmental-news/newonlinecopyright-infringement-research-released, Accessed March 2016. 10 Kantar world panel, (2015) Smartphone OS Sales Market Share, Online: http://www. kantarworldpanel.com/global/smartphone-os-market-share, Accessed March 2016. 11 Kantar world panel, (2015) Smartphone OS Sales Market Share, Online: http://www. kantarworldpanel.com/global/smartphone-os-market-share, Accessed March 2016. 12 Australian Education International (2012) ‘International Student Data’ accessed at www.aei.gov.au in January 2013. 13 TNS Australia (2015) Online Copyright Infringement Research: A Marketing Research Report, (263103193) prepared for Department of Communications (24 June 2015) Online: https://www.communications.gov.au/departmental-news/new-onlinecopyright-infringement-research-released, Accessed March 2016. 14 TNS Australia (2015) Online Copyright Infringement Research: A Marketing Research Report, (263103193) prepared for Department of Communications (24 June 2015)

573

Online: https://www.communications.gov.au/departmental-news/new-onlinecopyright-infringement-research-released, Accessed March 2016. 15 Australian Education International (2012) ‘International Student Data’ accessed at www.aei.gov.au in January 2013 16 ‘A speech tip’, Communication Briefings, 14(2), p. 3. 17 TEDx Speaker’s Guide. Online: http://www.storage.ted.com/tedx/manuals/tedx_ speaker_guide.pdf; Also see: Anderson, C. (2016) TED Talks: The Official TED Guide to Public Speaking, Houghton Mifflin Harcourt 18 Based on Mintz, Ofer & Currim, Imran S. (2013) ‘What drives managerial use of marketing and financial metrics and does metric use impact performance of marketing mix activities?’, Journal of Marketing, 77(March), pp. 17–40. 19 See, for example, Tufte, Edward (2003) ‘PowerPoint is evil: Power corrupts. PowerPoint corrupts absolutely’, Wired Magazine, 11.09, September, Accessed at www. wired.com/wired/archive/11.09/ppt2.html March 2016; Hill, A., Arford, T., Lubitow, A. & Smollin, L.M. (2012) ‘I’m Ambivalent about It: The Dilemmas of PowerPoint’, Teaching Sociology, 40(July), pp. 242–56. 20 See: Alley, Michael (2013) ‘The Craft of Scientific Presentations’, 2 edn. New York: Springer-Verlag; & Alley, Michael (2013) ‘Scientific Presentations: The AssertionEvidence Approach’, Online: http://writing.engr.psu.edu/speaking.html University Park: Penn State, Accessed March 2016. 21 See DeFinis (2010) ‘Use Gestures to Engage your Audience’, DeFinis Communications, Accessed at http://www.definiscommunications.com on 11 October 2012.

» A APPENDIX

TABLE A.1 »

37751

STATISTICAL TABLES

RANDOM DIGITS

04998

66038

63480

98442

22245

83538

62351

74514

90497

50915

64152

82981

15796

27102

71635

34470

13608

26360

76285

99142

35021

01032

57907

80545

54112

15150

36856

03247

40392

70720

10033

25191

62358

03784

74377

88150

25567

87457

49512

18460

64947

32958

08752

96366

89092

23597

74308

00881

88976

65763

41133

60950

35372

06782

81451

78764

52645

19841

50083

83769

52570

60133

25211

87384

90182

84990

26400

39128

97043

58900

78420

98579

33665

10718

39342

46346

14401

13503

46525

54746

71115

78219

64314

11227

41702

54517

87676

14078

45317

56819

27340

07200

52663

57864

85159

15460

97564

29637

27742

34990

62122

38223

28526

37006

22774

46026

15981

87291

56946 64519

02269

22795

87593

81830

95383

67823

20196

54850

46779

43042

53600

45738

00261

31100

67239

02004

70698

53597

62617

92565

12211

06868

87786

59576

61382

33972

13161

47208

96604

67424

32620

60841

86848

85000

04835

48576

33884

10101

84129

04015

77148

09535

10743

97871

55919

45274

38304

93125

91847

85226

19763

46105

25289

26714

73253

85922

21785

42624

92741

03360

07457

75131

41209

50451

23472

07438

08375

29312

62264

72460

99682

27970

25632

34096

17656

12736

27476

21938

67305

66960

55780

71778

52629

51692

71442

36130

70425

39874

62035 53950

14824

95631

00697

65462

24815

13930

02938

54619

28909

34001

05618

41900

23303

19928

60755

61404

56947

91441

19299

77718

83830

29781

72917

10840

74182

08293

62588

99625

22088

60930

05091

35726

07414

49211

69586

20226

08274

28167

65279

94180

62151

08112

26646

07617

42954

22521

09395

43561

45692

81073

85543

47650

93830

07377

87995

35084

39386

93141

88309

18467

39689

60801

46828

38670

88243

89042

78452

08032

72566

60643

59399

79740

17295

50094

66436

92677

68345

24025

36489 08325

73372

61697

85728

90779

13235

83114

70728

32093

74306

18395

18482

83245

54942

51905

09534

70839

91073

42193

81199

07261

28720

71244

05064

84873

68020

39037

68981

00670

86291 25605

61679

81529

83725

33269

45958

74265

87460

60525

42539

11815

48679

00556

96871

39835

83055

84949

11681

51687

55896

99007

35050

86440

44280

20320

97527

28138

01088

49037

85430

06446

65608

79291

16624

06135

30622

56133

33998

32308

29434

574

»

575

APPENDIX A > STATISTICAL TABLES

TABLE A.2 »

AREA UNDER THE NORMAL CURVE

0

Z

Z

.00

.01

.02

.03

0.0

.0000

.0040

.0080

.0120

.0160

.0199

0.1

.0398

.0438

.0478

.0517

.0557

.0596

0.2

.0793

.0832

.0871

.0910

.0948

.0987

.1026

.04

.05

.06

.07

.08

.09

.0239

.0279

.0319

.0359

.0636

.0675

.0714

.0753

.1064

.1103

.1141

0.3

.1179

.1217

.1255

.1293

.1331

.1368

.1406

.1443

.1480

.1517

0.4

.1554

.1591

.1628

.1664

.1700

.1736

.1772

.1808

.1844

.1879

0.5

.1915

.1950

.1985

.2019

.2054

.2088

.2123

.2157

.2190

.2224

0.6

.2257

.2291

.2324

.2357

.2389

.2422

.2454

.2486

.2518

.2549

0.7

.2580

.2612

.2642

.2673

.2704

.2734

.2764

.2794

.2823

.2852

0.8

.2881

.2910

.2939

.2967

.2995

.3023

.3051

.3078

.3106

.3133 .3389

0.9

.3159

.3186

.3212

.3238

.3264

.3289

.3315

.3340

.3365

1.0

.3413

.3438

.3461

.3485

.3508

.3531

.3554

.3577

.3599

.3621

1.1

.3643

.3665

.3686

.3708

.3729

.3749

.3770

.3790

.3810

.3830 .4015

1.2

.3849

.3869

.3888

.3907

.3925

.3944

.3962

.3980

.3997

1.3

.4032

.4049

.4066

.4082

.4099

.4115

.4131

.4147

.4162

.4177

1.4

.4192

.4207

.4222

.4236

.4251

.4265

.4279

.4292

.4306

.4319

1.5

.4332

.4345

.4357

.4370

.4382

.4394

.4406

.4418

.4429

.4441

1.6

.4452

.4463

.4474

.4484

.4495

.4505

.4515

.4525

.4535

.4545

1.7

.4554

.4564

.4573

.4582

.4591

.4599

.4608

.4616

.4625

.4633

1.8

.4641

.4649

.4656

.4664

.4671

.4678

.4686

.4693

.4699

.4706

1.9

.4713

.4719

.4726

.4732

.4738

.4744

.4750

.4756

.4761

.4767

2.0

.4772

.4778

.4783

.4788

.4793

.4798

.4803

.4808

.4812

.4817

2.1

.4821

.4826

.4830

.4834

.4838

.4842

.4846

.4850

.4854

.4857

2.2

.4861

.4864

.4868

.4871

.4875

.4878

.4881

.4884

.4887

.4890

2.3

.4893

.4896

.4898

.4901

.4904

.4906

.4909

.4911

.4913

.4916

2.4

.4918

.4920

.4922

.4925

.4927

.4929

.4931

.4932

.4934

.4936

2.5

.4938

.4940

.4941

.4943

.4945

.4946

.4948

.4949

.4951

.4952

2.6

.4953

.4955

.4956

.4957

.4959

.4960

.4961

.4962

.4963

.4964 .4974

2.7

.4965

.4966

.4967

.4968

.4969

.4970

.4971

.4972

.4973

2.8

.4974

.4975

.4976

.4977

.4977

.4978

.4979

.4979

.4980

.4981

2.9

.4981

.4982

.4982

.4983

.4984

.4984

.4985

.4985

.4986

.4986

3.0

.49865

.4987

.4987

.4988

.4988

.4989

.4989

.4989

.4990

.4990

4.0

.49997

576

APPENDIX A > STATISTICAL TABLES

TABLE A.3 »

DISTRIBUTION OF T FOR GIVEN PROBABILITY LEVELS

Level of significance for one-tailed test .10

.05

d.f.

.20

.10

.025

.01

.005

.0005

.05

.02

.01

.001

Level of significance for two-tailed test 1

3.078

6.314

12.706

31.821

63.657

636.619

2

1.886

2.920

4.303

6.965

9.925

31.598 12.941

3

1.638

2.353

3.182

4.541

5.841

4

1.533

2.132

2.776

3.747

4.604

8.610

5

1.476

2.015

2.571

3.365

4.032

6.859

6

1.440

1.943

2.447

3.143

3.707

5.959

7

1.415

1.895

2.365

2.998

3.499

5.405

8

1.397

1.860

2.306

2.896

3.355

5.041

9

1.383

1.833

2.262

2.821

3.250

4.781

10

1.372

1.812

2.228

2.764

3.169

4.587

11

1.363

1.796

2.201

2.718

3.106

4.437

12

1.356

1.782

2.179

2.681

3.055

4.318

13

1.350

1.771

2.160

2.650

3.012

4.221

14

1.345

1.761

2.145

2.624

2.977

4.140

15

1.341

1.753

2.131

2.602

2.947

4.073

16

1.337

1.746

2.120

2.583

2.921

4.015 3.965

17

1.333

1.740

2.110

2.567

2.898

18

1.330

1.734

2.101

2.552

2.878

3.922

19

1.328

1.729

2.093

2.539

2.861

3.883

20

1.325

1.725

2.086

2.528

2.845

3.850

21

1.323

1.721

2.080

2.518

2.831

3.819

22

1.321

1.717

2.074

2.508

2.819

3.792

23

1.319

1.714

2.069

2.500

2.807

3.767

24

1.318

1.711

2.064

2.492

2.797

3.745

25

1.316

1.708

2.060

2.485

2.787

3.725

26

1.315

1.706

2.056

2.479

2.779

3.707

27

1.314

1.703

2.052

2.473

2.771

3.690

28

1.313

1.701

2.048

2.467

2.763

3.674

29

1.311

1.699

2.045

2.462

2.756

3.659

30

1.310

1.697

2.042

2.457

2.750

3.646

40

1.303

1.684

2.021

2.423

2.704

3.551

60

1.296

1.671

2.000

2.390

2.660

3.460

120

1.289

1.658

1.980

2.358

2.617

3.373

`

1.282

1.645

1.960

2.326

2.576

3.291

APPENDIX A > STATISTICAL TABLES

TABLE A.4 »

CHI-SQUARE DISTRIBUTION

a

0

2

Critical

Degrees of freedom (d.f.)

Values

Area in shaded right tail (a) .10

.05

.01

1

2.706

3.841

6.635

2

4.605

5.991

9.210

3

6.251

7.815

11.345

4

7.779

9.488

13.277 15.086

5

9.236

11.070

6

10.645

12.592

16.812

7

12.017

14.067

18.475

8

13.362

15.507

20.090

9

14.684

16.919

21.666

10

15.987

18.307

23.209

11

17.275

19.675

24.725

12

18.549

21.026

26.217

13

19.812

22.362

27.688

14

21.064

23.685

29.141

15

22.307

24.996

30.578

16

23.542

26.296

32.000

17

24.769

27.587

33.409

18

25.989

28.869

34.805

19

27.204

30.144

36.191

20

28.412

31.410

37.566

21

29.615

32.671

38.932

22

30.813

33.924

40.289

23

32.007

35.172

41.638

24

33.196

36.415

42.980

25

34.382

37.652

44.314

26

35.563

38.885

45.642

27

36.741

40.113

46.963

28

37.916

41.337

48.278

29

39.087

42.557

49.588

30

40.256

43.773

50.892

How to use this table, an example: In a Chi-square distribution with 6 degrees of freedom (d.f.), the area to the right of a critical value of 12.592 – that is, the a area – is .05.

577

578

APPENDIX A > STATISTICAL TABLES

TABLE A.5 »

CRITICAL VALUES OF Fv1v2 FOR A a 5 .05

.05 0

F

v2 5 Degrees of freedom for denominator

v1 5Degrees of freedom for numerator 1

2

3

4

5

6

7

8

9

10

12

15

20

24

30

40

60

120

`

254

1

161

200

216

225

230

234

237

239

241

242

244

246

248

249

250

251

252

253

2

18.5

19.0

19.2

19.2

19.3

19.3

19.4

19.4

19.4

19.4

19.4

19.4

19.5

19.5

19.5

19.5

19.5

19.5

19.5

3

10.1

9.55

9.28

9.12

9.01

8.94

8.89

8.85

8.81

8.79

8.74

8.70

8.66

8.64

8.62

8.59

8.57

8.55

8.53

4

7.71

6.94

6.59

6.39

6.26

6.16

6.09

6.04

6.00

5.96

5.91

5.86

5.80

5.77

5.75

5.72

5.69

5.66

5.63

5

6.61

5.79

5.41

5.19

5.05

4.95

4.88

4.82

4.77

4.74

4.68

4.62

4.56

4.53

4.50

4.46

4.43

4.40

4.37

6

5.99

5.14

4.76

4.53

4.39

4.28

4.21

4.15

4.10

4.06

4.00

3.94

3.87

3.84

3.81

3.77

3.74

3.70

3.67

7

5.59

4.74

4.35

4.12

3.97

3.87

3.79

3.73

3.68

3.64

3.57

3.51

3.44

3.41

3.38

3.34

3.30

3.27

3.23

8

5.32

4.46

4.07

3.84

3.69

3.58

3.50

3.44

3.39

3.35

3.28

3.22

3.15

3.12

3.08

3.04

3.01

2.97

2.93

9

5.12

4.26

3.86

3.63

3.48

3.37

3.29

3.23

3.18

3.14

3.07

3.01

2.94

2.90

2.86

2.83

2.79

2.75

2.71

10

4.96

4.10

3.71

3.48

3.33

3.22

3.14

3.07

3.02

2.98

2.91

2.85

2.77

2.74

2.70

2.66

2.62

2.58

2.54

11

4.84

3.98

3.59

3.36

3.20

3.09

3.01

2.95

2.90

2.85

2.79

2.72

2.65

2.61

2.57

2.53

2.49

2.45

2.40

12

4.75

3.89

3.49

3.26

3.11

3.00

2.91

2.85

2.80

2.75

2.69

2.62

2.54

2.51

2.47

2.43

2.38

2.34

2.30 2.21

13

4.67

3.81

3.41

3.18

3.03

2.92

2.83

2.77

2.71

2.67

2.60

2.53

2.46

2.42

2.38

2.34

2.30

2.25

14

4.60

3.74

3.34

3.11

2.96

2.85

2.76

2.70

2.65

2.60

2.53

2.46

2.39

2.35

2.31

2.27

2.22

2.18

2.13

15

4.54

3.68

3.29

3.06

2.90

2.79

2.71

2.64

2.59

2.54

2.48

2.40

2.33

2.29

2.25

2.20

2.16

2.11

2.07

16

4.49

3.63

3.24

3.01

2.85

2.74

2.66

2.59

2.54

2.49

2.42

2.35

2.28

2.24

2.19

2.15

2.11

2.06

2.01

17

4.45

3.59

3.20

2.96

2.81

2.70

2.61

2.55

2.49

2.45

2.38

2.31

2.23

2.19

2.15

2.10

2.06

2.01

1.96

18

4.41

3.55

3.16

2.93

2.77

2.66

2.58

2.51

2.46

2.41

2.34

2.27

2.19

2.15

2.11

2.06

2.02

1.97

1.92

19

4.38

3.52

3.13

2.90

2.74

2.63

2.54

2.48

2.42

2.38

2.31

2.23

2.16

2.11

2.07

2.03

1.98

1.93

1.88

20

4.35

3.49

3.10

2.87

2.71

2.60

2.51

2.45

2.39

2.35

2.28

2.20

2.12

2.08

2.04

1.99

1.95

1.90

1.84

21

4.32

3.47

3.07

2.84

2.68

2.57

2.49

2.42

2.37

2.32

2.25

2.18

2.10

2.05

2.01

1.96

1.92

1.87

1.81 1.78

22

4.30

3.44

3.05

2.82

2.66

2.55

2.46

2.40

2.34

2.30

2.23

2.15

2.07

2.03

1.98

1.94

1.89

1.84

23

4.28

3.42

3.03

2.80

2.64

2.53

2.44

2.37

2.32

2.27

2.20

2.13

2.05

2.01

1.96

1.91

1.86

1.81

1.76

24

4.26

3.40

3.01

2.78

2.62

2.51

2.42

2.36

2.30

2.25

2.18

2.11

2.03

1.98

1.94

1.89

1.84

1.79

1.73

25

4.24

3.39

2.99

2.76

2.60

2.49

2.40

2.34

2.28

2.24

2.16

2.09

2.01

1.96

1.92

1.87

1.82

1.77

1.71

30

4.17

3.32

2.92

2.69

2.53

2.42

2.33

2.27

2.21

2.16

2.09

2.01

1.93

1.89

1.84

1.79

1.74

1.68

1.62

40

4.08

3.23

2.84

2.61

2.45

2.34

2.25

2.18

2.12

2.08

2.00

1.92

1.84

1.79

1.74

1.69

1.64

1.58

1.51 1.39

60

4.00

3.15

2.76

2.53

2.37

2.25

2.17

2.10

2.04

1.99

1.92

1.84

1.75

1.70

1.65

1.59

1.53

1.47

120

3.92

3.07

2.68

2.45

2.29

2.18

2.09

2.02

1.96

1.91

1.83

1.75

1.66

1.61

1.55

1.50

1.43

1.35

1.25

∞

3.84

3.00

2.60

2.37

2.21

2.10

2.01

1.94

1.88

1.83

1.75

1.67

1.57

1.52

1.46

1.39

1.32

1.22

1.00

APPENDIX A > STATISTICAL TABLES

TABLE A.6 »

579

CRITICAL VALUES OF Fv1v2 FOR A a 5 .01

.05 0

F

v1 5 Degrees of freedom for numerator 1

v2 5 Degrees of freedom for denominator

1

2

3

4

5

6

7

8

9

10

12

4052 5000 5403 5625 5764 5859 5928 5982 6023 6056 6106

15 6157

20

24

30

40

60

120

∞

6209 6235 6261 6287 6313 6339 6366

2

98.5

99.0

99.2

99.2

99.3

99.3

99.4

99.4

99.4

99.4

99.4

99.4

99.4

99.5

99.5

99.5

99.5

99.5

99.5

3

34.1

30.8

29.5

28.7

28.2

27.9

27.7

27.5

27.3

27.2

27.1

26.9

26.7

26.6

26.5

26.4

26.3

26.2

26.1

4

21.2

18.0

16.7

16.0

15.5

15.2

15.0

14.8

14.7

14.5

14.4

14.2

14.0

13.9

13.8

13.7

13.7

13.6

13.5

5

16.3

13.3

12.1

11.4

11.0

10.7

10.5

10.3

10.2

10.1

9.89

9.72

9.55

9.47

9.38

9.29

9.20

9.11

9.02

6

13.7

10.9

9.78

9.15

8.75

8.47

8.26

8.10

7.98

7.87

7.72

7.56

7.40

7.31

7.23

7.14

7.06

6.97

6.88

7

12.2

9.55

8.45

7.85

7.46

7.19

6.99

6.84

6.72

6.62

6.47

6.31

6.16

6.07

5.99

5.91

5.82

5.74

5.65

8

11.3

8.65

7.59

7.01

6.63

6.37

6.18

6.03

5.91

5.81

5.67

5.52

5.36

5.28

5.20

5.12

5.03

4.95

4.86

9

10.6

8.02

6.99

6.42

6.06

5.80

5.61

5.47

5.35

5.26

5.11

4.96

4.81

4.73

4.65

4.57

4.48

4.40

4.31

10

10.0

7.56

6.55

5.99

5.64

5.39

5.20

5.06

4.94

4.85

4.71

4.56

4.41

4.33

4.25

4.17

4.08 4.00

3.91

11

9.65

7.21

6.22

5.67

5.32

5.07

4.89

4.74

4.63

4.54

4.40

4.25

4.10

4.02

3.94

3.86

3.78

3.69

3.60

12

9.33

6.93

5.95

5.41

5.06

4.82

4.64

4.50

4.39

4.30

4.16

4.01

3.86

3.78

3.70

3.62

3.54

3.45

3.36

13

9.07

6.70

5.74

5.21

4.86

4.62

4.44

4.30

4.19

4.10

3.96

3.82

3.66

3.59

3.51

3.43

3.34

3.25

3.17

14

8.86

6.51

5.56

5.04

4.70

4.46

4.28

4.14

4.03

3.94

3.80

3.66

3.51

3.43

3.35

3.27

3.18

3.09

3.00 2.87

15

8.68

6.36

5.42

4.89

4.56

4.32

4.14

4.00

3.89

3.80

3.67

3.52

3.37

3.29

3.21

3.13

3.05

2.96

16

8.53

6.23

5.29

4.77

4.44

4.20

4.03

3.89

3.78

3.69

3.55

3.41

3.26

3.18

3.10

3.02

2.93

2.84

2.75

17

8.40

6.11

5.19

4.67

4.34

4.10

3.93

3.79

3.68

3.59

3.46

3.31

3.16

3.08

3.00

2.92

2.83

2.75

2.65

18

8.29

6.01

5.09

4.58

4.25

4.01

3.84

3.71

3.60

3.51

3.37

3.23

3.08

3.00

2.92

2.84

2.75

2.66

2.57

19

8.19

5.93

5.01

4.50

4.17

3.94

3.77

3.63

3.52

3.43

3.30

3.15

3.00

2.92

2.84

2.76

2.67

2.58

2.49

20

8.10

5.85

4.94

4.43

4.10

3.87

3.70

3.56

3.46

3.37

3.23

3.09

2.94

2.86

2.78

2.69

2.61

2.52

2.42

21

8.02

5.78

4.87

4.37

4.04

3.81

3.64

3.51

3.40

3.31

3.17

3.03

2.88

2.80

2.72

2.64

2.55

2.46

2.36

22

7.96

5.72

4.82

4.31

3.99

3.76

3.59

3.45

3.35

3.26

3.12

2.98

2.83

2.75

2.67

2.58

2.50

2.40

2.31

23

7.88

5.66

4.76

4.26

3.94

3.71

3.54

3.41

3.30

3.21

3.07

2.93

2.78

2.70

2.62

2.54

2.45

2.35

2.26

24

7.82

5.61

4.72

4.22

3.90

3.67

3.50

3.36

3.26

3.17

3.03

2.89

2.74

2.66

2.58

2.49

2.40

2.31

2.21

25

7.77

5.57

4.68

4.18

3.86

3.63

3.46

3.32

3.22

3.13

2.99

2.85

2.70

2.62

2.53

2.45

2.36

2.27

2.17

30

7.58

5.39

4.51

4.02

3.70

3.47

3.30

3.17

3.07

2.98

2.84

2.70

2.55

2.47

2.39

2.30

2.21

2.11

2.01

40

7.31

5.18

4.31

3.83

3.51

3.29

3.12

2.99

2.89

2.80

2.66

2.52

2.37

2.29

2.20

2.11

2.02

1.92

1.80

60

7.08

4.98

4.13

3.65

3.34

3.12

2.95

2.82

2.72

2.63

2.50

2.35

2.20

2.12

2.03

1.94

1.84

1.73

1.60

120

6.85

4.79

3.95

3.48

3.17

2.96

2.79

2.66

2.56

2.47

2.34

2.19

2.03

1.95

1.86

1.76

1.66

1.53

1.38

∞

6.63

4.61

3.78

3.32

3.02

2.80

2.64

2.51

2.41

2.32

2.18

2.04

1.88

1.79

1.70

1.59

1.47

1.32

1.00

580

APPENDIX A > STATISTICAL TABLES

TABLE A.7 » CRITICAL VALUES OF THE PEARSON CORRELATION COEFFICIENT

TABLE A.8 » CRITICAL VALUES OF T IN THE WILCOXON MATCHED-PAIRS SIGNED-RANKS TEST

Level of significance for one-tailed test .05

.025

.01

Level of significance for two-tailed test .005

Level of significance for two-tailed test d.f.

.10

.05

.02

.01

1

.988

.997

.9995

.9999

2

.900

.950

.980

.990

3

.805

.878

.934

.959

4

.729

.811

.882

.917

5

.669

.754

.833

.874

6

.622

.707

.789

.834

7

.582

.666

.750

.798

8

.549

.632

.716

.765

9

.521

.602

.685

.735

10

.497

.576

.658

.708

11

.576

.553

.634

.684

12

.458

.532

.612

.661

13

.441

.514

.592

.641

14

.426

.497

.574

.623

15

.412

.482

.558

.606

16

.400

.468

.542

.590

17

.389

.456

.528

.575

18

.378

.444

.516

.561

19

.369

.433

.503

.549

20

.360

.423

.492

.537

21

.352

.413

.482

.526

22

.344

.404

.472

.515

23

.337

.396

.462

.505

24

.330

.388

.453

.496

25

.323

.381

.445

.487

26

.317

.374

.437

.479

27

.311

.367

.430

.471

28

.306

.361

.423

.463

29

.301

.355

.416

.486

30

.296

.349

.409

.449

35

.275

.325

.381

.418

40

.257

.304

.358

.393

45

.243

.288

.338

.372

50

.231

.273

.322

.354

60

.211

.250

.295

.325

70

.195

.232

.274

.303

80

.183

.217

.256

.283

90

.173

.205

.242

.267

100

.164

.195

.230

.254

N

.05

.02

.01

6

1

—

—

7

2

0

—

8

4

2

0

9

6

3

2

10

8

5

3

11

11

7

5

12

14

10

7 10

13

17

13

14

21

16

13

15

25

20

16

16

30

24

19

17

35

28

23

18

40

33

28

19

46

38

32

20

52

43

37

21

59

49

43

22

66

56

49 55

23

73

62

24

81

69

61

25

90

77

68

GLOSSARY A a priori codes

Codes created before data have been gathered and examined, usually drawn from existing theory and current knowledge of the research domain. acquiescence bias

A category of response bias that results because some individuals tend to agree with all questions or to concur with a particular position. administrative error

An error caused by the improper administration or execution of the research task. alternative hypothesis

A statement indicating the opposite of the null hypothesis. analysis of variance (ANOVA)

Analysis involving the investigation of the effects of one treatment variable on an interval-scaled dependent variable; a hypothesis-testing technique to determine whether statistically significant differences in means occur between three or more groups. applied research

Research conducted when a decision must be made about a real life problem. at-home scanning system

A system that allows consumer panellists to perform their own scanning after taking home products, using hand-held wands that read UPC symbols. attitude

An enduring disposition to consistently respond in a given manner to various aspects of the world; composed of affective, cognitive and behavioural components. attribute

A single characteristic or fundamental feature of an object, person, situation or issue. auspices bias

basic experimental design

central-limit theorem

basic (pure) research

central location interviewing

An experimental design in which a single independent variable is manipulated to measure its effect on another single dependent variable. Research conducted to expand the limits of knowledge, to verify the acceptability of a given theory or to learn more about a certain concept. behavioural differential

A rating scale instrument similar to a semantic differential, developed to measure the behavioural intentions of subjects towards future actions. binary logistic regression

Establishes a rule for forecasting the value of a binary dependent variable from a combination of two or more metric independent variables. bivariate linear regression

A measure of linear association that investigates straightline relationships of the type Y 5 a 1 bX, where Y is the dependent variable, X is the independent variable, and a and b are two constants to be estimated. blinding

A technique used to control subjects’ knowledge of whether or not they have been given a particular experimental treatment. branch question

A filter question used to determine which version of a second question will be asked.

checklist question

A fixed-alternative question that allows the respondent to provide multiple answers to a single question by ticking off items. Chi-square ( x2 ) test

A hypothesis test that allows for investigation of statistical significance in the analysis of a frequency distribution. Chi-square test for independence

A test that statistically analyses significance in a joint frequency distribution. choice

A measurement task that identifies preferences by requiring respondents to choose between two or more alternatives. choice modelling

cluster analysis

An attempt to recontact an individual selected for a sample who was not available initially. case study method

The qualitative research technique that intensively investigates one or a few situations similar to the problem situation.

category scale

A fixed-alternative rating scale with an equal number of positive and negative categories; a neutral point or point of indifference is at the centre of the scale.

In an Internet questionnaire, a small graphic box, next to an answer, that a respondent clicks on to choose that answer; typically, a tick or an X appears in the box when the respondent clicks on it.

C

B

balanced rating scale

check box

callback

categorical (classificatory) variable

Taking a questionnaire that has previously been translated into another language and having a second, independent translator translate it back to the original language.

Telephone interviews conducted from a central location using lines at fixed charges.

Closely related to conjoint analysis, but focuses on the patterns of choices made by respondents instead of ratings data.

Bias in the responses of subjects caused by their being influenced by the organisation conducting the study. back translation

The theory that, as sample size increases, the distribution of sample means of size n, randomly selected, approaches a normal distribution.

A variable that has a limited number of distinct values. A rating scale that consists of several response categories, often providing respondents with alternatives to indicate positions on a continuum. causal research

Research conducted to identify cause-and-effect relationships among variables. census

An investigation of all the individual elements that make up a population.

A body of techniques for classifying individuals or objects into a small number of mutually exclusive groups, ensuring that there will be as much likeness within groups and as much difference among groups as possible. cluster sampling

An economically efficient sampling technique in which the primary sampling unit is not the individual element in the population but a large cluster of elements; clusters are selected randomly. code book

A book that identifies each variable in a study and gives the variable’s description, code name and position in the data matrix. codes

Rules for interpreting, classifying and recording data in the coding process; also, the actual numerical or other character symbols assigned to raw data.

581

582

GLOSSARY

coding

The process of assigning a numerical score or other character symbol to previously edited data in preparation for analysis. coefficient of determination (r2)

A measure obtained by squaring the correlation coefficient; that proportion of the total variance of a variable that is accounted for by knowing the value of another variable. cohort effect

A change in the dependent variable that occurs because members of one experimental group experience different historical situations from members of other experimental groups. communication process

The process by which one person or source sends a message to an audience or receiver and then receives feedback about the message. comparative rating scale

Any measure of attitudes that asks respondents to rate a concept in comparison with a benchmark explicitly used as a frame of reference. completely randomised design

An experimental design that uses a random process to assign subjects (test units) to treatments to investigate the effects of only one independent variable. compromise design

An approximation of an experimental design, which may fall short of the requirements of random assignment of subjects or treatments to groups. computer-assisted telephone interview (CATI)

Technology that allows answers to telephone interviews to be entered directly into a computer for processing.

confidence level

A percentage or decimal value that shows how confident a researcher can be about being correct. It states the long run percentage of confidence intervals that will include the true population mean. conjoint analysis

A range of techniques for inferring the relative importance of product attributes by decomposing overall evaluations of different patterns of stimuli. constancy of conditions

A situation in which subjects in experimental groups and control groups are exposed to situations identical except for differing conditions of the independent variable. constant comparison

The process of routinely comparing each new piece of text with other similarly coded pieces, and their contexts, to ensure consistent coding and to help discover new themes. constant error

An error that occurs in the same experimental condition every time the basic experiment is repeated; a systematic bias. constant-sum scale

A measure of attitudes in which respondents are asked to divide a constant sum to indicate the relative importance of attributes; respondents often sort cards, but the task may also be a rating task. construct validity

The ability of a measure to provide empirical evidence consistent with a theory based on the concepts. consumer panel

concept

A generalised idea about a class of objects, attributes, occurrences or processes.

A longitudinal survey of the same sample of individuals or households to record their attitudes, behaviour or purchasing habits over time.

concept testing

content analysis

Any qualitative research procedure that tests some sort of stimulus as a proxy for an idea about a new, revised or repositioned product service or strategy. conceptual definition

A verbal explanation of the meaning of a concept. It defines what the concept is and what it is not. conclusions and report preparation stage

The stage in which the researcher interprets information and draws conclusions to be communicated to decision-makers. confidence interval estimate

A specified range of numbers within which a population mean is expected to lie; an estimate of the population mean based on the knowledge that it will be equal to the sample mean plus or minus a small sampling error.

The systematic observation and quantitative description of the manifest content of communication.

so that the test marketer can be guaranteed distribution. controlled store test

A hybrid between a laboratory experiment and a test market; test products are sold in a small number of selected stores to actual customers. convenience sampling

The sampling procedure of obtaining those people or units that are most conveniently available. correlation matrix

The standard form for reporting correlational results. counterbiasing statement

An introductory statement or preamble to a potentially embarrassing question that reduces a respondent’s reluctance to answer by suggesting that certain behaviour is not unusual. cover letter

A letter that accompanies a questionnaire to induce the reader to complete and return the questionnaire. criterion validity

The ability of a measure to correlate with other standard measures of the same construct or established criterion. critical values

The values that lie exactly on the boundary of the region of rejection. cross-check

The comparison of data from one source with data from another source to determine the similarity of independent projects. cross-sectional study

A study in which various segments of a population are sampled and data are collected at a single moment in time. customer discovery

Involves mining data to look for patterns identifying who is likely to be a valuable customer.

D data cleaning

A variable that has an infinite number of possible values.

The process used to determine inaccurate, incomplete or unreasonable data and then improving the quality through correction of detected errors and omissions.

contrived observation

data conversion

control group

data-gathering stage

continuous variable

Observation in which the investigator creates an artificial environment in order to test a hypothesis. The group of subjects exposed to the control condition in an experiment – that is, not exposed to the experimental treatment. control method of test marketing

A ‘minimarket test’ using forced distribution in a small city. Retailers are paid for shelf space

The process of changing the original form of the data to a format suitable to achieve the research objective; also called data transformation. The stage in which the researcher collects the data. data matrix

A rectangular arrangement of data in rows and columns. Typically, records are in the rows and the variables are in the columns.

GLOSSARY

data mining

The use of powerful computers to dig through volumes of data to discover patterns about an organisation’s customers and products. It is a broad term that applies to many different forms of analysis. data processing and analysis stage

The stage in which the researcher performs several interrelated procedures to convert the data into a format that will answer management’s questions. data-processing error

A category of administrative error that occurs because of incorrect data entry, incorrect computer programming or other procedural errors during data analysis. database marketing

The use of customer databases to promote oneto-one relationships with customers and create precisely targeted promotions. debriefing

The process of providing subjects with all pertinent facts about the nature and purpose of an experiment after its completion. decision-makers’ objectives

Managerial goals expressed in measurable terms. degrees of freedom

The number of observations minus the number of constraints or assumptions needed to calculate a statistical term. demand characteristics

Experimental design procedures or situational aspects of an experiment that provide unintentional hints about the experimenter’s hypothesis to subjects. dependent variable

A criterion or variable to be predicted or explained. The criterion or standard by which the results of an experiment are judged; a variable expected to be dependent on the experimenter’s manipulation of the independent variable. depth interview

A relatively unstructured, extensive interview in which the interviewer asks many questions and probes for in-depth answers. descriptive research

Research designed to describe characteristics of a population or phenomenon. determinant-choice question

A fixed-alternative question that requires the respondent to choose one response from multiple alternatives. dialogue box

A window that opens on a computer screen to prompt the user to enter information. direct observation

A straightforward attempt to observe and record what naturally occurs; the investigator does not create an artificial situation.

discussion guide

A document prepared by the focus group moderator that contains remarks about the nature of the group and outlines the topics or questions to be addressed. disguised question

An indirect question that assumes the purpose of the study must be hidden from the respondent. disproportional stratified sample

A stratified sample in which the sample size for each stratum is allocated according to analytical considerations. door-to-door interview

A personal interview conducted at respondents’ doorsteps in an effort to increase the participation rate in a survey. double-barrelled question

A question that may induce bias because it covers two issues at once. double-blind design

A technique in which neither the subject nor the experimenter knows which are the experimental and which the controlled conditions. drop-down box

In an Internet questionnaire, a space-saving device that reveals responses when they are needed but otherwise hides them from view. drop-off method

A survey method that requires the interviewer to travel to the respondent’s location to drop off questionnaires that will be picked up later. dummy tables

Representations of the actual tables that will be in the findings section of the final report; used to provide a better understanding of what the actual outcomes of the research will be.

E editing

The process of checking and adjusting data for omissions, legibility and consistency, and readying them for coding and storage. electronic test markets

A system of test marketing that measures results based on Universal Product Code data; often scanner-based consumer panels are combined with high-technology television broadcasting systems to allow experimentation with different advertising messages via split-cable broadcasts or other technology. email surveys

Surveys distributed through electronic mail. emergent codes

sometimes called ‘grounded codes’, appear in the process of reading and examining the data codes created before data have been gathered and examined, usually drawn from existing theory and current knowledge of the research domain.

583

environmental scanning

Information gathering and fact-finding that is designed to detect indications of environmental changes in their initial stages of development. equivalent-form method

A method that measures the correlation between alternative instruments, designed to be as equivalent as possible, administered to the same group of subjects. error trapping

Using software to control the flow of an Internet questionnaire – for example, to prevent respondents from backing up or failing to answer a question. ethnography

Represents ways of studying cultures through methods that involve becoming highly active within that culture. experiment

A research method in which conditions are controlled so that one or more independent variables can be manipulated to test a hypothesis about a dependent variable. Experimentation allows evaluation of causal relationships among variables while all other variables are eliminated or controlled. experimental group

The group of subjects exposed to the experimental treatment. experimental treatments

Alternative manipulations of the independent variable being investigated. expert interview

A special form of judgement sampling where the researcher decides that a certain group of people have special knowledge or expertise that may substitute for a much larger sample of nonexpert respondents. exploratory factor analysis

A type of analysis used to discern the underlying dimensions or regularity in phenomena. Its general purpose is to summarise the information contained in a large number of variables into a smaller number of factors. exploratory research

Initial research conducted to clarify and define a problem. external data

Data created, recorded or generated by an entity other than the researcher’s organisation. external validity

The ability of an experiment to generalise beyond the experiment data to other subjects or groups in the population under study. extremity bias

A category of response bias that results because some individuals tend to use extremes when responding to questions.

584

GLOSSARY

eye-tracking monitor

A mechanical device used to observe eye movements. Some eye monitors use infrared light beams to measure unconscious eye movements.

frequency determination question

A fixed-alternative question that asks for an answer about general frequency of occurrence. frequency distribution

F

A set of data organised by summarising the number of times a particular value of a variable occurs.

F-test

functional magnetic resonance imaging (fMRI)

A procedure used to determine whether there is more variability in the scores of one sample than in the scores of another sample. F-test (regression)

A procedure to determine whether more variability is explained by the regression or unexplained by the regression. face (or content) validity

Professional agreement that a scale’s content logically appears to accurately reflect what was intended to be measured. factorial design

An experiment that investigates the interaction of two or more independent variables on a single dependent variable. field

A collection of characters that represents a single type of data, such as the answer to a question. field editing

Preliminary editing on the same day as the interview to catch technical omissions, check legibility of handwriting and clarify responses that are logically or conceptually inconsistent. field experiment

An experiment conducted in a natural setting, where complete control of extraneous variables is not possible. file

A collection of related records, with accompanying information about the nature of the data. filter question

A question that screens out respondents who are not qualified to answer a second question. fixed-alternative question

A question in which the respondent is given specific, limited alternative responses and asked to choose the one closest to his own viewpoint. focus group interview

An unstructured, freeflowing interview with a small group of people. forced answering software

Software that prevents respondents from continuing with an Internet questionnaire if they fail to answer a question. forced-choice rating scale

A fixed-alternative rating scale that requires respondents to choose one of the fixed alternatives.

hypothesis

An unproven proposition or supposition that tentatively explains certain facts or phenomena; a probable answer to a research question.

I iceberg principle

A magnetic scan that reveals in different colours which parts of the brain are active in real time.

The idea that the dangerous part of many marketing problems is neither visible to nor understood by marketing managers.

funnel technique

image profile

Asking general questions before specific questions in order to obtain unbiased responses.

G graphic aids

Pictures or diagrams used to clarify complex points or emphasise a message. graphic rating scale

A measure of attitude that allows respondents to rate an object by choosing any point along a graphic continuum. grounded theory

Represents an inductive investigation in which the researcher poses questions about information provided by respondents or taken from historical records; the researcher asks the questions to him- or herself and repeatedly questions the responses to derive deeper explanations. guinea pig effect

An effect on the results of an experiment caused by subjects changing their normal behaviour or attitudes in order to cooperate with an experimenter.

H Hawthorne effect

An unintended effect on the results of a research experiment caused by the subjects knowing that they are participants. hermeneutics

An approach to understanding phenomenology that relies on analysis of texts through which a person tells a story about him- or herself. hermeneutic unit

Refers to a text passage from a respondent’s story that is linked with a key theme from within this story or provided by the researcher. hidden observation

Observation in which the subject is unaware that observation is taking place. history effect

The loss of internal validity caused by specific events in the external environment, occurring between the first and second measurements, that are beyond the control of the experimenter. hypothetical construct

A variable that is not directly observable but is measurable through indirect indicators, such as verbal expression or overt behaviour.

A graphic representation of semantic differential data for competing brands, products or stores to highlight comparisons. independent samples t-test for difference of means

A technique used to test the hypothesis that the mean scores on some interval- or ratioscaled variables are significantly different for two independent samples or groups. independent variable

A variable that is expected to influence a dependent variable. index (or composite) measure

A composite measure of several variables used to measure a single concept; a multi-item instrument. instrumentation effect

An effect on the results of an experiment caused by a change in the wording of questions, a change in interviewers or other changes in procedures used to measure the dependent variable. interaction effect

The influence on a dependent variable of combinations of two or more independent variables. interactive help desk

In an Internet questionnaire, a live, real-time support feature that solves problems or answers questions respondents may encounter in completing the questionnaire. interdependence methods

Multivariate statistical techniques that are used to group things together and give them meaning. internal and proprietary data

Secondary data that originate inside the organisation. internal validity

Validity determined by whether an experimental treatment was the sole cause of changes in a dependent variable or whether the experimental manipulation did what it was supposed to do. Internet survey

A self-administered questionnaire posted on a website. interval scale

A scale that both arranges objects according to their magnitudes and distinguishes this ordered arrangement in units of equal intervals.

GLOSSARY

interviewer bias

A response bias that occurs because the presence of the interviewer influences respondents’ answers. interviewer cheating

The practice of filling in fake answers or falsifying questionnaires while working as an interviewer. interviewer error

Mistakes made by interviewers who fail to record survey responses correctly. ipsative scale

mall intercept interview

A personal interview conducted in a shopping mall. A performance criterion or objective that expresses specific actions that will be taken if the criterion is achieved. market-basket analysis

multicollinearity

A form of data mining that analyses anonymous point-of-sale transaction databases to identify coinciding purchases or relationships between products purchased and other retail shopping information.

market tracking

multiple-grid question

The observation and analysis of trends in industry volume and brand share over time. marketing research

J

The systematic and objective process of generating information to aid in making marketing decisions.

judgement (purposive) sampling

matching

A nonprobability sampling technique in which an experienced individual selects the sample based on personal judgement about some appropriate characteristic of the sample member.

A procedure for the assignment of subjects to groups that ensures each group of respondents is matched on the basis of pertinent characteristics.

L

maturation effect

leading question

A question that suggests or implies certain answers. Likert scale

A measure of attributes designed to allow respondents to rate how strongly they agree or disagree with carefully constructed statements, ranging from very positive to very negative attitudes towards some object; several scale items may be used to form a summated index. loaded question

A question that suggests a socially desirable answer or is emotionally charged. longitudinal study

A survey of respondents at different times, thus allowing analysis of response continuity and changes over time.

M mail survey

A self-administered questionnaire sent to respondents through the mail. main effect

The influence of a single independent variable on a dependent variable.

multidimensional scaling

A statistical technique that locates objects in multidimensional space on the basis of measures of the similarity of objects.

item nonresponse

An experiment conducted in a laboratory or other artificial setting to obtain almost complete control over the research setting.

A problem in multiple regression when the independent variables are correlated with each other, causing the parameter estimates to be unreliable.

The percentage of potential customers who make at least one trial purchase.

market penetration

laboratory experiment

multi-attribute model

A means of measuring an attitude to an object by asking respondent to evaluate each part of the object. Attitude scores are based on the product of belief strength and the evaluation of the consequences.

managerial action standard

A measurement designed to compare items from the same person, usually ranking or choice tasks where items are explicitly evaluated on their relative attractiveness. The failure of a respondent to provide an answer to a survey question; the technical term for an unanswered question on an otherwise complete questionnaire.

585

An effect on the results of an experiment caused by experimental subjects maturing or changing over time.

Several similar questions arranged in a grid format. multiple regression analysis

Where the effects of two or more metric scaled independent variables on a single, metric-scaled dependent variable are investigated. multivariate dependence methods

Multivariate statistical techniques that explain or predict one or more dependent variables on the basis of two or more independent variables. multivariate statistical analysis

Statistical methods that allow the simultaneous investigation of more than two variables.

N n-way cross-tabulation

mean

Where two nonmetric scaled variables are compared after accounting for the effects of a third (or more) nonmetric variable.

median

n-way univariate analysis of variance (ANOVA)

A measure of central tendency; the arithmetic average. A measure of central tendency that is the midpoint; the value below which half the values in a distribution fall. mixed-mode survey

A study that employs any combination of survey methods. mode

A measure of central tendency; the value that occurs most often. model building

The use of secondary data to help specify relationships between two or more variables. Model building can involve the development of descriptive or predictive equations.

A technique that simultaneously tests for the differences in the mean of a metric dependent variable among two or more nonmetric independent variables. netnography

Netnography is a type of ethnography that analyses the free behaviour of individuals on the Internet. In particular, the posts, tweets and other online contributions of participants are studied, usually by observation. In the market research field it is also known as ‘social listening’. neural network

moderator

The person who leads a focus group discussion.

A form of artificial intelligence in which a computer is programmed to mimic the way that the human brain processes information.

monadic rating scale

no contact

Any measure of attitudes that asks respondents about a single concept in isolation.

A person who is not at home or who is otherwise inaccessible on the first and second contact.

mortality (sample attrition) effect

nominal scale

A sample bias that results from the withdrawal of some subjects from the experiment before it is completed.

A scale in which the numbers or letters assigned to objects serve as labels for identification or classification.

586

GLOSSARY

non-forced-choice rating scale

A fixed-alternative rating scale that provides a ‘no opinion’ category or that allows respondents to indicate that they cannot say which alternative is their choice. nonprobability sampling

A sampling technique in which units of the sample are selected on the basis of personal judgement or convenience; the probability of any particular member of the population being chosen is unknown. nonrespondent

A person who is not contacted or who refuses to cooperate in the research. nonresponse error

The statistical differences between a survey that includes only those who responded and a perfect survey that would also include those who failed to respond. normal distribution

A symmetrical, bell-shaped distribution that describes the expected probability distribution of many chance occurrences. normative scale

online focus group

A focus group whose members use Internet technology to carry on their discussion. online test market

perceptual map

An application of multidimensional scaling to show graphically how objects are perceived by consumers.

An online panel used to test market new products and advertising copy. Global coverage, but respondents are usually paid.

performance-monitoring research

open-ended box

personal interview

open-ended response question

phenomenology

In an Internet questionnaire, a box where respondents can type in their own answers to open-ended questions. A question that poses some problem and asks the respondent to answer in her own words. operational definition

An explanation that gives meaning to a concept by specifying the activities or operations necessary to measure it. opt in

To give permission to receive selected email, such as questionnaires, from a company with an Internet presence. oral presentation

Research that regularly provides feedback for evaluation and control of marketing activity. Face-to-face communication in which an interviewer asks a respondent to answer questions. A philosophical approach to studying human experiences based on the idea that human experience itself is inherently subjective and determined by the context in which people live. picture frustration

A version of the thematic apperception test (TAT) that uses a cartoon drawing for which the respondent suggests dialogue the characters might engage in. pilot study

A collective term for any small-scale exploratory research technique that uses sampling but does not apply rigorous standards.

A measurement designed to make comparisons among other people on a particular construct item, usually a rating scale where items are evaluated in isolation from other items.

A spoken summary of the major findings, conclusions and recommendations, given to clients or line managers to provide them with the opportunity to clarify any ambiguous issues by asking questions.

null hypothesis

order bias

pooled estimate of the standard error

ordinal scale

population

A statement about a status quo asserting that any change from what has been thought to be true will be due entirely to random sampling error. numerical scale

An attitude rating scale similar to a semantic differential except that it uses numbers (instead of verbal descriptions) as response options to identify response positions.

O observation

The systematic process of recording the behavioural patterns of people, objects and occurrences as they are witnessed. observer bias

A distortion of measurement resulting from the cognitive behaviour or actions of a witnessing observer. one-group pretest–posttest design

A quasi-experimental design in which the subjects in the experimental group are measured before and after the treatment is administered, but there is no control group. one-sample t-test

A hypothesis test that uses the t-distribution rather than the Z-distribution; it is used when testing a hypothesis with a small sample size and unknown m. one-shot design

An after-only design in which a single measure is recorded after the treatment is administered.

Bias caused by the influence of earlier questions in a questionnaire or by an answer’s position in a set of answers. A scale that arranges objects or alternatives according to their magnitude in an ordered relationship.

P paired comparison

point estimate

An estimate of the population mean in the form of a single value, usually the sample mean. An estimate of the standard error for a t-test of independent means that assumes the variances of both groups are equal. Any complete group of entities that share some common set of characteristics. population distribution

A frequency distribution of the elements of a population.

A measurement technique that involves presenting the respondent with two objects and asking the respondent to pick the preferred object. More than two objects may be presented, but comparisons are made in pairs.

population element

paired samples t-test

pop-up boxes

partial correlation analysis

posttest-only control group design

A technique used to test the hypothesis that mean scores differ on some interval- or ratioscaled variable between related or paired samples. An analysis of association between two linear variables after controlling for the effects of other variables.

An individual member of a population. population parameters

Variables in a population or measured characteristics of the population. In an Internet questionnaire, boxes that appear at selected points and contain information or instructions for respondents.

A statistical measure of the covariation, or association, between two variables.

An after-only design in which the experimental group is tested after exposure to the treatment and the control group is tested at the same time without having been exposed to the treatment; no premeasure is taken. Random assignment of subjects and treatment occurs.

percentage distribution

preliminary tabulation

Pearson’s correlation coefficient

A frequency distribution organised into a table (or graph) that summarises percentage values associated with particular values of a variable.

A tabulation of the results of a pretest to help determine whether the questionnaire will meet the objectives of the research.

GLOSSARY

pretest–posttest control group design

A true experimental design in which the experimental group is tested before and after exposure to the treatment and the control group is tested at the same two times without being exposed to the experimental treatment. pretesting

A screening procedure that involves a trial run with a group of respondents to iron out fundamental problems in the survey design. probability

The long-run relative frequency with which an event will occur. probability sampling

A sampling technique in which every member of the population has a known, non-zero probability of selection. probing

A method used in personal interviews in which the interviewer asks the respondent for clarification of answers to standardised questions. problem definition

The crucial first stage in the research process – determining the problem to be solved and the objectives of the research. problem definition stage

The stage in which management seeks to identify a clear-cut statement of the problem or opportunity. program strategy

The overall plan to conduct a series of marketing research projects; a planning activity that places each marketing project in the context of the company’s marketing plan. projective technique

An indirect means of questioning that enables a respondent to project beliefs and feelings onto a third party, an inanimate object or a task situation. proportional stratified sample

A stratified sample in which the number of sampling units drawn from each stratum is in proportion to the population size of that stratum. psychogalvanometer

push technology

ratio scale

Q

recode

Internet information technology that automatically delivers content to the researcher’s or manager’s desktop. qualitative research

Initial research and interpretive research that is not based on numerical analysis. quasi-experimental design

A research design that cannot be classified as a true experiment because it lacks adequate control of extraneous variables. quota sampling

A nonprobability sampling procedure that ensures that various subgroups of a population will be represented on pertinent characteristics to the exact extent that the investigator desires.

R radio button

In an Internet questionnaire, a circular icon, resembling a button, that activates one response choice and deactivates others when a respondent clicks on it. random digit dialling

The use of telephone exchanges and a table of random numbers to contact respondents with unlisted phone numbers. random sampling error

A statistical fluctuation that occurs because of chance variation in the elements selected for a sample. randomisation

A procedure in which the assignment of subjects and treatments to groups is based on chance. randomised block design

An extension of the completely randomised design in which a single extraneous variable that might affect test units’ response to the treatment is identified and the effects of this variable are isolated by being blocked out. randomised response questions

A research procedure used for dealing with sensitive topics, in which a random procedure determines which of two questions a respondent will be asked to answer.

A device that measures galvanic skin response, a measure of involuntary changes in the electrical resistance of the skin.

range

pupilometer

ranking

A mechanical device used to observe and record changes in the diameter of a subject’s pupils. push button

A small outlined area on a dialogue box, such as a rectangle or an arrow, that the respondent clicks on to select an option or perform a function (such as Submit) on an Internet questionnaire.

The distance between the smallest and the largest values of a frequency distribution. A measurement task that requires respondents to rank order a small number of stores, brands or objects on the basis of overall preference or some characteristic of the stimulus. rating

A measurement task that requires respondents to estimate the magnitude of a characteristic or quality that a brand, store or object possesses.

587

A scale that has absolute rather than relative quantities, and an absolute zero where there is an absence of a given attribute. To use a computer to convert original codes used for raw data into codes that are more suitable for analysis. record

A collection of related fields – the answers to all questions by one respondent. reflexivity

Reflexivity entails the researcher being aware of his or her effect on the process and outcomes of research based on the premise that ‘knowledge cannot be separated from the knower’. refusal

A person who is unwilling to participate in a research project. reliability

The degree to which measures are free from random error and therefore yield consistent results. repeat-purchase rate

The percentage of purchasers who make a second or repeat purchase repeated measures

An experimental technique in which the same subjects are exposed to all experimental treatments to eliminate any problems due to subject differences. report format

The make-up or arrangement of parts necessary to a good research report. research design

A master plan that specifies the methods and procedures for collecting and analysing needed information. research design stage

The stage in which the researcher determines a framework for the research plan of action by selecting a basic research method. research objective

The researcher’s version of the marketing problem; it explains the purpose of the research in measurable terms and defines standards for what the research should accomplish. research proposal

A written statement of the research design that includes a statement explaining the purpose of the study and a detailed, systematic outline of procedures associated with a particular research methodology. research report

An oral presentation or written statement of research results, strategic recommendations and/ or other conclusions to a specific audience.

588

GLOSSARY

respondent

sampling distribution

respondent-driven sampling

sampling frame

The person who verbally answers an interviewer’s questions or provides answers to written questions. A respondent recruitment method where respondents are rewarded for their interviews and also for recruiting others to be interviewed. Newly-recruited respondents make contact with the researcher to earn the same rewards as the earlier respondent. The researcher does not know the respondent until the respondent makes contact. respondent error

A category of sample bias resulting from some respondent action or inaction such as nonresponse or response bias. response bias

A bias that occurs when respondents either consciously or unconsciously tend to answer questions with a certain slant that misrepresents the truth. response latency

The amount of time it takes to make a choice between two alternatives; used as a measure of the strength of preference. response rate

The number of questionnaires returned or completed divided by the number of eligible people who were asked to participate in the survey. reverse coding

Where the value assigned for a response is treated oppositely from the other items. role-playing technique

A projective technique that requires the subject to act out someone else’s behaviour in a particular setting.

S sample

A subset, or some part, of a larger population used to make inferences about the larger population. sample bias

A persistent tendency for the results of a sample to deviate in one direction from the true value of the population parameter. sample distribution

A frequency distribution of a sample. sample statistics

Variables in a sample or measures computed from sample data. sample-selection error

An administrative error caused by improper sample design or sampling procedure execution. sample survey

A more formal term for a survey.

A theoretical probability distribution of sample means for all possible samples of a certain size drawn from a particular population. A list of elements from which a sample may be drawn; also called working population. When a list is infeasible, then the sampling frame is a highly detailed explanation of how a representative subset of the target population will be contacted. sampling frame error

An error that occurs when certain sample elements are not listed or are not accurately represented in a sampling frame. sampling stage

The stage in which the researcher determines who is to be sampled, how large a sample is needed, and how sampling units will be selected.

sentence completion method

A projective technique in which respondents are required to complete a number of partial sentences with the first word or phrase that comes to mind. significance level

The critical probability in choosing between the null and alternative hypotheses; the probability level that is too low to warrant support of the null hypothesis. simple-dichotomy (dichotomous alternative) question

A fixed-alternative question that requires the respondent to choose one of two alternatives. simple random sampling

A sampling procedure that assures each element in the population of an equal chance of being included in the sample. simulated test market

sampling unit

A single element or group of elements subject to selection in the sample.

A research laboratory in which the traditional shopping process is compressed into a short time span.

scale

single-source data

scanner-based consumer panel

situation analysis

scientific method

snowball sampling

Any series of items that are arranged progressively according to value or magnitude; a series into which an item can be placed according to its quantification. A type of consumer panel in which participants’ purchasing habits are recorded with a laser scanner rather than a purchase diary. The techniques and procedures used to recognise and understand marketing phenomena. secondary data

Data that have been previously collected for some purpose other than the one at hand. selection effect

A sampling bias that results from differential selection of respondents for the comparison groups. self-administered questionnaire

A survey in which the respondent takes the responsibility for reading and answering the questions. self-selection bias

A bias that occurs because people who feel strongly about a subject are more likely to respond to survey questions than people who feel indifferent about it. semantic differential

A measure of attitudes that consists of a series of 7-point rating scales that use bipolar adjectives to anchor the beginning and end of each scale. sensitivity

A measurement instrument’s ability to accurately measure variability in stimuli or responses.

Diverse types of data offered by a single company. The data are usually integrated on the basis of a common variable such as geographic area or store. A preliminary investigation or informal gathering of background information to familiarise researchers or managers with the decision area. A sampling procedure in which initial respondents are found and additional respondents are obtained from information provided by the initial respondents. social desirability bias

Bias in responses caused by respondents’ desire, either conscious or unconscious, to gain prestige or appear in a different social role. Solomon four-group design

A true experimental design that combines the pretest–posttest with control group design and the posttest-only with control group design, thereby providing a means for controlling the interactive testing effect and other sources of extraneous variation. sorting

A measurement task that presents a respondent with several objects or product concepts and requires the respondent to arrange the objects into piles or to classify the product concepts. split-ballot technique

Using two alternative phrasings of the same questions for respective halves of a sample to elicit a more accurate total response than would a single phrasing.

GLOSSARY

split-half method

systematic sampling

standard deviation

T

A method for assessing internal consistency by checking the results of one-half of a set of scaled items against the results from the other half. A quantitative index of a distribution’s spread, or variability; the square root of the variance for a distribution. standard error of the mean

The standard deviation of the sampling distribution. standardised normal distribution

A purely theoretical probability distribution that reflects a specific normal curve for the standardised value, Z. Stapel scale

A sampling procedure in which a starting point is selected by a random process and then every nth number on the list is selected. t-distribution

A symmetrical, bell-shaped distribution that is contingent on sample size. It has a mean of zero and a standard deviation equal to 1. tachistoscope

A device that controls the amount of time a subject is exposed to a visual image. telephone interview

A personal interview conducted by telephone; the mainstay of commercial survey research.

A measure of attitudes that consists of a single adjective in the centre of an even number of numerical values.

television monitoring

static group design

An investigation of a hypothesis stating that two (or more) groups differ with respect to measures on a variable.

An after-only design in which subjects in the experimental group are measured after being exposed to the experimental treatment and the control group is measured without having been exposed to the experimental treatment; no premeasure is taken. status bar

In an Internet questionnaire, a visual indicator that tells the respondent what portion of the survey he or she has completed. stratified sampling

A probability sampling procedure in which simple random subsamples that are more or less equal on some characteristic are drawn from within each stratum of the population. streaming media

Multimedia content, such as audio or video, which can be accessed on the Internet without being downloaded first. structured question

A question that imposes a limit on the number of allowable responses. summated scale

A scale created by simply summing (adding together) the response to each item making up the composite measure. The scores can be (but do not have to be) averaged by the number of items making up the composite scale. survey

A method of collecting primary data in which information is gathered by communicating with a representative sample of people. systematic (non-sampling) error

Error resulting from some imperfect aspect of the research design, such as mistakes in sample selection, sampling frame error or nonresponses from persons who were not contacted or refused to participate.

Computerised mechanical observation used to obtain television ratings. test of differences

test marketing

A scientific testing and controlled experimental procedure that provides an opportunity to measure sales or profit potential for a new product or to test a new marketing plan under realistic marketing conditions. test–retest method

Administering the same scale or measure to the same respondents at two separate points in time to test for stability. test tabulation

Tallying of a small sample of the total number of replies to a particular question in order to construct coding categories. test units

Subjects or entities whose responses to experimental treatments are observed or measured. testing effect

In a before-and-after study, the effect of pretesting may sensitise subjects when taking a test for the second time, thus affecting internal validity. tests of association

A general term that refers to a number of bivariate statistical techniques used to measure whether or not two variables are associated with each other. thematic apperception test (TAT)

A projective technique that presents a series of pictures to research subjects and asks them to provide a description of or a story about the pictures. third-person technique

A projective technique in which the respondent is asked why a third person does what she

589

does or what she thinks about a product. The respondent is expected to transfer her attitudes to the third person. time-location sampling

A two-stage sampling procedure in which knowledgeable experts brief the researcher on locations and times that members of the target population may meet, followed by cluster sampling of venues and times and face-to-face contacts. time series design

An experimental design used when experiments are conducted over long periods of time. It allows researchers to distinguish between temporary and permanent changes in dependent variables. total variance

The sum of within group variance and betweengroup variance. tracking study

A type of longitudinal study that uses successive samples to compare trends and identify changes in variables, such as consumer satisfaction, brand image or advertising awareness. type I error

An error caused by rejecting the null hypothesis when it is true. type II error

An error caused by failing to reject the null hypothesis when the alternative hypothesis is true

U unbalanced rating scale

A fixed-alternative rating scale that has more response categories piled up at one end and an unequal number of positive and negative categories. undisguised question

A straightforward question that assumes the respondent is willing to answer. univariate statistical analysis

A type of analysis that assesses the statistical significance of a hypothesis about a single variable. unstructured question

A question that does not restrict the respondents’ answers.

V validity

The ability of a scale to measure what was intended to be measured. variable

Anything that may assume different numerical or categorical values. variable piping software

Software that allows variables to be inserted into an Internet questionnaire as a respondent is completing it.

590

GLOSSARY

variance

A measure of variability or dispersion. Its square root is the standard deviation. virtual-reality simulated test market

An experiment that attempts to reproduce the atmosphere of an actual retail store with visually compelling images appearing on a computer screen. visible observation

Observation in which the observer’s presence is known to the subject.

voice pitch analysis

A physiological measurement technique that records abnormal frequencies in the voice that are supposed to reflect emotional reactions to various stimuli.

W welcome screen

The first Web page in an Internet survey, which introduces the survey and requests that the respondent enter a password or PIN.

word association test

A projective technique in which the subject is presented with a list of words, one at a time, and asked to respond with the first word that comes to mind.

INDEX A

a priori codes 387, 391 absolute magnitudes 242 absolute value 445 accuracy 24, 119, 178, 344 of attitude measurement 262 of data 100–1, 117, 161, 177, 561 degree of 359 acquiescence bias 133 ad hoc samples, recruited 363 administrative error 134–5 advance notification 155 advertising 12, 27, 41, 180, 185 advertising research 120 after-only design 207–9 age 182 aggregations 101 aided-recall question format 316 alternative hypotheses 417 stating 426 alternatives, screening alternatives 64–5 ambiguity 19–20, 41, 48, 88 avoiding 313–14 see also exploratory research analogies 388 analysis of dependence 519–34, 542 of interdependence 519, 535–43 mathematical/statistical analysis of scales 241–2 of qualitative research data 386–9 systematic 180–1 analysis of variance (ANOVA) 213, 453–63, 491, 494 in Excel 462–3 identifying and partitioning the total variation 456–7 in SPSS 460–1 see also n-way univariate analysis of variance analytics 175 anchoring effect 318 anecdote circles 67 anonymity 86, 146, 151, 162, 187, 383–4 controversial issues and 151 lack of anonymity 143 answering machines 147 answers clear and simple 92 confidential 159 highly structured 149 ‘no answer’ 263 probing complex answers 141 anthropological research 11 applied research 7 artefacts 187 artificial environments 187, 195 assertion-evidence approach 570

association, tests of 472–509 Association of Market and Social Research Organisations (AMSRO) 145–6 assumptions 532 avoiding 315 at-home research 174 at-home scanning systems 184 attitude rating scales 309 attitude-measuring process 246–7, 261 importance to managers 246 attitudes 79 attitude rating scales 247 as hypothetical constructs 245–6 intentions and 264 multi-attribute attitude score 264–6 physiological measures of 247 unobservable 174 variance along continua 247 attributes 242 beliefs and 266 audience 549 audience surveys 25 audio recordings 386 audio-visual aids 570–1 auspices bias 133, 155 Australian and New Zealand Standard Industrial Classification (ANZSIC) 380 Australian Bureau of Statistics (ABS) 399 estimates from 99 social trends 137 Australian Direct Marketing Association (ADMA) 145–6 Australian Market & Research Society’s Code of Professional Behaviour 362 Australian Market and Social Research Society (AMSRS) code of ethics 165 scale changes 264 Australian Recording Industry Association (ARIA) 107 authority 117 average deviation 404

B

back translation 333 balanced incomplete block (BIB) designs 267, 269 sample 270–1 balanced rating scale 263 ‘bandwagon effect’ 311 bar charts 565–6 bar codes 184 basic (pure) research 7 basic experimental designs 206–7 before–after with control group design 209–10 behaviour buyer 80 consumer see consumer behaviour

expressive 173–4 ‘natural’ 67 switching 76–8 verbal 173–4 behavioural data 138 behavioural differential 258 behavioural intention, measuring 256–7 behavioural observation 173–87 behavioural patterns 173 benefits versus costs 14–15 best-guess estimates 132 best–worst conjoint 295–9 advantages 296–8 disadvantages 298–9 best–worst scaling 258–9, 266–74 disadvantages 270 ‘beta’ 529, 532 bias 6, 101, 178, 201–2, 204, 306–8, 313, 324–6 possible sources 358 response 131–4 also under individual bias big data 111 problems with 175 secondary research with 99–122 big-ticket purchase situations 175 bill shock case study 59, 96, 125, 170, 191, 233, 283–4, 337, 439, 469, 515 binary logistic regression 533–4 bipolar rating scales 251 bivariate analysis 442–64, 518 case study 545 bivariate linear regression 484–5, 498 bivariate statistical analysis 472–509 blinding 201 blogging 16, 127 body language 86 body movements 175 books 118–19 brain scanning technology, case study 190 branch questions 319 brand preference 254 branding campaigns 69 brand-name evaluation 10 brands 65 name changes 199 prestige brands 180 budget constraints 51 business trends, in marketing research 15–16 business-to-business marketing research 151 buyer behaviour 42, 80 buzzwords 111

C

callbacks 144, 148, 161–2 caller ID services 147 cannibalisation 228 cartoon tests 84

591

592

INDEX

case study method 68–9 primary advantage 69 categorical variable 47, 197 category labels 262 category scales 247–8 question wording 247 causal research 20, 25–7 causality based on correlations – danger 487, 498 correlation is not causation 476–7 cause-and-effect relationship studies 25–6 Celsius temperature scale 240 census 119, 139, 343–4 versus inferential statistics 416 updates 120 central location interviewing 148 central tendency, measures of 400–2 central-limit theorem 409, 411–13 certainty 18–19, 31 charts 563–7 figure number, title, explanatory legends, source and footnotes 563 chat rooms 86 cheating 135, 143 checklist question 309 Chi-squares case study 514 distribution 577 in Excel 432–3, 504–8 in SPSS 431–2, 502–4 test for goodness of fit 429–31, 499–508 choice between alternatives 247 best technique? 262 scale that forces 263 choice-based conjoint 290–5 advantages and disadvantages 294–5 circular-flow concept 17 clarity, in research questions and hypotheses 49 classificatory variable 47, 197 click-through rate 278 closed-response questions 307 types 308–10 cluster analysis 519, 537–40 case study 545 cluster samples 28, 444 cluster sampling 355–6 C-OAR-SE measurement procedure 246 code books 384 code construction 381 codes 380, 389 numerical 382 coding code-able data 387 dummy coding 379 open-ended questions 382–5 poor 380 precoding 324–6 qualitative data 387–8 qualitative research data 386–9 strategies 388–9

coding data 29, 380–5 coefficient of determination 476–7, 491–2, 494 coefficient of multiple determination 527 coefficients of partial regression 528 cognitive behavioural therapy 69 cognitive phenomena 174 cognitive reactions 186–7 cohort effect 202 cohort studies 137 coin flipping 260–1 commercial data sources 119–20 communalities 536–7 communication nonverbal communication 175–6 research results 549–71 with respondents 139–50 two-way 139 see also Internet communications model 549–50 communications process 549 communicators 549 comparative rating scale 262 comparison 388–9 multiple 456 of sampling techniques 360 competitive pricing study 10 competitive situations 225 competitor information 69 competitors 162 competitive environment, reading incorrectly 227 complementary evidence 176–7 completely randomised design 213 completeness checks 379 complexity avoidance 310–11 composite measures 242–4 compromise design 211 computer–aided personal interview (CAPI) 235 computer–aided web interview (CAWI) 235 computer–assisted data collection methods 178 computer–assisted telephone interviewing (CATI) 148–9 computerised qualitative analysis software 87 computerised voice-activated telephone interview 149 concept testing 10, 64–5 screening alternatives 64–5 concepts 237 conceptual definitions 238 conclusions drawing 30–1 formulating 549–71 concurrent validity 277 confidence intervals 414–16 calculating with t-distribution 425 confidence level 366, 414, 418–19 confidential information 143 Confidential Unit Record Files (CURFs) 120 confidentiality 69, 149 conflicting claims 80

conjoint analysis 261, 285–300 analysis using best–worst scaling 295–9 background 285–6 choice-based conjoint 290–5 ratings-based conjoint 286–90 connectors 388 consensus among participants 556–7 constancy of conditions 200 constant comparison 388–9 constant experimental error 201–2 constant-sum scales 254, 259 consumer attitude 120 consumer behaviour 24, 67, 69, 178, 184 identifying for a product category 103 reasons 79 consumer confidence case study 283 consumer durables 135 consumer needs 65 consumer panels 137–8, 160–3 scanner-based panels 184 consumer perceptions 10 case study 469 Consumer Price Index 240 consumer research 8–9 neuroscience approaches to 186 projective techniques in 82 consumer surveys 228 consumer utility 285–300 consumers ‘creative and cool’ young consumers 87 digital consumers 132 typical consumers versus professional respondents 78 consumption data 120 consumption patterns 103 content analysis 180–1 content validity 276–7 continuous variable 47 contrived observation 178 control groups 197 control method of test marketing 221 controlled store test 195–6 convenience 24 convenience sampling 356–7, 367 cookies 161 cooperation 146–7 in case studies 69 of respondents 160–1 correlation 275 correlation is not causation 476–7 correlation–regression comparison 483 nonparametric 482–3 running in Excel 480–2 running in SPSS 478–80 correlation analysis 483 correlation coefficients 498, 526–7 strength 474 see also Pearson’s correlation coefficient correlation matrix 477–8, 498 costs 343–4, 367 benefits versus costs 14–15

INDEX

cost-effectiveness of Internet surveys 159 of mail surveys 150 of personal interviews 143 of telephone interviews 145 counterbiasing statement 313 counting 241 coupons 10–12 cover letters 153 example 154–5 coverage 117 credit cards 445–6 criterion validity 277 critical values 419, 578–80 cross-sectional studies 136 cross-tabulations 417, 499–508, 559–61 case study 514 in Excel 504–8 layering 521 n-way 520–2 in SPSS 502–4 culture 67 currency 117 customer discovery 110–11 customer experience quality (CXQ) index 250–1 customer groups 516–17 customer loyalty 274 measuring 238 customer relationship management 111–13 customer–salesperson interactions 175 ‘cut-through’ ads 186 cutting 388

D

DAR (day-after recall) scores 120 data availability of 14 big data 99–122, 175 cleaning 432 cross-checks of 101 editing and coding 29 empirical 21 generation 175 historical 22 making usable 400–5 missing data, options for 385 raw see raw data secondary 22–3, 25 secondary data sources 112–20 single-source 184 tabulation 29 data analysis 29–30, 54, 399–434, 442–64, 472–509, 518–43 after responses 379 after survey 379–80 editing 378–9 stages 377–80 before the survey 377–8 during the survey 378 tabulating ‘don’t know’ answers 379 data cleaning 384

data collection 173, 177 computer-assisted methods 178 editing and coding 377–92 data conversion 101 data gathering 29, 54, 145 data integrity 377, 380 data matrix 380, 384 data mining 109–11 data review 379 ‘data to ink’ ratio, maximise 567–8 data transformation 551 see also data conversion data-base management software 390–2 database marketing 103, 111–13 databases 99 data-processing error 134 debriefing 216–17 deception 179 decimal places 561 decision makers ascertaining objectives 43–4 uncertainty of 5, 41 decision making choice between alternatives – decision time 177–8 consumer decision-making, insights 176–7 nature of 14 strategic see strategic decision making decision-oriented research objectives 50–1 deference 175 degrees of freedom 424, 426, 445, 501, 527–8 deliberate falsification 131–2 demand characteristics 203–4 reducing 204 demographics 142 comparisons 130 composition 225 demographic data 138 demographic profiles 182 demographic updates 120 dendograms 538–9 Department of Foreign Affairs and Trade (DFAT) 121 dependence analysis 519–34, 542 dependent variables 47–8, 472 decide on choice of 196–9 linear association with independent variables 483, 498 selection and measurement 198–9 depth interviews 79–80, 88 descriptive research 20, 24–5, 138 descriptive statistics 399 determinant-choice question 309 detractors 273–4 deviation scores 403–4 diagnostic analysis 24 diagnostic tools, focus groups as 78 qualitative research, diagnosing situations 64 diagrammatic representation 47, 207 nonmetric and metric descriptors 520

593

dialogue boxes 162 diaries 378 differences of means 442–9 differential interviewer techniques 142–3 digital technology 139 direct observation 177–9 direct observation–interviewing combined 178–9 errors associated with 178 instantaneous 177 direct pharmaceutical marketing 172 direct questioning 67 ‘likelihood of purchase’ 257 disaggregates 113 discounts 10–11 discussion 86 discussion guides (focus groups) 76–8 disguised questions 80, 136 dispersion, measures of 402–5 disproportional sampling 354–5 distribution 205, 403 distribution channels 113–20 distribution research 11 distribution system 113 do-not-call register 145–6 ‘don’t know’ answers 379 door-to-door interviews 143–4 double-barrelled questions 314–15 double-blind design 201 dramatic findings presentation 557 DrinkWise case study 57–8 drop-down boxes 328 drop-off method 157 dummy coding 379 dummy tables 55 dummy variable 529 durables 219

E

eco-friendly products 197 e-commerce market 35 editing data 29 eigenvalues 536 electric cars, case study 124 electroencephalography (EEG) 186–7 electronic interactive media 138–9 electronic test markets 221 email surveys 157–9 emergent codes 388 emma (Enhanced Media Metrics Australia) 235 emotional reactions 42, 186 empirical data 21 empirical evidence 7 entrapment 179 entrepreneurship 238 environment, diagnostic information regarding 8 environmental conditions, focus group interviews 75 environmental scanning 105 equivalent-form method 275 error checking 384–5

594

INDEX

error trapping 329–30 errors anticipating 377 associated with direct observation 178 observation data interpretation as error 178 in survey research 128–9 type I and type II errors 420–1 espionage 227 estimates/estimation 528 best-guess 132 consumer savings example 108–9 estimating market potential for geographic areas 106–8 of parameters 413–16 point estimates 413–14 rule-of-thumb 135 ethical issues in experimentation 216–17 observation of human behaviour 179 of product placement 460 in survey research 165–6 ethnographic studies 178 ethnography 67–8 evaluation beliefs and 266 of data 54 of measure 274–9 of research (overall) 54 of secondary data 100, 102 of websites 117 events 177 physical trace evidence as visible mark 179–80 evidence complementary 176–7 empirical 7 physical traces 179–80 Excel 433 case study 438 Chi-squares in 432–3, 504–8 conducting an ANOVA in 462–3 correlation 480–2 cross-tabulations in 504–8 independent samples t-tests in 447–9 one-sample t-test in 428 paired samples t-tests in 450–2 regression in 496–8 exhaustive code categories 381 existential psychology 69 expected price 272 experimental designs basic versus factorial 206 better 209–11 complex 212–13 quasi-experimental 207–9 selection and implementation 206–16 symbolism for diagramming 207 experimental groups 197 experimental research 193–229 experimental treatments 196–7 several levels 198

experimenter bias 204 experiments 26–7 artificiality 195 ethical issues in 216–17 field and laboratory 194–6 field experiments application see test marketing nature of 193–217 validity issues in 200–6 expert interview 357 exploratory factor analysis 519, 535–7 factor results interpretation 536–7 number of factors 537 exploratory research see qualitative research expressive behaviour 173–4 external data 113 external validity 205 extraneous variables 205–6 problems controlling 206 extremity bias 133 eye contact 177 ‘eyeballing’ 432, 485–6 eye-tracking monitors 185–6 eye-tracking software, case study 190

F

face validity 276–7 Facebook 16, 31, 65, 87, 110 face-to-face contact 139, 142, 550 absence of 146 facial expressions 86, 175, 178 fact-finding 103–5 factor loadings 536 factor scores 536 factorial experimental designs 206, 214–15 falsification, deliberate 131–2 feedback 140–1, 549 field editing 378 field experiments 194–6 application see test marketing fields 380 files 380 filter question 319 first-to-last discovered findings presentation 557 fishbone diagrams 47 fixed-alternative questions 307, 381 flexible questioning 162 focus group interviews 22–3, 71–9, 176 advantages 73–4 case studies 95–6 ‘co-creating focus groups’ 74–5 consensus among participants 556–7 environmental conditions 75 focus groups as diagnostic tools 78 group composition 74–5 homogeneous groups 74–5 informal ‘continuous’ focus group 86 moderators 75–6 planning outline 76–8 shortcomings 79 follow-ups 152–3, 155, 549–71

food consumption, ‘rubbish’ project 180 food labelling 514 forced answering software 330 forced-choice rating scale 245, 263 forecasting 9, 107–8, 508 incorrect volume forecasts 227 see also predictions foreign markets 15 formal descriptive research reports, body 552 F-ratio 457–60 frequencies 506 frequency distributions 400 frequency-determination question 309 F-statistic 524, 528 F-test 456–7, 491 functional magnetic resonance imaging (fMRI) 186–7 funnel technique 318

G

galvanic skin response (GSR) 185, 247 gamification 140 gender 177, 182 generalisation 69 Generation X 24 Generation Y 9, 24 Generation Z 105 generational differences 105 geodemographic 121 geographic flexibility, of mail surveys 150 geographical areas 100, 112, 135 market potential estimation for 106–8 gift method 276 global marketing research 15, 333 global markets 333 global research, sources 121–2 globalisation 15 goodness of fit, test for 429–31, 499–508 Google 109 consumer surveys case study 170 government data sources 119 graphic aids 558–68 graphic interaction 216 graphic rating scales 254–6 graphical user interfaces (GUI) 326, 329, 560 graphics packages 570 graphs, presentation options 566 ‘greenwashing’ 197 ‘grounded codes’ 388 grounded theory 68 group designs 208–12 groups collection of strangers – group? 89–90 experimental or control 197 focus group composition 74–5 ‘group’ versus ‘non-group’ 90 guinea pig effect 204 ‘guru’ syndrome 92–3

H

hackers 162 ‘halo effect’ 79

INDEX

halo effect 276 happy face scales 255 Hawthorne effect 204 head nods 175 hermeneutic units 67 hermeneutics, stories or inputs into 67 hidden observation 175 high-technology systems 221 historical data 22 history effect 202 home shopping services 11 homogeneity 275, 344 human behaviour, observation of 175–7, 179 human interactive media 138–9 hypotheses 49, 302–3 alternative 417 bizarre 88 clarity in 49 defined 417 null see null hypothesis stating 417 hypotheses testing 19–21, 417–21, 442–3, 492, 498, 500, 532–3 artificial environments for see contrived observation case study 438–9 example 418–20 parametric versus nonparametric hypothesis tests 423 procedure 418 hypothetical constructs 245–6, 261

I

iceberg principle 44 ideas generation 65 illegal downloading 132 image profiles 252 incentives 157 independent samples t-tests for differences of means 442–9 in Excel 447–9 in SPSS 446–7 independent variables 47–8, 472 decide on choice of 196–9 linear association with dependent variables 483, 498 manipulation 196–8 more than one 198 in-depth observations 178 index measures 242, 264, 404–5 indexing services 118–19 Indigenous categories 388 inferential statistics 399, 406, 413–14, 416 information assisting decision making 14 diagnostic information 8 information–raw data, distinction 377 outdated 100 as a product and distribution channels 113–20 for questionnaire design 302–3 single-source data-integrated 121 information processing 187

information technologies 15 innovation 182 Instagram 87 instrumentation 203 integrated marketing research 12 intentions 256–7 attitudes and 264 interaction 76, 138–9, 208 customer–salesperson 175 graphic 216 interactivity of Internet surveys 159–60 interaction effect 214 interactive help desks 330 interactive media 85, 138–9 interdependence analysis 519, 535–43 internal and proprietary data 112–13 internal consistency 274 internal validity 196, 202–4 international market research, sampling frames 348 Internet 11, 85, 115–17, 139 changes everything 361–7 growth of 15–16 penetration 158 secondary data sites 116 Internet questionnaires 326–31 Internet service provider (ISP) 364 Internet surveys 159–63, 329, 361–2 avoiding open-ended questions 382 case study 170 New Zealand flag designs 128–9, 131 thank-you page 330 interpretation 178 interpreter bias 88 interquartile range 403 interval scales 240–1 interviewer bias 133, 163, 308 interviewer cheating 135, 143 interviewer error 135 interviewer influence 142–3 interviewers absence of 152 roles 79 interviews 29 communicating with respondents 139–50 depth 79–80 direct observation–interviewing combined 178–9 door-to-door 143–4 expert 357 field editing after 378 focus group see focus group interviews length of 141, 147 personal 139–44 practice interviews 378 telephone 145–50 inventories 180 investigations, defining terms for 100 in-vivo codes 388 ipsative scales, normative versus ipsative scales 244–5

595

item analysis 249 item nonresponse 141, 379

J

jargon 311 judgement sampling 357 judgemental samples 28 Juster Scale 257

K

Kelvin temperature scale 240 key-words-in-context 388 kiosk interactive surveys 163

L

laboratory experiments 194–6 ladder scales 255 landline phones, pros and cons 148 language – simple and conversational 310–11 latency 177–8, 187 leading questions 311–13 least squares method of regression analysis 486–9 computation 488 letters 180 levels of significance 576 Leximancer 389–90 libraries 115 lifestyle considerations 225 Likert scales 248–51, 463 limit checks 379 line graphs 566–7 ‘line of best fit’ 484, 486–7, 494–5 LinkedIn 87 listening 76 literature review 22, 45, 47 ‘Literature review’ 551 loaded questions 311–13 longitudinal studies 137–8 loyalty, customer 238, 274 luxury markets 31

M

magnetoencephalography (MEG) 186 mail questionnaires 150–7 keying with codes 157 length 152 mail surveys global considerations 157 increasing response rates for 153–7 main effect 214 main study phase (data gathering) 29 mall intercept interviews 144 managerial action standard 51 managerial value (of marketing research) 7–13 market basket analysis 110 market parameters 108 market penetration/repeat-purchase rate formula 228–9 market potential, estimating for geographic areas 106–8 market research panels 163 market segments 9

596

INDEX

market share data 120 market tracking 104 marketing concept, adoption 127 marketing mix 6, 205 planning and implementing 9–12 marketing performance, analysing 12–13 marketing problems influence on objectives/research design 51 nature of 41 translation into research objectives 50 marketing research application 4 benefits versus costs 14–15 business trends in 15–16 case studies 35–6, 58, 513 defined 6–7 determining when to conduct 15 global 15 integrated 12 managerial value of 7–13 marketing concept adoption 127 nature of 5–7 need for 13–15 process flowchart 17, 30 role 4–32 student views of 64 technique or function classification 20 worst job 209 marketing researchers, conducting surveys 138–9 marketing strategy interactions in 208 stages 8–13 markets foreign 15 luxury 31 target see target markets mass narrative capture 67 matching 200 maturation effect 202–3 MaxQDA 390–1 mean squared deviation 404 means 400–1 differences of 442–9, 464 measure of spread 404 measurement 236–79 attitude and scale determination 244–66 best–worst scaling 266–74 evaluation of measure 274–9 how it is to be measured 238 of measurement 422 measurement rule application 239–42 measuring consumer utility see conjoint analysis measuring physiological reactions 185–7 number of measures determination 242–4 process steps 236–66, 274–9 ProPoints (Weight Watchers) 238 scales selection 261–4 units of 100–1 what is to be measured 236–8

measurement scales 520, 542 measures of central tendency 400–2 measures of dispersion 402–5 mechanical observation 174, 181–7 media (data sources) 119 media coverage/efficiency 225 media isolation 226 media research 12 median 402 medium 549 memos 389 messages 549 metadata 98, 110 metaphors 388 method of summated ratings 248–51 metrics 183, 278 midpoint bias 133 millennial women 9 mindfulness 69 ‘minimarket test’ 221 misrepresentation 131–2 unconscious 132 missing information 388 mixed-mode surveys 163 mobile phones 146 experiments 196 pros and cons 148 switching case study 59, 96, 125, 170, 191, 233, 283–4, 337, 439, 469, 515 mobile technology 347 mode 402 model building 105–9 moderators 75–6 online moderators 86 sensitive and effective 79 moments of truth (POMP dimension) 250–1 monadic rating scale 262 money money helps 153–4 time is money 152 monitoring 185 Media Monitors 181 television 181–2 website traffic 181, 183 mortality 203 most-to-least important findings presentation 557 motion picture cameras 181 motivations 309 unobservable 174 moving average forecasting 107–8 multi-attribute attitude score 264–6 disadvantages 266 multicollinearity 529–30, 532, 542–3 multidimensional scaling 261, 519, 540–3 multinomial logit (MNL) 269 multiple index of determination 527 multiple regression analysis 525–33, 542 case study 545 problems with 529–33 multiple response questions 381–2

tabulation 382 multiple-grid question 321 multivariate dependence methods 519 seven commandments 520 multivariate statistical analysis 261 classifying techniques 519–20 nature of 518 research data application 517 mutually exclusive and independent code categories 381 mutually exclusive (response) alternatives 310 mystery shoppers 178

N

naive interviewers 67 Nation of Numbers (Gallop) 346 ‘natural’ behaviour 67 natural findings presentation 557 natural zero value 240 naturalism 195 net promoter score (NPS) 273–4 netnography 16, 68 neural networks 109–10 neuromarketing 8, 185 neuroscience 186–7, 190 estimated costs and applications 187 neutrality 133 newspapers articles 180, 235 readership versus circulation 131 Nielson 138, 155, 182, 363 ‘no opinion’ category 263 nominal scale 239, 241 information provided by 240 non-forced-choice rating scale 263 noninteractive media 139 nonparametric correlation 482–3 nonparametric hypothesis tests 423 nonparametric statistics 463 nonparametric tests 463 nonprobability sampling 28, 356–9 versus probability sampling 350–1 non-random sampling error 348–9 nonrespondents 130 nonresponse error 129–31, 349 cultural influences 130 non-sampling errors 128, 351, 378 nonverbal behaviour 175, 178 normal curve, area under 575 normal distribution 405–8 normality 410 normative scales, normative versus ipsative scales 244–5 ‘nose counting’ 25 nuisance variables 200 null hypothesis 417, 443, 445, 452, 456, 459, 476, 492, 496, 500 stating 426 ‘number crunching’ 111 numbers 253 numerical codes, meaning transfer to 382

INDEX

numerical measurement, lack in qualitative research 63 numerical scales 253, 498 NVivo 391–2 n-way cross-tabulations 520–2 n-way univariate analysis of variance (ANOVA) 523–5

O

objectives case study 439 decision-makers’ 43–4 research objectives 50–1, 103–12, 302 survey objectives 127 web advisers’ 183 objectivity 6, 117 observation 25, 173–87 artificial environments 187 content analysis 180–1 defined 173 direct 177–9 of human behaviour 175–7, 179 mechanical 174, 181–7 nature of studies 174–5 observable phenomena 173–4 observational research by Australian companies 174 observer’s passive role 177 participant 67 of physical objects 179–80 product innovation success 182 scientific tool 173 scientifically contrived 178 visible and hidden 175 observer bias 178 ‘off-time’ online focus group 87 one-group pretest–posttest design 208 one-sample t-test in Excel 428 in SPSS 427 one-shot design 207–8 one-way mirrors 176 online blogging 16 online ethnography 16 case study 35 online focus groups 85 online panels 367 online reports 569 online surveys see Internet surveys online test markets 224 ‘on-time’ focus group 87 open-ended boxes 328 open-ended comments, meaning extraction 383–4 open-ended questions, coding 382–5 open-ended response question 306, 308 operational definitions 238 operationalisation, case study 283 opportunities, identifying and evaluating 8–9 opt-in lists 363–4 oral presentations 568–71 learn by example and practice 568–9

order bias 317, 324–6 order of presentation 201 ordinal data 482 ordinal scale 239, 241 information provided by 240 organisations, opportunities, identifying and evaluating 8–9 outcomes anticipating 55 outcome focus (POMP dimension) 250–1 overattention 226 OzTAM 181–2, 277

P

packaging 13, 27, 185 page views 183 paging layout 327 paired comparisons 258–9 paired samples t-tests 442, 449–52 in Excel 450–2 in SPSS 449–50 panel samples 363 pantry audits 180 parameters estimation 413–16 population parameters 399 parametric hypothesis tests 423 partial correlation analysis 522–3, 529, 531 participant observation 67 participants, consensus among 556–7 participation high participation 142 of respondents 160–1 willingness to participate 144 passives 273–4 pawing 388 peace of mind (POMP dimension) 250–1 Pearson’s correlation coefficient 473–83, 580 example 475–6 pencil and paper questionnaires 379, 382 People Meters 182, 184 people-provoked problems 48 percentage distribution 400 perceptual maps 540 perceptual-interpretative (apperception) use 83 performance-monitoring research 12–13 periodicals 118–19 periodicity bias 353–4 permission 158 personal identification number (PIN) 160 personal interviews 139–44, 165–6 advantages 140–2 disadvantages 142–3 global considerations 144 personal involvement inventory 249–50 personal video recorders (PVRs) 182 personalised and flexible questioning 162 personality traits 45, 110 pharmaceutical advertising 172 phenomena 386–7, 529 observable 173–4

597

phenomenology 66–7 physical actions 173–4 physical inventories 180 physical objects 173–4 observation of 179–80 physical trace evidence 179–80 physiological reactions, measuring 185–7 picture frustration 84 pie charts 563–5 pilot studies 23 placebo technique 204 point estimates 413–14 political polls 211–12, 351 push polling 312 POMP dimensions 250–1 pooled estimate of the standard error 444 population 27 advance knowledge of 361 homogeneous 344, 365 national versus local project 361 parameters 399 sampling rare/hidden populations 358–9 size of 225 target 28, 130, 344–7, 367 see also sampling population distribution 408–10 population element 343 pop-up boxes 328–9 positive psychology 69 posttest-only control group design 210–11 power gestures 176 PowerPoint 550 practicality 279 predictions 19, 27–8, 498, 508 see also forecasting predictive validity 277 pre-existing codes 387 preferences 132, 140 brand 254 unobservable 174 preliminary tabulation 331 premiums 12 pre-screening 177 pretesting effects 203 pretesting phase 29, 331–2, 377–8 survey research 165 pretest–posttest control group design 209–10 ‘Previous research’ 551 price 205 quality and 10–11 pricing 272–3 pricing research 10–11 privacy 98 principles 118 probability 400, 500–1 distribution of t 576 probability sampling 28, 351–6 versus nonprobability sampling 350–1 probing 141

598

INDEX

problem definition 18–20, 54, 236 error/omission in 19 importance of proper 42–3 process 43–8 research process and 41–55 time spent defining 51 see also causal research; descriptive research problem discovery 18–20 problems clarifying nature of 64 concepts relevant to 237 dimension 40 identifying problem not symptoms 45–6 people-provoked 48 versus symptoms 46 understanding background 44–5 producers 118 product experience (POMP dimension) 250–1 product innovation 182 product involvement 249 product placement 460 product research 10 by consumers 11 products use 9 product testing 10 products 205 case study 514 eco-friendly 197 information as 113–20 innovation of 182, 514 new product development 453 product categories 103 testing new product ideas see qualitative research uncommon products 138 projective techniques 80–5, 88 cartoon tests 84 projections from findings 88–9 sentence completion method 81–2 thematic apperception test (TAT) 83 third-person technique and role playing 82–3 word association tests 81 promoters 273–4 promotion 205 promotion research 12 prompting 320, 329 proportional sampling 354–5 props 141–2 protocols 555 psychogalvanometers 185 public opinion research 120 pull factors 77 pupilometers 185 purchase behaviour data 120 pure research see basic (pure) research purposive sampling 357 push buttons 327 push factors 77 push technology 105 p-values 416

Q

qualitative data coding 387–8 sources 386–7 qualitative research 20–3 analysing responses 70–1 case study 96 common techniques used 71–88 danger 89 data coding and analysis 386–9 defined 63 deriving themes 556–7 focus – in-depth understanding/ insight 63, 66, 89 modern technology and 85–8 orientations 66–9 presentation styles 557 presenting 555–8 purpose 556–7 versus quantitative research 66 reporting checklist 557–8 seven deadly sins 89–93 software for 389–92 tools 72, 84–5 uses 63–5 warning 88–93 quality 10–11 quality-of-life surveys 177 quantitative research, versus qualitative research 66 quasi-experimental designs 207–9 questioning direct 67 personalised and flexible 162 questionnaire design for global markets 333 individual questions, content determination 304–5 information sought 302–3 physical characteristics determination 321–31 question response form 306–10 question sequence determination 317–20 question wording determination 310–17 re-examine/revise 331–2 steps 302–31 type determination 303 questionnaires 17, 29 blind keying 157 case studies 338, 394–5 completeness of 141 examples 322–6 Internet 326–31 mail 150–7 pencil and paper 379, 382 product registration questionnaire 156 relevance 302 self-administered 150–63 traditional 321–6 questions also under individual question in absence of research 14

asking questions in qualitative research 91 avoiding burdensome 316–17 disagreement with 133 disguised 80, 136 double-barrelled 314–15 interesting 154–5 leading and loaded 311–13 misinterpretation 141 multiple response 381–2 probing 79 for questionnaire design 304–5 randomised response 260–1 sensitive 304–5 sequence in questionnaire 317–20 standardised 152 structured 136 types of 422 unanswered see item nonresponse undisguised 136 unstructured 136 wording 247, 310–17, 376 Quit Campaign case study 35–6 quota samples 28 quota sampling 357–8 advantages 358 quotations 557

R

race 177 radio buttons 328 raised eyebrows 175 random digit dialling 147 case study 371 random digits 574 random error, sample size and 364–5 random sampling, simple 353 random sampling error 128, 199, 348–9, 447 randomisation 199–200 randomised block design 213–14 randomised response questions 260–1 randomised-response model 305 range 403 rank order 258 ranking 246, 258–9 best technique? 262 ranking data 482 rapport 76 rating 246 best technique? 262 rating scales advantages and disadvantages 256 category labels use (if any) 262 negative values on 385 unbalanced or balanced? 263 ratings-based conjoint 286–90 advantages and disadvantages 289–90 ratio scales 240–2 information provided by 240 rationale 117

INDEX

raw data extracting meaning from see qualitative research information–raw data, distinction 377 transforming into information 377–92 raw slope coefficients 498 real-time data capture 161 reasonableness checks 379, 532 recoding 243, 384 recording behavioural patterns 173 of nonverbal behaviour 175 records 380 verbal and pictorial 173–4 reflexivity 555 refusal 130, 144 regression correlation–regression comparison 483 diagrammatic representation 484 regression analysis 417, 483–98 drawing a regression line 489–90 interpreting regression output 526–9 least squares method of 486–9 running in Excel 496–8 running in SPSS 492–5 regression coefficients 498, 532–3 relevant variables, determination 46–8 reliability 274–5, 410 reliability versus validity 277–8 reliable results 344 repeatability 274 repeated measures 206 repeat-purchase rate/market penetration formula 228–9 reporting research see research reports representative samples 129–30, 145, 147, 161 less than perfectly 349–50 research also under individual research type basic and applied research 7 essence of see scientific method experimental research 8 global 121–2 guidelines 79 marketing see marketing research realistic results 21–2 research results versus research recommendations 567–8 systematic enquiry 13, 16 type, influence of uncertainty on 20–1 research design 21, 54 ‘best’ 27 influence of marketing problem statement on 51 planning 21–3, 63–93, 99–122, 127–66, 173–87, 193–229, 236–79, 302–33 secondary data 103–12 survey research 163–4 research hypotheses clarity in 49 see also hypotheses research method selection 20

research objectives 17, 57–8 decision-oriented 50–1, 302 influence of marketing problem statement on 51 manageable number of 51 statement of 20–1 research process alternatives 18 problem definition and 41–55 role 4–32 stages in 16–31 research program, strategy 31–2 research proposals 52–5 ‘Don’t say it, write it.’ 53, 55 example (abbreviated) 52–3 format 53, 55 research questions clarity in 49 stating 48 research reports 367 additional parts 553–4 background, current knowledge, research method, results, conclusions 551 body 550–1 in context 550 executive summary, the body, appendices 554 preparing 30–1 report format 550–5 summary 552 table of contents 553–4 technical terms 549 templates and styles 554–5 title page 553 type of 54 research results, bogus responses case study 337–8 researcher interpretation 67 resources 361 respondent error 129–35 minimising through unobtrusive observation 175 respondent-driven sampling 359 respondents 67, 78–9, 127 anonymity of 151, 162 characteristics 177 communicating via interviews 139–50 feedback to 141 herding into central location 90–1 intimidating 91 lack of anonymity 143 mail surveys, convenience of 150–1 matching 200 participation and cooperation 160–1 response bias 131–4, 180 readership versus circulation 131 types 132–4 response latency 177–8, 187 response positions 262 response rates 129–30, 152–7, 162 devices for increasing 156–7

599

plots of patterns 155 responses inconsistencies 379 naturally negative 383–4 reverse coding 242 review 379 literature 22, 45 risk minimisation 31, 219 role playing 82–3 rounding 101 Roy Morgan Research Australia 7, 24, 120, 131, 138 R-squared 524, 527, 530, 532–4, 536 ‘rubbish’ project 180 rule-of-thumb estimates, for systematic error 135 rules 239

S

safety 217 sales forecasts 9 sales ratios 228 sample attrition 203 sample bias 129 sample design 359–61 sample distribution 408–10 sample statistics 399 sample surveys see surveys samples adequacy 367 independent 442–9 paired 449–52 representative 129–30, 145, 147, 161, 349–50 selection 54, 199 samples size 364–5, 416–17 case study 371 determining factors for questions involving mean 365–6 determining on basis of judgement 366–7 sample-selection error 134 sampling 27–8 frame 345–8 practical concepts 344–50 rare/hidden populations 358–9 reasons for conducting 343–4 terminology 343 units 348–9 sampling distribution of the sample mean 408–10 examples 412 sampling error 351, 378 sampling frame error 347–8 sampling problems 79 scale categories 262 scale recoding 243 scales 422, 498 best–worst see best–worst scaling computing values 242–4 constant-sum 254, 259 graphic rating 254–6

600

INDEX

scales (Continued) Likert scale 248–51, 463 mathematical and statistical analysis of 241–2 measurement scales 520 normative versus ipsative scales 244–5 numerical 253 selection 261–4 Stapel 253 types of 239–41 also under individual scales see also measurement scanner data 221 availability of 226 scanner-based research 183–4 scatter diagrams 474 scattergraphs 479, 483, 492–4, 496 scatterplots 538–9 scientific inquiry 173 scientific method 7 scientific scrutiny 74 scores 249, 252 screening 165 telephone calls 130 scrolling layout 327 secondary data 22–3, 25 advantages 99 availability of 99 disadvantages 100–1 pertinence to project 100 use with survey data 108–9 secondary data research with big data 99–122 design 103–12 secondary data research 99–102 secondary data sources 112–20 typical objectives 103–12 secrecy 195–6 loss of 219 security 74 concerns regarding 162 selection effect 203 selective sampling 362 self-administered questionnaires 150–63 self-selection bias in 130–1 using other distribution forms 157–63 self-contained trading area 226 self-selection bias 130–1 semantic differential 148, 249, 251–2 sensitivity 210, 278 sensitive questions 304–5 sentence completion method 81–2 sequence discovery 111 serendipity 73 shopping experiences 176–7 mystery shoppers 178 shopping mall intercepts 143–4 significance 508–9 significance levels 561 statistical 500 for tests of differences 463–4 tests of statistical significance 476, 490–2

significance levels 418 simple attitude scales 247 simple random samples 28 selection 353 simple-dichotomy (dichotomous alternative) question 308–9 simple-to-complex findings presentation 557 Simpson’s Paradox 522 single measures 264 single-source data 184 single-source data-integrated information 121 situation analysis 44–5, 64 skip questions 326 slider scales 255 smiles 175 SMS messages 109 snowball sampling 358 snowballing 73 social desirability bias (SDB) 134 social entrepreneurship, defined 238 social listening 110 social marketing case study 468 social media 41, 70 growth of 16 social networking 87 ‘likes’ 65 social progress, case study 282–3 social science queries 388 software development 87–8 Solomon four-group design 211 sorting 246–7, 259–60, 388 best technique? 262 spam 158–9 spatial relations and locations 173–4 speaking 570–1 Spearman rank coefficient 482 specialisation 74 speed 74 of Internet surveys 159 of telephone interviews 145 split-ballot technique 313 split-half method 275 sponteneity 74 spreadsheets 379–80, 385 see also Excel SPSS 433, 561–2 case study 438 Chi-squares in 431–2, 502–4 conducting an ANOVA in 460–1 correlation in 478–80 cross-tabulations in 502–4 independent samples t-tests in 446–7 one-sample t-test in 427 paired samples t-tests in 449–50 regression in 492–5 scattergraphs in 479 squishing 316 standard deviation 405, 448 reasons for use 404 standard error of the mean 409 ‘standardised beta’ 529, 532

standardised coefficients 529, 532 standardised normal distribution 406–8 standardised questions 152 standardised regression coefficients 498 static group design 208–9 statistical analysis appropriate technique 422–3 bivariate analysis 442–64 multivariate statistical analysis 518–43 univariate analysis 399–434 statistical inference 413–14 see also inferential statistics Statistical Local Areas (SLAs) 100 statistical output 560–2 statistical packages 561 statistical significance 500 for tests of differences 463–4 statistical tables 574–80 statistical trend analysis 108 statistics 434 status bar 327–9 status gestures 176 stereotyping 134 stimulation 74 stimuli 185, 254 store conditions, unrealistic 226 storytelling 557 straight trend projections 228 straight-line relationships 484 strategic decision making, marketing research, managerial value of 7–13 stratified samples 28 stratified sampling 354 streaming media 85 Strongly Agree, Agree, Disagree and Strongly Disagree categories 327 structure 74 structured questions 136 student surrogates 205 summated findings 70 summated ratings, method of 248–51 summated scale 242–3 survey data, secondary data use with 108–9 survey research 127–66 classifying methods 136–8 design selection 163–4 errors in 128–9 ethical issues in 165–6 method determination (questionnaires) 303 no contacts 130 pretesting 165 reducing error 135 that mixes modes 163 survey sponsorship 155–6 surveys 5, 24–5, 193 advantages 127–8 collecting qualitative data 63 conducting, ways of 138–9 email 157–9 errors categories 129 good survey flow provision 319–20

INDEX

hybrids 136 Internet 159–63 kiosk interactive 163 literature review 22 methods pros and cons 164 nature of 127–8 objectives 127 representative 28 respondent error 129–35 shortcomings 128 ‘sweethearting’ 477–8 switching behaviour 76–8 symbolism 207 symptoms versus problems 46 synergy 73 systematic (non-sampling) error 128–9, 349 rule-of-thumb estimates for 135 sample size and 365 systematic analysis 180–1 systematic enquiry 13, 16 systematic sampling 353–4

T

tables 559–60 difficult to read see pie charts notes, source notes 560 table number, title, stubheads, bannerheads 559 tabulation 29 preliminary 331 tachistoscope 205 tagging 98, 110 target markets, analysing and selecting 9 target population 28, 346–7 defining 344–5, 367 demographics comparisons 130 taste 142 t-distribution 423–33 confidence intervals 425 univariate tests 426–7 technical materials 550 ‘too . . .’ material 554 technological advances 149, 222 technology 15 digital 139 mobile 347 qualitative research and 85–8 scanner-based research 183–4 TED talks tips 568–9 telemarketers 347 telemarketing 145–6 telephone interviews 145–50 automatic 149 characteristics 145–8 global considerations 150 limited duration 148 telescoping 316 television monitoring 181–2 television programs 180 temporal classification 136–8

temporal patterns 173–4 terms, defining 100 test marketing 26, 217–28 advantages 217, 219 functions 220 projecting results 228–9 to test market or not decision 218–19 test markets Australia as 226 case studies 233 electronic 221 functions 219–20 length decision 224 online 224 overused 226 place to conduct 224–6 results estimation/projection 226–8 selection considerations 225–6 simulated 222 type decision 220–4 test product sales–total company sales ratio 228 test units destruction 344 selection and assignation 199–200 testing effect 210 test–retest method 274–5 tests of association 472–509 basics 472 cross-tabulations – Chi-square test for goodness of fit 499–508 Pearson’s correlation coefficient 473–83 regression analysis 483–98 statistical and practical significance for 508–9 tests of differences 442–64 appropriate 442 nonparametric statistics for 463 statistical and practical significance for 463–4 tests/testing 203, also under individual tests concept testing 64–5 test tabulation 383 testing new product ideas 64 tests of statistical significance 490–2 in a virtual world 223 text mining 88 software 389–90 text(s) 386 in respondent stories 67 text-based conversation 68 thematic analysis 70–1 thematic apperception test (TAT) 83 picture frustration version of 84 themes 68, 83, 87 theory development 386 Theory of Planned Behaviour (TPB) 421 theory of reasoned action approach (TRA) 264 third-person technique 82–3 time 361

601

time constraints 343–4, 367 of systemic research 13 time is money 152 time lapse 227–8 time series designs 211–12 time series with control group design 212 time-lapse photography 181 time-location sampling 358–9 time-shift viewing (TSV) service 182 timing 351 topics 80 ‘halo effect’ 79 sensitive topics 138 ‘top-line’ results 92 total variance 456 explained 536 tracking studies 137 trade associations (data sources) 119 trade-offs 18, 194, 364 traditional questionnaires 321–6 Transcranial Magnetic Stimulation (TMS) 186–7 trend analysis 104–5 new trends 107 social trends 137 statistical 108 trends, business trends in marketing research 15–16 triangulation 67 ‘true’ feelings 176 t-statistic 525, 528 t-test hypothesis 476 t-tests 417, 464 independent samples 442–9 paired samples 449–52 Twitter 87, 110 case study 124 two-stage research reports, exploratory followed by descriptive 552–3 two-way communication 139 two-way mirrors 71, 75, 85, 179 type I and type II errors 420–1

U

unbalanced rating scale 263 uncertainty 5, 18–20 of decision makers 41 influence on research type 20–1 unconscious misrepresentation 132 undisguised questions 136 units of analysis 46 units of measurement 100–1 univariate analysis 518 univariate statistical analysis 399–434 univariate tests 423 using t-distribution 426–7 Universal Product Codes (UPC) 12, 183 unlisted phone numbers 147 unmarked text 388 unmet needs 47 unsolicited emails 158–9

602

INDEX

unstructured questions 136 untruthfulness 180

V

validity 187, 275–7, 410 of data 317 establishing 276–7 external 205 internal 196, 202–4 issues in experiments 200–6 reliability versus validity 277–8 value(s) chart distortion 563 computing scales values 242–4 consumer perception of 10 critical values 419, 578–80 of managerial decision 14 managerial value of marketing research 7–13 missing values, coding for 385 van Westendorp Price Sensitivity Meter 272–3 variable piping software 330 variables 45 categorical 197 classificatory 197 covariance between see Pearson’s correlation coefficient data matrices for 384 dummy 529 expected relationships between 49, 302–3

extraneous 205–6 ‘lurking’ 522 nuisance 200 numbers of 422 relevant variables determination 46–8 also under individual variables variance 404–5, 536 varimax rotation 543 vendors 118 verbal and pictorial records 173–4 verbal behaviour 173–4 verbatim comments 383 VicHealth 10 video recordings 386 videoconferencing 85 viral communication 16 virtual-reality simulations 223–4 visible observation 175 visual aids 141–2 visual appeal (of Internet surveys) 159–60 visual cues 178 visual medium 148 voice pitch analysis 185–6 volume forecasts, incorrect 227

W

web metrics 183 website traffic monitoring 181, 183, 362–3 weight control 195, 497–8

weights 249 welcome screens 160 Wilcoxon matched-pairs signed-ranks test 580 willingness to participate 304 willingness to pay 272 word association tests 81 word of mouth 16 words complex and simple 311 making a difference with 307–8 meaning of 305 word repetitions 388 see also jargon; language – simple and conversational World Bank, estimates from 99

Y

Yahoo! 119 YouTube 109

Z

Z distribution 423 zero point 240, 242 Z-test 417, 426, 443 Z-value 407