Understanding Research Methods for Evidence-Based Practice in Health [2 ed.] 9780730369264

922 86 13MB

English Pages 141 Year 2017

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Methods in Health Services Research 9781493967049

317 96 2MB Read more

Health Intervention Research: Understanding Research Design and Methods 9781446256176, 2014936874, 9781446256169, 1446256170

At a time when evidence-based practice is the standard bearer for understanding health behaviour, problems and intervent

276 64 697KB Read more

Research Methods for Understanding Professional Learning 9781474274616, 9781474274609, 9781474274647, 9781474274630

Practitioners are experts in their field and this book introduces research methods that help to make that expertise expl

249 76 6MB Read more

Research for Designers - A Guide to Methods and Practice

1,530 299 8MB Read more

Essentials of Public Health Research Methods 9781284175462

1,697 313 74MB Read more

Understanding Social Work Practice in Mental Health 9781412935050

278 40 454KB Read more

Research Methods in Health: Investigating Health and Health Services [5 ed.] 0335233643, 9780335233649

"The new edition of Research Methods in Health continues to provide an excellent broad based introduction to the su

452 106 7MB Read more

Qualitative Methods in Tourism Research: Theory and Practice 9781845416416

This volume seeks to expose and illustrate new approaches and thinking in qualitative methods that are being developed a

257 15 4MB Read more

Community-Based Participatory Health Research : Issues, Methods, and Translation to Practice [2 ed.] 9780826193971, 9780826193964

iCommunity-Based Participatory Research (CBPR) has become the preferred model for conducting research in communities. Mo

162 25 3MB Read more

Research Methods for History 9781474408745

A wide-ranging critical survey of methods for historical research at all levels Historians have become increasingly sen

276 115 3MB Read more

Understanding Research Methods for Evidence-Based Practice in Health [2 ed.]
9780730369264

Author / Uploaded
Trisha M. Greenhalgh
John Bidewell
Jane Warland
Amanda Lambros
Elaine Crisp

Table of contents :
BRIEF CONTENTS
CONTENTS
ABOUT THE AUTHORS
CHAPTER 1
CHAPTER 2
CHAPTER 3
CHAPTER 4
CHAPTER 5
CHAPTER 6
CHAPTER 7
CHAPTER 8
CHAPTER 9
CHAPTER 10
CHAPTER 11
CHAPTER 12
INDEX

Citation preview

BRIEF CONTENTS About the authors 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.

vii

Introduction to research, the research process and EBP Asking questions and searching for evidence 11 Reviewing literature 21 Qualitative research 32 Quantitative research 43 Levels of evidence 55 Statistics for the non-statistician 73 Mixed methods research 88 Sampling 97 Ethics 106 Getting evidence into practice 114 Challenges to evidence-based practice 124

Index

133

1

CONTENTS About the authors

vii

CHAPTER 1

Introduction to research, the research process and EBP 1 1.1 The meaning of ‘evidence-based practice’ 2 1.2 Apprehension towards evidence-based practice 5 Decision making by anecdote 6 Decision making by press cutting 6 Decision making by expert opinion 6 Decision making by cost minimisation 7 1.3 Before you start: formulate the problem 8 CHAPTER 2

Asking questions and searching for evidence

11

2.1 Different types of searching for evidence 12 Informal 12 Focused looking for answers 13 Searching the literature 13 2.2 Differences between primary and secondary research 13 2.3 Effective search strings 13 Steps for effective searching 14 One-stop shopping: federated search engines 16 2.4 Other avenues for how to search for evidence 17 Searching for information using social media 19 CHAPTER 3

Reviewing literature

21

3.1 Is a paper worth reading at all? 23 Question 1. Who wrote the paper? 23 Question 2. Is the title appropriate and illustrative, and is the abstract informative? 23 Question 3. What was the research design, and was it appropriate to the question? 24 Question 4. What was the research question, and why was the study needed? 24 Question 5. Do the results or findings answer the question? 24

3.2 Reviewing the methods of primary research papers 25 Sample and setting: who are the participants, and where is the study being carried out? 25 What data-collection methods were used? 26 How was the data analysed? 26 3.3 Reviewing the methods of secondary (review) papers 26 Question 1. What is the focused clinical question that the review addressed? 27 Question 2. Was a thorough search of the appropriate database(s) carried out, and were other potentially important sources explored? 27 Question 3. Who evaluated the studies, and how? 28 Question 4. How sensitive are the results to the way the review has been performed? 28 Question 5. Have the results been interpreted sensibly, and are they relevant to the broader aspects of the problem? 28 Meta-analyses and meta-syntheses 28 CHAPTER 4

Qualitative research

32

4.1 Qualitative research explained 33 4.2 The difference between qualitative and quantitative research 34 4.3 Qualitative methodologies and data collection strategies 35 Qualitative sampling 36 Data collection 36 4.4 Evaluating papers that describe qualitative research 37 Question 1. Did the paper describe an important clinical problem addressed via a clearly formulated question? 38 Question 2. Was a qualitative approach appropriate? 38 Question 3. How were (a) the setting and (b) the subjects selected? 38 Question 4. What was the researcher’s perspective, and has this been taken into account? 38 Question 5. What methods did the researcher use for collecting data, and are these described in enough detail? 39

Question 6. What methods did the researcher use to analyse the data, and what quality control measures were implemented? 39 Question 7. Are the results credible and, if so, are they clinically important? 40 Question 8. What conclusions were drawn, and are they justified by the results? 40 Question 9. Are the findings of the study transferable to other settings? 40 CHAPTER 5

Quantitative research

43

5.1 Why and how quantitative research is done 44 5.2 Quantitative research designs 46 Intervention studies 46 Observational studies 47 5.3 Measurement 48 Variables — independent and dependent 49 Reliability and validity in measurement

CHAPTER 8

Mixed methods research

50

CHAPTER 6

Levels of evidence

Have valid assumptions been made about the nature and direction of causality? 78 7.4 Probability and confidence 79 How are p values interpreted? 79 Confidence intervals 81 7.5 Clinical importance of treatment effects 81 Clinical importance 81 7.6 Summarising treatment effects from multiple studies of interventions in a systematic review 84

55

6.1 Clinical questions in healthcare 56 Finding the best evidence 56 NHMRC and evidence-based practice 57 How researchers answer clinical questions 57 6.2 Matching clinical questions to NHMRC levels of evidence 58 NHMRC evidence levels for intervention studies 60 6.3 How bias threatens the validity of research evidence 63 6.4 Evaluating the evidence – quality of evidence and grades of recommendations for practice guidelines 65 6.5 Levels within levels 66 CHAPTER 7

Statistics for the non-statistician 73 7.1 Storing quantitative data in a data set 74 7.2 Descriptive statistics for summarising sample characteristics 75 Descriptive statistics for categorical variables 75 Descriptive statistics for continuous variables 76 7.3 The researchers ‘setting the scene’ 77 Have the researchers tested the assumption that their groups are comparable? 77 What assumptions apply to the shape of the data? 77

88

8.1 An overview of mixed methods research 89 Why use mixed methods in nursing and health sciences research? 89 8.2 Different mixed methods designs 90 Convergent study 90 Sequential study 91 Multiphase (multilevel) study 91 Embedded study 92 8.3 Integration in mixed methods research 93 Integrating the research question 93 Research design 93 Sampling 93 Analysis 93 Interpretation 94 8.4 Mixed method design considerations 94 Weighting (dominance) 94 General challenges associated with mixed methods studies 94 CHAPTER 9

Sampling

97

9.1 Understanding the terminology around sampling 98 Theoretical population (or target population) 98 Study population (or accessible population) 98 Sampling 99 Sample 99 Sampling frame 99 9.2 Types of sampling 99 Probability sampling 99 Non-probability sampling 100 Sampling methods 100 9.3 Sampling error 102 9.4 Calculating sample size 102 Quantitative research 102 Qualitative research 103

CONTENTS

v

CHAPTER 10

Ethics

106

10.1 Ethical principles 107 Autonomy: patients/clients are free to determine their own actions 107 Beneficence: acting to benefit human kind 108 Justice: obligation to treat fairly 108 Non-maleficence: avoiding or minimising harm 108 Respect for human dignity 108 Confidentiality: maintenance of privileged information, including the right to privacy and anonymity 109 Veracity: obligation to tell the truth 109 10.2 The role and function of human research ethics committees 109 10.3 Judging the ethical aspects of a published journal article 111 CHAPTER 11

Getting evidence into practice 114 11.1 Adoption of evidence-based practice (EBP) 115 Individual barriers 115 Organisational barriers 116 11.2 Encouraging individuals to implement evidence-based practice 116

vi CONTENTS

11.3 Organisational support of evidencebased practice 117 Integrated care pathways 117 Clinical practice guidelines 118 11.4 The client perspective in evidencebased practice 119 Patient-reported outcome measures (PROMs) 119 Shared decision-making 119 Option grids 120 CHAPTER 12

Challenges to evidencebased practice 124 12.1 When evidence-based practice is done badly 126 12.2 When evidence-based practice is done well 126 Guidelines devalue professional expertise 126 The guidelines are too narrow (or too broad) 127 The guidelines are out of date 127 The client’s perspective is ignored 127 There are too many guidelines 128 Practical and logistical problems 128 The evidence is confusing 128 12.3 Achieving evidence-based practice 129

Index

133

ABOUT THE AUTHORS Trisha M Greenhalgh Dr Trisha Greenhalgh is a Professor of Primary Care Health Sciences at the University of Oxford and a practising GP. She completed a BA in Social and Political Sciences at the University of Cambridge in 1980 and a medical degree at the University of Oxford in 1983. Trisha’s research aims to apply the traditional aspects of medicine while embracing the opportunities of contemporary science to improve health outcomes for patients. She is the author of more than 240 peer-reviewed publications as well as a number of scholarly textbooks. Trisha has received numerous accolades for her work, including twice winning the Royal College of General Practitioners Research Paper of the Year Award, and receiving the Baxter Award from the European Health Management Association. In 2001, she was awarded an OBE for services to evidence-based medical care.

John Bidewell Dr John Bidewell is a Lecturer in Research Methods at the School of Science and Health at Western Sydney University. From an early career in school teaching, John moved into psychology, acquiring three degrees while always maintaining an interest in education. Opportunities arose in applied social research and data analysis, leading John in that direction. For many years, he provided technical and inspirational support to academic and student researchers, covering every stage of the research process from concept to publication, and especially data analysis and interpretation, at Western Sydney University’s nursing and midwifery school. John has provided consultancy services in research and statistics to business, industry and governments, and has taught research methods and statistics to nursing, business and allied health students at undergraduate and postgraduate levels.

Elaine Crisp Dr Elaine Crisp is a Registered Nurse (RN) and Lecturer at the School of Nursing at the University of Tasmania, where she coordinates both the Bachelor of Nursing (BN) course and the Translational Research unit within the BN. This dual role enables her to ensure the BN highlights the connection between research and clinical practice. She has also taught research methods to nursing and allied health students at the postgraduate level, encouraging clinicians to understand and use research evidence in their everyday practice. Elaine worked as an RN in aged care and in the perioperative area before commencing her PhD, which combined her love of history and nursing. Her major research interests are nursing and welfare history, aged and dementia care, and nurse education.

Amanda Lambros Amanda E Lambros is a Professional Speaker, Author and Clinical Counsellor as well as a past Clinical Fellow. She has completed a Bachelor of Health Sciences at the University of Western Ontario (2001), a Postgraduate Diploma of Ethics (2002), a Master of Forensic Sexology (2004) and a Master of Counselling (2014). Amanda has developed, coordinated and taught evidence-informed health practice to thousands of Interprofessional First Year Health Sciences students throughout her career. Amanda’s private practice focuses on relationships, mental health, and grief and loss. Providing her clients with the most up-to-date and evidence-based care is imperative to her, and she has a strong focus on EBP, ethics and communication. Amanda has received numerous accolades for her work, including NifNex 100 Most Influential Business Owners, a Telstra Business Award nomination and a Telstra Business Woman of the Year nomination.

Jane Warland Dr Jane Warland is an Associate Professor at the School of Nursing and Midwifery at the University of South Australia (UniSA). She worked as a midwife from 1988 to 2007, and gained her PhD from the University of Adelaide in 2007. Jane was appointed as an academic staff member to the School of Nursing and Midwifery in February 2008, and teaches a foundational research course in the undergraduate midwifery program. Her own program of research is STELLAR (stillbirth, teaching, epidemiology, loss, learning, awareness and risks). Jane has a track record in research using qualitative, quantitative and mixed methods. She has a strong interest in research ethics and served two terms as a member of the UniSA Human Research Ethics Committee. Jane has written numerous book chapters about research — she has more than 90 publications, including books, chapters and peer-reviewed journal articles.

CHAPTER 1

Introduction to research, the research process and EBP LEARNING OBJECTIVES 1.1 What does ‘evidence-based practice’ mean? 1.2 Why are people apprehensive about evidence-based practice? 1.3 How do we get started with evidence-based practice?

1.2 Apprehension towards evidence-based practice LEARNING OBJECTIVE 1.2 Why are people apprehensive about evidence-based practice?

Unfortunately, some people might be apprehensive or groan when mentioning evidence-based practice because they have heard through the grapevine that it’s all about statistics and number crunching. Before we go much further, let’s clarify something and pop that bubble! EBP is not ‘all about statistics’. Although yes, statistics, numbers, equations, odds ratios, confidence intervals, etc. are all words you will hear in EBP, understanding how to interpret and implement EBP is much more important at this point in time. Numbers are great but, in reality, if you don’t understand what the numbers mean, then you are not really able to implement EBP. Alternatively, if you don’t understand how the researchers completed the math behind the numbers, you can’t double-check their work — which is sometimes equally as important. With this in mind, Greenhalgh and Donald proposed an alternative definition of EBP, which demonstrates the use of mathematics. Evidence-based practice is the use of mathematical estimates of the risk of benefit and harm, derived from high-quality research on population samples, to inform clinical decision making in the diagnosis, investigation or management of individual patients.7

Students regularly allow the idea of ‘numbers’ and a ‘new language’ to blur their excitement for learning about EBP — please don’t let this happen to you. Be open to learning about the topic, which will stick with you for the rest of your life. The more open you are to learning about EBP, the better the practitioner you will become! The second reason that people often groan when you mention evidence-based practice is because there are plenty of daunting new (and often long) words that look like a foreign language. While it is like a new language, absolutely everything in EBP can be broken down into simple and manageable steps. For example, students are often stumped by retrospective longitudinal cohort design (which you will learn about in this resource), but once they break it down, it makes perfect sense: • retrospective — in the past (think of the word ‘retro’) • longitudinal — over a long period of time • cohort — a group of people • design — type of study. Now that it’s broken down into manageable chunks, you instantly know that ‘retrospective longitudinal cohort design’ means a type of study that was done over a long period, looking at a group of people in the past. So, please don’t feel overwhelmed by the words — they all make sense, but some just need to be broken down first. Anyone who works face-to-face with clients knows that it is necessary to seek new information before making a clinical decision. Health practitioners spend countless hours searching through libraries, books and online to inform their practices. In general, we wouldn’t put a client on a new drug or through a new treatment without evidence that it is likely to work — but, unfortunately, best practice is not always followed. There have been a number of surveys on the behaviours of health professionals. In the United States in the 1970s, only around 10–20 per cent of all health technologies then available (i.e. drugs, procedures, operations, etc.) were evidence-based; in the 1990s, that figure improved to 21 per cent.8 Studies of the interventions offered to consecutive series of clients suggested that 60–90 per cent of clinical decisions, depending on the specialty, were ‘evidence-based’.9 Unfortunately, due to various excuses and limitations, we are still selling our clients short most of the time. A large survey by an Australian team looked at 1000 clients treated for the 22 most commonly seen conditions in a primary-care setting. The researchers found that while 90 per cent of clients received evidence-based care for coronary heart disease, only 13 per cent did so for alcohol dependence.10 Furthermore, the extent to which any individual practitioner provided evidence-based care varied in the sample from 32 per cent of the time to 86 per cent of the time. A more recent study found that one in three hospitals are not meeting performance metrics. One of the leading reasons was failure to implement EBP.11 Following this, a study suggested that medical error is now the third leading cause of death in the United States.12 These findings suggest plenty of room for improvement; therefore, with a new wave of practitioners, hopefully we can increase the application of EBP so that the majority of health consumers are receiving evidence-based care. Let’s look at the various approaches that many health professionals use to reach their decisions in reality — all of which are examples of what EBP isn’t . . . therefore, please do not practise these!

CHAPTER 1 Introduction to research, the research process and EBP 5

Decision making by anecdote When Trisha Greenhalgh was a medical student, she was able to join a distinguished professor on his daily ward rounds. On seeing a new client, he would ask about their symptoms, turn to the massed ranks of juniors around the bed, and relate the story of a similar client encountered a few years previously. ‘Ah, yes. I remember we gave her such-and-such and she was fine after that’. He was cynical, often rightly, about new drugs and technologies, and his clinical acumen was second to none. Nevertheless, it had taken him 40 years to accumulate his expertise and the largest medical textbook of all — the collection of cases that were outside his personal experience — was forever closed to him. Anecdote (storytelling) has an important place in clinical practice.13 It is common practice for students and practitioners to listen to professors, tutors and clients and memorise their stories or scripts in the form of what was wrong with particular clients, and their outcomes to use later. Health professionals glean crucial information from clients’ illness narratives — most crucially, perhaps, what being ill means to the client. Experienced health professionals take account of the accumulated ‘illness scripts’ of all their previous clients when managing subsequent clients — but that doesn’t mean simply doing the same for client B as you did for client A if your treatment worked, and doing precisely the opposite if it didn’t! We would not be human if we ignored our personal clinical experiences, but we would be better to base our decisions on the collective experience of thousands of health professionals treating millions of clients, rather than on what we as individuals have seen and felt.

Decision making by press cutting Imagine simply trawling the internet, magazines, newspapers and information presented throughout all forms of media and simply ‘cutting and pasting’ then creating, for lack of a better analogy, a scrapbook of treatments, cures, etc. — continuously altering your practice in line with the various conclusions. For example, ‘probiotics improve your mood’,14 and advocating that all clients take probiotics. The advice was in print, and it was recent, so it must surely replace what was previous practice. This approach to clinical decision making is, unfortunately, still very common. How many doctors do you know who justify their approach to a particular clinical problem by citing the results section of a single published study, even though they might fail to tell you: • the methods used to obtain those results • whether the trial was randomised and controlled • the number, age, sex and disease severity of the clients involved • how many withdrew from (‘dropped out of’) the study and why • by what criteria clients were judged ‘cured’ • if the findings of the study appeared to contradict those of other researchers; whether any attempt was made to validate (confirm) and replicate (repeat) them • whether the statistical tests that allegedly proved the authors’ point were appropriately chosen and correctly performed (see the chapter on statistics for the non-statistician) • whether the client’s perspective has been systematically sought and incorporated via a shared decisionmaking tool. Therefore, health practitioners who like to cite the results of medical research studies have a responsibility to ensure that they first go through a checklist like this before simply making decisions by press cutting.

Decision making by expert opinion In extreme cases, an ‘expert opinion’ may consist simply of the lifelong bad habits and personal press cuttings of an ageing health professional, which could simply multiply the misguided views of any one of them. Table 1.1 gives examples of practices that were at one time widely accepted as good clinical practice, but that have subsequently been discredited by high-quality clinical trials. Serious harm can be done by applying guidelines that are not evidence-based. It is a major achievement of the EBP movement that almost no guideline these days is produced solely by expert opinion!

6 Understanding research methods for evidence-based practice in health

TABLE 1.1

Examples of harmful practices once strongly supported by ‘expert opinion’

Approximate time period

Clinical practice accepted by experts of the day

Practice shown to be harmful in a

Impact on clinical practice

From 500 BC

Bloodletting (for just about any acute illness)

1830s

Bloodletting ceased around 1910

Early 1900s

Complete immobilisation in the form of splints and braces on the affected limbs for infantile paralysis (Polio)

1930s

Although very hot dressings and physical therapy (what we now know as physiotherapy) was deemed to be effective, it was not fully implemented for the treatment of polio until the 1950s

1957

Thalidomide for ‘morning sickness’ in early pregnancy, which led to the birth of over 8000 severely malformed babies worldwide

1960

The teratogenic effects of this drug were so dramatic that thalidomide was rapidly withdrawn when the first case report appeared

From at least 1900

Bed rest for acute low back pain

1986

Many doctors still advise people with back pain to ‘rest up’

1960s

Benzodiazepines (e.g. diazepam) for mild anxiety and insomnia, initially marketed as ‘non-addictive’ but subsequently shown to cause severe dependence and withdrawal symptoms

1975

Benzodiazepine prescribing for these indications fell in the 1990s

1970s

Intravenous lignocaine in acute myocardial infarction, with a view to preventing arrhythmias, subsequently shown to have no overall benefit and in some cases to cause fatal arrhythmias

1974

Lignocaine continued to be given routinely until the mid-1980s

Late 1990s

Cox-2 inhibitors (a new class of non-steroidal anti-inflammatory drug), introduced for the treatment of arthritis, were later shown to increase the risk of heart attack and stroke

2004

Cox-2 inhibitors for pain were quickly withdrawn following some high-profile legal cases in the United States, although new uses for cancer treatment (where risks may be outweighed by benefits) are now being explored

a

Interestingly, bloodletting was probably the first practice for which a randomised controlled trial was suggested.

Decision making by cost minimisation The popular press tends to be horrified when they learn that a treatment has been withheld from a client for reasons of cost. Managers, politicians and, increasingly, health professionals can count on being pilloried when a child with a rare cancer is not sent to a specialist unit in the United States or a frail elderly woman is denied a drug to stop her visual loss from macular degeneration. Yet, in the real world, all healthcare is provided from a limited budget, and it is increasingly recognised that clinical decisions must take into account the economic costs of a given intervention. Clinical decision making purely on the grounds of cost (‘cost minimisation’ — purchasing the cheapest option with no regard to how effective it is) is generally ethically unjustified, and we are right to object vocally when this occurs. Expensive interventions should not, however, be justified simply because they are new, or because they ought to work in theory, or because the only alternative is to do nothing — but because they are very likely to save life or significantly improve its quality.

CHAPTER 1 Introduction to research, the research process and EBP 7

1.3 Before you start: formulate the problem LEARNING OBJECTIVE 1.3 How do we get started with evidence-based practice?

If midwifery students are asked what they know about childbirth and pain management, they can talk at length about the different pain management techniques, how they measure pain by having a conversation with the woman in labour, what the textbooks define as pain — the list goes on. They are truly aware of the concept of ‘pain’ and its management during the labouring process. However, when the students are asked a practical question such as ‘Mrs Janes wants the most effective, non-invasive and non-pharmacological pain management technique — what would you advise her as her options?’, they appear startled. One student replies ‘Mrs Janes can have absolutely anything she wants to manage her pain!’ — a great response, but it doesn’t provide the best possible evidence to Mrs Janes, especially if she’s asking direct questions. The response could be relaxation techniques,15 massage and reflexology,16 aromatherapy,17 and so on — but the student would need to revise the evidence. They may sympathise with Mrs Janes’s predicament, but they often draw a blank as to where to draw on information such as this, which could possibly be the one thing that Mrs Janes needs or wants to know. Experienced health professionals might think they can answer Mrs Janes’s question from their own personal experience, but few of them would be right. Even if they were right on this occasion, they would still need an overall system for converting all of the information about a client (age, ethnicity, subjective pain scale, etc.), the particular values and preferences (utilities) of the client, and other things that could be relevant (a hunch, a half-remembered article, the opinion of a more experienced colleague or a paragraph discovered by chance while flicking through a textbook) into a succinct summary. The summary would need to cover what the problem is, and what specific additional items of information we need to solve that problem and come up with a desired outcome. Sackett et al., in a book subsequently revised by Straus,18 explained that the parts of a good clinical question should include three components. • First, define precisely whom the question is about (i.e. ask ‘How would I describe a group of clients similar to this one?’). • Next, define which manoeuvre (treatment, intervention, etc.) you are considering in this client, and, if necessary, a comparison manoeuvre (e.g. placebo or current standard therapy). • Finally, define the desired (or undesired) outcome (e.g. reduced mortality, better quality of life, and overall cost savings to the health service). Thus, in Mrs Janes’s case, we might ask, ‘In a thirty-year-old Caucasian woman with a high pain threshold, two previous labouring/birthing experiences, no coexisting illness, and no significant past medical history, whose blood pressure is currently X/Y, would the benefits of suggesting massage and reflexology provide her with the desired outcome of an effective, non-invasive and non-pharmacological pain management technique to decrease pain during labour?’ Note that in framing the specific question, we have already established that Mrs Janes has previously experienced labour and birth twice. Knowing this, we recognise that she may have also previously experienced invasive or pharmacological interventions and is aware of the discomfort of labouring and birthing. Remember that Mrs Janes’s alternative to an effective, non-invasive and non-pharmacological pain management technique is potentially invasive and may have side effects — on not only Mrs Janes, but also the birthing process and/or the baby about to be born. Not all of the alternative approaches would help Mrs Janes or be acceptable to her, but it would be quite appropriate to seek evidence as to whether they might help her — especially if she was asking to try one or more of these remedies. Before you start, give one last thought to your client in labour. In order to determine her personal priorities (how much does she value a 10 per cent reduction in her pain to still experience the labour and birthing process compared to the inability to feel below her belly button?), you will need to approach Mrs Janes, not anybody else in that labouring room, and start the dialogue towards providing evidence-based care.

8 Understanding research methods for evidence-based practice in health

SUMMARY Evidence-based practice is much more than ‘reading an academic article’. It includes, at a very minimum, the best possible research evidence available, clinical expertise as well as understanding the client’s values and preferences. EBP is not about learning a technique, skill and treatment methodology once and applying it for the rest of your career; it should be a continuous loop of learning and improvement by using the Ask, Acquire, Appraise, Apply and Evaluate model. Although some students (and even health practitioners) attempt to avoid learning about EBP because they believe that it’s all math-driven and hard to understand, it’s much more than just numbers. Once you learn the techniques, you will begin to see evidence-based information all around you and will continue to apply EBP throughout your life as both a health consumer and eventually a health practitioner. When the best question is formulated, it becomes much easier to find the best evidence to answer the question.

KEY TERMS evidence-based care Care that is based on evidence-based practice (supported by scientific evidence, clinical expertise and client values). evidence-based practice (EBP) A practice that is supported by scientific evidence, clinical expertise and client values. knowledge An accepted body of facts or ideas that is acquired through the use of the senses, reasons or through research methods. retrospective longitudinal cohort design A type of study conducted over a long period, looking at a group of people in the past.

WEBSITES 1 Centre for Research in Evidence-based Practice (CREBP): https://bond.edu.au/researchers/research-

strengths/university-research-centres/centre-research-evidence-based-practice 2 Centre for Evidence-based Medicine: www.cebm.net 3 ‘What is evidence-based medicine?’, British Medical Journal: www.bmj.com/content/312/7023/71 4 HPNA Position statements ‘Evidence-Based Practice’, Hospice and Palliative Nurses Association

(US): https://advancingexpertcare.org/position-statements Joanna Briggs Institute: www.joannabriggs.org Australasian Cochrane Centre: http://aus.cochrane.org ‘Evidence-based information’, QUT Library: www.library.qut.edu.au/search/howtofind/evidencebased ‘Answering Clinical Questions’, University of Western Australia: www.meddent.uwa.edu.au/ teaching/acq 9 ‘Evidence based practice’, University of Tasmania: https://utas.libguides.com/ebp 10 ‘Introduction to evidence-based practice’, Duke University Medical Center Library and the Health Sciences Library at the University of North Carolina: https://guides.mclibrary.duke.edu/ebmtutorial 5 6 7 8

ENDNOTES 1. M. Hurley et al., ‘Hip and knee osteoarthritis: a mixed methods review,’ Cochrane Database of Systematic Reviews 4 (2018). doi:10.1002/14651858.CD010842.pub2. 2. D. L. Sackett et al., ‘Evidence-based practice: what it is and what it isn’t,’ BMJ: British Medical Journal 312, no. 7023 (1996): 71. 3. Sackett, ‘Evidence-based practice,’ 71. 4. K. Head et al., ‘Saline irrigation for allergic rhinitis,’ Cochrane Database of Systematic Reviews 6 (2018). doi:10.1002/14651858.CD012597.pub2. 5. D. L. Sackett and R. B. Haynes, ‘On the need for evidence-based practice,’ Evidence-based Practice 1, no. 1 (1995): 4–5. 6. T. Greenhalgh, ‘Is my practice evidence-based?,’ BMJ: British Medical Journal 313, no. 7063 (1996): 957. 7. T. Greenhalgh, How to read a paper: the basics of evidence-based medicine (Oxford: Blackwell-Wiley, 2006). 8. M. Dubinsky and J. H. Ferguson, ‘Analysis of the national institutes of health medicare coverage assessment,’ International Journal of Technology Assessment in Health Care 6, no. 3 (1990): 480–8. 9. D. L. Sackett et al., ‘Inpatient general practice is evidence-based,’ The Lancet 346, no. 8972 (1995): 407–10.

CHAPTER 1 Introduction to research, the research process and EBP 9

10. W. B. Runciman et al., ‘CareTrack: assessing the appropriateness of health care delivery in Australia,’ Medical Journal of Australia 197, no. 10 (2012): 549. 11. B. Melnyk, ‘Are you getting the best health care? Evidence says: maybe not,’ The Conversation, June 9, 2016, https://theconversation.com/are-you-getting-the-best-health-care-evidence-says-maybe-not-59206. 12. M. A. Makary and M. Daniel, ‘Medical error — the third leading cause of death in the US,’ BMJ: British Medical Journal 353 (2016): i2139. 13. J. Macnaughton, ‘Anecdote in clinical practice,’ in Narrative based practice: dialogue and discourse in clinical practice, ed. T. Greenhalgh and B. Hurwitz (London: BMJ Publications, 1998). 14. L. Steenbergen et al., ‘A randomized controlled trial to test the effect of multispecies probiotics on cognitive reactivity to sad mood,’ Brain, Behavior, and Immunity (2015): 258–64. doi:10.1016/j.bbi.2015.04.003. 15. C. A. Smith et al., ‘Relaxation techniques for pain management in labour,’ Cochrane Database of Systematic Reviews 3 (2018). doi:10.1002/14651858.CD009514.pub2. 16. C. A. Smith et al., ‘Massage, reflexology and other manual methods for pain management in labour,’ Cochrane Database of Systematic Reviews 3 (2018). doi:10.1002/14651858.CD009290.pub3. 17. C. A. Smith, C. T. Collins, and C. A. Crowther, ‘Aromatherapy for pain management in labour,’ Cochrane Database of Systematic Reviews 7 (2011). doi:10.1002/14651858.CD009215. 18. S. E. Straus et al., Evidence-based practice: how to practice and teach EBP, 4 ed. (Edinburgh: Churchill Livingstone, 2010).

ACKNOWLEDGEMENTS Adapting author for this Australian edition: Amanda Lambros Photo: © Halfpoint / Shutterstock.com

10 Understanding research methods for evidence-based practice in health

CHAPTER 2

Asking questions and searching for evidence LEARNING OBJECTIVES 2.1 2.2 2.3 2.4

What are the different reasons we might search for evidence? What are the differences between primary research and secondary literature? How do you construct an effective search string? What are some other avenues for how to search for evidence?

Focused looking for answers Focused looking for answers should take a much more detailed approach, especially if we can trust the ‘answer’ we find and apply it directly to the care of a client. When we find information, we need to know how we can tell if it is trustworthy and, if so, that it is OK to stop looking — we don’t need to find absolutely every study that may have addressed this topic. This kind of query is increasingly well served by new synthesised information sources whose goal is to support evidence-based care and the transfer of research findings into practice. This is discussed in more detail shortly.

Searching the literature When writing an essay or an article for publication, searching the literature involves an entirely different process. The purpose here is less to influence client care directly than to identify the existing body of research that has already addressed a problem and clarify the gaps in knowledge that require further research. For this kind of searching, you need knowledge about information resources and some basic skills in searching. A simple search of one database can often be enough for this kind of searching. Although, you should be aware that if you want to search systematically (for example, a systematic review of the literature), then multiple relevant databases need to be searched systematically, and citation chaining needs to be employed to ensure that you are being thorough enough. If this is your goal, you should consult with an information professional, such as a health librarian.

2.2 Differences between primary and secondary research LEARNING OBJECTIVE 2.2 What are the differences between primary research and secondary literature?

Literature that reports primary research is from a single research study. Primary sources can be found in a variety of ways. You could look at the reference lists and hyperlinks from secondary sources. You could identify them direct from journal alerts — for example, via RSS feeds, table-of-contents services or more focused topical information services. You could also search databases such as PubMed/Medline, EMBASE, PASCAL, Cochrane Library, CINAHL (Cumulated Index of Nursing and Allied Health Literature), Web of Science, Scopus or Google Scholar. Secondary research reports ‘synthesised’ findings and usually takes the form of a literature review. A literature review will typically: • examine multiple primary research papers • summarise the research papers. A literature review of quantitative articles may also include a meta-analysis. A review of qualitative literature will often include a meta-synthesis, meaning that this type of literature has combined findings to provide strong evidence on which to base practice. Secondary literature is also useful to assist you to quickly understand what is already known about a topic; but, if you would like (or have been asked) to locate evidence from primary research, then the rest of this chapter is for you.

2.3 Effective search strings LEARNING OBJECTIVE 2.3 How do you construct an effective search string?

PubMed is a frequently accessed online resource for most physicians and health professionals worldwide, probably because it is free and well known. When conducting a basic PubMed search, you can use two or three search words — but taking this approach characteristically turns up hundreds or thousands of references, and many of these may be irrelevant for your topic of interest. This is certainly not an effective way to search, but it is the reality of how most people do search.1 It is surprisingly easy to improve the efficiency of this kind of approach, which can enable you to become much more effective at basic searching. Simple tools that are part of most database search engines help to focus a basic search and produce better results.

CHAPTER 2 Asking questions and searching for evidence 13

TABLE 2.1

Activity

(continued)

Example tools

Project and document management

Basecamp, Evernote, GoodReader, Papers

Cloud services (software)

Google Docs

Cloud storage

Dropbox, Google Drive, MS Office 365

Information retrieval

RSS feeds, Google Alerts

Reference managers

Mendeley, Zotero, Endnote web, CiteULike

Video conferencing

Skype, Zoom, Google Hangouts Meet, MS Lync

Professional profile and networking

Collaboration and project management ✓

✓

Information retrieval, filtering and management

Data collection and storage

Dissemination and translation

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

✓

Clinical databases and registries ✓

✓

Survey tools

SurveyMonkey, Qualtrics, Poll Daddy, Google Forms

Mobile devices

Text messaging, apps

Social networking

Facebook, Twitter, Yammer

✓

✓

✓

Blogging

WordPress, Blogger

✓

✓

✓

Microblogging

Twitter, Instagram, TweetDeck, Tumblr

✓

✓

Curation

Scoop.it, Pinterest

Video hosting

YouTube, Vimeo

✓

Altmetrics

Impactstory.org, altmetrics.com

✓

✓

✓

✓

✓

✓

✓

18 Understanding research methods for evidence-based practice in health

✓

✓ ✓

SUMMARY There are three main types of searching for evidence: informal, focused looking for answers and searching the literature. Primary research is from a single research study; whereas secondary literature usually examines multiple primary research papers and summarises them. There are many tools, databases and search engines that can help in effective searching (such as PubMed, Google Scholar and Trip). Other ways of searching for and gathering evidence include consulting librarians and making use of social media.

KEY TERMS advanced search Allows you to set specific parameters for your results, such as year, author, title or study type. This option usually means you have less hits that are more relevant than results from a basic search. Boolean operators The use of AND, OR and NOT to expand or exclude keywords in a search, which then results in more focused or productive search results. primary research An original piece of research from a single study. search string A list of words or phrases, which when combined are used to search for a specific piece of information contained in a database. secondary research Research that has examined multiple primary research papers and summarised them. truncation A searching technique used when searching databases in which a word ending is replaced by a symbol — an asterisk (*), a question mark (?) or a dollar sign ($) — to enable the search to include all possible endings of the word. wildcards A searching technique used when searching databases in which a letter in a word is replaced by a symbol — an asterisk (*), a question mark (?) or a dollar sign ($) — to enable the search to include alternate spellings of the word.

WEBSITES 1 ‘Finding the evidence’ from the University of Oxford’s Centre for Evidence-Based Medicine includes

a series of helpful resources, including videos on searching databases: www.cebm.net/category/ebmresources/tools/finding-the-evidence 2 ‘PubMed tutorial’ from PubMed offers an overview of what PubMed does and doesn’t do, as well as some exercises to help you get used to how to use it: www.nlm.nih.gov/bsd/disted/pubmedtutorial/cover.html

ENDNOTE 1. A. Hoogendam et al., ‘Answers to questions posed during daily patient care are more likely to be answered by UpToDate than PubMed,’ Journal of Medical Internet Research 10, no. 4 (2008): e29.

ACKNOWLEDGEMENTS Adapting author for this Australian edition: Jane Warland Photo: © Twin Design / Shutterstock.com Photo: © nopporn / Shutterstock.com Figures 2.1, 2.2 and 2.3: © PubMed / Public Domain Figure 2.4: © Trip Database Table 2.1: Reprinted with permission of John Wiley & Sons, Inc.

20 Understanding research methods for evidence-based practice in health

CHAPTER 3

Reviewing literature LEARNING OBJECTIVES 3.1 Is a paper worth reading at all? 3.2 How do we appraise the methods of primary research papers? 3.3 How do we appraise the methods of secondary (review) papers?

3.1 Is a paper worth reading at all? LEARNING OBJECTIVE 3.1 Is a paper worth reading at all?

There are multitudes of health claims published in advertisements, websites, newspapers and magazines that profess to be based on research, but which are, in fact, unsubstantiated, ineffective, and may even have the potential to cause real damage. (Several websites are mentioned at the end of this chapter that explore and expose a great number of these.7) Most clinicians can spot the flaws in these quite easily and rarely waste time reading them. Nevertheless, when it comes to articles in scholarly journals, many of us assume that these must be completely dependable, as we are told they have been subject to peer review. That is, other experts in the field have determined their validity, originality and importance before publication. However, this is no assurance that a published paper is completely trustworthy or worthwhile. Not all journals are high quality; there has been an explosion of health-related journals produced in recent years, many of which exist only online, and although these usually claim to be peer reviewed, they do not always follow the rigorous standards necessary to ensure scholarly worth. Even within reputable journals, published papers show flaws such as lack of originality, wrong design choices, inadequate sample sizes, unjustified conclusions, conflicts of interest and poor writing. However, even if a paper is not perfect, there may be good, practical reasons for this and it doesn’t mean you can’t read it and learn from it in some way after considering the flaws. Before you can use a research paper to inform your own practice, you need to assess both the quality of the study and how useful it is to you. You are looking for three main things: 1. results (what did the study find?) 2. validity (do the results match the conclusions?) 3. applicability (will they help you with your own clients?).8 Most research papers appearing in journals are presented more or less in standard introduction, methods, results and discussion (IMRaD) format: • Introduction (why the authors decided to do this particular piece of research) • Methods (how they did it and how they chose to analyse their results) • Results (what they found) • Discussion (what they think the results mean). The introduction may include background and a short literature review, and the discussion will usually include conclusions, although these sections may all have their own separate headings. Your evaluation should rest mainly on the design of the methods section, which will be covered in more detail later in this chapter. Here are five general questions to ask for all papers.

Question 1. Who wrote the paper? As you become more knowledgeable in your field or specialty, you will also become familiar with the key researchers in that area. In the meantime, you can make some judgements on an author’s expertise from their qualifications and affiliations. Are these in an area that relates to the topic they are writing about? In particular, you should look to see if there may be a conflict of interest or possible grounds for bias in the paper; for example, is the researcher funded by the company that manufactures the intervention that is being tested? Are they investigating an intervention they have designed themselves and have a vested interest in proving it successful? This does not necessarily invalidate the research, but it will let you know there may be an underlying agenda and that you should examine the results carefully to see how the researcher avoided bias.

Question 2. Is the title appropriate and illustrative, and is the abstract informative? A good title will tell you the phenomenon being studied, the population and the methodology used by the researcher in their study. For example, ‘Young people’s experience living with Crohn’s disease: a phenomenological study’ tells you the topic (Crohn’s disease), the population (young people) and the design (phenomenology). The abstract should summarise the entire paper briefly, but with enough detail to help you decide whether to read the whole article. Some are structured, following the same IMRaD pattern as the extended paper, while others are narrative. However, all should let you know what the study is about, how it was carried out, and the findings made. These aspects will not tell you the quality of the study itself, but they will give you a good idea of how well organised or well written the paper will be. CHAPTER 3 Reviewing literature 23

SUMMARY Reviewing research papers fulfils two functions: it enables us to appraise the evidence base to inform our own practice, and it helps us to educate and advise our clients. Whether a paper describes quantitative, qualitative or mixed methods research, almost every article will follow the same pattern, known as the IMRaD format: introduction, methods, results and discussion. We should ask questions about each section to evaluate the quality of the evidence for ourselves and judge its applicability to the context in which we practise. The major aspects to consider when reading a primary (original) research paper are the sample/participants, setting, how data is collected and how it is analysed. A systematic review is a secondary paper providing an overview of primary studies, which asks a focused question on a specific clinical problem; contains a clear statement of objectives, sources and methods; and has been conducted in a rigorous way that is explicit, transparent and reproducible.

KEY TERMS clinical significance The relevance or importance of the results/findings to real-life practice. grey literature Information published by organisations other than commercial publishers, such as government reports, company brochures, conference papers, policy statements, and academic theses and dissertations. herd immunity Indirect protection from infection due to a high level of immunity in the community. homogeneity The compatibility of results of one study with that of another. meta-analysis Statistical procedure for combining results from existing quantitative studies with measured outcomes, effectively increasing sample size and precision. meta-synthesis Synthesis of a number of qualitative studies examining the same phenomenon. peer review Evaluation of a paper by experts in that field, usually done ‘blind’ (i.e. the reviewer does not know who has written the paper) to avoid bias. primary paper An article written by the researchers who carried out the study it describes; new research with original findings (also known as original research or empirical research). publication bias When published research is not representative of all completed studies in that area, leading to false conclusions as to the entire body of evidence.23 This is often skewed towards positive results. secondary paper A paper that reviews several original/primary papers written by other people; no new research has been carried out by the authors. statistical significance The likelihood that a result is produced by more than chance; usually accepted to be expressed through a p-value of less than 0.05 (i.e. more than 95 per cent probability that the effect resulted from the intervention). systematic review Secondary research evidence that answers a focused clinical question through a structured and rigorous synthesis of original, primary studies chosen with strict and transparent inclusion and exclusion criteria.

WEBSITES 1 The American Council on Science and Health provides evidence-based information on health claims

that have no scientific basis: www.acsh.org 2 Dr Ben Goldacre’s Bad Science blog covers a wide variety of dubious ‘scientific’ claims and has links

to many useful resources and sites: www.badscience.net 3 The Nightingale Collaboration is a British group that challenges questionable health claims and reports

4 5 6 7

them to regulatory bodies. The website reports the results of these investigations: www.nightingalecollaboration.org The Critical Appraisal Skills Programme website includes a number of tools you can use to help you evaluate and understand research evidence: www.casp-uk.net The National Health and Medical Research Council is the Australian Government body supporting health and medical research: www.nhmrc.gov.au The Centre for Health Evidence has links to user guides and tools for EBM calculations: www.cche.net The following article provides an interesting perspective on vaccination discussions: https:// theconversation.com/everyone-can-be-an-effective-advocate-for-vaccination-heres-how-111828 30

Understanding research methods for evidence-based practice in health

ENDNOTES 1. B. Deer, ‘Revealed: MMR research scandal,’ The Sunday Times, February 22, 2004; B. Deer, ‘How the case against the MMR vaccine was fixed,’ BMJ: British Medical Journal 342 (2011): c5347. doi:10.1136/bmj.c5347. 2. D. K. Flaherty, ‘The vaccine–autism connection: a public health crisis caused by unethical medical practices and fraudulent science,’ Annals of Pharmacotherapy 45, no. 10 (2011): 1302–4. doi:10.1345/aph.1Q318. 3. Department of Social Services, ‘Australian social services legislation amendment (No Jab, No Pay) Bill 2015. No 158 of 2015,’ Australian Government. 4. See the Australian Health Practitioner Regulation Agency (AHPRA) website (www.ahpra.gov.au) for Codes of Conduct for registered health practitioners, 2014. 5. See the Nursing and Midwifery Board of Australia (NMBA) website (www.nursingmidwiferyboard.gov.au) for Registered Nurse standards for practice, 2016. 6. See the New Zealand Nurses Organisation website (www.nursingcouncil.org.nz), for the Code of Conduct and Standards of Professional Nursing Practice. 7. E. Gambrill, Critical thinking in clinical practice (New Jersey: John Wiley & Sons, 2012). 8. A. Rees, C. Beecroft, and A. Booth, ‘Critical appraisal of the evidence,’ in The research process in nursing, ed. K. Gerrish and J. Lethlean (Chichester: John Wiley & Sons, 2015); D. Rogers, ‘Can we trust the evidence? Critically reviewing a research paper,’ in Healthcare research: a textbook for students and practitioners, ed. P. Roberts and H. Priest (Chichester: John Wiley & Sons, 2010). 9. L. Ottrey et al., ‘“Meal Realities” — an ethnographic exploration of hospital mealtime environment and practice,’ Journal of Advanced Nursing 74 (2017): 603–13. 10. A. Shorten and C. Moorley, ‘Selecting the sample,’ Evidence-Based Nursing 17, no. 2 (2014): 32–33. 11. L. Ward et al., ‘Yoga for the management of pain and sleep in rheumatoid arthritis: a pilot randomized controlled trial,’ Musculoskeletal Care 16 (2018): 39–47. 12. H. Noble and J. Smith, ‘Issues of validity and reliability in qualitative research,’ Evidence-Based Nursing 18, no. 2 (2015): 34–5; R. Heale and A. Twycross, ‘Validity and reliability in quantitative studies,’ Evidence-Based Nursing 18, no. 3 (2015): 66–7. 13. J. Freeman and S. J. Walters, ‘Examining relationships in quantitative data,’ in The research process in nursing, ed. K. Gerrish and J. Lethlean (Chichester: John Wiley & Sons, 2015). 14. Noble and Smith, ‘Issues of validity,’ 66–7. 15. D. Moher et al., ‘Preferred reporting items for systematic reviews and meta-analyses the PRISMA statement,’ Annals of Internal Medicine 151, no. 4 (2009): 264–9. 16. K. M. Bruins Slot et al., ‘Oral anticoagulants versus antiplatelet therapy for preventing stroke and systemic embolic events in patients with atrial fibrillation,’ Cochrane Database of Systematic Reviews 2 (2012). doi:10.1002/14651858. CD009538. 17. T. Greenhalgh and R. Peacock, ‘Effectiveness and efficiency of search methods in systematic reviews of complex evidence: audit of primary sources,’ BMJ: British Medical Journal 331, no. 7524 (2005): 1064–5. 18. For example, the failure to publish negative results of a clinical trial for an anti-arrhythmic drug, Lorcainide, in 1980, led to the deaths of over 100 people on similar medications in the following decades. See A. J. Cowley et al., ‘The effect of lorcainide on arrhythmias and survival in patients with acute myocardial infarction,’ International Journal of Cardiology 40 (1993): 161–6. 19. J. H. Ruotsalainen et al., ‘Preventing occupational stress in healthcare workers,’ Cochrane Database of Systematic Reviews 4 (2015). doi:10.1002/14651858.CD002892.pub5. 20. Ruotsalainen et al., ‘Preventing Occupational Stress.’ 21. A. Shorten and B. Shorten, ‘What is meta-analysis?,’ Evidence-Based Nursing 16, no. 1 (2013): 3–4. 22. M. Sandelowski, S. Docherty, and C. Emden, ‘Qualitative metasynthesis: issues and techniques,’ Research in Nursing & Health 20 (1997): 365–71; K. Flemming, ‘The synthesis of qualitative research and evidence-based nursing,’ Evidence-Based Nursing 10 (2007): 68–71. 23. H. R. Rothstein, A. J. Sutton, and M. Borenstein, Publication bias in meta-analysis–prevention, assessment and adjustments (Chichester: John Wiley & Sons, Ltd, 2005).

ACKNOWLEDGEMENTS Adapting author for this Australian edition: Elaine Crisp Photo: © Dragon Images / Shutterstock.com Figure 3.1: © L. Ottrey et al., ‘“Meal realities” — an ethnographic exploration of hospital mealtime environment and practice,’ Journal of Advanced Nursing 74 (2017): 603–13. Figure 3.2: © L. Ward et al., ‘Yoga for the management of pain and sleep in rheumatoid arthritis: a pilot randomized controlled trial,’ Musculoskeletal Care 16, (2018): 39–47. Figure 3.3: © J. H. Ruotsalainen et al., ‘Preventing occupational stress in healthcare workers,’ Cochrane Database of Systematic Reviews 4 (2015). doi: 10.1002/14651858.CD002892.pub5.

CHAPTER 3 Reviewing literature 31

CHAPTER 4

Qualitative research LEARNING OBJECTIVES 4.1 4.2 4.3 4.4

What is qualitative research? What distinguishes qualitative from quantitative? What are the different qualitative methodologies? How will you evaluate papers that describe qualitative research?

questions like ‘Why do young people text while driving?’, or ‘What encourages people to eat a healthy diet?’ cannot and should not be answered by leaping in and measuring the first aspect of the problem that we (the outsiders) think might be important. Rather, we need to hang out, listen to what people have to say, and explore the ideas and concerns that the individuals themselves come up with. After a while, researchers may notice a pattern emerging, known as saturation, which may prompt the researcher to make observations in a different way or ultimately, signal the end of that particular study. In addition, qualitative researchers might begin their observations or explorations with one of the methods and go on to use a selection of others, which will be explained later in the chapter.

4.2 The difference between qualitative and quantitative research LEARNING OBJECTIVE 4.2 What distinguishes qualitative from quantitative?

Quantitative research begins with an idea (usually articulated as a hypothesis), which then, through measurement, generates data and, by deduction, allows a conclusion to be drawn. Qualitative research is different. It begins with an intention to explore a particular area, collects ‘data’ (e.g. observations, interviews, documents — even emails can count as qualitative data), and generates ideas and hypotheses from these data largely through what is known as inductive reasoning.7 The strength of quantitative approach lies in its reliability (repeatability) — that is, the same measurements should yield the same results time after time. The strength of qualitative research lies in validity (closeness to the truth) — that is, good qualitative research, using a selection of data collection methods, really should touch the core of what is going on rather than just skimming the surface. Table 4.1 states the differences between the qualitative and quantitative approaches to research.8 In reality, there is a great deal of overlap between them, the importance of which is increasingly being recognised.9 TABLE 4.1

Qualitative versus quantitative research10 Qualitative

Quantitative

Common purpose

Discover ideas with general research objects

Test hypotheses or specific research questions

Approach

Observe and interpret

Measure and test

Data collection approach

Unstructured and free form

Structured response and categories are provided

Reasoning

Inductive

Deductive

Researcher independence

The researcher is intimately involved and the results are subjective

The researcher is uninvolved and the results are objective

Strength

Validity

Reliability

Samples

Small samples — often in ‘natural’ settings

Large samples to produce generalisable results

Question

What is X? (classification)

How many Xs? (enumeration)

The validity of qualitative methods is often greatly improved when the researcher incorporates three approaches to their research: 1. the use of more than one method used in combination (a process sometimes known as triangulation) 2. thinking carefully about what is going on and how their own perspective might be influencing the data (an approach known as reflexivity) 3. including more than one researcher to help with the analysis of the same data independently (to demonstrate inter-rater reliability).11 The last point is contentious as, more recently, inter-rater reliability has become less credible as a measure of quality in qualitative research. Appraisers of qualitative papers increasingly seek to assess the competence and reflexivity of a single researcher rather than confirm that the findings were ‘checked by someone else’. This change is attributable to an important insight. In most qualitative research, one person knows the data far better than anyone else, so the idea that two heads are better than one simply isn’t true — a researcher who has been brought in merely to verify ‘themes’ may rely far more on personal 34

Understanding research methods for evidence-based practice in health

preconceptions and guesswork than the main field worker. So, if you do choose to take this approach to improve the validity of the research design, make the effort to implement the inter-rater reliability process properly. If you want to read further insight into inter-rater reliability, see Armstrong and colleagues who provide a great overview of the ‘place of inter-rater reliability in qualitative research’.12 Those who are not familiar with the rigour of qualitative research may believe that it constitutes little more than hanging out and watching leaves fall. It is beyond the scope of this text to take you through the substantial literature on how to (and how not to) proceed when observing, interviewing, leading a focus group and so on. Nevertheless, sophisticated methods for all these techniques certainly do exist and, if you are interested, you can refer to the excellent British Medical Journal series by Scott Reeves and colleagues.13

4.3 Qualitative methodologies and data collection strategies LEARNING OBJECTIVE 4.3 What are the different qualitative methodologies?

Similar to quantitative research methodologies, there are a few basic qualitative methodologies that you should be familiar with prior to embarking on a qualitative research project. Understanding which approach you want to take typically guides the research project. Qualitative methods really come into their own when researching the ‘yet-to-be-known’ — that is, where the variables are poorly understood, ill-defined and cannot be controlled by other means. In these circumstances, the hypothesis may not become apparent until the study is well on its way. Whereas in quantitative methods, the hypothesis is present prior to the study commencing. This is one of the main reasons that, from the start, the qualitative researcher must ensure that the research focus is identified with particular questions to try to answer throughout the study. The methods of qualitative research (see table 4.2) allow for and often require modification of the research question in the light of findings generated along the way — a technique known as progressive focusing.14 The continual process of reassessing the study, the information and the findings along the way are why qualitative researchers must allow periods away from their study for reflection, planning and consultation with colleagues. TABLE 4.2

Qualitative research methods

Ethnography (passive observation)

Systematic watching of behaviours and conversations in natural occurring settings — known to be the study of cultures and people. In passive observation, the researcher is meant to observe the culture from the point of view of the participant.

Ethnography (participant observation)

Observation in which the researcher also occupies a role or part in the setting in addition to observing. In participant observation, the researcher is meant to observe the culture from the point of view of the participant — essentially be embedded in the culture of observation.

Phenomenology

Focusing on people’s lived experiences and how they interpret those experiences. You begin to explore the ‘phenomenon’ of the situation — a person’s perception, perspectives and understandings of an event.

Case studies

Used to look at individuals, a small group of participants, or a group as a whole. This approach begins with a clear definition of the case being researched and the researcher explores the people, behaviours, etc. based on the defined question of the research. The result of a case study is typically a report providing great detail about the case that was researched.

Grounded theory

Literally ‘from the ground up’. This approach starts with no preconceptions and develops themes and ideas as the data is collected and analysed. Often the result of the approach is a clearly defined and unpacked process, or theory that offers an explanation about the primary idea of interest.

Semi-structured interview

Face-to-face, telephone or skype conversation with participant(s) with the purpose of exploring issues or topics in detail. Uses a broad list of questions or topics (known as a topic guide). (continued)

CHAPTER 4 Qualitative research 35

Interviews The purpose of choosing to conduct an interview as part of the data collection process is to gain further insight from the participants. Interviews give the researcher the opportunity to explore questions in more depth, query previous findings and gain further understanding about the topic being researched. Interviews can be conducted either one-on-one or with groups (of up to 12 participants), better known as a focus group. In addition to how many participants the researcher chooses to interview, the researcher must also decide whether to conduct semi-structured or unstructured interviews. Semi-structured interviews occur when the researcher comes prepared with a set of essential questions for the participants to answer. The nature of semi-structured interviews also allows the participant to provide additional comments and insights, generally around the concepts of the essential questions. An example of the semi-structured interviews of parents with frequently hospitalised sick children under the age of five might include the following. 1. How often would you say that your child has been hospitalised in the last 12 months? (This can be verified with quantitative data.) 2. Talk me through the process of when you first noticed your child was becoming ill and when you decided to bring them to the hospital. 3. As a parent, what is the most frustrating/upsetting issue that you face having a child frequently in the hospital? 4. As a parent of a sick child, what would you like other parents to know? 5. Do you feel that you are being heard when you are talking to the hospital staff? (Request that they explain further as well with questions such as: How so? How often? When does this not occur?) Unstructured interviews, on the other hand, occur when the researcher has no predetermined questions to ask of the participants. Instead, the researcher begins the interview with a broad statement or question and allows the participant to guide the subsequent questions being asked by the researcher. An example of the broad statement or question in an unstructured interview of parents with frequently hospitalised sick children under the age of five might include: ‘Tell me about your experiences as a parent of a child who is frequently hospitalised’.

Observations Throughout observations, there are various levels of participation on a continuum in which the researcher can engage, from passive observation to embedded participation. It is important with the use of observations that the researchers understand the difference between ‘being native’ and ‘going native’. The term ‘going native’ was first coined by Malinowski when he reported that in order to truly understand ‘natives’ the researcher should be fully immersed in the culture in which they are studying.16 This was done for various reasons, which included increasing the engagement of the participants and gaining trust of the participants as well. However, without proper checks in place, the researcher may be at risk of ‘going native’ — which means becoming so involved that the lines between being the researcher and being a participant are so blurred that the researcher, and others looking at the study, are unaware of which one is which. A real-world example of this would be an undercover police officer working on a drug bust who then becomes a drug lord himself.17

4.4 Evaluating papers that describe qualitative research LEARNING OBJECTIVE 4.4 How will you evaluate papers that describe qualitative research?

By its very nature, qualitative research is non-standard, unconfined and dependent on the subjective experience of both the researcher and the researched. It explores what needs to be explored and provides depth and meaning to the exploration. Qualitative research is an in-depth, interpretive task, not a technical procedure. It depends crucially on a competent and experienced researcher exercising the kind of skills and judgements that are difficult, if not impossible, to measure objectively. Some people have argued that critical appraisal checklists detract from qualitative research quality because they encourage a mechanistic and protocol-driven approach.18 Others would argue that a checklist of sorts would be ideal to set some ground rules. The best attempt to offer guidance has been made by Dixon-Woods and colleagues.19 The nine-point question list provides one of the best guides for qualitative research.20

CHAPTER 4 Qualitative research 37

Question 1. Did the paper describe an important clinical problem addressed via a clearly formulated question? One of the first things you should look for in any research paper is a statement of why the research was carried out and what specific question it addressed. Qualitative papers are no exception to this rule: there is absolutely no scientific value in interviewing or observing people just for the sake of it. Papers that cannot define their topic of research more closely than ‘we decided to interview 20 patients with epilepsy’ inspire little confidence that the researchers really knew what they were studying or why. You might be more inclined to read on if the paper stated in its introduction something like: Epilepsy is a common and potentially disabling condition, and a significant proportion of patients do not remain fit-free on medication. Antiepileptic medication is known to have unpleasant side effects, and several studies have shown that a high proportion of patients do not take their tablets regularly. We therefore decided to explore patients’ beliefs about epilepsy and their perceived reasons for not taking their medication.

Although a definitive research question may not be clearly focused at the outset of a qualitative study, it should be formulated by the time the report is written.

Question 2. Was a qualitative approach appropriate? If the objective of the research was to explore, interpret or obtain a deeper understanding of a particular clinical issue, qualitative methods are almost certainly the most appropriate ones to use. If, however, the research aimed to achieve some other goal (such as determining the incidence of a disease or the frequency of an adverse drug reaction, testing a cause-and-effect hypothesis, or showing that one drug has a better risk–benefit ratio than another), qualitative methods are clearly inappropriate. If you think a case-control, cohort study or randomised trial would have been better suited to the research question posed in the paper than the qualitative methods that were actually used, then using qualitative in this instance was not the best choice.

Question 3. How were (a) the setting and (b) the subjects selected? When asking the question ‘Whom is the study about?’, in quantitative research, it’s important and beneficial to ensure that a random sample (probability sample) of participants is recruited. A random sample will ensure that the results reflect, on average, the condition of the population from which that sample was drawn. In qualitative research, however, we are not interested in an ‘on-average’ view of a patient population. We want to gain an in-depth understanding of the experience of particular individuals or groups, and we should, therefore, deliberately seek out individuals or groups who fit the bill. If, for example, we wished to study the experience of women who had water births, we would be perfectly justified in going out of our way to find women who have had a range of different water birth experiences — a water birth in hospital, a home water birth, an induced water birth, a water birth delivery by a medical student, and so on. We would also wish to select some women who have had shared antenatal care between an obstetrician and their community midwife, and some women who had been cared for by community midwives throughout the pregnancy. Finally, we might choose to study patients who gave birth in the setting of a large, modern, ‘high-tech’ maternity unit that offered water birth, as well as some who did so in a small community hospital. Of course, all these specifications will give us ‘biased’ samples, but that is exactly what we want. Watch out for qualitative research where the sample has been selected (or appears to have been selected) purely based on convenience. In the abovementioned example, taking the first dozen patients to pass through the nearest labour ward would be the easiest way to notch up interviews, but the information obtained may be considerably less helpful.

Question 4. What was the researcher’s perspective, and has this been taken into account? Given that qualitative research is necessarily grounded in real-life experience, a paper describing such research should not be ‘trashed’ simply because the researchers have declared a particular cultural perspective or personal involvement with the participants of the research. Quite the reverse: they should 38

Understanding research methods for evidence-based practice in health

be congratulated for doing just that. It is important to recognise that there is no way of fully controlling for observer bias in qualitative research. This is most obviously the case when participant observation is used (see table 4.2), but it is also true for other forms of data collection and of data analysis. If, for example, the research concerns the experience of adults with asthma living in damp and overcrowded housing and the perceived effect of these surroundings on their health, the data generated by techniques such as focus groups or semi-structured interviews are likely to be heavily influenced by what the interviewer believes about this subject and by whether they are employed by the hospital chest clinic, the social work department of the local authority, or an environmental pressure group. However, because it is inconceivable that someone with no views at all and no ideological or cultural perspective could have conducted the interviews, the most that can be required of the researchers is that they describe in detail where they are coming from so that the results can be interpreted accordingly. It is for this reason, that qualitative researchers generally prefer to write up their work in the first person (‘I interviewed the participants’ rather than ‘the participants were interviewed’), because this makes explicit the role and influence of the researcher.

Question 5. What methods did the researcher use for collecting data, and are these described in enough detail? Rather than collecting numbers and analysing them, qualitative research is so much more fun and engaging! Qualitative researchers get the opportunity to explore the beliefs, hopes, fears and attitudes of people from various ethnic groups around the world. The methods section of a qualitative paper often cannot be written in shorthand or dismissed by reference to someone else’s research techniques. It may have to be lengthy and discursive because it is telling a unique story without which the results cannot be interpreted. As with the sampling strategy, there are no hard and fast rules about exactly what details should be included in this section of the paper. You should simply ask, ‘Have I been given enough information about the methods used?’, and, if you have, use your common sense to assess, ‘Are these methods a sensible and adequate way of addressing the research question?’

Question 6. What methods did the researcher use to analyse the data, and what quality control measures were implemented? The data analysis section of a qualitative research paper is the opportunity for the researcher(s) to demonstrate the difference between sense and nonsense. Having amassed a thick pile of completed interview transcripts or field notes, the genuine qualitative researcher has hardly begun. It is simply not good enough to flick through the text looking for ‘interesting quotes’ to support a particular theory. The researcher must find a systematic way of analysing his or her data, and, in particular, must seek to detect and interpret items of data that appear to contradict or challenge the theories derived from the majority. By far the most common way of analysing qualitative data is thematic analysis. In this, the researchers go through printouts of free text, draw up a list of broad themes and allocate coding categories to each. For example, a ‘theme’ might be patients’ knowledge about their illness and within this theme, codes might include ‘transmissible causes’, ‘supernatural causes’, ‘causes due to own behaviour’, and so on. Note that these ‘codes’ are specific to each research analysis because the point of the research is to explore the interviewees’ taxonomy, whether the researcher agrees with it or not. Thematic analysis is often tackled by drawing up a matrix or framework with a new column for each theme and a new row for each ‘case’ (e.g. an interview transcript), and cutting and pasting relevant segments of text into each box.21 Another type of thematic analysis is the constant comparative method, in which each new piece of data is compared with the emerging summary of all the previous items, allowing stepby-step refinement of an emerging theory.22 Eventually, when you want to learn more about the concept of qualitative research, and how to code and analyse the data, there are programs available to help researchers through this tedious process — but at this point in time, you need to understand the types of qualitative methodologies. Be mindful that qualitative interpretation is often open to interpretation with ambiguities throughout the data and between the interpreters of the data, which is perfectly normal, as it is subjective in nature.

CHAPTER 4 Qualitative research 39

Question 7. Are the results credible and, if so, are they clinically important? We cannot assess the credibility of qualitative results via the precision and accuracy of measuring devices, nor their significance via confidence intervals and numbers needed to treat. The most important tool to determine whether the results are sensible and believable, and whether they matter in practice, is plain common sense. One important aspect of the results section to check is whether the authors cite actual data. Claims such as ‘general practitioners did not usually recognise the value of annual appraisal’ would be more credible if one or two verbatim quotes from the interviewees were reproduced to illustrate them, such as: Participant LJ0109, a General Practitioner in a large city practice, stated, ‘Although I see the value in having an annual [appraisal], I’m busy and feel as though my time is already stretched, I’m simply not sure where I’d find the time to incorporate this into my practice’ (LJ0109, transcript, p. 8, line 45–46).

The results should be independently and objectively verifiable (e.g. by including longer segments of text in an appendix or online resource), and all quotes and examples should be indexed so that they can be traced back to an identifiable interviewee and data source.

Question 8. What conclusions were drawn, and are they justified by the results? A quantitative research paper, presented in standard introduction, methods, research and discussion (IMRAD) format, should clearly distinguish the study’s results (usually a set of numbers) from the interpretation of those results. The reader should have no difficulty separating what the researchers found from what they think it means. In qualitative research, however, such a distinction is rarely possible, as the results are by definition an interpretation of the data. Therefore, it is necessary, when assessing the validity of qualitative research, to ask whether the interpretation placed on the data accords with common sense and that the researcher’s personal, professional and cultural perspective is made explicit so the reader can assess the ‘lens’ through which the researcher has undertaken the fieldwork, analysis and interpretation. This can be a difficult exercise because the language used to describe things such as meanings and motives may have different meanings and interpretations to the meanings and motives that the participants themselves shared with the researcher. Compare, for example, the two statements, ‘Three women went to the well to get water’ and ‘Three women met at the well and each was carrying a pitcher’. Mays and Pope suggest three useful questions for determining whether the conclusions of a qualitative study are valid.23 1. How well does this analysis explain why people behave in the way they do? 2. How comprehensible would this explanation be to a thoughtful participant in the setting? 3. How well does the explanation cohere with what we already know?

Question 9. Are the findings of the study transferable to other settings? One of the most common criticisms of qualitative research is that the findings of any qualitative study pertain only to the limited setting in which they were obtained. In fact, this is not necessarily any truer of qualitative research than of quantitative research. Look back at the example of women’s water birth experiences described in question three. A convenience sample of the first dozen women to give birth would provide little more than the collected experiences of these 12 women. A purposive sample as described in question three would extend the transferability of the findings to women having a wide range of water birth experiences. Therefore, by making iterative adjustments to the sampling frame as the research study unfolds, the researchers will be able to develop a theoretical sample and test new theories as they emerge. For example (and note, this is not actual findings), the researchers might find that well-educated women seem to have more psychologically traumatic experiences than less educated women. This might lead to a new hypothesis about women’s expectations (the better educated the woman, the more she expects a ‘perfect water birth experience’), which would, in turn, lead to a change in the purposive sampling strategy (we now want to find extremes of maternal education), and so on. The more the research has been driven by this kind of progressive focusing and iterative data analysis, the more its findings are likely to be transferable beyond the sample itself. 40

Understanding research methods for evidence-based practice in health

SUMMARY Health professionals have traditionally placed high value on number-based data, which may in reality be misleading and/or irrelevant to the real issues. The increasing popularity of qualitative research in the health sciences has arisen largely because quantitative methods provided either no answers, or the wrong answers, to important questions in both clinical care and service delivery. Qualitative research is not second rate by virtue of being a ‘soft’ science. Similar to quantitative research methods, the question, sampling and data collection strategy should all be well considered before embarking on the qualitative path — except in the case of conducting grounded theory in which no preconceptions should be made prior to beginning the research journey. On the other hand, unlike quantitative research, non-probability sampling techniques are often used because the researcher is being purposeful and strategic in finding participants with the experiences that they want to explore further. The data collection process of interviews, focus groups and observations are some of the ways in which researchers are able to gather so much more meaning and depth. Data analysis in qualitative research is often subjective, time consuming and hard to replicate; however, the results often provide a story that simply cannot be explained by numbers alone. Critical appraisal of qualitative research is a relatively underdeveloped science, and qualitative researchers are still refining the questions posed in this chapter.

KEY TERMS deductive Going from the general to the specific. inductive Going from the specific to general. inter-rater reliability When two or more researchers assess the data and results and give consistent estimates of the same phenomenon. progressive focusing The ability to review the research at various stages and adjust accordingly if required to improve on the study. reflexivity Examining oneself as the researcher as well as the research relationship. Making the research process and researcher a focus of inquiry. reliability In measurement, the consistency and repeatability of measurement scores over time, measurement equipment and people doing the measuring. saturation Little or no new data is generated from the participants, and it is believed that the sample size is adequate. triangulation Improves validity of a study because it cross-validates data through verifying the research results through two or more sources, typically through the application or combination of different research methods attempting to answer the same questions. validity In measurement, the accuracy of a measurement, whether a measurement system measures what it’s supposed to measure, and only what it’s supposed to measure, which depends in part on reliability.

WEBSITES 1 The Qualitative Report (TQR): http://tqr.nova.edu/websites 2 International Institute for Qualitative Methodology (University of Alberta): www.iiqm.ualberta.ca 3 QualPage (University of Georgia): www.qualitativeresearch.uga.edu/QualPage

ENDNOTES 1. 2. 3. 4.

T. Greenhalgh, How to read a paper, 5th ed. (Oxford: John Wiley & Sons, 2014). N. K. Denzin and Y. S. Lincoln, The SAGE handbook of qualitative research (London: Sage, 2011). N. Black, ‘Why we need qualitative research,’ Journal of Epidemiology and Community Health 48, no. 5 (1994): 425–26. Australian Institute of Health and Welfare, Healthy communities: overweight and obesity rates across Australia, 2014–15 (Cat.no: HPF27) (Canberra: AIHW, 2018). 5. B. Wong, ‘Second marriages are more likely to end in divorce here’s why,’ Huffington Post, March 3, 2017, www.huffingtonpost.com.au/entry/second-marriages-are-more-likely-to-end-in-divorce-hereswhy_n_58b88e38e4b0b99894162a07. 6. A. Gheondea-Eladi, “Is qualitative research generalizable?” Journal of Community Positive Practices XIV, no. 3 (2014): 114–24.

CHAPTER 4 Qualitative research 41

7. Denzin and Lincoln, The SAGE handbook. 8. N. Mays and C. Pope, ‘Qualitative research in health care: assessing quality in qualitative research,’ BMJ: British Medical Journal 320, no. 7226 (2000): 50. 9. M. Dixon-Woods et al., Integrative approaches to qualitative and quantitative evidence (London: Health Development Agency, 2004). 10. Adapted from J. F. Gilgun, ‘Reflexivity and qualitative research,’ Current Issues in Qualitative Research 1, no. 2 (2010): 1–8. 11. Gilgun, ‘Reflexivity,’ 1–8. 12. D. Armstrong et al., ‘The place of inter-rater reliability in qualitative research: an empirical study,’ Sociology 31, no. 3 (1997): 597–606. 13. S. Reeves et al., ‘Qualitative research: why use theories in qualitative research?’ BMJ: British Medical Journal 337, no. 7670 (2008): 631–34; L. Lingard, M. Albert, and W. Levinson, ‘Grounded theory, mixed methods, and action research,’ BMJ: British Medical Journal 337, no. aug07_3 (2008): a567–a67; A. Kuper, L. Lingard, and W. Levinson, ‘Critically appraising qualitative research,’ BMJ: British Medical Journal 337 (2008): a1035 ; A. Kuper, S. Reeves, and W. Levinson, ‘Qualitative research: an introduction to reading and appraising qualitative research,’ BMJ: British Medical Journal 337, no. 7666 (2008): 404–7; S. Reeves, A. Kuper, and B. D. Hodges, ‘Qualitative research methodologies: ethnography,’ BMJ: British Medical Journal 337 (2008): a1020. 14. Mays and Pope, ‘Qualitative research in health care,’ 50. 15. J. Creswell, Research design: qualitative, quantitative, and mixed methods approaches, 3rd ed. (Los Angeles: Sage Publications, 2009). 16. B. Malinowski, Argonauts of the Western Pacific: an account of native enterprise and adventure in the archipelagoes of Melanesian New Guinea (London: Routledge and Kegan Paul, 1922; enhanced edition reissued Long Grove, IL: Waveland Press, 2013). 17. L. Collins, ‘Exclusive: he stole guns and rolls of cash from gangsters, sold cocaine from drug lords and when asked if he was police or a trafficker said ‘BOTH’ — now ‘New York’s most corrupt cop’ tells of his shame,’ Daily Mail, May 12, 2015, www.dailymail.co.uk/news/article-3077358/He-stole-guns-rolls-cash-gangsters-sold-cocaine-drug-lords-asked-policetrafficker-said-New-York-s-corrupt-cop-tells-shame.html#ixzz3zucqKoDd 18. R. S. Barbour, ‘Checklists for improving rigour in qualitative research: a case of the tail wagging the dog?,’ BMJ: British Medical Journal 322, no. 7294 (2001): 1115. 19. M. Dixon-Woods et al., ‘The problem of appraising qualitative research,’ Quality and Safety in Health Care 13, no. 3 (2004): 223–25. 20. In addition, there is also the guidelines developed by the McMaster University Occupational Therapy Evidence-Based Practice Research Group and revised by Letts et al. See L. Letts, ‘Guidelines for critical review form: qualitative studies (Version 2.0),’ 2007, http://srs-mcmaster.ca/wp-content/uploads/2015/05/Guidelines-for-Critical-Review-FormQualitative-Studies.pdf. 21. L. Spencer and G. Britain, Quality in qualitative evaluation: a framework for accessing research evidence (London: Government Chief Social Researcher’s Office, Cabinet Office, 2003). 22. B. G. Glaser, ‘The constant comparative method of qualitative analysis,’ Social Problems 12, no. 4 (1965): 436–45. 23. Mays and Pope, ‘Qualitative research in health care,’ 50.

ACKNOWLEDGEMENTS Adapting author for this Australian edition: Amanda Lambros Photo: © Jacob Lund / Shutterstock.com

42

Understanding research methods for evidence-based practice in health

CHAPTER 5

Quantitative research LEARNING OBJECTIVES 5.1 Why and how is quantitative research done? 5.2 What are the major quantitative research designs? 5.3 What is the role of consistent and accurate measurement in quantitative research and healthcare?

TABLE 5.1

Aims and reasons for quantitative research

Aim or reason

Compared to qualitative, which . . .

• to describe and measure the world beyond an individual person’s experiences • to gain knowledge applying to a population, beyond the individual or sample • to demonstrate cause and effect (e.g. that smoking causes health problems, or that a treatment will benefit patients) • to test ideas, to see whether what should work (in theory) really does work • to verify predictions, so we know what will happen based on what has already happened, so we know what to do and what to avoid in the future • to reveal the scientific principles and laws of nature • to enable the development of technologies that advance society and improve people’s standard of living.

• reveals individual truths based on people’s unique interpretations, feelings, values and beliefs — knowledge about how people experience the world, not just what they experience • attempts to understand individuals’ unique perspectives about their lives, which helps us to empathise with them as clients, personalising our clinical work and enhancing therapeutic relationships • is more concerned with developing explanatory theories rather than rigorously testing theories that are already developed • aims to improve relationships between people through mutual understanding, improving quality of life through social rather than technological means.

Now let’s look at how quantitative research is done and compare it with qualitative research (table 5.2). TABLE 5.2

Quantitative research methods compared with qualitative research

Quantitative research method

Qualitative research method

Large sample sizes; within practical limits, the larger the sample the better.

Small sample sizes, a dozen people or fewer are normally acceptable.

Collecting a small amount of information from each of a large number of people.

Collecting a large amount of information from each of a small number of people.

Preference for probability (i.e. random) sampling methods to help ensure that participants in the study resemble the wider population whom the participants should ideally represent, which improves the ability to generalise the results. Shared characteristics are of more interest than individuals’ uniqueness.

Non-probability sampling methods, such as the researcher personally selecting individuals for their special characteristics. Representativeness of the sample is less important for qualitative than quantitative research. Uniqueness of the individual rather than similarity to others is valued.

Precisely planned, relatively inflexible study designs using scientific methods. Clinical trials testing the effectiveness of healthcare treatments are an example. Designs may be observational or use interventions.

Studies are planned but without the necessarily consistent data collection methods across individuals. Designs are observational, without treatments or other interventions.

Researchers’ personal values should not influence results. Methods should be exactly repeatable by other researchers to enable comparison of results across studies.

Research is collaborative between the researcher and participants; interpretation of results may explicitly reflect the researcher’s values. Exact replication across studies is not so important.

Accurate, repeatable measurement of amounts (quantities); counting and categorising. Proven accuracy and consistency of measurement are favoured.

Less emphasis on counting or measurement; instead collects verbal or text information — qualities rather than quantities.

Numerical data are analysed statistically to describe the sample and further results as counts (numbers of people), percentages of groups, averages and other summary statistics; also to estimate or predict characteristics of a population based on data from a sample.

Little or no statistical analysis or presentation of numerical results; analysis identifies common or notable themes from participants’ response. Main results are presented in narrative text form. Sample characteristics may be summarised statistically.

Interpretations largely are confined to answering a specific research question determined before data collection and analysis, while allowing for unexpected findings. Open-ended speculation is discouraged.

Results are interpreted discursively, open-endedly, with allowance for disciplined speculation. Research agenda may be revised during the study in response to unexpected findings.

CHAPTER 5 Quantitative research 45

Thus, we can see how quantitative research relies on tightly planned sampling and data collection methods, rigorous numerical measurement of amounts or categories, and statistical presentation of results with a view to answering a specific question or testing an idea.

5.2 Quantitative research designs LEARNING OBJECTIVE 5.2 What are the major quantitative research designs?

Scientific research in health is planned and set up before the project commences. Quantitative research begins with a design. This section describes major designs for quantitative research.

Intervention studies With intervention studies, the researchers do something — the ‘intervention’ — with study participants to bring about a change. The researchers then measure the amount of change after the intervention compared with before to see the effect of that intervention. A basic example of an intervention study is to treat patients and measure changes to their health condition after treatment. Among intervention studies are experimental designs, which include randomised controlled trials (RCTs) and non-randomised controlled trials. With a standard experimental design, the participants (who are commonly patients and sometimes called ‘subjects’) are allocated to (i.e. put into) one of at least two groups. One group experiences a planned treatment (the intervention), while the other group or groups receive no treatment or a pseudo-treatment — the placebo that resembles the real treatment but has no known reason to work. Multiple treatment groups are also allowed, enabling comparison of treatments. Measurements of participants’ health are taken before and after treatment, and over the same period of time for participants receiving no treatment or the placebo. If the treatment is effective there will be an improvement or a larger improvement for the treatment group compared with no treatment, an alternative genuine treatment, or a placebo. Experimental designs used in clinical trials are summarised using the PICO system. PICO stands for population, intervention, comparison and outcome measures, as outlined in table 5.3. The study summarised is a planned study design (also known as a protocol) from Sales et al. testing the effect of exercise on older persons’ health and wellbeing.3 TABLE 5.3

PICO for the Sales et al. randomised controlled trial

PICO term

What PICO term stands for

Example from Sales et al.

Population

The population of interest. Can also be a clinical group with a health condition for which a treatment needs testing.

Older persons, aged from 60 to 90 years.

Intervention

The treatment. What the researcher– clinicians do with the intention of causing a change to participants’ condition.

Exercise in a park.

Comparison

Control or placebo condition, for comparing results against the real treatment.

Social activity and a booklet about preventing falls; no exercise.

Outcome

Measurement of what the researchers are trying to change or improve with the intervention.

A test of balance; hand-grip strength, walking, lower limb strength, health-related quality of life; compliance with the exercise; feasibility and safety of the exercise; other psychosocial measures; falls and physical activity for 12 months.

Using PICO elements as search terms, you can search online databases for experimental evidence about the effectiveness of healthcare treatments for various health conditions and types of patients. Compared to non-experimental designs, experimental designs provide better evidence about causes and effects, such as whether a treatment causes any improvement to a patient’s condition. The RCT, where participants are allocated to treatment and control groups randomly (using a chance method) is ideal at finding evidence for or against causes and effects, because this design helps to rule out alternative explanations to the treatment for changes to patients’ health occurring during the study. The chapter on the levels of evidence elaborates on the important advantages of RCTs over weaker designs. 46

Understanding research methods for evidence-based practice in health

While the controlled clinical trial is the best design for testing the effects of treatments, withholding proper treatment that could benefit the patient can be unethical, ruling out a control or placebo comparison group for some studies. For this reason, non-experimental studies lacking a control or placebo condition are sometimes used. A non-experimental study that is also an intervention study is called a case series. The levels of evidence chapter explains the limitations of case series designs further.

Observational studies With observational studies, there are no deliberate treatments. Instead, the researchers allow events to happen naturally. The researchers observe what happens passively rather than attempting to make changes. Observational studies collect information without the researchers doing anything to influence the data. Observational studies are conducted when interventions are impractical or unethical, such as research into the effects of environmental hazards. The effect of natural sun exposure on the risk of skin cancer can be known only through observational studies. Using sunbeds in artificial tanning salons to test their effects on cancer risk would be unethical as an intervention. This section describes observational study designs commonly encountered in allied health. Descriptive research, as the name suggests, describes a sample and compares groups within a sample on their characteristics or opinions. Questionnaires or other types of survey are the usual data collection method. An example of an observational descriptive study is the Kenney et al. survey about the effects of informal caregiving on carers’ physical and mental health.4 Diagnostic accuracy studies evaluate how well a diagnostic or assessment procedure: • correctly identifies people who have the health condition the procedure is designed to detect • correctly identifies people who do not have the health condition. In healthcare, accurate assessment and diagnosis are essential. Inaccurate diagnoses can lead to the wrong treatment, delays to the correct treatment, missing a necessary treatment if a health condition is not identified promptly, or unnecessary treatment that is wasteful and possibly harmful. Goossens et al. tested the diagnostic accuracy of simple clinical tests that physiotherapists could easily do for detecting meniscal tears of the knee, and found these tests were not accurate compared with the gold standard, arthroscopy — an invasive medical procedure.5 Diagnostic accuracy research has looked for interesting and innovative ways to detect serious health conditions. For example, dogs have been tested for their ability to identify cancer by odour, with mixed results.6 Epidemiological research looks at how diseases and health conditions arise among various groups, defined by their genetic characteristics, physical condition, geographic location, occupation and lifestyle choices to identify hazards that make people sick, or to find protective factors against illness or disability. Within a population, groups who are most at risk of acquiring a health condition, injury or disability can be identified through these predictive factors. Using this information, measures can be taken by clinicians or policymakers to reduce the risk for those most at risk. Epidemiological research assists health service planning, because it can predict the future demand for health services among demographic, community and lifestyle groups, and geographical areas. Major study designs for epidemiological research include: • cohort studies, comparing the rate at which a health condition occurs among people exposed to a hazard, and the rate for other people not exposed to the hazard • case-control studies, comparing the rates of prior exposure to a hazard among people with a health condition, and the rate of prior exposure for other people without that health problem. Epidemiologists (i.e. people who conduct epidemiological research) further distinguish between: • prospective studies, working forwards in time, using earlier events such as exposure to a hazard or a treatment to predict later events such as disease, death, disability, or even survival and recovery • retrospective studies, working backwards in time, starting with effects (health problems) and measuring what happened earlier that might have increased the risk (the hazard) • longitudinal studies that track people over time to see what happens to them, even over a period of years (e.g. starting with a sample of 20 year olds, keeping in contact with them for 20 years and measuring them again when they reach 40 years of age) • cross-sectional studies, working at a single point in time rather than following people over time the way a longitudinal study does (e.g. comparing 20 year olds and 40 year olds now). Prospective and longitudinal studies are scientifically stronger, although they can be more difficult to conduct, than retrospective and cross-sectional studies. Longitudinal prospective cohort studies are considered the ideal design for epidemiological research, even if they are more difficult and expensive to conduct than easier, but scientifically inferior, retrospective designs, such as the case control.

CHAPTER 5 Quantitative research 47

The contribution of epidemiological research to public health is immense and it affects our lives. If you’re a young person feeling tired much of time, perhaps you might try reducing your smartphone use late at night, as Van den Bulck found a link between mobile phone use ‘after lights out’ and fatigue.7 If you’re paying more for your car insurance or claims excess because you’re a young driver, even though you’re a skilled, sensible and safe driver, perhaps it’s because insurance companies know about the increased risk of motor vehicle crashes involving drivers aged under 25. The Australian Bureau of Statistics conducted a study about motor vehicle accident risks and other risk-taking behaviour among young persons in Australia,8 and the Australian Institute of Health and Welfare released a report with plenty of epidemiological evidence about health and wellbeing for young persons in Australia.9 Thanks to epidemiological research, sun beds at commercial tanning salons are now banned in New South Wales.10 Systematic reviews are a specialised type of research in which a selection of existing quantitative studies is appraised and their results combined statistically to give a summary finding. Systematic reviews save practitioners the time and effort to locate, read, appraise and integrate results from numerous research reports. They are conducted according to procedures designed to reduce the risk of bias compared with the alternative (and not so highly regarded), less formal style of literature review known as the narrative review. Meta-analysis is a specialised statistical technique for combining the results from a set of quantitative studies in a systematic review. The purpose of meta-analysis is to arrive at an overall statistical result from a set of studies, estimating the effect of a treatment or other influence on the health of a population. Not all systematic reviews include meta-analysis, although many do. As an evidence-based practitioner, you are encouraged to explore systematic reviews, as they can quickly tell you what a large amount of research says about the effectiveness of treatments, saving you from reading a lot of separate research reports. Well-written systematic reviews also evaluate the quality of existing research, so you know whether the evidence about a clinical topic is dependable. Many systematic reviews are published in professional journals. Systematic reviews are also located in dedicated databases such as the Cochrane Library (www.cochranelibrary.com) where you can search for systematic reviews about topics that interest you. The Cochrane Library is a valuable resource that you should be familiar with and is free of charge in Australia. Cochrane reviews are usually summarised in plain language as well as in the more technical abstract — see the Dieterich et al. review of intensive case management for severe mental illness as an example.11

5.3 Measurement LEARNING OBJECTIVE 5.3 What is the role of consistent and accurate measurement in quantitative research and healthcare?

Measurement is fundamental to modern healthcare as well as to quantitative research. If, as a practitioner, you record information about your clients such as their age, height, weight, functional ability, pain, range of motion, occupation, lifestyle or other habits — indeed, any information concerned with numbers or amounts (how much, how many, how often, how high or how low, more than or less than) — you’re working with quantitative measurement. It helps clinicians to know about basic principles for measurement, not only so they can understand quantitative research but also to work more effectively as practitioners. Health service administrators and policymakers rely on numerical information that comes from measurement. Four main levels of measurement are recognised — nominal, ordinal, interval and ratio. Any of the measurement methods available to healthcare, from simple tick-a-box survey answers to a high-technology diagnostic system, produces results belonging to one of the four levels of measurement in table 5.4. Each measurement level is known as a scale. Thus, there are nominal, ordinal, interval and ratio scales. Do not confuse measurement scales with the entirely different levels of evidence described in the levels of evidence chapter. Table 5.4 further classifies the four measurement levels as categorical or continuous. Categorical data refer to group membership or one’s place on a list ordered from low to high. Categorical data do not state exactly how high or how low a person’s score is on a measurement. It makes sense to talk about the number of males and females or the number of people passing an examination, or the number of people in the highest 10 per cent of scores, but not the ‘average gender’ or an average rank. Fractional values cannot happen with categorical data. No one comes 2.5th in a race or has ¾ of a religion or 1½ postcodes or a postcode of 2000.25.

48

Understanding research methods for evidence-based practice in health

TABLE 5.4

Levels of measurement

General type

Level of measurement

Categorical

Continuous

Definition

Examples

Nominal

Classifies people, places, objects or events into named groups. Any numerical values applied to the groups, such as Group 1 or Group 2, or numbered categories of people (e.g. Male = 1, Female = 2), are labels only, not referring to amounts or quantities. Aside from counting numbers of group members, there is no information about magnitude on a measure.

• any yes/no answer to a question • people’s characteristics that are not quantities: gender, language, religion; postcode, student number, telephone number, or a motor vehicle registration number.

Ordinal

Records rank order of items in a set: first, second, third, fourth and so on in a list. Differences between items in a series are not the same amount; spacing is uneven and irregular. There is no exact measure of magnitude.

• gold, silver and bronze medals in a sporting competition • pass, credit, distinction and high distinction grades in a university subject • rich or poor — implies more or less wealth without quantifying the difference as an amount of money, not how much more.

Interval

Every point on the scale is separated by the same amount — equal intervals. Going from 1 to 2 is the same increase as going from 2 to 3 and so on, at every location on the scale. Amounts on a measure are identified. Zero does not mean ‘nothing’ or ‘no amount’. A value of ‘0’ stands for an amount more or less than nothing, or ‘0’ on an interval scale is not defined at all.

• temperature measured in degrees Celsius; zero degrees Celsius does not mean ‘no temperature’ • satisfaction scale from 1 (‘dissatisfied’) to 5 (‘satisfied’) without a zero value on the scale.

Ratio

Similar to interval scale; measures amounts with equal values between points on the scale. Has extra advantage of zero point defined as ‘nothing’ (i.e. a zero amount of what’s measured).

• many physical measures such as height, weight, angle of limb flexion or extension, heart rate, blood pressure • most measures of time (minutes, seconds, etc.) • measures of money as currency; number of dollars or cents • the number of questions correctly answered on a test.

Continuous data contain more information than categorical data, and so are preferred when continuous measurement is practicable. For continuous scales, summary statistics such as averages can be meaningfully calculated. Fractional values are possible on continuous scales. It’s possible to have $2.5 million, a diamond weighing ¾ of a carat, or be 1½ years old. Continuous measurement can be converted into categorical data. We can measure a person in centimetres — a continuous, ratio measurement scale — and convert that measurement into ordinal clothing sizes: ‘small’, ‘medium’, ‘large’, ‘extra large’, and so on, which is categorical measurement.

Variables — independent and dependent In research and clinical practice, you may read or hear about variables. A variable is another name for a measurement. Measures such as height, weight, time, gender, personality, health, diagnostic category and survival are all variables because they change from one person to another or with time. Eventually you may hear about independent and dependent variables. These terms apply to experimental research. In studies testing the effects of interventions, the independent variable refers to the treatment CHAPTER 5 Quantitative research 49

and control or placebo conditions. Because interventions typically refer to groups such as treatment and control, the independent variable is usually a nominal variable standing for the groups. The dependent variable refers to the outcome measure, and could be categorical (e.g. patient’s condition improved or not improved) or continuous (e.g. body mass index calculated from individuals’ height and weight). Here is another way to remember what independent and dependent variables refer to, using the two short words, of and on: ‘We are interested in the effect of the treatment on the outcome’. This is the same as: ‘We are interested in the effect of the independent variable on the dependent variable’.

Reliability and validity in measurement Reliability in measurement refers to consistency over repeated measurement occasions, getting the same result from multiple measurements of an unchanging category or amount. Reliable measurements taken from the same unchanging person or thing are stable over repeated observations. Low reliability can happen in various ways. Therefore, different types of reliability have been defined. Types of reliability are shown in table 5.5. Where reliability is high, there is agreement within a set of scores about the same, unchanging person or thing. TABLE 5.5

Types of reliability What this reliability refers to

Reliability

What high reliability means

Clinical example of reliable measurement

Intra-rater

Repeated measurements by the same person.

Measurements by the same person measuring the same thing at the same time are consistent.

One clinician measuring the height of the same person twice using the same tape measure gets the same result for both measurements.

Inter-rater

Measurements by at least two people.

Measurements by two different people measuring the same thing at the same time are consistent.

Two clinicians measuring the height of the same person once, each using the same tape measure, get the same result for both measurements.

Test-retest

Repeated measurements over time. Similar to intrarater only with a larger time gap between measurements.

Measurements of the same unchanging thing using the same methods remain consistent over time.

One clinician measuring the height of the same person twice, with measurements one week apart, using the same tape measure gets the same result for both measurements.

Alternate or parallel forms

Measurement instruments. Refers to measuring equipment rather than the person doing the measuring.

Different methods or equipment for measuring the same unchanging thing give consistent results.

One clinician measuring the height of the same person twice, once using a stadiometer and again with a tape measure, gets the same result for both measurements.

More than one type source of unreliability can happen at the same time; for example, unreliable clinicians using unreliable equipment. Of course, if what’s being measured changes, measurements should also change. Measuring a child every year for five years and finding an increase in the height is no evidence of unreliability. The child is growing taller. Imagine first-year allied health students attempting a measurement task in tutorial classes. Volunteer students have their waist measured in centimetres independently by several of their classmates using a tape measure. Each measured student should receive the same value for their waistline, because waistlines don’t change appreciably during a one-hour class. To the class’s surprise, the measurements are inconsistent. A set of measurements, such as 82, 83, 80.5, 85, 80, 83, 81, can occur for the same person measured by several students. Measuring a person’s waistline consistently and accurately isn’t easy. The waistline measurements from the class show problems with inter-rater reliability. While reliability is about the consistency or stability of measurements, validity refers to the accuracy of measurements. Consistency and accuracy may superficially appear the same concept. Technically they are different, yet related. With the classroom measurement exercise described above, the waistline measurements cannot be considered accurate because they are inconsistent. At least one of the measurements 50

Understanding research methods for evidence-based practice in health

has to be wrong, and perhaps they all are wrong. With further training about how to measure the waistline accurately, and better awareness of measurement principles, the students’ future clinical measurements should become more reliable and valid. Validity is about whether a test really does measure what it claims to measure. We may ask whether intelligence tests measure only intelligence, or whether they measure other characteristics of a person, such as cultural factors or educational advantage. Western-style intelligence tests comprise language questions, general knowledge and logic puzzles that are unknown in many cultures. People raised and educated in a non-Western culture may not perform well on conventional tests of intelligence despite being very bright in the manner of their own society. Therefore, a person’s cultural and educational background may contaminate results, reducing the validity of intelligence scores for persons unaccustomed to the style of test. The most fundamental type of validity is construct validity. A construct is a concept or idea. Examples from the long list of constructs of interest to allied health practitioners include: • health (mental and physical) • pain or discomfort • gender • anxiety or depression • emotional intelligence • self-esteem • socioeconomic status (a complex measure of how well a person is advantaged socially and financially). A test is construct valid if it really does measure the concept or idea that it’s supposed to measure and nothing else. Some clinical tests are designed only for specific populations, and should be used only with those patient groups. The Abbey pain scale is intended to detect pain among people with dementia who lack the verbal ability to report their pain using spoken words, and who could not meaningfully mark a point on a rating scale.12 Along with construct validity, other types of validity are defined with examples in table 5.6. TABLE 5.6

Types of validity

Type of validity

What it means

Examples

Construct validity

Whether a test or measurement system measures only the concept or idea that its developers claim or intend it to measure in the relevant population.

Whether the Abbey pain scale reliably and accurately measures pain experienced by people with dementia who cannot speak for themselves. Whether a survey measuring quality of life really does measure quality of life.

Content validity

Whether a measurement system measures all or only part of what it’s supposed to measure.

A test of arithmetic ability should examine skills in addition, subtraction, multiplication and division. If any of these four components are missing from the test, the test is content-deficient.

Criterion validity

Whether a measurement predicts results on another, relevant measurement. May provide evidence for construct validity.

How well secondary school final examination results predict school leavers’ subsequent performance at university. University grades serve as the ‘criterion’.

Face validity

Whether a measurement or test appears to measure what it’s expected or intended to measure, important if a test and its results are taken seriously by people using the test.

Suppose final, clinical examinations at university consist only of trivia questions about Hollywood celebrities. Students will not take the exam seriously and may underperform on the test, even if their knowledge of popular culture accurately predicts their clinical knowledge.

Internal consistency

Whether questions on a test or scale measure the same concept.

Whether the various questions in a final examination assess the same body of knowledge. Internal consistency is measured by a statistic known as Cronbach’s alpha.13

CHAPTER 5 Quantitative research 51

It’s possible to measure inaccurately but consistently, in which case there is reliability without validity. On the other hand, if a test is unreliable, the test must be invalid. With low reliability there is no assurance of accuracy. To summarise: • if a test is valid, it must be reliable • if a test is unreliable, it must be invalid. In other words, reliability is necessary (it must be present) but is not sufficient (not enough by itself) for validity. Validity requires not only consistent measurement, but also complete, accurate and relevant measurement for the intended purpose. Whether we are practitioners or researchers, our aim is to use tests that measure only what we intend them to measure. Without reliable and valid measurement, our work as health practitioners or researchers cannot be trusted. Measurement validity relates to the separate concepts of internal validity and external validity of an entire research study. Internal and external validity refer to the truth, trustworthiness and generalisability of research findings. If the measurements in a research study are invalid, the results of the study as a whole cannot be trusted, which reduces internal validity. False results and conclusions cannot be expected to generalise to the wider population, which is a problem for external validity. In the worst case, invalid research can mislead well-intentioned practitioners into doing the wrong thing for their clients. It is the researcher’s responsibility to use reliable and valid measures, and to convincingly demonstrate that the measures used were consistent and accurate. The health practitioner’s responsibility is to look for evidence of reliability and validity in research reports, and to ensure that their own measurements in clinical practice reach a similarly high standard.

52

Understanding research methods for evidence-based practice in health

SUMMARY Compared to qualitative research, quantitative research collects numerical information and analyses results statistically. Quantitative research is classified according to its design. Intervention studies such as randomised controlled trials measure the effects of treatments on patients’ health. With observational studies, researchers allow events to occur naturally. Designs for observational studies include descriptive surveys, diagnostic accuracy studies and epidemiological studies. Systematic reviews combine results from selected quantitative studies, often using meta-analysis. Levels of measurement consist of nominal and ordinal measurement scales, defined as categorical measurement; and interval and ratio scales classed as continuous measurement. Reliability refers to consistency of measurement whereas validity refers to accuracy.

KEY TERMS dependent variable For intervention studies, the outcome measures expected to be affected by the intervention (independent) variable. descriptive research Observational studies describing and optionally comparing the characteristics or responses of a sample. diagnostic accuracy study A research design evaluating how well a clinical diagnostic or assessment procedure can identify the presence or absence of a health condition in a sample. epidemiological research The scientific study of causes and other factors affecting the occurrence illnesses, injuries, survival or recovery among populations and groups within a population. experimental design An intervention study involving at least one treatment group and a control or placebo group, so the effect on outcomes with and without the intervention can be compared. independent variable For intervention studies, the intervention or the treatment expected to affect the outcome (dependent) variable. intervention study Research where the researchers deliberately do something with participants with the intention of causing a change. Compare with ‘observational study’. level of measurement How quantitative information is recorded, according to whether only group membership (nominal scale) or ranking (ordinal scale) is identified, or whether quantities or amounts are directly measured (continuous scales). Continuous scales (interval and especially ratio) are considered more informative levels of measurement than categorical (nominal or ordinal) scales. meta-analysis Statistical procedure for combining results from existing quantitative studies with measured outcomes, effectively increasing sample size and precision. observational study Research where the researchers do not intervene to bring about a change but instead observe and measure events as they happen naturally. Compare with ‘intervention study’. PICO A way of listing the clinical population, intervention (treatment), comparison or control conditions and measured outcomes for an experiment; useful for identifying search terms for intervention studies. reliability In measurement, the consistency and repeatability of measurement scores over time, measurement equipment and people doing the measuring. systematic review Secondary research evidence that answers a focused clinical question through a structured and rigorous synthesis of original, primary studies chosen with strict and transparent inclusion and exclusion criteria. validity In measurement, the accuracy of a measurement, whether a measurement system measures what it’s supposed to measure, and only what it’s supposed to measure, which depends in part on reliability. variable Anything measurable, either categorically or continuously, and can differ from one observation to another, such as among different individuals, groups of people, or across time.

WEBSITES 1 Abbey pain scale: www.health.gov.au/internet/publications/publishing.nsf/Content/triageqrg~triageqrg-

pain~triageqrg-abbey 2 Research Methods Knowledge Base: Design: www.socialresearchmethods.net/kb/design.php 3 Making sense of Chronbach’s alpha: www.ijme.net/archive/2/cronbachs-alpha.pdf 4 Research Methods Knowledge Base: Measurement: www.socialresearchmethods.net/kb/measure.php

CHAPTER 5 Quantitative research 53

ENDNOTES 1. M. Gunner, ‘Going for broke: a heideggerian phenomenological study of problem gamblers,’ (unpublished PhD thesis, University of Western Sydney, 2000), http://researchdirect.uws.edu.au/islandora/object/uws. 2. C. J. Forrest, D. L. King, and P. H. Delfabbro, ‘The gambling preferences and behaviors of a community sample of Australian regular video game players,’ Journal of Gambling Studies (Online) 32, no. 2 (2016): 409–20. 3. M. P. R. Sales et al., ‘A novel dynamic exercise initiative for older people to improve health and well-being: a study protocol for a randomised controlled trial,’ BMC Geriatrics 15, no. 1 (2015): 1–17. 4. P. Kenny, M. T. King, and J. Hall, ‘The physical functioning and mental health of informal carers: evidence of care-giving impacts from an Australian population-based cohort,’ Health & Social Care in the Community 22, no. 6 (2014): 646–59. 5. P. Goossens et al., ‘Validity of the thessaly test in evaluating meniscal tears compared with arthroscopy: a diagnostic accuracy study,’ Journal of Orthopaedic & Sports Physical Therapy 45, no. 1 (2015): 18–24. 6. R. T. Gordon et al., ‘The use of canines in the detection of human cancers,’ Journal of Alternative & Complementary Medicine 14, no. 1 (2008): 61–7; H. Sonoda et al., ‘Colorectal cancer screening with odour material by canine scent detection,’ Gut 60 (2011): 814–19. 7. J. Van den Bulck, ‘Adolescent use of mobile phones for calling and for sending text messages after lights out: results from a prospective cohort study with a one-year follow-up,’ Sleep 30, no. 9 (2007): 1220–3. 8. Australian Bureau of Statistics, Australian social trends: risk taking by young people (Canberra: ABS, 2008). 9. Australian Institute of Health and Welfare, Young Australians: their health and well-being (Canberra: AIHW, 2011), www.aihw.gov.au/publication-detail/?id=10737419261. 10. Environmental Protection Agency NSW, Solaria (Sydney: NSW Government, 2015), www.epa.nsw.gov.au/ your-environment/radiation/solaria. 11. M. Dieterich et al., ‘Intensive case management for severe mental illness,’ Cochrane Database of Systematic Reviews 10 (2010). doi:10.1002/14651858.CD007906.pub3. 12. J. Abbey et al., ‘The abbey pain scale: a 1-minute numerical indicator for people with end-stage dementia,’ International Journal of Palliative Nursing 10, no. 1 (2004): 6–13. 13. M. Tavakol and R. Dennick, ‘Making sense of Cronbach’s alpha,’ International Journal of Medical Education 2 (2011): 53–55.

ACKNOWLEDGEMENTS Adapting author for this Australian edition: John Bidewell This chapter has been adapted from K. Gerrish and J. Lathlean, eds., The research process in nursing, 7th ed. (Chichester: Wiley-Blackwell, 2015). Photo: © Felix Furo / Shutterstock.com

54

Understanding research methods for evidence-based practice in health

CHAPTER 6

Levels of evidence LEARNING OBJECTIVES 6.1 What are the major clinical questions in healthcare? 6.2 How are clinical questions matched to research designs and organised into levels of evidence according to their risk of bias? 6.3 How does bias threaten the validity of research evidence? 6.4 What are the grades of recommendation for evidence, leading to clinical practice guidelines? 6.5 How do levels of evidence fit into a general hierarchy of evidence?

Notice how the two clinicians spoke to the patient as if the patient himself were responsible for the decision to use the footstool. Rather than blaming the patient for someone else’s decision, both clinicians might instead have searched published evidence and found at least two systematic reviews, one by Faldini, Traina, De Fine, Pedrini, and Sambri (2015)1 and another by Fu, Tian, Li, Sun, and Ma (2016)2 about how limb position affects blood loss and range of motion after total knee-replacement surgery. The Faldini et al. and Fu et al. reviews show how research offers evidence-based answers to a clinical question about post-operative care after knee-replacement surgery. Published results from research about thousands of other clinical questions are available from libraries, databases and the internet generally. It’s unnecessary for most healthcare practitioners to conduct their own research, even properly designed and approved research. A vast amount of it has already been done. The challenge lies in finding the highest quality and most useful research evidence from the large amount available on almost any health topic. Suppose you were interested in the effects of strapping knees for athletes. In September 2018, a search of the CINAHL database (https://health.ebsco.com/products/the-cinahl-database/allied-healthnursing) for nursing and allied health found 467 hits for the terms knee and strap* (for strap, strapping, strapped) or tape, taping or taped. See figure 6.2 for an example. We could ask how many of these 467 hits are: • current and recent, or old and possibly outdated? • informed opinion based on research; or just opinion, possibly uninformed? • well-conducted research, free from biases threatening the validity of results and conclusions? • well reported, preferably peer reviewed, meaning the articles are independently checked for quality? • consistent in their findings across several or more studies? • relevant to your local clinical population and healthcare setting? These questions can be asked about any evidence, from any source, including searches done with popular search engines on the internet. In short: • how can we tell the difference between good- and poor-quality evidence? • how can we identify evidence that should inform clinical practice, and distinguish it from poor-quality, biased or irrelevant evidence that should not guide practice? The above two questions define the problem. Not all evidence is high quality. Some evidence is biased. Not all evidence is relevant. Health practitioners should avoid information that is biased, unclear, or simply false and misleading. From the remaining, high-quality evidence, clinicians can select what’s relevant to their local setting and patients. Fortunately, many experts have developed ways for evidence-based practitioners to sort through the almost unlimited amount of information available so practitioners can find information worth using.

NHMRC and evidence-based practice The National Health and Medical Research Council (NHMRC, www.nhmrc.gov.au) is Australia’s leading government organisation offering expert advice about health and medical research, ethical healthcare and health research, and putting health and medical research findings into practice. The NHMRC provides Australian medical and health practitioners with ways to evaluate published research for its quality and relevance. The NHMRC system looks at two aspects of research. 1. Clinical question. For any clinical problem, such as post-operative care for knee replacement, there is a clinical question and a way to answer that question through research. Ways to set up research are known as study designs or research designs. 2. Level of evidence. Some research designs offer better protection against bias than other designs. Superior designs are assigned a higher level of evidence. Research at a higher evidence level is preferred for guiding practice. Evidence-based practice relies on asking the right type of question for the clinical problem and finding the best way to answer it. The best research design will answer the question using unbiased evidence. Next, we shall look at the types of clinical questions that health professionals ask, and match those questions with NHMRC levels of evidence for study designs. That helps us identify the most relevant and trustworthy evidence.

How researchers answer clinical questions Let’s begin with how researchers answer clinical questions. Table 6.1 matches clinical questions along with a research design to answer that question.

CHAPTER 6 Levels of evidence 57

TABLE 6.1

Clinical questions and how they are answered

How effective is this treatment for this health condition? How a researcher will find the answer.

An intervention study such as a clinical trial Treat a sample of patients and measure change in the patients’ condition. Compare results on the same measures from patients who were not treated.

How accurate is this assessment or diagnostic procedure? Diagnostic accuracy study Measure how well an assessment identifies people with a specified health condition, and separates them from people without that condition.

What genetic, lifestyle or environmental hazards cause this health condition or make it more likely? Aetiological study Epidemiological research: Measure associations between people’s exposure to a suspected hazard, and rates of a health condition among the same people.

How does this health condition progress if untreated? Prognostic study Record what happens to people who have a specified health condition, especially in cases where the condition is untreated.

Will people be helped or harmed by communitywide testing for this health condition? Screening intervention study Test a large number of people without signs, symptoms or diagnosis of a specified health condition. Where the screening test suggests the patient may have the condition, measure whether later follow-up procedures lead to a benefit or harm.

Some of the clinical questions in table 6.1 will relate more to your practice than other questions. Through all this, please remember the following. • Researchers answer clinical questions by collecting information from real people, in the real world. Researchers do the original research and publish the results. The results are available for you to find in libraries and databases. • As an evidence-based practitioner, you don’t have to do the original research. You need only to find the research, and then think about its quality and relevance to your work before putting the results and recommendations into practice. Some evidence-based practitioners do conduct their own research, with formal approval, using proper methods. Read the author details in research reports and you’ll find many of them are clinicians. Many students will go on to do research higher degrees and become clinicians and researchers. You may become one of these clinician-researchers. For now, you are learning about how to find and use research rather than how to conduct new, original research.

6.2 Matching clinical questions to NHMRC levels of evidence LEARNING OBJECTIVE 6.2 How are clinical questions matched to research designs and organised into levels of evidence according to their risk of bias?

The next stage in understanding levels of evidence is to match research questions to study designs. Table 6.1 did this in a simple way: you’ll have noticed that clinical questions about treatments (‘How well does this treatment work?’) go with clinical trials — easy! However, there’s more than one study design for testing treatments. This is where levels of evidence enter the picture. Study designs offering better protection against bias have a higher level of evidence. Table 6.2 expands on table 6.1 by matching the study designs to levels for each type of study according to NHMRC (2009) and shown slightly simplified.3 For every type of study and every level of evidence, table 6.2 shows the study design.

58 Understanding research methods for evidence-based practice in health

Even researchers who want to be honest can produce biased research by choosing weaker designs (i.e. lower-level evidence), or through poor methods with higher-level designs. Table 6.3 is a list of biases that may affect intervention studies, such as clinical trials testing the effects of treatments. Better designed and conducted studies should be less susceptible to these biases. The biases change the results for one group compared to another in ways that say nothing about the effects of the treatment. TABLE 6.3

Biases that may affect intervention studies

Allocation bias

Differences between treatment and control conditions at the start of the experiment

Detection bias

Differences in how the treatment and control groups are assessed or measured

Performance bias

Events other than the intended treatment happen during the experiment

Attrition bias

Some types of participants leave the study, more so in one group than another, setting up unwanted differences between groups in the background characteristics of the participants

Measurement bias

Outcomes are measured inaccurately

Random allocation reduces allocation bias only. Level II evidence should have a lower risk of allocation bias than Levels III and IV. All the other biases can affect randomised controlled trials (Level II) and pseudo or non-randomised controlled trials (Level III). Measurement and performance bias can affect Level IV studies as well as Levels II and III. Evidence-based practitioners will look for how the researchers protected their study against these and other biases in the methods sections of the reports they read. Level I systematic reviews are susceptible to all of the biases that affect original studies. If the original research is biased, the systematic review will inherit those biases. Authors of systematic reviews inspect the original studies for bias and eliminate biased studies from the review conclusions. Properly conducted systematic reviews should, therefore, be relatively unbiased. Additional biases may affect systematic reviews specifically. Biases specific to systematic reviews are listed in table 6.4. Some of these biases affect whether the original studies are published at all, or whether the original studies are easy for reviewers to find. Reviewers can work only with studies they can locate. Other biases affect whether reviewers choose to include a study in their review even after they have found that study.

TABLE 6.4

Biases specific to systematic reviews

Publication bias

Studies researching unpopular research topics or treatments don’t get published, so their evidence is unavailable to reviewers

Time lag bias

Delays in publication prevent the research from being found by practitioners or reviewers in time for their review

Duplicated publication bias

The same results from the same studies are repeatedly published, suggesting there’s more evidence than there really is

Outcome reporting bias

Mainly desirable or large expected results or statistically significant results get published, even though other results are equally valid and informative; another type of publication bias

Citation bias

If a study is cited a lot by many other authors, reviewers are more likely to find that research compared with studies that are rarely cited or not cited at all

Database inclusion bias

Studies are much more easily found if they are available from online databases

Language bias

A preference among reviewers for studies published in the language they understand, commonly but not always English

Reviewers’ personal biases

Reviews may unfairly exclude an article because they don’t like the topic or the results, even though the results are valid and relevant

64 Understanding research methods for evidence-based practice in health

6.4 Evaluating the evidence — quality of evidence and grades of recommendations for practice guidelines LEARNING OBJECTIVE 6.4 What are the grades of recommendation for evidence, leading to clinical practice guidelines?

The main purpose for health research is to inform clinical practice and policy decisions. A clinical practice guideline is an educational document with recommendations for patient care for a named health condition. A clinical practice guideline doesn’t tell healthcare professionals what to do. Instead, it gives an evidence-based suggestion. Guidelines are an aid to clinical decision-making. Clinical practice guidelines are found in databases such as the Australian Clinical Practice Guidelines (www.clinicalguidelines.gov.au), administered by the NHMRC; and the NHMRC’s Publications and resources (www.nhmrc.gov.au/aboutus/publications). Other databases for guidelines include Turning Research Into Practice (Trip, www.tripdatabase.com), and databases specific to professions, such as PEDro (www.pedro.org.au) for physiotherapists and OTseeker for occupational therapists (http://www.otseeker.com). Evidence-based support for clinical decisions can also be found at Best Practice (http://bestpractice.bmj.com). From what you’ve read so far in this chapter, you could assume that developing guidelines becomes as simple as finding the highest-level evidence for that type of clinical question, which always means Level I evidence, the systematic review of Level II studies. Unfortunately, developing guidelines is not so simple. While a higher level of evidence reduces the risk of bias, that high level alone is not enough to make a study a good one. A study must be done well, regardless of its evidence level. A good-quality randomised control trial (Level II evidence) with a large, representative sample and biases mostly eliminated, and with accurately reported results, all properly interpreted, could contain better information than a poor-quality systematic review containing a small and biased selection of mediocre studies that are misinterpreted. Even good-quality systematic reviews may say there’s not enough evidence from existing research, or the existing research is poor quality, even when that evidence is from Level II studies. Forrester et al. (para. 7) in the abstract for their systematic review, described the evidence for the value of aromatherapy for dementia as ‘equivocal’, meaning no definite conclusion was possible from the evidence.12 Only two out of seven studies reviewed had usable data. This was a problem with quality of evidence about aromatherapy for dementia. That is not a fault with the systematic review. The Thieme et al. systematic review on mirror therapy after stroke raises similar doubts about the quality of evidence (figure 6.1). The NHMRC has adopted the internationally recognised GRADE system for evaluating the quality of evidence used to develop clinical practice guidelines.13 The GRADE system works by rating the quality of evidence and the strength of recommendations from that evidence14; these are two separate but related concepts. The reasons for this rating system are that: • correct recommendations for healthcare procedures depend on high quality of evidence • a strong recommendation for a treatment is reasonable if good-quality evidence shows that benefits from a treatment greatly exceed the risk of harm from that treatment • if the balance between desirable and undesirable effects of a treatment are unknown from the evidence, because the evidence is not of high quality, then only a weak recommendation or no recommendation at all should be given. Quality of evidence is measured according to the following GRADE criteria. • Strengths and limitations. The strengths and limitations of the study design (i.e. level of evidence) and procedure are measured along with how the study was conducted. • Consistency of results between and within studies. Agreement about findings as reported within a single research study, and consistency of results across various studies, demonstrate high-quality evidence. • Directness of the evidence. It is better to compare two treatments within a single study than separately, across multiple studies. • Precise results. Research results should estimate almost exactly the effects of a treatment on the target population of patients. • Effect size. The size of the beneficial effect of the treatment should be large enough for the treatment to make a worthwhile improvement to health or quality of life. Benefits should be at least worth the effort of treating the patient. • Bias. This is measured as the possibility of unwanted other factors confusing the result, otherwise known as bias. GRADE ratings for quality of evidence refer to how likely it is that future research will change our opinions about the value of a healthcare procedure. Evidence scoring well on the above criteria for quality

CHAPTER 6 Levels of evidence 65

is more likely to withstand the test of time compared with weaker evidence. Low-quality evidence can expect to be quickly outdated by better evidence from superior research designs and methods. Evidence quality for research about treatments is rated as follows. • High quality. Estimates of treatment effects are unlikely to change as a result of future research. • Moderate quality. Future research will likely change our confidence in estimated treatment effects. • Low quality. Changes to our confidence in estimated treatment effects are very likely from future research. • Very low. There is considerable uncertainty in the estimates of treatment effects, which means we can’t be sure how well a treatment works when compared with its costs and risks. GRADE ratings for strength of the recommendation about healthcare procedures are based on the: • evidence quality about desirable and undesirable results or effects • patient’s values and preferences, which are part of good healthcare decision-making in general • cost effectiveness of the procedure, which acknowledges that all healthcare procedures require time and money. High-quality healthcare more than justifies its costs. The GRADE system recognises two levels of recommendation strength for healthcare treatments. 1. Strong. We can be reasonably sure that a treatment’s benefits well exceed its costs and associated risks. Given an informed choice, most patients would choose this treatment. 2. Weak, conditional or discretionary. There is a fine balance between costs and benefits of the treatment, or we cannot be certain about measured treatment effects indicating how well the treatment works. Given an informed choice, patients may prefer other treatment options.

6.5 Levels within levels LEARNING OBJECTIVE 6.5 How do levels of evidence fit into a general hierarchy of evidence?

Levels of evidence in table 6.2 are about the trustworthiness of the data, based on the risk of bias. We have so far seen that there are original studies, with their data collected from samples of patients, also known as primary evidence. Original studies make up Levels II, III and IV. Then there are Level I systematic reviews, secondary evidence, in which original Level II studies are located in databases by using planned search criteria, then selected and evaluated for quality, bias and consistency (‘homogeneity’), and the results summarised. Many systematic reviews use a procedure called meta-analysis, which combines results from individual studies statistically to reach an overall conclusion about what the research is saying. Cochrane Collaboration maintain the online Cochrane Library of Systematic Reviews (www.cochranelibrary.com)15, one of the best places to find Level I evidence, and it’s free within Australia. Beyond the levels of evidence, there is a larger hierarchy of evidence types, which includes, and goes beyond, the original studies and systematic reviews we’ve seen in table 6.2. This hierarchy is presented as a triangle in figure 6.5. In figure 6.5, the levels of evidence from I to IV occupy the middle layers of the triangle. Above secondary and primary studies are summaries of information collected and pre-appraised by experts, often with stated implications and guidance for practice, including clinical practice guidelines mentioned above. This summarised information is available in dedicated databases, providing evidence-based practitioners such as you with a convenient resource for a huge range of topics. Below, secondary and primary studies in the triangle are the professional opinions and anecdotal evidence that carry a high risk of bias. This lowest layer in the hierarchy is not recommended as evidence to guide your practice. Table 6.5 presents this evidence hierarchy, along with databases and other resources where that evidence is found. Your university library will normally provide these resources free to you, without subscription, if you are logged in. Some of the online resources are free to anyone. In table 6.5, some databases such as Best Practice and CINAHL have more than one type of evidence from the figure 6.1 hierarchy, which is highly convenient. Some databases are practically a one-stop shop across levels of evidence and the evidence hierarchy. ACCESSSS, SUMSearch and Trip search multiple databases, which makes them very useful. The ‘When to search’ column in table 6.5 includes where you should search first. One day, you’ll become a busy health professional with many patients needing treatment. You won’t have time to read hundreds of articles reporting clinical trials. You’ll want recommendations based on evidence that an expert has checked for quality. You’ll want to start at the upper levels of the figure 6.5 triangle.

66 Understanding research methods for evidence-based practice in health

68 Understanding research methods for evidence-based practice in health

Reviews of original studies — secondary evidence

Information preappraised and summarised from primary and secondary evidence; pre-packaged advice relating to a healthcare topic

ACCESSSS (http://plus.mcmaster.ca/accessss) Best Practice (http://bestpractice.bmj.com) STAT!Ref (www.statref.com) Trip database (www.tripdatabase.com) UpToDate (www.uptodate.com)

Summaries of clinical topics

CINAHL — the major evidence database for nursing and allied health (https://health.ebsco.com/products/the-cinahl-database/allied-health-nursing) Cochrane Database of Systematic Reviews (www. cochranelibrary.com/home/topicand-review-group-list.html) Google Scholar (https://scholar.google.com.au)

ACCESSSS (http://plus.mcmaster.ca/accessss) ACP Journal Club (www.acpjc.org) Bandolier (http://www.bandolier.org.uk/) DARE: Database of Abstracts of Reviews of Effects (https://www.crd.york.ac.uk/CRDWeb/) The DARE database is no longer being updated and may be discontinued after 2021. EBM: Evidence-Based Medicine (https://www.wiley.com/enau/Evidence+Based+Medicine+Guidelines-p-9780470011843) EBN: Evidence-Based Nursing (http://ebn.bmj.com) Trip database (www.tripdatabase.com)

Synopses of systematic reviews

Level I evidence, other systematic reviews and other secondary evidence

ACCESSSS (http://plus.mcmaster.ca/accessss) AHRQ Agency for Healthcare Research and Quality https://www.ahrq.gov/gam/index.html Best Practice (http://bestpractice.bmj.com) Clinical Practice Guidelines Portal (Australian (www.clinicalguidelines.gov.au) DynaMed Plus (www.dynamed.com/home) Guidelines International Network (www.g-i-n.net) NICE: National Institute for Health and Care Excellence (www.nice.org.uk) NICE: Evidence search (www.evidence.nhs.uk) NURSING+ (https://plus.mcmaster.ca/np) REHAB+ (https://plus.mcmaster.ca/rehab) SUMSearch (http://sumsearch.org) Trip database (www.tripdatabase.com)

Where to find

Practice guidelines and point of care resources

Type of evidence

Where and when to find evidence in the hierarchy

Level in hierarchy

TABLE 6.5

When you want reviews of original studies; useful when summary information is not available for that topic

You need preappraised information quickly. The first choice for the busy professional who needs evidence-based advice about clinical practice and related topics

When to search

Search here if no summaries, practice guidelines or other preappraised information is available

Search here first to save time

CHAPTER 6 Levels of evidence 69

Expert, professional opinions, single case studies and personal recollections about practice — may not be properly evidence-based

Original studies about interventions — primary evidence

Narrative reviews, comments and letters to journals, the Internet

Level II randomised control trials Level III quasi or non-randomised control trials Level IV noncontrolled trials Cohort, casecontrol, all-ornone studies Professional and technical journals, the internet, other informal networks, your workplace

Cochrane Controlled Register of Trials (www.cochranelibrary.com/about/centrallanding-page.html) CINAHL (https://health.ebsco.com/products/the-cinahl-database/allied-health-nursing) Embase (www.elsevier.com/solutions/embase-biomedical-research) EvidenceAlerts (https://plus.mcmaster.ca/EvidenceAlerts/), formerly known as EvidenceUpDates Google Scholar (https://scholar.google.com.au) OTseeker (www.otseeker.com) PEDro (www.pedro.org.au) Scopus — many disciplines; good also for searching an author’s list of publications (www.elsevier.com/solutions/scopus) speechBITE (http://speechbite.com) SUMSearch (http://sumsearch.org)

Health Evidence — public health (www.healthevidence.org) Joanna Briggs Institute — nursing (http://connect.jbiconnectplus.org) MEDLINE (www.nlm.nih.gov/bsd/pmresources.html), available through PubMed (www.ncbi.nlm.nih.gov/pubmed). MEDLINE is also available through EBSCO (https://health.ebsco.com/products/medline-with-full-text) and ProQuest (www.proquest.com/products-services/medline_ft.html). NURSING+ — nursing (https://plus.mcmaster.ca/np) OTseeker — occupational therapy (www.otseeker.com) PEDro — physiotherapy (www.pedro.org.au) REHAB+ — rehabilitation (https://plus.mcmaster.ca/rehab) speechBITE — speech pathology (http://speechbite.com) SUMSearch (http://sumsearch.org) Trip database (www.tripdatabase.com) Many scientific technical and professional journals publish systematic reviews as well as original studies.

Not recommended; information has too high risk of bias

Search for original studies when summary or review information is not available for that topic

Search here if you can’t find anything at higher levels

SUMMARY Clinical questions in healthcare ask about the effectiveness of treatments, diagnostic and screening procedures, the clinical course of health conditions and hazards to health. These clinical questions are matched to research designs. Research designs are assigned a level of evidence in accordance with Australian NHMRC guidelines. Levels are numbered, starting from Level I systematic reviews. Levels II, III and IV encompass original studies, with higher-numbered levels indicating increased risk of bias and thus weaker evidence. Evidence-based practitioners should preference Level I and Level II evidence or summaries of these to inform their practice. The GRADE system appraises the quality of evidence leading to recommendations for clinical practice guidelines. NHMRC evidence levels reside within a larger hierarchy of evidence types, which includes summaries and practice guidelines as the most convenient source of evidence for healthcare practice.

KEY TERMS allocation bias Systematic differences in clinical trial groups, and affecting results beyond treatments. Also known in some contexts as selection bias. bias Systematic, consistent error in results or conclusions from evidence; renders the evidence invalid or likely to be misinterpreted where the bias is concealed. clinical practice guideline Document providing evidence-based recommendation for the clinical management of patients with a specific, known health condition. clinical question An inquiry about clinical practice, ideally answered from research evidence. database An organised list that can be searched by computer. In evidence-based practice, a catalogue of research evidence retrievable according to design of study, topic, clinical population or health condition, authorship, year of publication or other criteria relevant to the clinical question, the quality and accessibility of the evidence. evidence Information from research that may guide practice if that evidence is valid for the clinical scenario. evidence-based practitioners Health professionals who use recent, high-quality, relevant evidence from research in combination with professional judgement and the patient’s preferences and values to inform healthcare practice and policy decisions. GRADE system A method for evaluating the quality of evidence used to develop recommendations informing clinical practice guidelines. hierarchy An ordered list, from highest (better) to lowest (not so good). levels of evidence A list of study designs matched to clinical questions, and ranked in order of potential bias. meta-analysis Statistical procedure for combining results from existing quantitative studies with measured outcomes, effectively increasing sample size and precision. placebo An intervention resembling a real treatment but without any expected therapeutic benefit. Placebos are used as control interventions. The ‘sugar-coated pill’ that looks like real medication but contains no drug is the classic example. Any effect from a placebo is from patient’s expectations from receiving treatment rather than the treatment itself. It’s important that patients not know whether they are receiving a real treatment or a placebo intervention, hence the need for blinded studies. primary evidence Evidence from original studies collecting data from new research published for the first time. secondary evidence Evidence compiled from existing, already available research results. systematic review Secondary research evidence that answers a focused clinical question through a structured and rigorous synthesis of original, primary studies chosen with strict and transparent inclusion and exclusion criteria.

WEBSITES 1 ‘About evidence-based practice’, CareSearch palliative care network: www.caresearch.com.au/

caresearch/tabid/327/Default.aspx 2 CareSearch: www.caresearch.com.au/caresearch/tabid/517/Default.aspx

70 Understanding research methods for evidence-based practice in health

3 CIAP eLearning Modules: www.ciap.health.nsw.gov.au/learning/modules 4 ‘Cochrane handbook for systematic reviews of interventions’ (you don’t have to read it all, but it’s nice

to know how systematic reviews are done): http://handbook.cochrane.org 5 Cochrane Library: www.cochranelibrary.com/about/central-landing-page.html 6 Department of Health Library, Government of Western Australia: https://dohlibrary.health.wa.gov.au/

c.php?g=500009&p=3423966 7 ‘Evidence based practice in health sciences — introduction’: https://libguides.library.curtin.edu.au/

evidence-based-practice 8 ‘Evidence-based practice in health’, University of Canberra: https://canberra.libguides.com/evidence 9 GRADE working group: www.gradeworkinggroup.org/#pub 10 International Centre for Allied Health Evidence: http://unisa.edu.au/Research/Health-Research/

Research/Allied-Health-Evidence 11 Joanna Briggs Institute: http://joannabriggs.org 12 National Health and Medical Research Council (NHMRC): www.nhmrc.gov.au 13 NHMRC additional levels of evidence and grades for recommendations for developers of guidelines

14 15

16 17

(includes older, but still interesting system for evaluating evidence quality): www.mja.com.au/sites/ default/files/NHMRC.levels.of.evidence.2008-09.pdf Nursing Reference Center (you’ll need to log in via your university library to use this resource): www.ebscohost.com/nursing/products/nursing-reference-center/nursing-reference-center-content ‘Oxford Centre for Evidence-based Medicine — levels of evidence (March 2009)’ (includes Level V, for expert opinion — other details may also differ slightly from NHMRC levels of evidence): www.cebm.net/oxford-centre-evidence-based-medicine-levels-evidence-march-2009 Physiotherapy Choices: www.physiotherapychoices.org.au ‘What is evidence-based practice?’ library resource from the University of Notre Dame: http://libraryonline.nd.edu.au/evidencebasedpractice

ENDNOTES 1. C. Faldini et al., ‘Post-operative limb position can influence blood loss and range of motion after total knee arthroplasty: a systematic review,’ Knee Surgery, Sports Traumatology, Arthroscopy: Official Journal of the ESSKA 23, no. 3 (2015): 852–9. doi: 10.1007/s00167-013-2732-4. 2. X. Fu et al., ‘Postoperative leg position following total knee arthroplasty influences blood loss and range of motion: a meta-analysis of randomized controlled trials,’ Current Medical Research and Opinion 32, no. 4 (2016): 771–8. 3. NHMRC, NHMRC additional levels of evidence and grades for recommendations for developers of guidelines (Canberra: Commonwealth of Australia, 2009), www.mja.com.au/sites/default/files/NHMRC.levels.of.evidence.2008-09.pdf. 4. H. Thieme et al., ‘Mirror therapy for improving motor function after stroke,’ Cochrane Database of Systematic Reviews 7 (2018). doi: 10.1002/14651858.CD008449.pub3. 5. R. Guedes et al., ‘The effects of kinesiotaping on quadriceps muscle performance at different velocities: a randomized controlled trial,’ Isokinetics & Exercise Science 24, no. 2 (2016): 149–56. 6. NHMRC, NHMRC additional levels of evidence. 7. T. Hoffmann, S. Bennett, and C. Del Mar, Evidence-based practice across the health professions, 3rd ed. (Chatswood: Churchill Livingstone, 2017). 8. ‘Assessing risk of bias in included studies,’ Cochrane collaboration, 2015, http://bmg.cochrane.org/assessing-risk-bias-included-studies#Sources%20of%20Bias%20in%20Clinical%20Trials. 9. E. H. Cho, M.-Y. Lee, and M.-H. Hur, ‘The effects of aromatherapy on intensive care unit patients’ stress and sleep quality: a nonrandomised controlled trial,’ Evidence-Based Complementary & Alternative Medicine (eCAM) (2017): 1–10. doi: 10.1155/2017/2856592. 10. N. E. Knibbe et al., ‘An automatic repositioning system to prevent pressure ulcers: a case series,’ British Journal of Nursing 27, no. 6 (2018): S16–22. 11. Cochrane collaboration, paragraph 3. 12. L. Forrester et al., ‘Aromatherapy for dementia,’ Cochrane Database of Systematic Reviews 2 (2014). doi: 10.1002/ 14651858.CD003150.pub2. 13. GRADE, 2004–2018, www.gradeworkinggroup.org. 14. G. H. Guyatt et al., ‘GRADE: an emerging consensus on rating quality of evidence and strength of recommendations,’ BMJ: British Medical Journal (International Edition) 336, no. 7650 (2008): 924–6. 15. ‘Cochrane Library,’ www.cochranelibrary.com.

CHAPTER 6 Levels of evidence 71

ACKNOWLEDGEMENTS Adapting author for this Australian edition: John Bidewell Photo: © Miriam Doerr / Shutterstock.com Figure 6.1: © H. Thieme et al., ‘Mirror therapy for improving motor function after stroke,’ Cochrane Database of Systematic Reviews 7 (2018). doi: 10.1002/14651858.CD008449.pub3. Figure 6.2: © Reprinted from Isokinetics and Exercise Science 24, no. 2 (2016): 149–56 with permission from IOS Press. The publication is available at IOS Press through http://dx.doi.org/10.3233/IES-160612. Figure 6.3: © E. H. Cho, M.-Y. Lee, and M.-H. Hur, ‘The effects of aromatherapy on intensive care unit patients’ stress and sleep quality: a nonrandomised controlled trial,’ Evidence-Based Complementary & Alternative Medicine (eCAM) (2017): 1–10. doi: 10.1155/2017/2856592. Figure 6.4: © British Journal of Nursing by MA Healthcare Ltd. Reproduced with permission of MA Healthcare Ltd in the format ‘book’ via Copyright Clearance Center. Table 6.2: © Commonwealth of Australia 2018

72 Understanding research methods for evidence-based practice in health

CHAPTER 7

Statistics for the non-statistician LEARNING OBJECTIVES 7.1 7.2 7.3 7.4 7.5 7.6

How are quantitative data stored in a data set? How do descriptive statistics summarise the characteristics of a sample? What assumptions apply to the results? How sure can we be that results from the sample will generalise to the wider population? How is the clinical importance of treatment effects decided? How are treatment effects summarised from multiple studies of interventions in a systematic review?

Gender

Female

Categorical

Smoking status

Doesn’t smoke regularly

Categorical

Alcohol consumption

Not a regular drinker of alcohol

Categorical

Exercise frequency

Moderate exercise frequency

Categorical

Study group

Control group rather than experimental; sat for 1 minute after initial pulse-rate measurement

Categorical

First pulse-rate measurement, taken before running or sitting

86 beats per minute

Continuous

Second pulse-rate measurement, taken after running or sitting

88 beats per minute

Continuous

Source: Adapted from data from Smyth 2011.

Information about individual people may be interesting, but it tells us nothing about the sample as a whole, or how different groups compare. Statistical analysis enables researchers to: • summarise the entire sample • concisely describe and compare groups within a sample • find evidence about the effectiveness of treatments, and the personal, lifestyle and environmental factors affecting people’s health • make statements about the larger population, based on information about the sample.

7.2 Descriptive statistics for summarising sample characteristics LEARNING OBJECTIVE 7.2 How do descriptive statistics summarise the characteristics of a sample?

Research reports usually begin their results section by summarising the sample characteristics, including demographic information such as age, gender, ethnicity, education and income. Further into the results, outcome measures are summarised and related to treatments or other variables predicted to affect those outcomes. Descriptive statistics refer only to the sample, not attempting to generalise beyond the sample.

Descriptive statistics for categorical variables Most quantitative research reports summarise the characteristics of a sample using tables of descriptive statistics. We begin with categorical data from the pulse-rate study, in table 7.2, which shows results for the running and sitting groups on gender, smoking, alcohol consumption and exercise. Results for categorical variables show membership of groups rather than measured amounts. Because running and sitting are also grouping values, table 7.2 (see column headings) shows entirely categorical data. Percentages refer to the total number of students within each group, running or sitting. Looking at table 7.2, you may consider how similar the running and sitting groups were on background variables at the beginning of the study. TABLE 7.2

Pulse-rate study descriptive statistics by group — categorical variables Running group

Variable

Sitting group

N

Per cent

N

Per cent

Male

24

52%

35

55%

Female

22

48%

29

45%

Gender

(continued)

CHAPTER 7 Statistics for the non-statistician 75

the value of each score, the median is less vulnerable to excessive influence from extremely high or low scores than the mean is. • Mode, the most commonly occurring score. Because only a few scores may contribute to the mode (see ‘Frequency of mode’, which counts the number of students scoring the mode value), the mode statistic is less informative than the mean or the median and is reported less often. Multiple modes for the same data are possible, as happened with age for the running group in table 7.3. Dispersion refers to how spread out the scores are on a continuous variable. Statistical measures for dispersion are as follows. • Standard deviation (SD) indicates how widely spread out the scores are in a sample. Results that are more spread out, with plenty of large differences between the scores, will have a higher SD compared with a sample of scores that are mostly similar and clustered tightly around the mean. In research reports, SDs often appear after the sign ±. From table 7.3, a research report could state that M = 174.57 ± 11.05 cm for the height of the running group. • The minimum and maximum are the highest and lowest scores on that variable. • The range is the difference between the minimum and maximum, indicating the extent of the scores from low to high. As with the maximum and minimum, the range shows how spread out the scores are, but is defined by only the most extreme scores, telling us nothing about any of the other scores. The range is the least informative measure of dispersion. Statistics in journal articles are almost never labelled ‘central tendency’ or ‘dispersion’. It’s assumed that readers know enough about statistics so these labels aren’t needed. Your understanding of central tendency and dispersion will prepare you for real journal articles.

7.3 The researchers ‘setting the scene’ LEARNING OBJECTIVE 7.3 What assumptions apply to the results?

Have the researchers tested the assumption that their groups are comparable? For a randomised controlled trial, descriptive statistics comparing groups when the study begins should show the intervention and control groups to be similar in their general characteristics (age and gender, etc.) and outcome variables, such as pulse rate in the pulse-rate study. Large differences in participants’ baseline characteristics across groups can complicate the meaningful interpretation of results. Tables 7.1 and 7.2 help us compare the groups and judge their similarity at the study outset. We want the only difference between the groups to be that one group receives the treatment (running, in the pulse-rate study) while the other group receives the control condition (sitting). That way, the treatment-control setup has to be the best explanation for any group differences in the outcome variable (pulse rate) at the end of the experiment. Tables 7.2 and 7.3 show the running and sitting groups to be roughly similar on all measures, though not exactly the same because, even with random allocation, group differences can happen simply through chance. However, with a large enough sample and random allocation, chance differences at study commencement are likely to be small and not enough to contaminate the results. That is the logic behind a randomised controlled trial design.

What assumptions apply to the shape of the data? Another consideration is the shape of the data distribution. Figure 7.1 shows histograms of the four continuous variables (age, height, weight and resting pulse) from the pulse-rate study. The height of each column shows the number of students at that position on the measurement scale. Figure 7.1(a) shows that most students in the sample were aged from 18 to 21 years. Figure 7.1(d) shows that a resting pulse from 60 to 80 beats per minute was the most common range within the sample. Height and weight graphs in figure 7.1 are interpreted similarly — what would you conclude about the sample from these graphs? Superimposed on the figure 7.1 histograms are ‘normal’ curves. These curves attempt to summarise the distributions of the data across the graphs. The term normal is used because many biological phenomena show this bell-shaped pattern, with the most common scores close to the mean, at the centre of the distribution. Extreme high or low scores are relatively uncommon, so the curves flatten into ‘tails’ at each end. In figure 7.1, the normal curve approximates the height, weight and resting pulse data fairly well. We can say that height, weight and resting pulse are normally distributed. The normal curve does not summarise

CHAPTER 7 Statistics for the non-statistician 77

The randomised controlled trial design has the advantage of ruling out many alternative, misleading explanations for associations between treatments and results. With a properly conducted randomised controlled trial, the treatment becomes the best causal explanation for differences on outcome variables between the treatment and control or placebo groups at the end of the study. In a randomised controlled trial, a causal connection between a treatment and its measured effect becomes a reasonable assumption.

7.4 Probability and confidence LEARNING OBJECTIVE 7.4 How sure can we be that results from the sample will generalise to the wider population?

How are p values interpreted? Readers of quantitative research reports soon discover the p value. P values confuse undergraduate students of health, possibly more than any other statistic. P values go with hypothesis testing and inferential statistics. Let’s start by contrasting inferential with descriptive statistics. Descriptive statistics describe the sample, which they can do as accurately as the measurement methods allow. Conclusions from descriptive statistics are true for the sample but they are precisely true only for the sample, which limits their usefulness in healthcare because in professional practice you will almost never meet or treat the participants in a research study you have read about. Instead, you’ll work with clients from the wider population with the same health conditions as research participants; the same type of people but not the same individuals. As an evidence-based practitioner, you want evidence that the results from the samples you read about applies to your clients. Simply assuming, without evidence, that descriptive data from samples will apply to your clients is risky, and no smarter than wandering around with your eyes closed, expecting never to bump into anything. Fortunately, statisticians have developed inferential statistics to explore whether data from samples apply more generally to the relevant population. There is a price to pay for using inferential statistics to make statements about a population based on sample data. That price is loss of precision. You can make definite statements about the characteristics of a sample and what happens to that sample after its patients are treated; however, you can’t be so definite in your conclusions about an entire population when you’re relying on sample data. While not providing absolute certainty, inferential statistics allows evidence-based practitioners to control and quantify the uncertainty, so you understand exactly the limitations to generalisations from sample to population, helping to reduce the risk of false conclusions. For reasons of cost and practicality, research is almost never done using entire populations. Inferential statistics, applied to high-quality samples, gives enough precision so population research is unnecessary. All of this makes inferential statistics especially useful. Let’s see how inferential statistics and hypothesis testing go together. This will help to make sense of p values. Strange as it may seem, with hypothesis testing, researchers begin with the idea of no difference between groups of people or measurements over time in the population. For a randomised controlled trial, that amounts to saying that treatment has no effect — the treatment does not work. For the pulse-rate study, saying ‘no effect’ is claiming that running has the same effect on resting pulse as sitting, that’s no difference in pulse rates between the two groups after running and sitting. The researchers are not yet concluding ‘no effect’ as an actual result. If the researchers know before the experiment that the treatment doesn’t work, their research becomes unnecessary. Instead, the ‘no effect’ null hypothesis serves as a starting point for comparison with the opposite conclusion, which the researchers are usually expecting the data to support. For a randomised controlled trial, supposing there really is no effect of the treatment in the population, we would expect no difference between treatment and control or placebo groups in the sample at any stage of the experiment. On the other hand, if the treatment leads to a difference in results for people with and other people without treatment in the population, we would predict a similar difference in the sample, a non-zero effect, this being the alternative hypothesis, logically opposite to the null hypothesis. Next, we turn to the data, the evidence we use to test the hypotheses, and see whether the null or alternative hypothesis better matches the data. For the pulse-rate study, table 7.4 shows pulse-rate descriptive and inferential statistics for the running and sitting groups before and after the running group ran, and while the sitting group sat. Starting with the descriptive statistics, before the running group ran, while all students were sedentary in class, we would predict no difference in mean pulse rate between the two groups, and that’s what the data show. The group difference is so slight, only 0.40 beats per minute, it’s almost certainly accidental, a chance effect of no scientific or practical importance. More informative is the running versus sitting comparison after the running group ran while the sitting group sat. Average pulse CHAPTER 7 Statistics for the non-statistician 79

Confidence intervals Hypothesis testing answers only whether there’s a non-zero effect in the population. This effect-or-noeffect approach, although used in many research reports, has limited value for evidence-based practice (EBP). Most health professionals want to know the actual size of an effect in a population, more than whether the effect is zero or non-zero. For continuous data, the mean serves to estimate the population effect. Because the sample mean is a single number, it’s called the ‘point estimate’. However, we cannot assume that the sample mean will exactly equal the population mean. For estimating population means from sample data, inferential statistics offers the confidence interval (CI). Being an inferential statistic, the CI acknowledges the uncertainties involved when generalising from sample data to populations. The CI for a mean specifies lower and upper boundaries for the estimated population mean at a given level of probability. The level of probability is almost always fixed at 95%. For a 95% CI, there’s a 5% chance that the population mean lies outside the upper and lower boundaries of that interval. Odd as it may seem, the CI allows us to know exactly how uncertain we are. A narrower CI is more helpful to EBP than a wider CI because narrow CIs improve the precision of the population estimate. Converting measurement scales to smaller numbers adds no further precision. If a CI of 200 to 500 centimetres is converted to an interval of 2 to 5 metres, there is no effective increase in precision despite the smaller numbers, because both intervals cover the same length. Table 7.4 shows 95% CIs for pulse rate within the running and sitting groups before and after the running group ran. We can say that before the running group ran, we are 95% certain that the population mean under these conditions lies between 70.87 and 80.04 beats per minute, much the same as the 95% CI for the sitting group. Running for 1 minute increases pulse rates, so the 95% CI after running was lifted to range of 119.38 to 134.32, well above the 95% CI for that group’s resting pulse and for the sitting group measured on both occasions. Again, we have evidence for the effect of exercise on immediate pulse rates.

7.5 Clinical importance of treatment effects LEARNING OBJECTIVE 7.5 How is the clinical importance of treatment effects decided?

In healthcare practice and policy, we want to know how one variable influences another variable that measures health. These influences are known as effects. Examples include the effect of: • treatments on patient’s health • genetic factors on health • environmental (where someone is) and behavioural (what someone does) factors on health • time on health (people’s health changes as they age). In this chapter, we concentrate on the effect of treatments on outcome measures about health.

Clinical importance Descriptive and inferential statistics say nothing about the amount of treatment effect necessary to make a worthwhile improvement to people’s health, their ability to do what they want, their life expectancy or their quality of life. Clinicians are looking for more than statistically significant differences. Interpreting the clinical importance of statistical results is essential to EBP. An observed statistical effect isn’t necessarily enough to make a practical difference to patients’ lives. We must also know how large a difference is enough. The smallest improvement that makes a healthcare procedure worth its time, effort, expense and risk is called the minimum important difference. If research shows an observed effect at least as large as the minimum important difference, and in a beneficial direction not causing harm, that effect is clinically important. If an observed effect is smaller than the minimum important difference, that effect is not worthwhile and the treatment need not be used in clinical practice. When reading a research report arguing the benefits of a treatment, you should look at the sizes of the effects presented as evidence, and decide whether the benefits and risks justify that treatment, compared to other treatment options or no treatment. These decisions are separate from hypothesis testing and statistical significance. The researchers should discuss clinical importance in the final section of their report. As inferential statistics, CIs are nevertheless useful for assessing clinical importance, because practitioners can compare the CI for an effect with the minimum important difference. Data can be presented in a tree plot (so-called because of the ‘branches’), showing at a glance the effects for each group, their clinical importance, and allowing for comparison of effects. Figure 7.2 is a tree plot for the pulse-rate study. The average change in pulse rate for each group is shown as the central square between bars on CHAPTER 7 Statistics for the non-statistician 81

5. Experiment 5. While most of the CI lies above the minimum important difference, there is a distinct possibility of a population mean below the minimum important difference. 6. Experiment 6. The mean is the highest of the six experiments, it lies far above zero and well in excess of the minimum important difference. The CI is relatively narrow and the entire CI lies well above zero and the minimum important difference. Chances are, on average, patients who receive this treatment will benefit by more than the minimum required to justify the treatment. This result offers stronger support for the treatment than any of the other five experiments. Inferential statistics and CIs especially are valuable for EBP for these reasons. 1. Inferential statistics support conclusions about what may happen in a population of interest, expanding on results from a localised sample of individuals that practitioners will likely never encounter. 2. Evidence-based practitioners can see how precisely sample data can estimate effects in the wider population. 3. Inferential statistics enable evidence-based practitioners to determine, with known precision, whether treatments have any effect at all in the population, whether treatment effects are beneficial or adverse, and whether beneficial effects are sufficiently large to justify the treatment. The validity of inferential statistics relies on assumptions. Among these assumptions is the representativeness of the sample. Inferences about populations from sample data will hold true only if the sample resembles the intended population. If the sample is biased, the results will be biased, and conclusions about those results will be biased. If the sample characteristics vary from your clients in ways that influence outcomes, do not expect the same outcomes from your clients as occurred in the sample. EBP depends on relevant evidence. The validity of all evidence relies on the further assumption that the study is conducted to prevent bias from any source distorting the results. Checking for bias involves critically examining the research methods, not the research statistical results. Very large effect sizes and statistically significant results, though superficially impressive, offer only weak or misleading evidence if these effects come from biased study procedures.

7.6 Summarising treatment effects from multiple studies of interventions in a systematic review LEARNING OBJECTIVE 7.6 How are treatment effects summarised from multiple studies of interventions in a systematic review?

The chapter on quantitative research described meta-analysis as a statistical technique for combining results across multiple quantitative studies. The combined results from meta-analysis are displayed in a graph similar to figure 7.3. Just as a large collection of trees is called a ‘forest’, so are multiple tree plots combined in a single graph called a ‘forest plot’. Figure 7.4 shows a forest plot from a systematic review by Jurgens et al.3 evaluating catechins, a component of green tea, for weight loss measured in kilograms. As with figure 7.3, results for a set of studies are displayed in rows in this graph, in descending order of daily dose for the catechin active ingredient. The forest plot records the mean weights for each group, the treatment-control mean differences and the group sizes numerically in a table. The graph to the right of the table has zero representing ‘no effect’. Differences higher than zero favour the control conditions over the catechin treatment, while negative differences in this forest plot show that the treatment group lost more weight than the control group, favouring catechin therapy. The horizontal lines indicate 95% CIs, with the small squares centred on the lines indicating the mean result for each study, the same layout as figure 7.3. Many of the studies achieved non-significant results, demonstrated by their CI crossing zero, which becomes a tenable value for the population, favouring the null hypothesis of no population effect, according to those studies. Other studies showed significant results in favour of catechin treatment over control or placebo alternatives. Unfortunately, clinical importance is undefined and not shown. ‘Heterogeneity’ looks at how diverse the results are for the studies in this review. The large and statistically significant (p < .05) value for the so-called 𝜒2 statistic (‘chi-square’, pronounced ‘ky square’) indicates considerable diversity in the results, which makes this set of studies problematic to summarise as a collection. Nevertheless, the large diamond in the forest plot presents a ‘pooled effect size’ combining the effect sizes for all 14 studies. The upper and lower points of the diamond represent the overall effect size as a point estimate from all studies combined, approximately negative 1 in this analysis. The left and right points indicate the CI for combined treatment effects, ranging from almost negative 2 to just above zero. Notice how the 95% CI for the pooled effect is narrower than for many of the individual studies, consistent

84 Understanding research methods for evidence-based practice in health

SUMMARY When looking at quantitative research reports, it is important to establish whether the authors have set the scene correctly. Authors should determine whether their groups are comparable at the study outset and use statistical procedures appropriate to the distributional shape of the data. Authors and their evidence-based practitioner readers alike should be wary of undisciplined assumptions about the nature and direction of causality; a particular result may be associated with another variable but not caused by it. Descriptive statistics summarise sample characteristics and reveal associations between variables, such as group differences, change over time, and effects of treatments on health. Inferential statistics support generalisations beyond the sample, testing hypotheses about population effects, or estimating population means with a specified level of certainty and precision. P values denote the probability of effects observed in sample having occurred by chance. A confidence interval (CI) estimates a population value such as a mean, typically with 95% certainty. The smallest improvement in health that makes a treatment worth the time, effort, expense and risk of that procedure is called the minimum importance difference. Tree-plot graphs enable ready comparison of effects of treatments in samples with the minimum important difference. In systematic reviews, metaanalysis combines the findings from a sample of studies, displaying the effect size and CI for each study and the overall, pooled result in a forest-plot graph. Irrespective of statistical effect sizes, significance and clinical importance, valid, usable evidence comes only from unbiased study methods.

KEY TERMS case Individual members of a sample, person, place, object or event; participants in research. central tendency The middle location between the highest and lowest scores on a continuous variable; relevant statistics are the mean, median and mode. confidence interval [for a mean, also ‘CI’] From inferential statistics, a range of values in which a population mean is estimated to occur, with a specific level of chance, conventionally 95% confidence. correlation Statistical association between one measurement and another, so the results on one variable are at least partly predictable from the other. Universities select students on the basis of their school results because of a correlation, with better performance at school presumably associated with higher grades in later university results. descriptive statistics Statistics that summarise or otherwise describe the characteristics of a sample, groups within a sample, or relationships between variables in a sample. dispersion How spread out the scores are on a continuous variable; relevant statistics are the standard deviation, minimum, maximum and range. distribution Shape of the data on a graph; whether scores are symmetrically arranged or skewed. effect In quantitative research, a difference or a change that happens to one measured variable, such as a measure of health, attributable to another variable such as a clinical treatment. generalise Conclude or assume that results from an individual or sample will apply to a larger group or population. hypothesis testing Using inferential statistics to estimate the probability that an effect observed in the sample is consistent with an effect, or no effect, occurring in the population. inferential statistics Statistics that attempt to generalise results from a sample to the wider population from which the sample was taken. minimum important difference The least amount of change in outcome measurements for a treatment or other intervention to be considered beneficial and worthwhile in clinical practice. p value In hypothesis testing with inferential statistics, the probability of the effect observed in the data if there were no effect in the population. Conventionally, a p value of less than .05 (5%) works as evidence of a non-zero effect in the relevant population. population In statistics, all possible cases that could be sampled. In research, the clinical population comprises the entire set of people with the same clinical characteristics as in a random sample from that population. score One case’s result on one variable (e.g. one person’s height). statistically significant An observed effect in a sample is considered large enough to be unlikely (p value less than .05) as a chance result, so it’s interpreted as representing a genuine effect in the population rather than a chance effect in the sample. 86 Understanding research methods for evidence-based practice in health

variable Anything measurable, either categorically or continuously, and can differ from one observation to another, such as among different individuals, groups of people, or across time.

WEBSITES 1 ‘Statistical language’ from the Australian Bureau of Statistics, has further information about quan-

2

3

4 5

titative and qualitative data: www.abs.gov.au/websitedbs/a3121120.nsf/home/statistical+language++quantitative+and+qualitative+data ‘Descriptive statistics’ from the Khan Academy allows you to go back to basics and refresh your memory on measures of central tendency and dispersion by watching short video clips: www.khanacademy.org/math/probability/descriptive-statistics The Coursera short course ‘Design and interpretation of clinical trials’ from John Hopkins University covers the basic principles for design of randomised clinical trials and how they should be reported: www.coursera.org/course/clintrials ‘What is a systematic review?’ from Cochrane Consumer Network has further information about systematic reviews: http://consumers.cochrane.org/what-systematic-review Online Statistics Education: A Multimedia Course of Study: http://onlinestatbook.com

ENDNOTES 1. The original data and background to the experiment are retrievable from G. Smyth, ‘Pulse rates before and after exercise,’ OzDASL (2011), www.statsci.org/data/oz/ms212.html. 2. G. Guyatt et al., ‘Basic statistics for clinicians: 1. Hypothesis testing,’ CMAJ: Canadian Medical Association Journal 152, no. 1 (1995): 27. 3. T. M. Jurgens et al., ‘Green tea for weight loss and weight maintenance in overweight or obese adults,’ Cochrane Database of Systematic Reviews 12 (2012). doi:10.1002/14651858.CD008650.pub2.

ACKNOWLEDGEMENTS Adapting author for this Australian edition: John Bidewell Photo: © Den Rise / Shutterstock.com Figure 7.4: © T. M. Jurgens et al., ‘Green tea for weight loss and weight maintenance in overweight or obese adults,’ Cochrane Database of Systematic Reviews 12 (2012). doi: 10.1002/14651858.CD008650 .pub2.

CHAPTER 7 Statistics for the non-statistician 87

CHAPTER 8

Mixed methods research LEARNING OBJECTIVES 8.1 8.2 8.3 8.4

What is mixed methods research? What are some different mixed methods designs? Why is integration important in mixed methods research? What are some common design considerations for mixed methods studies?

phase. With multiphase data, the researcher may conduct the analysis separately or at the same time, but will look for agreement and/or contradictions that emerge when findings from the two (or more) different groups are compared. Finally, if the mixed method design was embedded, the researcher is aiming for the data analysis to expand the breadth of the mainly quantitative study with qualitative findings. In most cases, analysis of qualitative and quantitative datasets occurs separately, using techniques suited to each form of data. Occasionally during analysis, different datasets can be combined or transformed into a single dataset. The process of transforming qualitative data into quantitative data has been referred to as quantitising, while the process of converting quantitative data into qualitative data has been referred to as qualitising.14 One way of quantitising qualitative data is transforming responses from a qualitative interview into variables for numerical analysis. Similarly, quantitative data can be qualitised by being transformed into narrative data that can be analysed qualitatively.15

Interpretation A key stage in mixed methods research is to interpret the findings from both studies. This usually takes place in the final stages of the study. This can be referred to as ‘crystallisation’16, where there is a purposeful search for how the results from the different methods used converge, diverge and/or are different from each other.

8.4 Mixed method design considerations LEARNING OBJECTIVE 8.4 What are some common design considerations for mixed methods studies?

Weighting (dominance) Morgan suggested that researchers should consider if one method in their mixed methods study is dominant over another so that one is the principal method used to collect data and the other is complementary or supplementary (e.g. in terms of resources devoted to them, the depth of analysis, and how study components are reported and disseminated).17 For example, a qualitative component may be added to a quantitative survey, but both methods would still be driven by the deductive assumptions of the quantitative paradigm. However, it is quite possible for researchers to design their study so that equal status is given to the quantitative and qualitative components, as in a sequential study. It is also possible that the relative weighting of different study components will not emerge until the later ‘interpretation’ phase of a research study.

General challenges associated with mixed methods studies In addition to the abovementioned challenges, there are some general challenges associated with conducting mixed methods studies, whatever the design type chosen. Maintaining quality in mixed methods research is paramount — and, if both qualitative and quantitative methods are being used, there is a danger that neither approach will be conducted well. This is because mixed methods studies require a range of expertise and skill. It is unlikely that one researcher will have the range of necessary skills to ensure the quality of both methods within a mixed methods study. This means mixed methods studies are usually conducted by teams of researchers with complementary skills. Working in multidisciplinary or interdisciplinary teams has the potential to give rise to tensions and challenges. This might include dealing with differences between quantitative and qualitative approaches, trusting the ‘other’, handling conflicts and tensions, and establishing effective leadership roles.18 It is important to adopt respectful and collaborative practice to guard against ‘silos’ developing, where a series of separate ‘mini projects’ is conducted rather than integrated mixed methods study. This is particularly pertinent where one method has been the dominant approach in a particular discipline and team members may have limited understanding of the other approach.

94 Understanding research methods for evidence-based practice in health

SUMMARY Mixing methods almost always adds value to a study’s findings. There are a variety of different ways to mix methods (convergent, sequential, multiphase and embedded), each with their own advantages and challenges. To ensure that the research is conducted well, attention must be paid to the integration of qualitative and quantitative methods throughout the study. When planning a mixed methods study, attention needs to be paid to the weighting attributed to each method as well as the nature and experience of the research team.

KEY TERMS convergent study Qualitative and quantitative arms of the study are run concurrently but separately and then the results are ‘converged’ to enable the researcher to compare and contrast quantitative results with qualitative findings, or expand quantitative results with qualitative data. embedded study A combination of quantitative and qualitative methods is used throughout a study. mixed methods research The planned mixing of quantitative and qualitative components within a single study. multiphase study Different study designs are used to answer the same question in a population that has inherently different sizes. qualitising Converting quantitative data into qualitative data. quantitising Transforming qualitative data into quantitative data. sequential study A qualitative study is conducted first, followed by a quantitative study (or vice versa).

WEBSITES 1 If your institution is a member of the Joanna Briggs Institute, search for ‘mixed methods’ and you will

have access to a variety of information on the topic (e.g. the methodology for a mixed methods systematic review and where a mixed methods study might sit on the Levels of evidence): www.joannabriggs.org 2 The Australian and New Zealand Academy of Management (ANZAM) has a special interest group in mixed methods research. While it is a management rather than health-focused organisation, getting in contact with people who have expertise in a research method, even if not from your discipline, is often useful: www.anzam.org/research/special-interest-groups-sigs/mixed-methods-research 3 Journal of Mixed Methods Research: http://mmr.sagepub.com 4 The ESRC National Centre for Research Methods training centre for innovations in methods has hosted workshops on mixed methods that are detailed on the website. The site also includes a discussion paper about mixed methods written by Julia Brannen, and seminars within the individual research programmes relevant to mixed methods: www.ncrm.ac.uk

ENDNOTES 1. J. C. Greene and V. J. Caracelli, ‘Defining and describing the paradigm issue in mixed-method evaluation,’ New Directions for Evaluation (1997): 5–17. doi: 10.1002/ev.1068. 2. A. O’Cathain, E. Murphy, and J. Nicholl, ‘Why, and how mixed methods research is undertaken in health services research in England: a mixed methods study,’ BMC Health Services Research 7 (2007a): 85. doi: 10.1186/1472-6963-7-85. 3. J. W. Creswell and V. L. Plano Clark, Designing and conducting mixed methods research, 2nd ed. (London: Sage, 2010). 4. G. Biesta, ‘Pragmatism and the philosophical foundations of mixed methods research,’ in Sage handbook of mixed methods in social and behavioral research, eds. A. Tashakkori and C. Teddlie, 2nd ed. (London: Sage, 2010), 95–117. 5. P. Bazeley, ‘Mixed methods data analysis,’ in Mixed methods research for nursing and the health sciences, eds. S. Andrew and E. J. Halcomb (Chichester: Wiley-Blackwell, 2009), 84–118; J. W. Creswell, A concise introduction to mixed methods research (Thousand Oaks: Sage, 2015); J. C. Greene, Mixed methods in social inquiry (San Francisco: Jossey-Bass, 2007); C. Teddlie and A. Tashakkori, Foundations of mixed methods research: integrating quantitative and qualitative techniques in the social and behavioral sciences (Thousand Oaks: Sage, 2009). 6. C. L. Orellana-Rios et al., ‘Mindfulness and compassion-oriented practices at work reduce distress and enhance self-care of palliative care teams: a mixed-method evaluation of an “on the job” program,’ BMC Palliative Care 17, no. 1 (2018): 3. 7. E. C. Rigg et al., ‘The role, practice and training of unregulated birth workers in Australia: a mixed methods study,’ Women and Birth 32, no. 1 (2019): e77–87. doi: 10.1016/j.wombi.2018.04.007.

CHAPTER 8 Mixed methods research 95 .

8. L. Segal et al., ‘A needs-based workforce model to deliver tertiary-level community mental health care for distressed infants, children, and adolescents in South Australia: a mixed-methods study,’ The Lancet 3, no. 6 (2018): e296–303. 9. C. Bellhouse et al., ‘The impact of sex work on women’s personal romantic relationships and the mental separation of their work and personal lives: a mixed-methods study,’ PLOS ONE 10, no. 10 (2015): e0141575. doi: 10.1371/journal.pone. 0141575. 10. A. O’Cathain, E. Murphy, and J. Nicholl, ‘Integration and publications as indicators of “yield” from mixed method studies,’ Journal of Mixed Methods Research 1 (2007b): 147–63; U. Östlund et al., ‘Combining qualitative and quantitative research within mixed method research designs: a methodological review,’ International Journal of Nursing Studies 48 (2011): 369–83. 11. O’Cathain, ‘Integration and publications as indicators,’ 369–83. 12. N. Khawaja and A. Hebbani, ‘Employment aspirations of former refugees settled in Australia: a mixed methods study,’ Journal of International Migration and Integration (in Press) (2018). 13. Riggs, ‘Unregulated birth workers,’ e77–87. 14. P. Bazeley, ‘Integrative analysis strategies for mixed data sources,’ American Behavioral Scientist 56 (2012): 814–28. 15. J. P. Combs and A. J. Onwuegbuzie, ‘Describing and illustrating data analysis in mixed research,’ International Journal of Education 2, no. 2 (2010): E13. 16. O’Cathain, ‘Integration and publications as indicators,’ 369–83. 17. D. L. Morgan, Integrating qualitative and quantitative methods: a pragmatic approach (London: Sage, 2013). 18. L. A. Curry et al., ‘The role of group dynamics in mixed methods health sciences research reams,’ Journal of Mixed Methods Research 6 (2012): 5–20.

ACKNOWLEDGEMENTS Adapting author for this Australian edition: Jane Warland This chapter has been adapted from K. Gerrish and J. Lathlean, eds., The research process in nursing, 7th ed., (Chichester, UK: Wiley-Blackwell, 2015). Photo: © nito / Shutterstock.com

96 Understanding research methods for evidence-based practice in health

CHAPTER 9

Sampling LEARNING OBJECTIVES 9.1 What do ‘sampling frames’, ‘populations’ and ‘samples’ mean? 9.2 What are the categories of sampling, and what are the types of probability and non-probability sampling techniques? 9.3 What does ‘sampling error’ mean? 9.4 When is the ‘sample size’ the right size?

possible to access the entire theoretical population (for example, women between the ages of 45 and 55, who have given birth in the last 6 months).

Sampling Sampling, not to be confused with ‘sample’, is the process of selecting participants, so that researchers can attempt to generalise their results back to a theoretical population.

Sample The sample is the group of participants who have been chosen to be a part of the current study.

Sampling frame In order to identify a sample in advance, a sampling frame is required. Simply put, a sampling frame is a detailed list of all participants, which can include people, locations or events in a population. Usually, the list relates to the study population rather than the target population. Keep in mind that a robust study population is one that has similar characteristics to the theoretical population, and, if this is not the case, it is more difficult to generalise to the wider population. Examples might include women who: • are between the ages of 45 and 55 • have had live births in the last 6 months • had their first live birth • reported no complications throughout the pregnancy or delivery • are Australian citizens.

9.2 Types of sampling LEARNING OBJECTIVE 9.2 What are the categories of sampling, and what are the types of probability and non-probability sampling techniques?

There are two broad categories or types of sampling design in research (outlined in table 9.1): probability and non-probability sampling. In quantitative research, where the aim is to recruit a sample that is representative of the theoretical population to permit generalisation, probability samples are favoured. In qualitative research, where the aim is to target the research on individual ‘cases’ that are data rich, non-probability samples are predominantly used. TABLE 9.1

Sampling

Sampling category

Probability

Non-probability

Most likely to be used in

Quantitative approaches

Qualitative approaches

Distinguishing feature

Random selection of participants

Targeted selection of participants

Aim

Recruit a sample that is representative of the theoretical population to permit generalisation

Target the research on individual ‘cases’ that are data rich

Sampling approaches

• • • • •

• • • •

Simple random sampling Systematic random sampling Stratified random sampling Cluster sampling Multistage sampling

Convenience sampling Snowball sampling Quota sampling Purposive sampling

Probability sampling The fundamental characteristic of probability sampling is the random selection of participants from a population. This approach ensures that all members of a target population have a known chance of being selected. For this reason, probability sampling is accepted as the most rigorous type of sampling strategy. In some types of probability sampling, it means that all people within a population have an equal chance of selection. Although probability sampling strategies do not guarantee the generation of a truly representative CHAPTER 9 Sampling 99

sample, the random selection of participants from a theoretical population means that any differences between the population and the sample are due to chance, except in the presence of sampling error (which will be discussed shortly). Compared to non-probability sampling, probability sampling is more likely to result in a representative sample with reduced sampling errors and bias. In addition, probability sampling removes the potential for researcher bias in the selection of potential participants. There are several types of probability sampling techniques: simple random sampling (SRS), systematic random sampling, stratified random sampling, cluster sampling and multistage sampling.

Non-probability sampling Unlike probability sampling, non-probability sampling does not involve the use of randomisation. Participants are chosen in a process that does not give all participants in the population an equal chance of being selected. Typically, non-probability samples are used to disprove a hypothesis rather than to prove a hypothesis. Non-probability samples are often used in qualitative research designs, as they are purposely seeking out particular participants to engage in the study. There are several types of non-probability sampling techniques: convenience sampling, snowball sampling, purposive sampling and quota sampling.

Sampling methods There are a range of different types of probability and non-probability sampling methods. Tables 9.2 and 9.3 provide brief definitions, as well as the advantages and disadvantages for each method presented. Keep in mind that the presented list is not exhaustive — however, it will provide you with the basic understanding required for entry level sampling knowledge. Remember that a key feature to probability sampling techniques is that they have an element of randomisation to them. Note that random sampling must not be confused with random allocation. The latter refers to the random assignment of participants to study groups in an experimental study, such as a randomised controlled trial. TABLE 9.2

Probability sampling techniques

Method

Definition

Advantage

Disadvantage

Simple random sampling (SRS)

Every participant has an equal chance of selection. There are several methods of selecting a random sample from the sampling frame, such as using a statistical software package or random number tables (found at the back of many statistics textbooks). Random number tables select samples by working either horizontally or vertically through the table. More information on these approaches can be found at www.randomizer.org.

Easiest method and most commonly used. This has a high generalisability of findings.

Not as efficient as stratified random sampling.

Systematic random sampling

Participants are systematically selected from a list (for example, choosing every fourth participant on the list). Participants are selected from a list at intervals that are predetermined by the researcher. Ordinarily, this involves selecting every ‘nth’ unit (where n is a number) on the list until the desired sample size is reached.

Very easy to use and implement.

Systematic biases are possible. Some lists have their own biases that can be transferred into the sample (for instance, every seventh patient in the list may have certain characteristics, such as being admitted to hospital on a Sunday, thereby creating an organisational bias). This means that a systematic random sample can only be truly random if the ordering of participants on the sample frame is also random.

100 Understanding research methods for evidence-based practice in health

Stratified random sampling

When a population is divided into groups also known as ‘strata’ and then continue by either implementing SRS or systematic random sampling.

Ensures an adequate sample size for subgroups in the population of interest.

Problematic if the stratas are not clearly defined. Analysis is typically complicated and the technique is time consuming.

Cluster random sampling

When the population is divided into a cluster, you randomly sample the cluster, and then sample all people within the selected clusters.

Cost effective.

Less efficient as you need a larger sample.

Multistage random sampling

A sampling technique that is carried out in various stages. The sample has a primary population followed by sub-populations.

Used when SRS, systematic sampling or stratified sampling would be too complex and expensive to conduct.

Remember that a key feature to non-probability sampling techniques is that they do not include randomisation; they strategically choose participants. TABLE 9.3

Non-probability sampling techniques

Method

Definition

Advantage

Disadvantage

Convenience sampling (also

When participants are chosen because they are ‘convenient’ — they might be in close proximity to the researcher, etc.

Easy access to participants. Cost effective. Can provide rich qualitative data.

Does not produce representative samples. Results are hard to replicate.

Snowball sampling

You begin by identifying someone who meets the criteria for inclusion in your study. You then ask them to recommend others who they may know who also meet the criteria.

Used for hard-to-reach participants that would typically be hard to access. Cost effective.

Not used for generalisations — except for similarly hardto-locate participants. Relies on the participants themselves to increase the sample.

Purposive sampling

The researcher looks for cases that will be able to provide rich or in-depth information about the issue being researched. There is a wide range of purposive sampling techniques that can be used.5

Can provide the researcher with justifications to make generalisations (theoretical, analytical or logical) from the sample being studied.

There is a high probability of researcher bias.

Participants are chosen according to prespecified quotas regarding demographics, attitudes, behaviours or some other criteria.

Ensures selection of adequate numbers of subjects with appropriate characteristics for the study.

Not used for generalisations.

known as accidental, or haphazard)

(also known as judgemental, selective or subjective)

Quota sampling (also known as proportional and non-proportional)

CHAPTER 9 Sampling 101

9.3 Sampling error LEARNING OBJECTIVE 9.3 What does ‘sampling error’ mean?

Don’t confuse sampling error with measurement error — the two are quite different. Measurement error refers to errors that take place with the data, whereas sampling error occurs when the group of participants chosen (the sample) is inadequate or not random enough. There are two types of sampling error: random errors and systematic errors; both introduce bias. Random errors are common and occur randomly in a sample as a result of under- or over-representation of certain groups. For instance, just by chance, a small sample might not include a sufficiently high proportion of people from minority ethnic groups — this would lead to under-representation of people from those groups in the sample, and the findings would not reflect the views/characteristics of the true population. The likelihood of random errors can usually be reduced by increasing the sample size. Systematic errors, on the other hand, are more difficult to handle because their likelihood cannot be reduced in this way. They usually occur as a result of inconsistencies or errors in the sampling frame, hence the reason researchers should take care when designing the sampling frame. This highlights the importance of an accurate sampling frame: inaccuracies will lead to systematic errors that cannot be corrected through increasing the sample size. For instance, sampling patients discharged from hospital with certain diagnoses relies on the accuracy of discharge codes entered onto the hospital recording system.

9.4 Calculating sample size LEARNING OBJECTIVE 9.4 When is the ‘sample size’ the right size?

Quantitative research While the rule of thumb that ‘bigger is better’ does apply to some degree to sample size in quantitative research, the required sample size should be calculated during the design and planning of a research study, rather than guessed or calculated retrospectively. There are several reasons why sample size is calculated in quantitative research — for example, a study that is too large may waste valuable resources, while a sample size that is too small can be unethical or inappropriate, as it might not be big enough to detect differences. Further, the probability of selecting a non-representative sample is increased if the sample size is small. This is because, as sample size increases, the ‘sample mean’ becomes closer to the ‘population mean’. As a practical example, we may be interested in LDL cholesterol. If a researcher took a sample of people and calculated their mean (average) LDL cholesterol concentration, it may not be the same as the mean LDL cholesterol in the general population. A common cause for such an error is the size of the sample, because in a small sample the chance of recruiting people who are different to others in the population is quite high. If the researcher were to recruit another sample of the same size, the mean LDL cholesterol in the sample might again be different to the mean LDL cholesterol in the whole population. However, if the researcher were to identify a bigger sample, the likelihood of only recruiting people who are very different to the general population would reduce. Of course, even in big samples, there is still room for error. It is only through recruiting all people in the population (a census) that the sample mean would be the same as the population mean. A quantitative research report should outline the required sample size and the way in which it is calculated. The presence of this increases one’s confidence in the study’s findings for an important reason: it quantifies the likelihood that the results are a chance finding in that sample and not a ‘true’ finding in the target population. It also describes the power of the study to detect statistically significant differences between groups.

Probability The significance of research findings relates to probability, which is simply how we quantify chance. If an event is impossible, then its probability is 0. If an event is certain, its probability is 1. This means that any event that is uncertain, although not impossible, will have a probability between 0 and 1. For brevity, probability is referred to as ‘p’ or better known as the ‘p-value’. The significance of research findings relates to the probability of detecting a statistically significant effect or difference between study groups in the sample when it does not exist in the target population. This is termed a type I (𝛼) error and, in practice, means that the researchers will say that the intervention works or that a real difference between groups exists when it does not. In quantitative research, we often set the significance (𝛼) level at 5 per cent. If the 𝛼 level is set at 5 per cent, a significant result means that we can be 95 per cent confident that a real 102 Understanding research methods for evidence-based practice in health

difference exists in the population and there is a 5 per cent probability that the finding was due to chance alone. In some studies, such as clinical trials of medicines or interventions, the 𝛼 level can be set more stringently at 1 per cent (a 1 per cent probability that the finding was due to chance alone). A type II (𝛽) error is the opposite: the probability of finding that there is no effect or difference between groups in the sample when, in the population, there is a true effect or difference. The concept of statistical ‘power’ is used to describe the probability of detecting a real difference that exists in the population. Consider the following example. • Hypothesis (Ha ): Watching television will make children obese. • Null hypothesis (H0 ): Watching television will not make children obese. • Type I error (also known as a false positive): The researchers discovered that watching television will make children obese, when in fact it does not. Here, the p-value would be p < 0.05 (therefore significant) and less likely to be noticed by the researcher. • Type II error (also known as a false negative): The researchers discovered that watching television will not make children obese, when in fact it does. Here, the p-value would be p > 0.05 (therefore not significant) and likely to be noticed by the researcher. Researchers can adjust for the possibility of a type II error by increasing the sample size. The required sample size can be calculated using a power analysis.6 As is custom, most studies will use 80 per cent power, which means they will have an 80 per cent probability of detecting a real difference if it exists. Of course, this means there will be a 20 per cent probability of not detecting a real difference. If a power calculation is not conducted, confidence in the findings is reduced because it is unknown whether the sample was too small to detect differences that exist or may detect differences that, in fact, do not exist in the population.

Power analysis Power analysis requires some information about the expected range of scores (variance or standard deviation) that would be expected within the population being studied. If, for instance, a comparison study wanted to investigate the difference in blood sugar levels between people with diabetes and people without diabetes, the power calculation would be based upon the expected differences in blood sugar levels between the two groups. This information might come from clinical data, previous research or a pilot study. Although there are tools available to help researchers to calculate sample size, often a statistician will conduct the power analysis using the information (expected values and their variance) provided by the researchers. The calculation of sample size also needs to consider attrition — people who drop out of a study. This means that the researcher should increase the sample size beyond that calculated through power analysis. Generally, when the expected differences between study groups are large, the required sample size is smaller than if expected differences are small. Ultimately, sample sizes aim to ensure that a sample is as representative as it can be while minimising sampling error.

Qualitative research A question that is often posed by novice researchers in particular is, ‘How many participants or “cases” are required for qualitative research to be “valid”?’ The literature sometimes suggests a range or a minimum and maximum number of participants or cases. However, this is not always helpful, as it depends considerably on the type of topic that is being studied, the nature of the research questions, the homogeneity or heterogeneity of the population, the richness of the data when the data collection and analysis are underway, and other factors such as feasibility and resources. Patton recommended minimum samples to achieve ‘reasonable’ coverage — a somewhat pragmatic approach.7 On the other hand, Silverman suggested that it is possible to derive theory from a fragment of naturally occurring data or one in-depth single case.8 In practice, most researchers aim for an initial sample that is considered adequate and of sufficient size to provide data that are detailed enough to address their research questions. If, at the point of analysis, the ability to theorise seems limited, they may include other examples to increase the sample size. Typically, in qualitative data, there is no set formula used to determine the sample size. The sampling process is known to be flexible and when the researcher reaches a point of saturation — when little or no new data is generated from the participants, and it is believed that the sample size is adequate. In fact, sample sizes for qualitative research tends to be much smaller than that of quantitative research due to the lengthy data collection and analysis required in qualitative methods.

CHAPTER 9 Sampling 103

SUMMARY Probability sampling is preferred in quantitative research, whereas non-probability sampling is preferred in qualitative research. Probability sampling incorporates randomisation, which means that each participant has an equal chance of being chosen to participate in the study. Qualitative research tends to adopt methods of sampling, whereby the sample is deliberately selected to provide in-depth and data-rich exemplars of the phenomena that is under exploration. Whichever approach is followed, the technique for sampling should be rigorous, transparent and described accurately in the reporting of the research. Probability sampling techniques are divided into the sub-categories including SRS, systematic random sampling, stratified random sampling, cluster random sampling and multistage random sampling. Non-probability sampling techniques are divided into sub-categories including convenience sampling, snowball sampling, purposive sampling and quota sampling. There are two types of sampling errors, not to be confused with measurement error, which include random and systematic. Increasing the sample size can adjust for random error, whereas systematic errors are more difficult to adjust for. Sample size is often larger in quantitative research methods than in qualitative methods. Although there are statistical equations that can be conducted to determine the sample size for quantitative studies, expert advice from statisticians or experienced quantitative researchers should be sought before selecting a sample. Sample size in qualitative research is determined once saturation has occurred. Sample size is important to consider prior to the commencement of the study rather than retrospectively. It is also important to understand the impact of type I and type II errors and how to adjust for the presence of these in the study.

KEY TERMS convenience sampling When participants are chosen because they are ‘convenient’ — they might be in close proximity to the researcher, etc. non-probability sampling Participants are chosen in a process that does not give all participants in the population an equal chance of being selected. probability sampling Participants are chosen randomly from a population. purposive sampling The researcher looks for cases that will be able to provide rich or in-depth information about the issue being researched. quota sampling Participants are chosen according to pre-specified quotas regarding demographics, attitudes, behaviours or some other criteria. saturation Little or no new data is generated from the participants, and it is believed that the sample size is adequate. simple random sampling (SRS) Every participant has an equal chance of selection. There are several methods of selecting a random sample from the sampling frame, such as using a statistical software package or random number tables. snowball sampling Someone is identified who meets the criteria for inclusion in a study, and they recommend others who also meet the criteria. stratified random sampling A population is divided into groups also known as ‘strata’, and then researchers continue by either implementing SRS or systematic random sampling.

WEBSITES 1 This site is a free research randomiser that can be used for a variety of applications — for example,

randomising subjects to a single treatment or where subjects are to receive all treatments in random order: www.randomizer.org 2 Sex and Gender — Women’s Health Collaborative: Fostering a sex and gender approach to medical education and practice: http://sgwhc.org 3 Research Methods Knowledge Base: Sampling: www.socialresearchmethods.net/kb/sampling.php

104 Understanding research methods for evidence-based practice in health

ENDNOTES 1. Australian Bureau of Statistics, Census of population and housing (Canberra: ABS, 2016), www.abs.gov.au/websitedbs/ censushome.nsf/home/Census. 2. A. McGregor, ‘Why medicine often has dangerous side effects for women,’ TED, 2014, www.ted.com/talks/alyson _mcgregor_why_medicine_often_has_dangerous_side_effects_for_women#t-8915. 3. Ambien, Drugs.com, 2015, www.drugs.com/ambien.html. 4. US Food and Drug administration, ‘FDA drug safety communication: risk of next-morning impairment after use of insomnia drugs’; ‘FDA requires lower recommended doses for certain drugs containing Zolpidem (Ambien, Ambien CR, Edluar, and Zolpimist)’ (FDA: Silver Spring, 2016), www.fda.gov/Drugs/DrugSafety/ucm334041.htm. 5. For a complete list, see M. Q. Patton, Qualitative evaluation and research methods (Thousand Oaks, CA: Sage, 1990); M. Q. Patton, Qualitative research and evaluation methods, 3rd ed. (Thousand Oaks: Sage, 2002); A. J. Kuzel, ‘Sampling in qualitative inquiry,’ in B. F. Crabtree and W. L. Miller ed., Doing qualitative research (Thousand Oaks, CA: Sage Publications, 1999), 33–45. 6. J. Cohen, Statistical power analysis for the behavioral sciences, 2nd ed. (Lawrence: Erlbaum Associates, 1998). 7. Patton, Qualitative research and evaluation methods. 8. D. Silverman, Doing qualitative research: a practical handbook, 3rd ed. (London: Sage, 2010).

ACKNOWLEDGEMENTS Adapting author for this Australian edition: Amanda Lambros This chapter has been adapted from K. Gerrish and J. Lathlean eds., The research process in nursing, 7th ed. (Chichester, UK: Wiley-Blackwell, 2015). Photo: © fizkes / Shutterstock.com

CHAPTER 9 Sampling 105

CHAPTER 10

Ethics LEARNING OBJECTIVES 10.1 What are the main ethical principles health researchers are concerned with? 10.2 What is the role and function of institutional ethics committees? 10.3 How do you evaluate the ethical aspects of a published journal article?

Beneficence: acting to benefit human kind It is quite common for the principles of beneficence and autonomy to clash. A good example of this is seen in the controversy that surrounded the fluoridation of drinking water in Australia in the 1960s and ’70s.5 There is no doubt that drinking fluoridated water protects children from dental cavities6, but autonomy dictates that people should be able to make decisions about the kind of water they drink and whether or not to ingest chemicals. Despite an apparent conflict to individual autonomy in the case of fluoridated water, the Australian Government made a choice to act to benefit the Australian society. This was particularly to benefit disadvantaged groups who are at increased risk of tooth decay, but may not have been able to afford to take fluoride tablets regularly. Governments in some communities without fluoridated water provide free fluoride tablets; however, people still need to know about this free service and be able to access it. Interestingly there is no legal requirement to add fluoride to the drinking water in any of the states in Australia, but it is done to benefit many Australians following strict guidelines outlined in the National Health and Medical Research Council’s (NHMRC) Australian Drinking Water Guidelines.7

Justice: obligation to treat fairly When planning their research, researchers need to ensure that participants are not unfairly burdened, that there is fair distribution of research benefits and that there is no unfair exploitation (justice). Researchers can be in a position of authority or influence over the participants such that they may feel obligated to participate. One example is when a doctor may be researching his own patients, or when a university lecturer may be recruiting her own students. Of course, it is usually ethical for this kind of research to occur, but ethics committees (discussed shortly) would carefully assess the scope and objectives of the proposed research, the inclusion and exclusion criteria used, and the general benefits of the research to ensure that the participants were not being ‘over researched’, were treated fairly and had clear opportunities to exercise autonomy.

Non-maleficence: avoiding or minimising harm Most research has the potential to do harm. The effect of the harm doesn’t need to be major or long lasting in order to be considered potentially harmful, for example: • physical harm, such as bruising, soreness of muscles, or side effects from drugs • psychological harm, such as distress associated with revealing personal, sensitive or embarrassing information • social harm, such as potentially stigmatising information (such as mental illness) is shared with a researcher • economic harm, when participants might incur costs associated with their participation • legal harm, when a person might reveal criminal history, which then required reporting (such as child abuse). When making an ethics application, researchers are usually asked several questions about how they will avoid or minimise harms such as these in order to ensure non-maleficence for their participants.

Respect for human dignity Researchers need to be particularly careful to uphold the principle of respect for human dignity where their potential participants have impaired ability to consent; for example, people in intensive care, people who are heavily sedated or unconscious, or people who are mentally ill. There are specific guidelines that ethics committees follow in these kinds of circumstances contained in the Australian National Statement on Ethical Conduct in Human Research. For example, when it comes to research with unconscious people, ethics committees need to understand that these participants are vulnerable because they cannot be informed about the research and their wishes to participate (or not) cannot be determined. Therefore: . . . those who are unconscious should only have minimally invasive research or be in research that is designed to be therapeutic for them or to improve treatment for the condition from which they suffer.8

108 Understanding research methods for evidence-based practice in health

Confidentiality: maintenance of privileged information, including the right to privacy and anonymity Confidentiality, privacy and anonymity are three words that can be used interchangeably. Nevertheless, while they are certainly related, they are, in fact, quite different from each other. The right to privacy gives participants the right to choose and control what happens to their personal information. The right to confidentiality encompasses the responsibility for others to respect another person’s privacy and not reveal that information to anyone else. One way to ensure confidentiality is to anonymise the data, by removing identifying information such as names and dates of birth. Researchers can protect privacy by collecting anonymous data, such as via a survey that does not ask for identifying information. Where that is not feasible, researchers should ‘de-identify’ their data as soon as possible by removing names and replacing them with numbers or pseudonyms.

Veracity: obligation to tell the truth Veracity is about ensuring that comprehensive, accurate and objective information that is as clear and understandable as possible is given to participants. This ethical principle is significantly linked to the principles of autonomy and respect, because in order to make an informed decision the participant needs to be given complete and truthful information. Most researchers would not deliberately engage in lying, but this does not mean that they are always be entirely truthful either. For example, they may: • inflate the benefits of their research in order to gain participants • leave out, or minimise, aspects of their research that may put people off • deliberately cloak information by using jargon or medical terminology. Another aspect of veracity relates to results from research — it is important for researchers not to overinflate the importance of results, not to leave out or minimise results that don’t prove or support the original research hypothesis, or deliberately use jargon or statistical terminology to cloak or cloud the meaning of the results. One other interesting aspect of research ethics is purposeful deception. Deception is often employed in psychological research, even today.9 One relatively famous example of deception research is the Stanley Milgram obedience experiment, which was conducted in 1962.10 Stanley set out to understand why the German people had permitted the Holocaust, and designed an experiment that involved deceiving participants into thinking that they were administering electric shocks to another person. He showed that most participants continued to obey the authority figure (researcher) and administer larger and larger shocks even against their own conscience. While this study gave important insight into human behaviour, this kind of research most likely would not be allowed to be conducted today because of the depth of the deception involved. Today, the use of deception in research is closely monitored by human research ethics committees who typically have a range of criteria to be met if deception is to be used (see endnote reference 8, 2.3.1, p. 20).

10.2 The role and function of human research ethics committees LEARNING OBJECTIVE 10.2 What is the role and function of institutional ethics committees?

Institutional ethics committees (usually called human research ethics committees or HRECs in Australasia) are responsible for upholding the principles of ethical research conduct. Most teaching hospitals and universities have their own HREC, and there are also some state-based HRECs. Regardless of where they are based, they are all required to operate within the NHMRC’s National Statement on Ethical Conduct in Human Research guidelines.11 These committees ensure, as far as is possible, that researchers conduct their research ethically. HREC membership includes core community members required under the NHMRC guidelines. A typical HREC will be made up of a chair, an executive officer and staff of the university/hospital with research expertise in disciplines from which ethics applications reviewed by that institution are received, as well as a(n): • lawyer • lay person • minister of religion

CHAPTER 10 Ethics 109

SUMMARY The main ethical principles that need to be upheld when conducting research are beneficence, nonmaleficence, justice, respect for human dignity, autonomy and confidentiality. Institutional ethics committees (or human research ethics committees) uphold the principles of ethical research conduct. Some of the steps involved in evaluating the ethical aspects of a published journal article include value, scientific validity, fair subject selection, favourable risk–benefit ratio, independent review, informed consent and respect for enrolled subjects.

KEY TERMS anonymity A person’s information is anonymous if the information cannot identify the person because identifiers such as name, date of birth or address have been removed. autonomy Freedom to determine one’s own actions. beneficence Acting to benefit humankind. confidentiality Maintenance of privileged information. Declaration of Helsinki The first historical document outlining ethical principles to govern research, first agreed to in 1964. HREC Human Research Ethics Committee. justice Obligation to treat fairly. NHMRC National Health and Medical Research Council — the peak funding and policy-making body for medical research in Australia. non-maleficence Avoiding or minimising harm. privacy The person’s right to choose and control what happens to personal information.

WEBSITES 1 The NHMRC website provides information on HRECs in Australia: https://nhmrc.gov.au/research-

policy/ethics/human-research-ethics-committees 2 The ‘National Statement on Ethical Conduct in Human Research’ (2007) incorporates all updates as at

May 2018: www.nhmrc.gov.au/guidelines-publications/e72 3 The ‘Guidelines Under Section 95 of the Privacy Act 1988’ set the current standard for the pro-

tection of privacy in the conduct of medical research involving human participants in Australia: https://nhmrc.gov.au/about-us/publications/guidelines-under-section-95-privacy-act-1988 4 R. Cash et al., eds., Casebook on ethical issues in international health research (Geneva: World Health Organization, December 2009), 124–5, http://whqlibdoc.who.int/publications/2009/9789241547727_ eng.pdf

ENDNOTES 1. R. L. Berger, ‘Nazi science—the Dachau hypothermia experiments,’ New England Journal of Medicine 322, no. 20 (2009): 1435–40. doi: 10.1056/NEJM199005173222006; N. Schaefer, ‘The legacy of Nazi medicine,’ The New Atlantis 5 (2004): 54–60. 2. N. Ehrenfreund, The nuremberg legacy: how the Nazi war crimes trials changed the course of history (New York: Palgrave Macmillan, 2007). 3. United Nations, ‘The universal declaration of human rights,’ www.un.org/en/universal-declaration-human-rights/index.html. 4. N. Phillips et al., ‘Effect of acute experimental hand pain on left-right discrimination response latency for hand recognition,’ Physiotherapy Theory and Practice (2019). doi: 10.1080/09593985.2019.1571133. 5. A. J. Spencer, G. D. Slade, and M. Davies, Water Fluoridation in Australia Community Dental Health 13, no. Suppl. 2 (1996): 27–37. 6. G. Pizzo et al., ‘Community water fluoridation and caries prevention: a critical review,’ Clinical Oral Investigations 11, no. 3 (2007): 189–93. 7. National Health and Medical Research Council (NHMRC), ‘Australian drinking water guidelines,’ 2011, https://nhmrc.gov.au/about-us/publications/australian-drinking-water-guidelines. 8. National Health and Medical Research Council (NHMRC), ‘National statement on ethical conduct in human research,’ 2007 (updated in 2018), section 4.4.8: 71, www.nhmrc.gov.au/guidelines-publications/e72.

112 Understanding research methods for evidence-based practice in health

9. D. Barrera, ‘Deception and research,’ in Encyclopedia of business and professional ethics, eds. D. C. Poff and A. C. Michalos (Springer International Publishing AG, 2018). doi: 10.1007/978-3-319-23514-1_51-1. 10. S. Milgram, ‘Behavioral study of obedience,’ Journal of Abnormal and Social Psychology 67 (1963): 371–78. 11. NHMRC, ‘National statement on ethical conduct in human research’. 12. E. J. Emanuel, D. Wendler, and C. Grady, ‘What makes clinical research ethical?’ Journal of the American Medical Association 283, no. 20 (2000): 2701–11. doi: 10.1001/jama.283.20.2701.

ACKNOWLEDGEMENTS Adapting author for this Australian edition: Jane Warland This chapter has been adapted from K. Gerrish and J. Lathlean, eds., The research process in nursing, 7th ed. (Chichester, UK: Wiley-Blackwell, 2015). Photo: © United States Army Signal Corp / Harvard Law School Library

CHAPTER 10 Ethics 113

CHAPTER 11

Getting evidence into practice LEARNING OBJECTIVES 11.1 11.2 11.3 11.4

Why are health professionals slow to adopt evidence-based practice? How can we encourage individuals to implement evidence-based practice? How can organisations support evidence-based practice? How can we include the client perspective in evidence-based practice?

Perhaps the strongest individual barriers, however, are personality based. Some clinicians dislike the entire concept of EBP and place little value on research.5 Like the RN in the opening scenario, they think that it’s unnecessary or irrelevant to their practice. Even for those who aren’t completely against EBP, readiness to adopt healthcare innovations can depend on several factors. These include their attitudes to change; for example, some people are trendsetters and like to try new things, others prefer to follow once a new practice has become established, and some will resist change under any circumstance. Change is risky, and it takes effort to move out of your comfort zone and learn new things. New evidence can also seem quite threatening on a personal level. Being told that the way you have been practising isn’t the best way can be very challenging to a person’s self-esteem and feelings of competence — and the more experienced you are, the more threatening it is. The RN or therapist who has a wealth of knowledge may greatly resent having this brushed aside for new ideas.

Organisational barriers While there are many individual barriers to using EBP, the main reasons why clinicians don’t use evidence tend to be down to the organisations they work for.6 The major organisational barriers to EBP include a lack of resources and support, and a workplace culture that devalues research. Time is again a problem; many facilities are short-staffed and workloads are heavy, so health professionals have little opportunity to research new ways to treat clients. While most workplaces are now computerised, staff don’t always have access to these resources, or to the databases that hold the evidence. Subscriptions to some databases are very expensive, and smaller organisations in particular may not be able to afford it. Even in workplaces with good resources, staff often need support to find and use evidence, from knowledge workers and from management. If this support is missing, or if the organisation has no interest in becoming evidence-based, the clinician will find it hard, if not impossible, to implement EBP themselves.

11.2 Encouraging individuals to implement evidence-based practice LEARNING OBJECTIVE 11.2 How can we encourage individuals to implement evidence-based practice?

All healthcare degrees now include content on research and EBP. This helps to familiarise students and new practitioners with the research process, makes them more comfortable with reading and using research, and enables them to satisfy their professional organisation’s requirements upon graduation. There are also continuing professional education courses and workshops available through professional bodies and educational institutions to help already qualified practitioners upgrade their knowledge and skills. Again, this is mandated in the various codes of conduct and professional standards. Although these can address the barriers that come from lack of knowledge, we know that simply telling people to use evidence doesn’t necessarily change their practice. Taking an ‘instructional’ approach to promoting professional behaviour change is built on the flawed assumption that people behave in a particular way because (and only because) they lack knowledge, and that imparting knowledge will therefore change behaviour. However, this isn’t always the case; while information may be necessary for change, it is rarely, if ever, sufficient. While we can’t change a person’s character traits to make them happy and willing to accept change, we can make it simpler and more attractive for them to make those changes. One way is to ensure that good evidence is made readily available, for example through professional journals that focus on EBP for particular specialties and practice areas. These allow clinicians to find useful and relevant information quickly and easily. Abstraction journals such as Evidence-Based Nursing, which summarise high-level studies and systematic reviews into a single page, make understanding the research even more straightforward. Researchers are also moving away from the statistics and jargon-heavy format they once used in their articles, towards a more readable and understandable way of writing. Some databases, such as the Cochrane Library, insist that all systematic reviews include a summary in layman’s language, and most journals now require authors to make their conclusions and implications for practice explicit, so it’s easier for readers to put the findings into practice. Motivation is also an important factor in whether a clinician chooses to make changes. When they can see the benefits of the new evidence, whether it’s a clear improvement in client health outcomes, or a personal advantage such as reduction in workload or greater ease of use, they are more likely to adopt a new practice.

116 Understanding research methods for evidence-based practice in health

How long does it take to recover?

About two weeks. During this time your child will need to stay at home to prevent infection.

As with previous episodes, it will take 3–7 days to recover from each event.

Will my child stop having tonsillitis?

Yes, but they may still have episodes of sore throat. Tonsillectomy can reduce sore throats by up to 9 episodes in the 2 years following surgery.

Possibly, 30 in every 100 children (30%) stop having tonsillitis without an operation. They may still have sore throats.

What is the risk from the procedure or process?

As with all operations there is a small risk of serious complications. More common problems include pain, vomiting, bleeding, infection and dental damage.

Serious complications of tonsillitis are very rare.

What is the risk of bleeding?

1 in every 100 children (1%) will have serious bleeding immediately after the operation. Most of these will need to have another operation to stop the bleeding. 3 in every 100 children (3%) will have serious bleeding in the first two weeks after the operation. Some will need to have another operation to stop the bleeding.

Bleeding from tonsillitis (haemorrhagic tonsillitis) is very rare.

Will it reduce the number of days missed from school?

Possibly. It depends on how much time your child has been missing from school.

Possibly. As children get older, they often have fewer attacks of tonsillitis.

Can I change my mind? Are other parents happy with their decisions?

Yes, you can be re-referred by your GP at any time. Yes, studies show that parents are happy with the surgery and its effects on their child’s health.

Yes, if their child gets better without the need for an operation.

Source: Trustees of Dartmouth College.

CHAPTER 11 Getting evidence into practice 121

SUMMARY Despite a legal requirement for health professionals to use evidence to underpin practice, it can still take time for research findings to filter through to the workplace, due to personal and organisational barriers that prevent clinicians from implementing evidence-based practice. Personal barriers include lack of knowledge and skills to find and interpret research, lack of time, and issues of confidence, self-esteem and autonomy. Organisational barriers include lack of support or resources, and a disregard for the use of EBP. These barriers can be addressed through education, accessibility of information and support, and the use of tools such as integrated care pathways and clinical practice guidelines. Client decision aids can also be used to include the client more fully in their own care.

KEY TERMS clinical practice guidelines Systematically developed statements to assist practitioner decisions about appropriate healthcare for specific clinical circumstances.16 decision aid Evidence-based tools designed to help clients to participate in making specific and deliberated choices among healthcare options. Client decision aids supplement (rather than replace) clinicians’ counselling about options.17 integrated care pathway A pre-defined plan of client care relating to a specific diagnosis (e.g. suspected fractured hip) or intervention (e.g. hernia repair), with the aim of making the management more structured, consistent and efficient.18 option grid A one-page table covering a single topic, to help clients and clinicians compare alternative treatment options to find the most suitable. patient-reported outcome measure (PROM) The outcome from a research study that matter most to the client/patient rather than those the researchers think are the most important. Identified through qualitative research methods prior to the quantitative study.

WEBSITES Australian Clinical Practice Guidelines (NHMRC): www.clinicalguidelines.gov.au National Institute for Health and Care Excellence (UK): www.nice.org.uk National Guideline Clearinghouse (USA): www.guideline.gov The Effective Practice and Organisation of Care (EPOC): http://epoc.cochrane.org Australian Commission on Safety and Quality in Health Care: www.safetyandquality.gov.au/ourwork/shared-decision-making/other-resources 6 OHRI patient decision aids: http://decisionaid.ohri.ca 7 Option grids for client decisions: www.optiongrid.org 1 2 3 4 5

ENDNOTES 1. See the AHPRA (www.ahpra.gov.au) website for Codes of Conduct for 15 health practitioner groups, including Chinese medicine, occupational therapy, physiotherapy and podiatry, and the NMBA Registered Nurse Standards for Practice. 2. See the New Zealand Nurses Organisation website (www.nursingcouncil.org.nz) for the Code of Conduct and Standards of Professional Nursing Practice. 3. G. Malik, L. McKenna, and V. Plummer, ‘Perceived knowledge, skills, attitude and contextual factors affecting evidence-based practice among nurse educators, clinical coaches and nurse specialists,’ Journal of Nursing Practice 21, suppl. 2 (2015): 46–57. 4. H. Sadeghi-Bazargani, J. S. Tabrizi, and S. Azami-Aghdash, ‘Barriers to evidence-based medicine: a systematic review,’ Journal of Evaluation in Clinical Practice 20 (2014): 793–802; F. Timmins, C. McCabe, and R. McSherry, ‘Research awareness: managerial challenges for nurses in the Republic of Ireland,’ Journal of Nursing Management 20 (2012): 224–35; A. C. Pighills et al., ‘Positioning occupational therapy as a discipline on the research continuum: results of a cross-sectional survey of research experience,’ Australian Occupational Therapy Journal 60 (2013): 241–51; N. Rydland Olsen et al., ‘Evidence-based practice exposure and physiotherapy students’ behaviour during clinical placements: a survey,’ Physiotherapy Research International 19 (2014): 238–47; M. Stronge and M. Cahill, ‘Self-reported knowledge, attitudes and behaviour towards evidence-based practice of occupational therapy students in Ireland,’ Occupational Therapy International 19 (2012): 7–16.

122 Understanding research methods for evidence-based practice in health

5. K. E. Harding et al., ‘Not enough time or a low priority? Barriers to evidence-based practice for allied health clinicians,’ Journal of Continuing Education in the Health Professions 34, no. 4 (2014): 224–31. 6. K. Samuelsson and E. Wressle, ‘Turning evidence into practice: barriers to research use among occupational therapists,’ British Journal of Occupational Therapy 78, no. 3 (2015): 175–81; K. A. Abrahamson, R. L. Fox, and B. N. Doebbeling, ‘Facilitators and barriers to clinical practice guideline use among nurses,’ American Journal of Nursing 112, no. 7 (2012): 26–35. doi: 10.1097/01.NAJ.0000415957.46932.bf. 7. M. Laurant et al., ‘Nurses as substitutes for doctors in primary care,’ Cochrane Library of Systematic Reviews 7 (2018). doi: 10.1002/14651858.CD001271.pub3. 8. J. Rycroft-Malone, ‘Evidence-informed practice: from individual to context,’ Journal of Nursing Management 16 (2008): 404–8; A. Thomas and M. Law, ‘Research utilization and evidence-based practice in occupational therapy: a scoping study,’ American Journal of Occupational Therapy 67 (2013): e55–65. doi: 10.5014/ajot.2013.006395. 9. J. Berentson-Shaw, ‘Reducing inequality in health through evidence-based clinical guidance: is it feasible? The New Zealand experience,’ International Journal of Evidence-Based Healthcare 10 (2012): 146–53. 10. A. Edwards and G. Elwyn, Shared decision-making in health care: achieving evidence-based patient choice (New York: Oxford University Press, 2009). 11. J. Dawson et al., ‘The routine use of patient reported outcome measures in healthcare settings,’ BMJ: British Medical Journal (Clinical Research ed.) 340 (2009): c186; K. McGrail, S. Bryan, and J. Davis, ‘Let’s all go to the PROM: the case for routine patient-reported outcome measurement in Canadian healthcare,’ Healthcare Papers 11, no. 4 (2011): 8–18; K. A. Meadows, ‘Patient-reported outcome measures: an overview,’ British Journal of Community Nursing 16, no. 3 (2011): 146–51. 12. Edwards and Elwyn, Shared decision-making in health care; G. Makoul and M. L. Clayman, ‘An integrative model of shared decision making in medical encounters,’ Patient Education and Counseling 60, no. 3 (2006): 301–12. 13. G. Elwyn et al., ‘Shared decision making and the concept of equipoise: the competences of involving patients in healthcare choices,’ The British Journal of General Practice 50, no. 460 (2000): 892–9. 14. D. Stacey et al., ‘Decision aids for people facing health treatment or screening decisions (review),’ Cochrane Database of Systematic Reviews 4 (2017). doi: 10.1002/14651858.CD001431.pub5. 15. G. Elwyn et al., ‘Option grids: shared decision making made easier,’ Patient Education and Counseling 90 (2013): 207–12. 16. M. J. Field and K. N. Lohr, eds., Guidelines for clinical practice: from development to use (National Academies Press (US), 1992). 17. Stacey et al., ‘Decision aids,’ 7. 18. T. Rotter et al., ‘Clinical pathways: effects on professional practice, patient outcomes, length of stay and hospital costs,’ Cochrane Database of Systematic Reviews 3 (2010). doi: 10.1002/14651858.CD006632.pub2.

ACKNOWLEDGEMENTS

.

Adapting author for this Australian edition: Elaine Crisp Photo: © Miriam Doerr Martin Frommherz / Shutterstock.com Figure 11.1: © M. Laurant et al., ‘Nurses as substitutes for doctors in primary care,’ Cochrane Library of Systematic Reviews 7 (2018). doi: 10.1002/14651858.CD001271.pub3. Figure 11.3: © PubMed / Public Domain Figure 11.4: © D. Stacey et al., ‘Decision aids for people facing health treatment or screening decisions (review),’ Cochrane Database of Systematic Reviews 4 (2017). doi: 10.1002/14651858.CD001431.pub5. Table 11.1: © Trustees of Dartmouth College

CHAPTER 11 Getting evidence into practice 123

CHAPTER 12

Challenges to evidence-based practice LEARNING OBJECTIVES 12.1 What’s wrong with evidence-based practice when it’s done badly? 12.2 What’s wrong with evidence-based practice when it’s done well? 12.3 Is evidence-based practice really possible?

12.1 When evidence-based practice is done badly LEARNING OBJECTIVE 12.1 What’s wrong with evidence-based practice when it’s done badly?

What does ‘EBP done badly’ look like? First, bad EBP cites evidence derived from studies but asks no upstream questions about those studies. For example, did the evidence come from a single trial, or a highquality and recent meta-analysis of all available trials? Does the sample population match your own circumstances or context? Using evidence from a study without considering how strong the results are or how applicable they are to your clients is not good practice. Second, bad EBP considers the world of published evidence to equate to the world of client and patient need. Hence, it commits two fallacies: it assumes that if (say) a randomised controlled trial (RCT) exists that tested a treatment for a ‘disease’, that disease is necessarily a real medical problem requiring treatment; and it also assumes that if ‘methodologically robust’ evidence does not exist on a topic, that topic is unimportant. This leads to a significant bias. The evidence base will accumulate in conditions that offer the promise of profit to the pharmaceutical and medical device industries — such as the detection, monitoring and management of risk factors for cardiovascular disease; or the creation and treatment of non-diseases such as ‘female hypoactive sexual desire’.3 Evidence will also accumulate in conditions that governments choose to recognise and prioritise for publicly funded research, such as dementia, or those that have a high public profile, such as breast cancer. However, it will fail to accumulate (or will accumulate much more slowly) in Cinderella conditions that industry and/or government deem unimportant, non–profit making, or ‘non-medical’ — such as multi-morbidity, physical activity in cardiovascular prevention, domestic violence or age-related frailty — or those that are less visible, such as ovarian cancer.4 Bad EBP has little regard for the client perspective and fails to acknowledge the significance of clinical judgement. As stated in the section ‘The client perspective in evidence-based practice’ in the chapter on getting evidence into practice, the ‘best’ treatment is not necessarily the one shown to be most effective in RCTs, but the one that fits a particular set of individual circumstances and aligns with the client’s preferences and priorities. Finally, bad EBP draws on bad research — for example, research that has used weak sampling strategies, unjustified sample sizes, inappropriate comparators, statistical trick-cycling, unsubstantiated conclusions, and so on. This sort of research produces evidence that is invalid, unreliable and possibly even harmful. Each year, thousands of people are injured or die from treatments that are based on bad evidence, and the economic damage to individuals, the health system and society as a whole is enormous.5 This is why health professionals need to keep their critical-thinking skills sharp and be able to appraise and evaluate papers for themselves (see the chapter on reviewing the literature).

12.2 When evidence-based practice is done well LEARNING OBJECTIVE 12.2 What’s wrong with evidence-based practice when it’s done well?

Fortunately, it is easy today to find good evidence to underpin health practice. Peer-reviewed journals, well-indexed and well-managed databases such as Trip, and evidence-based organisations — such as the Joanna Briggs Institute and the Cochrane Collaboration — ensure that clinicians have access to quality research based on rigorous methods. High-level evidence resources — such as systematic reviews, metaanalyses and clinical practice guidelines — are readily available to help health professionals to carry out EBP well. There are still good reasons why EBP will never be the fount of all knowledge — all evidence has its limitations.6 A significant criticism of EBP is the extent to which it is a formalised method for imposing an unjustifiable degree of standardisation and control over clinical practice. In the modern healthcare sector, EBP can be more or less equated with the production and implementation of clinical practice guidelines. There are a number of challenges to these worth considering.

Guidelines devalue professional expertise Perhaps the most common challenge to EBP is that it downgrades and devalues the expertise of the individual clinician, replacing their local and specialised knowledge with generic, ‘cookie-cutter’ clinical practice guidelines and protocols. Research, the critics say, values the measurable over the intangible, empiricism over humanism, and bears little resemblance to practice in the real world. Many clinicians resent the imposition of a guideline or pathway, claiming clinical freedom (‘I’m not having anyone telling me how to manage my clients’) or resisting change (‘When I trained we were always taught to use 126 Understanding research methods for evidence-based practice in health

antiseptic to wash wounds’). The clinician blundering blithely through their work still using the same techniques learned in training 40 years earlier makes this argument very weak, and it also represents a misunderstanding of what EBP actually is. Research evidence is not the only evidence used in EBP; the health professional’s expertise is actually an important factor and carries a great deal of weight in the clinical decisions they make. A genuine problem arises when guidelines and best practice recommendations are written with no input from the profession that will be carrying out the treatments. When, for example, a group of medical doctors writes a nursing guideline, it may put far more emphasis on the biomedical model and take too little notice of the holistic nature of the nursing discipline. In this case, the guideline may emphasise aspects of the treatment that are irrelevant or inappropriate, and many nurses may feel that their own disciplinary expertise is being ignored — and rightly so. This challenge can be addressed by ensuring that guidelines are adapted or designed locally, and that the committee doing this includes representatives from all affected disciplines, who all have a voice in the finished product. Another challenge to EBP is when governmental or organisational policy demands that guidelines are followed exactly, in an often well-meaning effort to control resources and standardise practice. While this may primarily be intended to result in equitable treatment for all clients or patients, it again discounts the individual expertise of the clinicians, who may also be concerned that costs will become central to the consideration of what treatments to use, rather than what will produce best client outcomes. Cost is an important consideration, given the financial constraints of the health system, but it is vital to also take into account the benefits of a treatment, as these often offset or even outweigh the expenses.7 Above all, remember that a good guideline won’t force you to abandon common sense or judgement; it simply flags a recommended course of action for you to consider.

The guidelines are too narrow (or too broad) Standardisation of practice may result in ‘average’ rather than best treatment being given to all clients or patients. Following a guideline can inhibit innovation and prevent individual cases from being dealt with discretely and sensitively. Humans naturally like to classify people as part of some sort of group, and once we do this, we tend to use the general characteristics of that group to inform the way we treat them. Guidelines help us to do this, by making those characteristics pre-eminent and permanent.8 Guidelines also tend to look at one condition or treatment in isolation. This can cause difficulties when working with clients with multiple morbidities, as many do, or who don’t match the population covered by the guideline in some way (such as age or ethnicity). This is a very valid criticism: what if one of the recommended actions to treat condition A is contraindicated for condition Y, which the client also has? Finding an appropriate guideline can be difficult and confusing. Similarly, trying to use a guideline developed for one context in another — for example, an acute care guideline in a primary care clinic — can result in inappropriate treatments. Again, this illustrates the importance of adapting guidelines to the specific context in which practice is taking place.

The guidelines are out of date One nasty side effect of the standardisation of practice is that yesterday’s best evidence drags down today’s guidelines and clinical pathways. Most guidelines are implemented with a built-in ‘review’ period — often three to five years — but these aren’t always followed, for various reasons (often guidelines that were due for renewal five years earlier might still be in use). Even when they are updated on time, what happens when new evidence comes to light just as the new guideline is released? This evidence can take some time to trickle down to the individual organisation level, and staff may be left using obsolete EBP for many years — unless the staff themselves take a stand. The various professional codes of conduct and standards for practice9 require clinicians to stay up to date with developments in their own area through continuous professional development, and we can use this knowledge to draw attention to an outdated guideline (and, if necessary, take action to have it changed).

The client’s perspective is ignored Perhaps the most powerful challenge to EBP is, if misapplied, it dismisses the client’s own perspective on the illness in favour of an average effect on a population sample or a column of figures calculated by a statistician. If standardised care is imposed, it reduces the clinician’s ability to respond to the idiosyncratic, here-and-now issues emerging for a particular client. The very core of the EBP approach is to use a CHAPTER 12 Challenges to evidence-based practice 127

SUMMARY There are many challenges to EBP, based on its limitations or on misconceptions of what EBP actually is. Many of these involve the beliefs that EBP downgrades or sidelines the practitioner, and results in mediocre or inappropriate care for patients and clients. The argument against the compulsive imposition of clinical practice guidelines and integrated care pathways is a powerful one, and every practising clinician shares the fear that politicians and health service managers who have jumped on the EBP bandwagon will use these to decree the treatment of conditions rather than of clients.16 They may make judgements about people and their illnesses based on published evidence that an intervention is effective ‘on average’. These challenges can be addressed by adapting, interpreting, and engaging with EBP and the research that underpins it. As health professionals, we need to be able to find research, read it, understand it, interpret and evaluate it, and apply it. We also need to incorporate it with the other aspects of EBP — clinical expertise, client preferences and context — to ensure we are providing our patients and clients with the best possible treatment, for the best possible outcomes.

KEY TERMS biomedical The application of biological science to clinical medicine. clinical practice guidelines Systematically developed statements to assist practitioner decisions about appropriate healthcare for specific clinical circumstances.17 empiricism A theoretical viewpoint that privileges knowledge gained from the senses, particularly observation, and from experiments. From the Greek empeiría, meaning ‘experience’. holistic Concerned with all aspects of a person’s wellbeing, including physical, psychological, spiritual and social factors. humanism A theoretical viewpoint that emphasises the importance of human values, beliefs and attitudes. meta-analysis Statistical procedure for combining results from existing quantitative studies with measured outcomes, effectively increasing sample size and precision. protocol Official procedure or set of rules that must be followed without variation. reliable How consistent and repeatable a study’s results are. valid How well the results of a study measure the phenomenon being researched.

WEBSITES 1 What’s the harm? An interesting website that lists deaths and injury from following questionable or

unsupported treatments and health practices: www.whatstheharm.net 2 Ben Goldacre discusses the dangers that come from bad evidence in his TED talk ‘What doctors don’t

know about the drugs they prescribe’: https://www.ted.com/talks/ben_goldacre_what_doctors_don_t_ know_about_the_drugs_they_prescribe 3 Australian Health Practitioner Regulation Agency: www.ahpra.gov.au 4 Information on how to develop, evaluate and implement clinical practice guidelines, National Health and Medical Research Council: https://nhmrc.gov.au/about-us/publications/guide-development-evaluationand-implementation-clinical-practice-guidelines 5 The Appraisal of Guidelines for Research and Evaluation (AGREE) Instrument can help you to evaluate how a guideline has been developed, to assess its quality: www.agreetrust.org/resource-centre

ENDNOTES 1. A. P. Verhagen et al., ‘Conservative interventions for treating work-related complaints of the arm, neck or shoulder in adults,’ Cochrane Database of Systematic Reviews 12 (2013). doi: 10.1002/14651858.CD008742.pub2. 2. D. Spence, ‘Why evidence is bad for your health,’ BMJ: British Medical Journal 341 (2010): c6368; S. Timmermans and M. Berg, The gold standard: the challenge of evidence-based medicine and standardization in health care (Philadelphia: Temple University Press, 2003); S. Timmermans and A. Mauck, ‘The promises and pitfalls of evidence-based medicine,’ Health Affairs 24, no. 1 (2005): 18–28; T. Agoritsas and G. H. Guyatt, ‘Evidence-based medicine 20 years on: a view from the inside,’ The Canadian Journal of Neurological Sciences 40, no. 4 (2013): 448–9; B. Goldacre, Bad pharma: how drug

CHAPTER 12 Challenges to evidence-based practice 131

3.

4.

5. 6.

7. 8. 9. 10. 11. 12. 13.

14. 15. 16.

17.

companies mislead doctors and harm patients (London: Random House Digital Inc., Fourth Estate, 2013); E. J. Mullen and D. L. Streiner, ‘The evidence for and against evidence-based practice,’ Brief Treatment and Crisis Intervention 4, no. 2 (2004): 111–21. P. M. Saukko et al., ‘Beyond beliefs: risk assessment technologies shaping patients’ experiences of heart disease prevention,’ Sociology of Health & Illness 34, no. 4 (2012): 560–75; A. Jutel, ‘Framing disease: the example of female hypoactive sexual desire disorder,’ Social Science & Medicine 70, no. 7 (2010): 1084–90. M. Lugtenberg et al., ‘Current guidelines have limited applicability to patients with comorbid conditions: a systematic analysis of evidence-based guidelines,’ PLOS ONE 6, no. 10 (2011): e25987; F. C. Bull and A. E. Bauman, ‘Physical inactivity: the ‘Cinderella’ risk factor for non-communicable disease prevention,’ Journal of Health Communication 16, suppl. 2 (2011): 13–26; C. Garcia-Moreno and C. Watts, ‘Violence against women: an urgent public health priority,’ Bulletin of the World Health Organization 89, no. 1 (2011): 2; A. Clegg et al., ‘Frailty in elderly people,’ The Lancet 381 (2013): 752–62. J. M. Grimshaw et al., ‘Knowledge translation of research findings,’ Implementation Science 7 (2012): 50, www.implementationscience.com/content/7/1/50; ‘What’s the harm?,’ www.whatstheharm.net. P. Shekelle et al., ‘Developing clinical practice guidelines: reviewing, reporting, and publishing guidelines; updating guidelines; and the emerging issues of enhancing guideline implementability and accounting for comorbid conditions in guideline development,’ Implementation Science 7, no. 1 (2012): 62; A. P. Gurses et al., ‘Using an interdisciplinary approach to identify factors that affect clinicians’ compliance with evidence-based guidelines,’ Critical Care Medicine 38 (2010): S282–91; A. G. Gagliardi et al., ‘How can we improve guideline use? A conceptual framework of implementability,’ Implementation Science 6, no. 1 (2011): 26; S. Evans-Lacko et al., ‘Facilitators and barriers to implementing clinical care pathways,’ BMC Health Services Research 10, no. 1 (2010): 182; S. Michie and M. Johnston, ‘Changing clinical behaviour by making guidelines specific,’ BMJ: British Medical Journal 328, no. 7435 (2004): 343; R. Grol et al., ‘Attributes of clinical guidelines that influence use of guidelines in general practice: observational study,’ BMJ: British Medical Journal 317, no. 7162 (1998): 858–61. Mullen and Streiner, “Evidence for and against”, 111–21. T. Greenhalgh, ‘Of lamp posts, keys, and fabled drunkards: a perspectival tale of 4 guidelines,’ Journal of Evaluation in Clinical Practice 24, no. 5 (2018): 1132–8. See the AHPRA website (www.ahpra.gov.au) for Codes of Conduct for 13 health professions (Standard 7: Maintaining professional performance), and the NMBA Registered Nurse Standards for Practice (Standard 3). D. Allen and K. Harkins, ‘Too much guidance?,’ The Lancet 365, no. 9473 (2005): 1768. C. A. Smith et al., ‘Relaxation techniques for pain management in labour,’ Cochrane Database of Systematic Reviews 3 (2018). doi: 10.1002/14651858.CD009514.pub2. T. Greenhalgh, ‘What is this knowledge that we seek to “exchange”?,’ The Milbank Quarterly 88, no. 4 (2010): 492–99. doi: 10.1111/j.1468-0009.2010.00610.x. S. Porter, ‘Fundamental patterns of knowing in nursing: the challenge of evidence-based practice,’ Advances in Nursing Science 33, no. 1 (2010): 3014; D. Holmes et al., ‘Nursing best practice guidelines: reflecting on the obscene rise of the void,’ Journal of Nursing Management 16 (2008): 394–403. A. Jutel, ‘Beyond evidence-based nursing: tools for practice,’ Journal of Nursing Management 16 (2008): 417–21. G. C. S. Smith and J. P. Pell, ‘Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled trials,’ BMJ: British Medical Journal 327 (2003): 1459–61. K. E. Sleeman et al., ‘“It doesn’t do the care for you”: a qualitative study of health care professionals’ perceptions of the benefits and harms of integrated care pathways for end of life care,’ BMJ Open 5 (2015): e008242. doi: 10.1136/bmjopen2015-008242. M. J. Field and K. N. Lohr, eds., Guidelines for clinical practice: from development to use (National Academies Press [US], 1992).

ACKNOWLEDGEMENTS Adapting author for this Australian edition: Elaine Crisp Photo: © Olena Yakobchuk / Shutterstock.com Figure 12.1: © A. P. Verhagen et al., ‘Conservative interventions for treating work-related complaints of the arm, neck or shoulder in adults,’ Cochrane Database of Systematic Reviews 12 (2013). doi: 10.1002/14651858.CD008742.pub2. Figure 12.2: © C. A. Smith et al., ‘Relaxation techniques for pain management in labour,’ Cochrane Database of Systematic Reviews 3 (2018). doi: 10.1002/14651858.CD009514.pub2. Figure 12.3: © G. C. S. Smith and J. P. Pell, ‘Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled trials,’ BMJ: British Medical Journal 327 (2003): 1459–61.

132 Understanding research methods for evidence-based practice in health

INDEX Note: Figures and tables are indicated by italic f and t, respectively, following the page reference. Abbey pain scale 51 abstract 23 accessible population see study population ACCESSSS 66 accountability culture 118 advanced search 14 AHPRA see Australian Health Practitioner Regulation Agency (AHPRA) alert service, use of 12 allocation bias 62, 64t alternate form reliability 50t anecdote (storytelling) 6 anonymity 109 Ask, Acquire, Appraise, Apply and Evaluate model 9 attrition bias 64t audio/visual materials 36 Australian Bureau of Statistics (ABS) 48, 98 Australian Commission on Safety and Quality in Health Care website 120 Australian drinking water guidelines 108 Australian Health Practitioner Regulation Agency (AHPRA) 22, 115 Australian Institute of Health and Welfare 48 Australian National Statement on Ethical Conduct in Human Research 108 autonomy 107 barriers to evidence-based practice 115 individual barriers 115–16 organisational barriers, 116 baseline characteristics of groups 77 beneficence 108 Best Practice (database) 65 bias 56, 64 affecting intervention studies 64t definition of 63 sampling error and 102 specific to systematic reviews 64, 64t biomedical model 127 Black, Nick 33 body of evidence 66, 67 Boolean operators 14, 15f British Medical Journal 35, 129 case 74 case-control studies 47 case series 47, 63 case studies 35t categorical data 48 categorical variable 75 causality, nature and direction of

78–9

central tendency 76 CHAIN (Contact, Help, Advice and Information Network) 17 chi-square 84 CI see confidence interval (CI) CINAHL (database) 66 citation bias 64t citation tracking 16 clients assessment, considerations in 4 clients’ illness narratives 6 clinical decision making 6 by anecdote 6 by cost minimisation 7 by expert opinion 6–7, 7t by press cutting 6 Clinical Evidence (database) 65 clinical expertise 3 clinical guidelines 118–26 benefits of 118 clinical practice guideline 65, 118–26 Clinical Practice Guidelines Portal (database) 65 clinical questions 56, 57 components of 8 how researchers answer 57–8, 58t in healthcare 56–8 matching of, to NHMRC levels of evidence 58–63 clinical research, ethical 111, 111f clinical significance 26, 30 clinical trials, comparative 82 cluster random sampling 100, 101t Cochrane Collaboration 62, 66, 126 Cochrane Library 13, 48, 66, 116, 117 cohort studies 47 comparative clinical trials 82 confidence interval (CI) 81 and clinical importance 81–4 narrower 81 confidentiality 109 construct validity 51, 51t content validity 51t continuous data 49 continuous variable 76–7 convenience sampling 100, 101t convergent study 90–1, 91f correlation coefficient 78 cost minimisation 7 costs consideration 127 counting-and-measuring approach, 33 see also quantitative approach cross-sectional studies 47 crystallisation 94 data collection in qualitative research 36–7, 36f data distribution 77 database inclusion bias 64t databases 57, 65

deception 109 decision aids 119 decision making see clinical decision making Declaration of Helsinki 107 deduction 34 dependent variables 49 descriptive research 47 descriptive statistics 75 for categorical variables 75 for continuous variables 76–7 detection bias 64t diagnostic accuracy studies 47 difference score 77 discourse analysis 36t dispersion 76 distribution, data 77 documents 36 duplicated publication bias 64t EBP see evidence-based practice (EBP) Eccleston, John 74 economic harm 108 effect-or-no-effect approach 81 effects 81 minimum important difference 83 embedded study 90, 92, 93f, 94 empiricism 126 epidemiological research 47 equipoise 119 error, sampling 102 ethical principles 107–9 autonomy 107 beneficence 108 confidentiality 109 justice 108 non-maleficence 108 respect for human dignity 108 veracity 109 ethnography 35t evidence 56 evidence-based care, 5 see also evidence-based practice (EBP) Evidence-Based Nursing 116 evidence-based practice (EBP) 2, 57, 64t, 115 achieving 129–30 adoption of 115–16 apprehension towards 5–7 client perspective in 120 context-sensitive checklist for 4 definition of 2 done badly 126 encouraging individuals to adopt 116–17 individual barriers to 115–16 limitations/criticism of 126–9 meaning of 2–4 NHMRC and 57 organisational barriers to 116

INDEX 133

evidence-based practice (EBP) (Cont.) organisational support of 117 steps in 4 triad approach to 2, 3 evidence-based practitioners 57, 58 experimental designs 46 expert opinion 6–7 harmful practices strongly supported by 7t face validity 51t federated search engine 16–17 filters, use of 15–16, 16f flexible sampling process 103 focus groups 36t, 37, 89 forest plot 84, 85f generalisation 75 Google alert 12 Google Scholar 12, 14, 16 GRADE system 65 grades of recommendation 65–6 Greenhalgh, Trisha 6, 33 grey literature 27 grounded theory 35t Gunner, Michelle 44 health librarian 13, 17 Helman, Cecil 33 herd immunity 22 heterogeneity 84 hierarchy of evidence 66, 67 holistic nature 127 homogeneity of studies 28, 30 HRECs see human research ethics committees (HRECs) human dignity, respect 108 human research ethics committees (HRECs) 109 constitution of 109 role and function of 109–10 humanism 126 hypothesis 103 hypothesis testing 79 illness scripts 6 independent variables 49 inductive reasoning 34 inferential statistics 79 evidence-based practice and 81 validity of 84 institutional ethics committees see human research ethics committees (HRECs) integrated care pathways 117 internal consistency 51t inter-rater reliability 34, 50t interval (measurement level) 49t intervention studies 46–7 hierarchy of evidence for 68t NHMRC evidence levels for 60–3 interviews 36f, 37 for qualitative data collection 37 semi-structured 37 unstructured 37 intra-rater reliability 50t introduction, methods, results and discussion (IMRAD) format 23 134 INDEX

Joanna Briggs Institute journal clubs 117 justice 108 knowledge

126

4

Lancet, The 22 language bias 64t legal harm 108 Level I evidence — systematic reviews of Level II evidence 60–1 Level II evidence — randomised controlled trials 61–2 Level III evidence — controlled trials that are not randomised 62–3 Level IV evidence — trials without control groups 63 level of measurement 48, 49t levels of evidence 3, 17, 56, 66 in general hierarchy of evidence 66–7, 68t grades of recommendations and practice guidelines 65–6 literature, searching 13 logistical/practical constraints 128 longitudinal studies 47 matching clinicial questions to NHMRC levels of evidence 59t maximum variable 77 McGregor, Alyson 98 mean 76 Measles-Mumps-Rubella (MMR) vaccination 22 measurement 48–52 bias 63, 64t error 102 levels of 48, 49t reliability 50–2, 50t validity 50–2, 51t variables in 49 median 76 MEDLINE (database) 27, 117 meta-analysis 13, 24, 26, 28, 29f, 48, 66, 84, 126 meta-syntheses 24, 28–9 minimum important difference 81–83 minimum variable 77 mixed methods research 89–90 analysis of data in 93 challenges associated with 94 convergent study 90–1, 91f design considerations 94 embedded study 90, 92, 93f integration in 93–4 multiphase study 90–1, 92f sampling in 93 sequential study 90–1, 91f use of 90 mode 77 motivation 116 multiphase study 90–1, 92f, 94 multistage random sampling 100, 101t narrative interview 36t

National Health and Medical Research Council (NHMRC) 57, 108, 118 Guidelines and Publications (database) 65 NHMRC levels of evidence 57 for intervention studies 60–3 Level I evidence 60–1 Level II evidence 61–2 Level III evidence 62–3 Level IV evidence 63 matching clinical questions to 58–63 NMBA Registered Nurse Standards for Practice 115 nominal (measurement level) 48, 49t non-experimental designs 46 non-maleficence 107–8 non-normal (skewed) data 78, 78f non-probability sampling 36, 99t, 99–100 techniques 101t non-randomised controlled trials 46, 62–3 normal distribution 78, 78f noxious stimuli 107 null hypothesis 79, 103 Nuremberg Code 107 observational descriptive study 47 observational studies 47–8 observations 47–8 ‘being native’ and ‘going native,’ difference between 37 embedded 37 for qualitative data collection 37 passive 37 observer bias in qualitative research 39 option grid 120 ordinal (measurement level) 48, 49t organisational barriers 116 outcome measure 119 outcome reporting bias 64t parallel form reliability 50t participant 98 participant observation 35t, 38 passive observation 35t patient-reported outcome measures (PROMs) 119 patient values and preferences 3 PEDro (database) 65 peer review 23, 30 performance bias 64t personal barriers 116 personal clinical experiences 6 phenomenology 35t physical harm 108 PICO (population, intervention, comparison and outcome measures) 46, 46t placebo 60 pooled effect size 84 population 75 population mean 102 power analysis 103

Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 26 press cutting, decision making by 6 primary evidence 66 primary papers, considerations in reading of 24 data analysis 26 data collection methods 26 sample and setting 25–6 primary research 13 vs. secondary research 13 privacy 109 probability see p-value probability sampling 99–100, 99t techniques 100t problem, formulation of 8 progressive focusing 35, 40 PROMs see patient-reported outcome measures (PROMs) prospective studies 47 protocols 126 psychological harm 108 publication bias 27, 64t published journal article, ethical aspects of 64t, 111f PubMed (database) 13, 14f purposive sampling 40, 101 p-value 79 in non-significant range 80 qualitative data 26, 39 qualitative research 34, 37 and quantitative research, difference between 34–5, 34t conclusions of 40 criticisms of 40 data collection in 36–7, 36f methods 35, 35t nine-point question list as guide for 38–40 non-probability sampling 36 paper, evaluation of 37–40 sample size in 102 validity of 34 qualitative researcher 33, 39 qualitising 94 quantitative data, storage of 74–5 quantitative research 34, 44–6 aims and reasons for 45t and qualitative research, difference between 34–5, 34t, 45t designs for 46–8 measurement in 48–52 sample size in 102–3 quantitative researcher 33 quantitative results 26 quantitising 94 quota sampling 101 random errors 102 randomised controlled trials (RCTs) 46, 126 Level II evidence 61–2 random sample 38 range 77 ratio (measurement level) 48, 49t reflexivity 34

Registered Nurse Standards for Practice 22 reliability, in measurement 50–2 types of 50t reliability, quantitative research 34 reliable evidence 129 research, definition of 3 research design 24 research evidence 3 research paper 17 general questions for 23–4 introduction, methods, results and discussion (IMRAD) format 23 peer review 23 primary papers 24–6 quality and utility of, assessment of 23 reviewing of, need of 25–6 secondary papers 24, 26–9 research question 24 integrated 93 research reports, quantitative 75 Researchgate 17 results of study 24 retrospective longitudinal cohort design 5 retrospective studies 47 reviewers’ personal biases 64t right to privacy 109

secondary research 13 selection bias 62 semi-structured interviews 35t, 37 sequential study 90–1, 91f, 93 shared decision making 119 simple random sampling (SRS) 100, 100t skew distribution 78, 78f skewed data 78 snowball sampling 101 social harm 108 social media, searching for information using 19 standard deviation 77 Stanley Milgram obedience experiment 109 statistical analysis 75 statistical power, concept of 102 statistical significance 26 statistical tests, knowledge about 74 statistically significant 80 statistics in journal articles 77 stratified random sampling 100, 101t study population 98–9 systematic errors 102 systematic random sampling 100, 100t systematic reviews 24, 26, 48, 57, 116 biases specific to 64, 64t Level I evidence 60–61

sample 99 sample mean 102 sample size 102 calculation of 102–3 in qualitative research 102–3 in quantitative research 102–3 sampling 102–3 definition of 99 importance of 102 methods 100–2, 100t, 101t non-probability 99–100, 99t probability 99t, 99–100 random errors 102 related terminology 98–9 systematic errors 102 types of 99–102, 99t sampling frame 99 saturation 34, 103 scale (measurement level) 48, 49t score 74 search for evidence 17–19 different types of 12–13 effective search strings 13–17 focused, looking for answers 13 health librarians, consultation with 17 informal 12 searching the literature 13 social media, use of 13, 19 Web 2.0 and social media tools 17t search string 15 secondary evidence 66 secondary papers, reviewing methods of 26–29 database(s) and other sources 27–8 focused clinical question 27 numerical result 26 reviewers and methods 27 sensitivity analysis 28

target population see theoretical population test-retest reliability 50t thematic analysis 39 theoretical population 98 time lag bias 64t title 23 tonsillectomy option grid 120 tree plot 81 triangulation 34, 90 Trip (database) 16, 17f, 65 truncation 15, 15f Twitter, use of 12, 19 type I error 103 type II error 103 UK National Institute of Health and Care Excellence (NICE) 118 United Nations 107 unstructured interviews 36t, 37 US Food and Drug Administration (USFDA) 98 US National Guideline Clearinghouse 118 valid evidence 129 validity 129 of measurement 50–2, 51t of qualitative research 34 values 74 variables 49, 74, 74t veracity 109 Web 2.0 and social media tools 17t weighting (dominance) 94 wildcards, use of 15 Wilson, Richard 74 INDEX 135