Systems, Functions and Safety: A Flipped Approach to Design for Safety 3031158229, 9783031158223

This textbook provides up-to-date content in the fields of system engineering, system safety and functional safety, with

511 149 9MB

English Pages 195 [196] Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Systems, Functions and Safety: A Flipped Approach to Design for Safety
 3031158229, 9783031158223

Table of contents :
A Letter From Your Instructor
Acknowledgement
Contents
Chapter 1: Safety-Critical Systems
Introduction
Video Lesson
Lecture Notes
Exercise 1
Exercise 1 Template
Exercise 1 Sample Solutions
Solution 1
Solution 2
Solution 3
Solution 4
Key Recap Questions
Self-assessment
Self-assessment Key
Chapter 2: System Requirements and Functions
Introduction
Video Lesson
Lecture Notes
Exercise 2
Exercise 2 Template
Exercise 2 Sample Solutions
Solution 1
Solution 2
Solution 3
Solution 4
Key Recap Questions
Self-assessment
Self-assessment Key
Chapter 3: System Safety
Introduction
Video Lesson
Lecture Notes
Calculation Examples
Task 1
Exercise 3
Exercise 3 Template
Exercise 3 Sample Solutions
Solution 1
Solution 2
Solution 3
Solution 4
Key Recap Questions
Self-assessment
Self-assessment Key
Chapter 4: System Safety Process
Introduction
Video Lesson
Lecture Notes
Exercise 4
Exercise 4 Template
Exercise 4 Sample Solutions
Solution 1
Solution 2
Solution 3
Solution 4
Key Recap Questions
Self-assessment
Self-assessment Key
Untitled
Chapter 5: Functional Safety
Introduction
Video Lesson
Lecture Notes
Exercise 5
Exercise 5 Sample Solutions
Solution 1
Solution 2
Solution 3
Solution 4
Key Recap Questions
Self-assessment
Self-assessment Key
Chapter 6: Defining Safety Functions
Introduction
Video Lesson
Lecture Notes
Your First Safety Project!
Required Output
Submission Deadline
Assessment
Sample Solution to the Project
Chapter 7: Safety Integrity and Random Failures
Introduction
Video Lesson
Lecture Notes
Calculation Examples
Task 1
Task 2
Exercise 7
Exercise 7 Solution
Key Recap Questions
Self-assessment
Self-assessment Key
Chapter 8: Safety Integrity of Composite Systems
Introduction
Video Lesson
Lecture Notes
Calculation Examples
Task 1
Task 2
Exercise 8
Exercise 8 Sample Solutions
Solution 1
Solution 2
Solution 3
Solution 4
Key Recap Questions
Self-assessment
Self-assessment Key
Chapter 9: Safety Integrity Improvement Methods
Introduction
Video Lesson
Lecture Notes
Calculation Examples
Task 1
Task 2
Task 3
Exercise 9
Exercise 9 Solution
Key Recap Questions
Self-assessment
Self-assessment Key
Chapter 10: Proving the Safety Integrity
Introduction
Video Lesson
Lecture Notes
Calculation Examples
Task 1
Task 2
Exercise 10
Exercise 10 Solution
Key Recap Questions
Self-assessment
Self-assessment Key
Chapter 11: Practical SIL Calculation
Introduction
Video Lesson
Lecture Notes
Now Try for Yourself!
Required Output
Assessment
Sample Solution to the Project
Required Evidence for the Safety Case
Architectural Block Diagram
Reliability Block Diagram
Chapter 12: System Safety Checklist
Video Lesson
Lecture Notes
Self-assessment
Self-assessment Key
Bibliography
Index

Citation preview

Milan Z. Bjelica

Systems, Functions and Safety

A Flipped Approach to Design for Safety

Systems, Functions and Safety

Milan Z. Bjelica

Systems, Functions and Safety A Flipped Approach to Design for Safety

Milan Z. Bjelica University of California, San Diego San Diego, CA, USA

ISBN 978-3-031-15822-3 ISBN 978-3-031-15823-0 https://doi.org/10.1007/978-3-031-15823-0

(eBook)

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 Request lecturer material: sn.pub/lecturer-material This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

A Letter From Your Instructor

Dear Reader, I am so happy you have selected this book, and perhaps also taken the corresponding course on systems, functions, and safety! This means you have realized that any technical wizardry around new achievements of humanity may come at a huge cost – harming humans, damaging our environment, or our property. By reading this book you acknowledge that it is not enough to be just any engineer – hardware engineer, computer engineer, software engineer, or mechanical engineer; it is important to look at your designs holistically, and to understand what consequences your designs, implementations, and overall doing may have and what you can specifically do about it. Yes, this book is about system engineering with a strong focus on reliability and the accompanying metrics. However, it is also a book that takes you on a journey of how these aspects can be applied to modern, twenty-first-century endeavors that you might be taking these days. Even if you come from reliability-heavy disciplines, such as mechanical engineering, it is still good that you have encountered this book. It would be easy for you to contrast your knowledge with many real-world examples, exercises, and applications listed – a great opportunity to recap and reestablish your perspective on the complex system designs now mostly depending on new suspects: software and high-performance computing hardware. The purpose of this book is not to lay down intensive theoretical constructs and background of the system engineering and reliability theory; this part is intentionally left lightweight. You can always refer to the bibliography at the end of this book to find additional great, in-depth material in those areas. Instead, the purpose of this book is to take you by the hand and lead you step by step, to make you understand and practice whatever is important to become ready to understand the world of system safety and functional safety in the context of your next engineering project. You would feel much more relaxed – but cautious – when you return to your desk after you complete what I have prepared for you here.

v

vi

A Letter From Your Instructor

I need you to be ready for an experiment with this book. It is not like most other books you might have read. Start with Chapter 1. Read the introduction to get yourself motivated. It is only a two-page read! After this, sit back and relax while you watch an accompanying lecture video. You can access those videos by following the links in the book. Each video lasts for about 30 minutes and will explain briefly whatever you need to understand in the chapter. Then, after watching the video, take a short break. Have a coffee. Think a bit. Then come back and read the lecture note section from the book. Pay attention to bold and underlined parts. These lecture notes are very brief and will contain only what is the most relevant for the topic. Lecture notes are short but essential. You can keep them as your vital reference for the future. After going through the lecture this way, now I need you to practice. Dive into the calculation examples and exercise sections. Do whatever is required from you. It is excellent if you can bring in a peer or two to work out the exercise together. Fill in the given sheets or make the required drawings. Think and discuss along the way. If you are in a live course, use your instructor to discuss further. Take about 1 hour to complete the exercise, not more. Then, look at some examples of how other people solved the exercise. Their solutions are not ideal – those might be exactly like yours! All the shortcomings are addressed in the review text that you can read for each exercise solution. This is an actual review provided by me, the instructor. Learn from those examples and compare them with your work. There will be many similarities! If you have a chance, give your solution to your instructor for the original review. When you are done, you now need to assess your knowledge. Take a look again at the key recap points/questions. Try to remember what you did by thinking about those points out loud. Spend a few minutes doing this. Finally, take a short quiz. You have ten questions requiring a simple yes or no answer. It seems easy at first, but it's not. You need to understand the chapter to get every question right. After writing down your answers, check the key at the end of the chapter. If you got something wrong, go back to the text and try to understand why. If you have your instructor nearby, ask for additional clarification. It is very important to get every answer right and to understand why you got it right. Once you do that, you are done with the chapter and move to the next one. If you do what I suggest, you will participate in the “flipped approach” to the design for safety. You will need 1–2 days per chapter, and in a couple of weeks, you will not only read the book but understand and be able to apply what I wanted to teach you. Please evangelize this topic further. This book is about the safety culture. The safety culture is what we need the most if we want to be safe and sustainable with our next-generation tech. Have great fun comprehending this book and see you in each and every chapter along the way! Yours sincerely, Prof. Dr. Milan Z. Bjelica

Acknowledgement

The work on this book was partially supported by the autonomous province Vojvodina of Republic of Serbia, Province Secretariat for High Education, Science, and Research, under Grant 142-451-2339/2022-01/02.

vii

Contents

1

Safety-Critical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 1 Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 1 Sample Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 1 2 3 6 9 10 11 12 15 17 18 18 19

2

System Requirements and Functions . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2 Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 2 Sample Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21 21 22 23 26 28 28 28 29 30 31 32 32 33

ix

x

Contents

3

System Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Calculation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3 Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 3 Sample Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35 35 36 37 40 40 41 42 43 43 44 45 46 46 47 47

4

System Safety Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 4 Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 4 Sample Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49 49 50 51 54 55 56 56 57 57 58 59 59 60

5

Functional Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 5 Sample Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61 61 62 63 65 67 67 68 69 70 71 71 72

Contents

xi

6

Defining Safety Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Your First Safety Project! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Required Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Submission Deadline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sample Solution to the Project . . . . . . . . . . . . . . . . . . . . . . . . . . . .

73 73 74 75 78 80 80 80 80

7

Safety Integrity and Random Failures . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Calculation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 7 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97 97 98 98 103 103 104 105 106 107 108 108

8

Safety Integrity of Composite Systems . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Calculation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 8 Sample Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

109 109 110 110 113 113 115 117 118 118 120 121 122 123 124 124

9

Safety Integrity Improvement Methods . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

125 125 126 126

xii

Contents

Calculation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 9 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

129 129 130 132 135 136 140 140 141

10

Proving the Safety Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Calculation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercise 10 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recap Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

143 143 144 144 148 148 149 151 152 156 156 157

11

Practical SIL Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Now Try for Yourself! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Required Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sample Solution to the Project . . . . . . . . . . . . . . . . . . . . . . . . . . . .

159 159 160 160 167 169 169 170

12

System Safety Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Video Lesson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-assessment Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

183 183 183 186 187

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

Chapter 1

Safety-Critical Systems

Introduction Most of the industries nowadays struggle with the ever-increasing complexity of their products as well as the tools and the equipment involved. Wherever we look, we are overwhelmed with the abundance of new possibilities, among which many are enabled by the emerging technical functionalities. Take a look at your smartphone, for example. It allows you to communicate at any distance; it reveals worldwide events to you and provides you with a live broadcast from any place almost instantly; the wealth of the worlds’ knowledge is a click away; you are kept entertained, informed, and connected at all times. All this comes at a cost of the huge complexity of smartphones today, including the most sophisticated chips and millions of lines of code in the software. It is not rare, however, that such a complex system fails: at times, we have a poor Internet connection, our applications hang, or the battery drains out. Also, we usually replace the device every 2 or 3 years. Complexity is usually the consequence of the digital (r)evolution. In the 1990s, computers started to take over media broadcasting, starting from satellite TV, on to the set-tops and TV sets, to digital flat TV screens. Then, the twenty-first century brought us the miniaturized computer in the form of a smartphone, and digitization continued and consumed all communications and broadcast. We further witnessed the revolutionization of mobility, bringing complex applications, autonomous driving algorithms, and modern driving assistance functions to vehicles. The aerospace sector, railway, production plants, energy sector, and others are getting more connected and computerized every year. The digitization momentum started to grasp all the remaining industries, putting all the things, and people, in the global communication network. Enabling a machine to communicate with any other machine, with a powerful computer behind each device, allows the utilization of a

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_1

1

2

1 Safety-Critical Systems

multitude of new algorithms and the introduction of an unprecedented set of new functions. Automatic optimization of busy traffic across intersections, unmanned processes in production plants, and the car that takes you from point A to point B while you watch a live broadcast of a tennis match from across the globe – all this is slowly becoming the reality. All aforementioned examples are based on complex systems, which require various interacting elements to achieve their end purpose. The listed new functions, which are enabled by communications, complex processors, and software, are making the systems even more complex, which may affect their reliability. In case the systems control high energy, such as kinetic, electrical, or chemical, they may harm people and the environment or damage property. We call those systems safetycritical, and they range from simple battery-powered devices, over to passenger vehicles, aerospace or rail, all the way to factories, power plants, or chemical plants. It is obvious that the digitization of such systems, introducing higher complexity, results in lowered reliability and the increased risk of mishaps introducing harm or damage. Safety-critical systems are not new; they are around for many decades, and even centuries. However, such complex, hardware, and software-rich installments require new regard to how those systems need to be defined, designed, implemented, maintained, and decommissioned. Some areas, such as software, bring together engineers which have little or no previous insights into the processes and methods required for safety-critical system design, whereas the role of those engineers becomes dominant in safety-critical systems. Therefore, we start the journey in this book by introducing safety-critical systems and understanding their key properties and aspects. Apart from the main terminology, such as a system, critical system, and system of systems, this chapter sheds light on required engineering processes in system engineering, emphasizing system delineation, requirement elicitation, and the required process models and traceability.

Video Lesson This chapter has a corresponding video lesson: sfs1.nit-institute.com

Lecture Notes

3

Lecture Notes A system is a combination of different interacting elements which are organized to achieve one or more stated purposes. The system fulfills its purpose through the provision of system functions. Correct orchestration of system elements and system functions is of vital interest for safe system design. Particular view instead of the system view may lead to the unforeseeable component and function interactions (remember: Uber Tempe accident, safety driver vs the autopilot; Ariane 5 disaster, self-destruct mechanism vs position sensing; power production vs safety test in Chernobyl). The system needs to be evaluated regarding its effects on the environment. Therefore, a system must be delineated from the environment, with the system boundary clearly defined (Fig. 1.1). Technical systems provide functions using hardware, software, and mechanical parts. The system is safety-critical if it can cause harm (to users, people, the environment) or damage (to the property). The system may cause harm or damage, if it produces or controls energy, such as kinetic energy, electrical energy, or chemical/radioactive processes. Also, systems providing critical support (such as life support or decision support) may be safetycritical. Swiss cheese model illustrates how hazards may materialize into accidents (causing damage and/or harm), by the chain of events passing through “cheese holes” via many slices of cheese, each slice depicting one protective barrier of the system (e.g.,

4

1

Fig. 1.1 System delineation – a clear split is needed between the system and its environment, via the definition of the system boundary

Safety-Critical Systems

ENVIRONMENT

boundary

SYSTEM

ENVIRONMENT

Active failures

Damage or Harm

Latent faults

Poor organization, g , decisions… design, Hazards

System “layers”

Fig. 1.2 Swiss cheese model – hazards may propagate and become failures causing damage or harm

system design and organization, implementation with respect to faults, exploitation with respect to failures, etc.) (Fig. 1.2). A system of systems (SoS) is a system comprising of other systems which are usually heterogeneous, distributed, and with well-defined interfaces. One of the systems within SoS is our system in focus, but it must not be overlooked that there are other systems it interacts with which may provide or consume critical information based on the functions of our system (Fig. 1.3). System delineation is performed by starting with the imminent environment and listing all the participating components which are obviously present. Then, the analysis “zooms out” to adjacent systems (as in SoS); then operators, users, and operation modes; and then other related systems (such as IT), as the last layer before the system boundary (delineation line). Outside of the boundary, the environment elements shall be listed (elements affecting the system, and those affected by the system). Finally, the intended functions, use cases, and legislation (standards) must be contrasted with the design.

Lecture Notes

5

Fig. 1.3 System of systems: our regarded system (system-in-focus) may not be alone!

Sibling systems

System System

subsystem

subsystem

subsystem

System

SYSTEM IN FOCUS Containing system

Enterprise processes

Project processes

Enterprise Environment Management Process Investment Management Process

Project Planning Process Project Assessment Process

System Life Cycle Management Process

Project Control Process

Resource Management Process

Decision-making Process

Quality Management Process

Risk Management Process

Agreement processes

Configuration Management Process

Acquisition Process Supply Process

Technical processes Stakeholder Requirements Definition Process Requirements Analysis Process Architectural Design Process Implementation Process

Information Management Process

Integration Process Verification Process Transition Process Validation Process Operation Process Maintenance Process Disposal Process

Fig. 1.4 Some important system engineering process groups and processes

System engineering (SE) concentrates on the design and application of the whole as distinct from the parts (regarding the “added value” of the whole), therefore providing all means to define, design, and utilize systems. SE regards the entirety of the design, including technical aspects, but also social and environmental aspects (Fig. 1.4). The system engineering process provides a top-down decomposition approach, starting from the requirements definition and analysis, architectural design and implementation (left side of V model), and then assembling and exploiting “the

6

1

Safety-Critical Systems

Fig. 1.5 V model with the notion of traceability

whole” through integration, verification (“is the system implemented correctly”), validation (“is the right system implemented”), operation, maintenance, and disposal processes (right side of V model) (Fig. 1.5). Starting from the requirements, processes produce artifacts (items, documentation) that need to be clearly linked (“used by,” “derived from”) with the inherent provision of traceability. Agile development (e.g., SCRUM), where the implementation is organized in periodic sprints, in which items are implemented based on the periodically updated backlog, is possible to utilize, but still all processes, phases, and traceability aspects need to be respected (Fig. 1.6).

Exercise 1 Electric scooters are increasingly being used today and are the subject of controversies. They are classified as different vehicle types in different countries, in some are banned from traffic, and in others restricted to certain lanes or speeds. You are selected by one of the countries still pending the legislation to analyze this system, delineate it from the environment, and discuss its criticality (Fig. 1.7). Current situation per country: sfs1.link.nit-institute.com

Exercise 1

7

Daily Scrum Product Backlog

Sprint Planning

Sprint Retrospective

Sprint Backlog

Sprint Review

Release increment

Fig. 1.6 Agile methodology (SCRUM) shall not compromise traceability

Fig. 1.7 Your system – an electric scooter with geofencing

Your Tasks for the Exercise • Discuss the functions of the scooter. Consider modes of operation, user interfaces, components, and the environment. Make rough notes. • Think about the usage of geofencing to prevent riders to operate the scooter in specific areas and discuss it. Which additional functions and components would be required for this feature? • By using the system delineation sheet, decompose the scooter to the imminent components, adjacent systems, operators/users, other related systems, and the environment. Make sure to clearly define the system boundary and to place all components, users, operators, and subsystems within their respective layers. Make sure you identify system of systems!

8

1

Safety-Critical Systems

• For each of the components, deduce whether it is safety-critical. • Is the electric scooter as a whole safety-critical? Which assumption can support or debunk this claim? To-Do List • Perform the exercise individually or with your peers. One can share the screen and keep notes, all contribute. • Create presentables (e.g., drawing, Excel calculation). • Discuss your solution and share it with others. Note: Digital files for this exercise are available at sfs1.ex.nit-institute.com

Exercise 1 Template

Exercise 1 Template

9

10

Exercise 1 Sample Solutions See several exemplary solutions to the exercise:

Solution 1

1

Safety-Critical Systems

Exercise 1 Sample Solutions

11

Review Comments by the Instructor • The delineation of the system is mostly correctly defined. Please note that in the iterations within a pre-project or concept phase, the system boundary might “shrink” to exclude items that would not impact the design or safety as much (e.g., “paying system”). • In the decomposition, however, it would be good to keep the level detail higher in the narrower shells (e.g., in the Z-shell). For example, the power chain (powertrain) might need to be further decomposed so that the electric motor is clearly visible (as well as, e.g., electronic brake). A power supply system, similarly, would need to expose the existence of a battery pack (with high fire hazards!). • Please note that software, although important to consider, is worthless if noted in a generic way, so specificity here is essential. Make sure to distinguish the software element which is important to regard per se, other than elements that are inherent parts of other controllers. • A notable absentee are HMI units, showing information and accepting commands from operators/users, since those are frequently overlooked in the design, but their complexity and non-intuitiveness may cause (un)foreseeable misuse. • In the other systems shell, usually IT systems (Internet-based/cloud-based) can be placed, such as map update IT system for the geofencing function, but also SW upgrade back-end. • In the environment, it would be good to add more granularity (e.g., pedestrians, types of obstacles, etc.) as opposed to using generic notions, such as “Traffic.”

Engin e

Rental operat or

baery

Brakes

Controlle r Unit

Speaker

Weight control unit

Z-Shell

Adjacent systems

Operators / Users

Comm. Module

GNSS

Sensores

Info Display

Folding and Carrying

Security

Other systems (System of Systems?)

People

Power Source

Wheels

Inter net

Satellite System

Traffic: -type...

Road Infrasture

ENVIRONMENT

- Safety Crical

Weather Forecast

Group name: Group2

1

ENVIRONMENT

Weather

Electric Scooter with Geofencing – System View and Delineaon

12 Safety-Critical Systems

Solution 2

Exercise 1 Sample Solutions

13

Review Comments by the Instructor The view is mostly correct, with several notes to consider: • Although it is good to include information display, other (input) types of HMIs would also need to be considered, and their criticality carefully assessed (note: some units, although showing only information, can be critical with regard to the decision-making of operators, and the misuse which can happen due to the, e.g., figuring out the wrong state of the system). • The operator shell could include Maintenance and Service personnel. Maintenance/Service operation modes can be overlooked although those special modes can uncover important hazards. • The notion of “Security” shall be more closely elaborated, e.g., which operator or (sub)system is especially security-critical.

14

1

Safety-Critical Systems

• The consideration of the Internet being within the system boundary is usually too much even for this phase; communication issues are usually addressed as failures on communication modules/adapters instead. • In the listing on page 2, modes of operation are good to have, but notably, here we miss modes such as “In-service,” “Folded,” or “Transport.”

Exercise 1 Sample Solutions

Solution 3

15

16

1

Safety-Critical Systems

Review Comments by the Instructor • It is very good to have a leg (stand) analyzed, since, e.g., expansion of this stand during the driving may pose a serious hazard. • Make sure to deepen the analysis with respect to Maintenance/Service mode, corresponding personnel and equipment.

17 Exercise 1 Sample Solutions

Solution 4

Pedestrians

Operators / Users

Control unit

Z-Shell

Adjacent systems

Break control module

Power supply

Traffic conditions

Mobile (un)locking

User interface

Location module

SW update module

Light control

Passenger

Other systems (System of Systems?)

Driver

Geo mapping server

Weather conditions

Mainten ance

Charging station

Module for vehicle speed

Steering module

Animals on the way

SW update server

Tire inflating system

Other vehicles

Group name: GROUP 4

ENVIRONMENT

Geofencing planning and operations system

- Geofencing – Features: location modules - slow down before traffic crossing, no max speed in pedestrian zone

Electric Scooter with Geofencing – System View and Delineation

Road conditions

ENVIRONMENT

Notes: -Energy – kinetic (could potentially harm driver and pedestrians) -> scooter is safety crinical

18

1

Safety-Critical Systems

Review Comments by the Instructor The analysis is mostly correct, with some outstanding remarks: • Z-shell mostly includes controllers; however, actual actuators and other components are very important for safety, such as breaks themselves, the battery, knobs/ levers, etc. • Breakdown of potentially affected/affecting entities seems OK, although not specifically clear why animals are considered separately • Some overlooked “environmental” effects, such as weight limit, can be added. Also, the descriptions shall be as precise as possible, and generic notions such as “conditions” shall be avoided

Key Recap Questions To make sure you fully understood safety-critical systems, reflect on the following: • • • • • •

Think about a system of choice. Think about users, system environment, and boundary. Is your system critical? Which components does it have? Are there subsystems? Think about hazards.

Self-assessment Now take the time to self-assess your knowledge by taking the quiz below. Each listed statement is either correct or incorrect. Please mark your answer and then check in the key at the end of the book. 1. A system engineer in a company focuses on the implementation of the system components which are assigned to his company, with little or no regard to the use cases and functions that would be provided by the end system or product. 2. When analyzing systems for safety, direct users and operators of the system are usually located within the system boundary. 3. The system is safety-critical only if it can kill or injure people while performing its functions. 4. The system is always safety-critical if it controls a chemical process that may cause a fire. 5. If the amount of kinetic energy produced by the system is somewhat reduced (e.g., vehicle speed is limited), the safety-criticality of the system may change as a result.

Self-assessment Key

19

6. Systems that only provide information to the operators via screens are usually not regarded as safety-critical. 7. One layer of protection in a Swiss cheese model (e.g., one system component, or one single process), if failed, can only be a major factor for an accident in case other slices (layers) also fail (exhibit “holes”). 8. When analyzing the safety of an aircraft, an air traffic control system must also be regarded for total safety assessment since air travel is actually a system of systems. 9. In a system engineering process, a verification test plan must always consist of test cases that are referencing implementation units which again reference system architecture documentation and the respective requirements. 10. The correctly implemented system, verified through the complete test plan with perfect coverage, is always a safe system.

Self-assessment Key 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

False True False True True False False True True False

Chapter 2

System Requirements and Functions

Introduction If you try to remember your last project, it is not uncommon that you will recall a document with the title similar to “requirements specification.” Unfortunately, many would remember this document being an incomplete set of things the product should do, and that many aspects were left unsaid and decided during the development phases. Some requirements were vaguely defined, puzzling the developers about how to decompose and implement them. Frequently, it happened that some things were known but not written down, or that the final design deviated from the actual desires of stakeholders which were omitted or simply not formulated correctly. Now let us put that situation in the context of safety-critical systems. The problems become obvious. If we do not know what the system shall exactly do, and which functions it shall exactly perform, we will end up not knowing what kind of harm or damage the system may produce as a side effect. Incorrect, incomplete, or missing requirements specifications, according to research, actually account for about 40% of all problems with system safety. This is truly an alarming indicator! This is why we need to dedicate this chapter to system requirements and system functions and try to understand how requirements engineering and corresponding processes should be performed in any design, and most importantly in safety-critical system designs. It is essential to fully scope the list of clearly described functions the system shall perform since only by listing functions we can analyze potential failures and failure modes of those functions. That way we can figure out if and to what extent our system behavior may result in a mishap. Requirements are all about communication. In the same way that two persons exchange information about anything through speech, requirements communicate the intended functions of the system to all the stakeholders and make sure that everybody is aligned. Stakeholders being people, it is required that the requirements are expressed in a way that is easily understood and unambiguous. Since each stakeholder looks at the system from his or her unique angle, much important © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_2

21

22

2 System Requirements and Functions

information can be left out in communication – regarded as “goes without saying.” This is why a careful requirements engineering process is required. This process starts from the requirement elicitation. Need to elicit requirements instead of expecting them to be specified for you from some “invisible external entity” bends our thinking. We (i.e., requirement engineer!) shall gather and figure out requirements by interacting with all the stakeholders in any way required so that we lure out every single bit and document each requirement correctly. It is also very important to know who your stakeholders are and to understand that the list of stakeholders is much broader than we immediately think. Our stakeholders are not only people from the management and the customers – the list is much larger including end users, suppliers, developers, shareholders, competitors, standardization bodies, but also the society – people that might be affected by the system, governments and legislators, trade unions, and associations. Requirements engineering equally requires the requirements to be expressed, formulated, and documented in a specific, nonambiguous, and consistent way. Several processes are also required to be meticulously followed, specifying how the requirements shall be negotiated, validated, coordinated, updated, tracked, traced, and changed. This chapter will give you a much-needed introduction to all these aspects, to make you fully aware of the available practices so that you do not miss this critical first step in the design of the safety-critical system.

Video Lesson This chapter has a corresponding video lesson: sfs2.nit-institute.com

Lecture Notes

23

Lecture Notes A correct definition of system requirements is essential to correctly define system functions and afterward assess hazards, making the requirements definition among the most important phases with respect to system safety. Bad requirements specifications are frequently encountered in practice (~40%) which is alarming in the sense of safety (Figs. 2.1 and 2.2). The requirement is a needed condition or capability of the system. Functional requirements define system functions that satisfy the user’s need to achieve a goal. Requirements need to be elicited (gathered, figured out) since they are not given directly by the stakeholders. Then, requirements need to be correctly documented (formulated, expressed), communicated (negotiated, validated, coordinated), and finally managed (updated, tracked, traced). Requirements are hierarchically organized so that they support traceability (e.g., high-level requirements -> system requirements -> functional requirements -> technical/hardware/software requirements), and they also reference the design specification and items which specify, implement, and verify them. Requirements are elicited from various stakeholders: end users, customers, shareholders, competitors, managers, developers, and suppliers, but also government/legislators, trade unions, and other affected groups in the society. Our stakeholders may also be in the environment (beyond system boundary!). Requirements need to be clearly documented (in a semiformal or formal way, in writing). They define what shall be done, instead of how it shall be implemented. Ambiguity in wording shall be avoided, and completeness shall be sought. Consistency and traceability throughout the requirements specification are essential, as well as procedures for maintaining the specification and agreeing to requirement definitions and changes.

INCORRECT

INCOMPLETE

MISSING

never rarely frequently

never rarely frequently

never rarely frequently

Fig. 2.1 Distribution of problems with incorrect, incomplete, or missing requirements specifications. (Source: Fraunhofer study on functional safety in automotive – ISO 26262, 2013)

24

2

System Requirements and Functions

Fig. 2.2 An infamous “Project Tree” illustration of what happens with the requirements specified incorrectly

The most important requirement groups are functional requirements (“The vehicle shall drive autonomously in a traffic jam on a highway”), quality requirements (“The display shall be readable in bright sunlight”), and constraint requirements (“The power consumption of the controller shall not exceed 10W”). A special case of constraint requirements is safety requirements (“Autonomous driving shall be disabled if the driver becomes inattentive”). Requirements engineering is a part of system engineering, dealing with requirements development (elicitation, documentation) and requirements management (change management, coordination and communication, tracking and monitoring, escalation). Methods to develop and elicit requirements include the Kano model, which can classify the initial set of requirements according to the level of goal (need) fulfillment, and the level of satisfaction the inquired stakeholder expresses by this fulfillment. This method requires a questionnaire in which the stakeholder uses the Likert scale (“I like that,” “I expect that,” “I am neutral,” “I tolerate that,” “I dislike that”) to answer the positive question (“What if the requirement/feature is not there/not working”) and the negative question (“What if the requirement/feature is there/ working”) for each of the requirements. Based on the answers, the requirements are classified to “Performance” (key differentiator for the system), “Must have” (mandatory), “Delighter” (not expected, but increasing value), “Questionable” (need

Lecture Notes

25

What if the feature is there / working

Fig. 2.3 Kano model What if the feature is not there / not working I like that

I expect that

I am neutral

I tolerate that

I dislike that

I like that

Questionable

Delighter

Delighter

Delighter

Performance

I expect that

Reverse

Questionable

Indifferent

Indifferent

Must have

I am neutral

Reverse

Indifferent

Indifferent

Indifferent

Must have

I tolerate that

Reverse

Indifferent

Indifferent

Indifferent

Must have

I dislike that

Reverse

Reverse

Reverse

Reverse

Questionable

Fig. 2.4 The table used in a Kano requirement elicitation model questionnaire

further elicitation), “Indifferent” (not perceived as important), and “Reverse” (shall be removed or redefined) (Figs. 2.3 and 2.4). Elicitation is encompassed within brainstorming sessions (in workshops with stakeholders), or wider inquiries via questionnaires. The formulation of the requirement is in writing, in natural language: “The system” + “shall/should/will/may” + / “provide” “with the ability to” / “be able to” . All the requirements together form a system requirements specification (SRS). This specification has several formal notation elements, such as ID (with clear nomenclature), Name, Description (requirements definition), Author,

26

2

System Requirements and Functions TRACEABILITY!

Ver.

Derived from

Used by

The vehicle shall enable the TJP Bjelica function to be activated if the speed is below 40mph, driver is attentive and the guard rail is detected.

0.5

HLR_001

TR_001 TR_002 TR_003 TR_004

1

To be discussed.

Cruise control start

The vehicle shall maintain the current speed when activated by the driver.

Bjelica

1.0

HLR_002

TR_005 TR_006

2

-

FR_003

Cruise control stop

The vehicle shall stop maintaining the current speed if the driver makes any action upon vehicle pedals or the steering wheel.

Bjelica

1.0

HLR_002

TR_007 TR_008

2

Discuss other means of stopping.

QR_001

ISO 26262

Digital Cockpit Domain Controller shall adher to ISO 26262 ASIL B.

Valls

0.7

HLR_037

All TR

2

Under review

ID

Name

Description

FR_001

Traffic Jam Pilot (TJP) Activation

FR_002

Author

Prio Note

Fig. 2.5 Example excerpt from a system requirements specification, showing traceability

Version, Traceability (Derived from/Used by/Related to), Priority, Note, etc (Fig. 2.5). By having the requirements at hand (especially functional), we can try to figure out what happens if a function fails and is there a hazard (potential for damage or harm) to people, the environment, or the property (is the failure dangerous). Safety requirements would then state identified constraints that shall prevent this dangerous failure to happen. Further, safety functions could be defined which implement the safety requirements.

Exercise 2 For the electric scooter with geofencing, discussed in Chapter 1, now we need to construct a brief system requirements specification (on the functional level). In the first part of the exercise (20 min), your team shall construct a requirements specification of at least ten requirements, denoting the requirements that present system functions (functional requirements), and other requirements related to constraints (safety!) or quality. In the second part of the exercise (10 min), the requirements shall be reviewed with the stakeholders, which are from your counterpart team, using the Kano model. In the last 10 min of the exercise, your team shall prioritize the requirements according to the outputs of the Kano analysis (Fig. 2.6). Your Tasks for the Exercise • Based on the discussion from Chap. 1, now fill in the requirements table correctly. To show traceability, consider some high-level requirements but do not spend time writing them in. Also, use dummy technical requirements for reference at this point.

Exercise 2

ID

27

Name

Descripon

Author

Ver.

Derived from

Used by

Prio

Note

FR_001

Traffic Jam Pilot (TJP) Acvaon

The vehicle shall enable the TJP funcon to be acvated if the speed is below 40mph, driver is aenve and the guard rail is detected.

Bjelica

0.5

HLR_001

TR_001 TR_002 TR_003 TR_004

1 (P)

-

FR_002

Cruise control start

The vehicle shall maintain the current speed when acvated by the driver.

Bjelica

1.0

HLR_002

TR_005 TR_006

2 (D)

-

FR_003

Cruise control stop

The vehicle shall stop maintaining the current speed if the driver makes any acon upon vehicle pedals or the steering wheel.

Bjelica

1.0

HLR_002

TR_007 TR_008

3 (I)

Discuss other means of stopping.

FR_004

Fast acceleraon

The vehicle shall accelerate upon pedal press using the highest torque available.

Stevic

0.5

HLR_005

TR_012

(R)

Removed aer Kano.

QR_001

ISO 26262

Digital Cockpit Domain Controller shall adher to ISO 26262 ASIL B.

Valls

0.7

HLR_037

All TR

2

Under review

Fig. 2.6 System requirements specification after Kano analysis (prioritized)

• Mark all functional requirements with the prefix FR, quality with QR, constraint with CT, and safety with SF. • Make sure to follow the formulation wording correctly (e.g., “The system shall . . .”) making sure to be clear, nonambiguous, and complete as much as possible. • Present the table to the counterpart group. Split all requirements into the categories (Delighter, Performance, Must Have, Questionable, Indifferent or Reverse) according to the Kano analysis, and prioritize (Performance, then Delighters, then Indifferent) and remove Reverse ones. • What about the completeness of the stakeholder list? Can you actually remove some requirements based on any stakeholder response? Get ready to discuss! To-Do List • Perform the exercise individually or with your peers. One can share the screen and keep notes, all contribute. • Create presentables (e.g., drawing, Excel calculation). • Discuss your solution and share with others. Note: Digital files for this exercise are available at sfs2.ex.nit-institute.com

28

2

System Requirements and Functions

Exercise 2 Template Electric scooter with geofencing – Requirements specification ID

Name

Description

Author

Ver.

Derived from

Used by

Priority

Kano (P/M/D/I/Q/R)

Note

Used by

Priority

Kano (P/M/D/I/Q/R)

Note

FR_001

QR_001

SF_001

Exercise 2 Sample Solutions See several exemplary solutions to the exercise:

Solution 1 Electric scooter with geofencing – Requirements specification ID

Name

Description

Author

Derived from

FR_001

Acceleration

0.5

HLR_001

TR_001 TR_011

1

M

FR_002

Breaking

The vehicle shall slow down when brake lever is pulled on.

Nebojsa Cvijic

0.5

HLR_002

TR_002 TR_022

1

M

Folding

The vehicle shall be able to be folded.

Vladimir Pavlovic

0.5

HLR_003

TR_003

2

D

Steering

The vehicle shall change direction according to steering handle.

Caner Dur

0.5

HLR_004

TR_004

1

M

FR_005

Charging

The vehicle shall be able to be recharged.

Dario Peric

0.5

HLR_005

TR_010

1

M

FR_006

Battery indicator

The vehicle shall have battery state indication.

Vanja Arbutina

0.5

HLR_006

TR_009

3

D

FR_007

GPS

The vehicle shall have communication with GPS.

Nebojsa Cvijic

0.7

HLR_007

TR_005

2

M

CR_001

Power consumption

The vehicle shall not consume more than 50W at average.

Vladimir Pavlovic

0.5

HLR_005

TR_006

3

D

SR_001

Geofencing

The vehicle shall indicate when leaving safe zone.

Caner Dur

0.7

HLR_007

TR_015

2

M

QR_001

Load weight

The vehicle shall be able to carry 120kg

Dario Peric

0.5

HLR_010

TR_008

2

M

FR_003

FR_004

Vanja Arbutina

Ver.

The vehicle shall accelerate when accelerate lever is pulled on.

To be more clarified in future.

Exercise 2 Sample Solutions

29

Review Comments by the Instructor • FR_007 seems more like a technical requirement than a functional requirement. Functional requirements describe system functions (goals) – the system provides these functions toward their users/operators, and the effects of those functions manifest toward the environment, at the system boundary (as well as failures of those functions!). Instead, say, FR_007 might cover the need for geofencing (as in SR_001), whereas FR_007 might become TR_XXX derived from SR_001. • Generally, when deriving functional requirements, it is good to first list the operation modes of the system (e.g., driving, parked, rolling, carrying) and then list various functions for each of the modes.

Solution 2 Electric scooter with geofencing – Requirements specification ID

Name

Description

Author

Ver.

Derived from

Used by

Priority

Kano (P/M/D/I/Q/R)

CR_001

Maximum velocity

The maximum velocity of the system shall not exceed 30 km/h.

Jelić

1

HLR_001

TR_001 TR_002

2

D

QR_001

ISO 26262

The system shall be developed according to ISO 26262 ASIL B

Miškulin

1

HLR_002

ALL TR

1

M

CR_002

Battery life in idle state

The system battery shall be able to provide power for at least 2 hours in idle mode

Brkljač

1

HLR_001

TR_003

1

M

Folding mode

The system shall be foldable for carry

HLR_004

TR_004 TR_005

5

M

1

M

1

M

2

M

FR_001

SR_001

Full stop

FR_002

Acceleration

FR_003

Braking

1

HLR_001

Miškulin

1

HLR_001

Ilić

1

HLR_001

Ilić

1

HLR_001

TR_009

3

M

Brkljač

1

HLR_003

TR_010

4

M

Jelić

1

HLR_004

TR_011

3

M

2

M

2

D

FR_004

Steering

FR_005

Shutdown when geofenced

The system will be able to safely shutdown when exiting geofenced area.

Low battery warning

SR_003

Leg stand during driving

QR_002

Display adaptiveness

1

Lubina

The system shall be able to change direction when steered.

SR_002

Jelić

The system must be able to fully stop within 10m at maximum velocity and load. The system shall be able to accelerate when holding the acceleration lever The system shall be able to decelerate when holding the brakes

The system shall indicate low battery warning when the percentage drops below 10% The system shall provide that the leg stand shall not be opened during driving The system shall provide that the display is adaptable to brightness in the environment

TR_006 TR_007 TR_008 TR_006 TR_007 TR_008 TR_006 TR_007 TR_008

Lubina

1

HLR_005

TR_006 TR_007

Brkljač

1

HLR_002

TR_020

Note

Discuss other standards?

Not understandable. Which direction and how steered? How to shutdown when exiting area? Can be elaborated as technical requirement.

Display remains clearly visible

Review Comments by the Instructor • Usually, functional requirements come first, stemming directly from high-level requirements. Constraint and quality requirements are very important, but in the presentation to the stakeholders, they shall be put second since they sound more technical. Generally, the requirement engineer shall act as a facilitator of the communication among the stakeholders; therefore, easy-to-understand requirements (nonambiguous, agreeable) are desired. • Kano analysis yielded Delighter for the CR_001 which states the maximum allowed velocity of the scooter. Usually, this requirement delights safety engineers or legislators/auditors, so it is indeed possible. Please note here that

30

2

System Requirements and Functions

depending on the stakeholders, Kano analysis may give different results, and sometimes the results must be averaged or weighted according to the importance of a stakeholder. It is in the end all about the argumentation so many approaches in prioritization and agreements are possible as long as the problem has been analyzed from various angles and among all relevant stakeholder groups. • Make sure not to have a vague requirement definition. For example, QR_002 states “adaptable to brightness.” This is unclear. What kind of adaptation shall be provided? Instead, “display shall remain visible for all daylight brightness levels” would be more clear. Also, “change direction when steered” in FR_004 is also vague; better: “change the direction according to the position of the steering handle.” • FR_005 is rather a safety requirement; however, it must also have additional requirements (maybe derived from it) which would specify in more detail how such a critical operation must be handled (e.g., producing an alarm, slowing down, etc.).

Solution 3 Electric scooter with geofencing – Requirements specification ID

Name

FR_001

Increasing speed

FR_002

Decrease velocity

FR_003

HMI information

FR_004

Lights

Description The scooter shall increase speed when the driver pulls the gas lever The scooter shall decrease speed when the driver release the gas lever The scooter shall present information about the vehicle speed, battery percentage, driving mode and mileage on display The scooter shall turn on the lights when the visibility (light conditions) drops below a level The scooter should provide function to be folded for easier transport

Author

Ver.

Derived from

Used by

Manic

1

HLR_001

TR_001 TR_0011

Beric

1

HLR_001

TR_004

Basta

1

HLR_001

TR_005 TR_012 TR_013

Basta

1

HLR_002

TR_006

Priority

Kano (P/M/D/I/Q/R)

1

M

1

M

Note

LIGHT_LEVEL is referenced in the definition part of the requirements

FR_005

Fold

Basta

1

HLR_002

TR_007

SF_001

Maximum velocity

The scooter shall not exceed maximum velocity of 25 km/h

Videnovic

1

HLR_002

TR_003

2

P

SF_002

Maximum acceleration

The scooter shall not exceed the maximum acceleration of

Basta

1

HLR_002

TR_008

1

M

MAX_ACC_VALUE is referenced in the definition part of the requirements

SF_003

Battery level

The scooter shall not be able to drive if battery level below 10%

basta

1

HLR_002

TR_009

Q

Rejected after review

basta

1

HLR_002

TR_010

1

M

Karan

1

HLR_002

TR_002

3

I

SF_004

Weight limit

QR_001

Constant velocity

The scooter shall not be able to drive if there is weight below 25kg The scooter should remain constant velocity per certain drive mode (SPORT, ECO, CITY)

Review Comments by the Instructor • Again, different stakeholders may respond differently to requirements (e.g., Must-Have for SF_004 could change if we are to make inquiries among children). • SF_003 is removed after Kano; however, it is questionable whether the safety engineer was among the stakeholders (to bring additional perspective to this requirement).

Exercise 2 Sample Solutions

31

• In FR_005, make sure not to use vague terms, such as “easier transport.” Better: “to enable the carrying with one hand.” • Make sure to have variables resolved in the same document, in the definitions section, and that they are kept together with the specification so that those values cannot be arbitrarily interpreted.

Solution 4 Electric scooter with geofencing – Requirements specification ID

Name

Description

Author

Ver.

Derived from

Used by

Priority

Kano (P/M/D/I/Q/R)

FR_001

Acceleration

The electric scooter shall enable acceleration and deceleration.

Bojkić

1.0

HLR_001

TR_001 TR_002

1

M

FR_002

Steering

The electric scooter shall provide steering control for the driver.

Bojkić

1.0

HLR_001

TR_003

1

M

Braking

The electric scooter shall provide braking control for the driver.

Bojkić

1.0

HLR_001

TR_004

1

M

HLR_002

TR_005 TR_006 TR_007

2

D

HLR_002

TR_008 TR_009

2

D

FR_003

FR_004

FR_005

Weight limit

Geofencing

FR_006

Battery alert

FR_007

Mobile control

FR_008

Sound signaling

QR_001

Light signaling

SF_001

Velocity control

The electric scooter shall prevent starting if the weight exceeds the limit. The electric scooter should prevent driving outside of the defined geofencing zone. The electric scooter should give the driver an alert if the battery level is insufficient for the travel distance. The electric scooter should provide controls for turning on and off via the mobile application. The electric scooter may be able to produce sound alerts. The electric scooter shall have lights bright enough so that the driver can navigate safely on the road with insufficient visibility. The electric scooter shall prevent powering off unless it is not stationary.

Barić

Mihić

1.0

1.0

Mihić

1.0

HLR_002

TR_010 TR_011 TR_012

Simić

1.0

HLR_003

TR_013

Simić

1.0

HLR_004

TR_014 TR_015

Kaštelan

1.0

HLR_004

TR_016 TR_017

Mihić

1.0

HLR_037

TR_125

Note

Safety requirement?

To the TR level?

Group 4

Review Comments by the Instructor • FR_008 is rather a technical requirement that can be derived from a functional requirement in which sound alerts may be utilized (e.g., battery low alert, etc.). Functions at the scooter level shall clearly describe the goals of the complete system toward the users/operators and with respect to the environment. • Geofencing may be even stated as a safety goal (SG) which is defined together with high-level requirements (HLRs). Then, safety requirements (SFs) are derived from SGs, and FRs are derive from HLRs, whereas SFs are related to FRs. • FR_007 might be made more clear if stating the connection with the use case (e. g., renting), which can be set up in the HLR from which FR_007 was derived. • The weight limit shall be set in FR_004 or additional FR as for the configurability of this limit in, e.g., maintenance phase shall be enabled (although this is unlikely). • QR_001 is impossible to implement as it is currently formulated.

32

2

System Requirements and Functions

Key Recap Questions In this you have been reading about and practicing exemplary requirements engineering practices through requirement elicitation: • • • • •

Think about a recent system you worked with. Remember its requirements specification. Were you involved in the elicitation? How would YOU perform the elicitation if given the task? Imagine how bad (or nonexistent) requirements impact safety.

Self-assessment Now take the time to self-assess your knowledge by taking the quiz below. Each listed statement is either correct or incorrect. Please mark your answer and then check in the key at the end of the book. 1. Requirement elicitation is performed among the project team members in a company developing a system. 2. Social groups outside the system boundary (in the system environment) are among the stakeholders in the requirements development process. 3. Failing to fulfill a constraint requirement may bear the risk to system safety. 4. The requirement engineer is not in charge of requirements communication, coordination, and escalation. 5. To apply a Kano analysis, the initial requirements specification needs to be assembled as a proposal to the stakeholders. 6. If a stakeholder expects that a listed feature is not present in the system, but would like it to be present, then this feature is assessed to be a Delighter according to Kano analysis. 7. Kano analysis helps us to detect features that “go without saying” and are usually not directly stated by the stakeholders. 8. System requirements specification allows a hierarchical view of the requirements, allowing traceability from, e.g., the functional requirements to technical requirements, but traceability to other artifacts (design specification, items, tests) is not maintained. 9. Safety requirements can only be defined by assessing failures of functions implementing functional requirements. 10. Dangerous failures of system functions, which implement functional requirements of the system, always lead to system hazards and potential accidents; therefore, the requirements specification needs to be complete before the safety assessment.

Self-assessment Key

Self-assessment Key

1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

False True True False True True True False False True

33

Chapter 3

System Safety

Introduction To set the stage for understanding the system safety, how it is defined, and how it can be provided, we needed first to understand the ways how systems are defined and specified. This being done in Chaps. 1 and 2, through the demystification of safetycritical systems and their requirements and functions, allows us now to knowingly dive into the challenging world of system safety. We deem the system safe in case it can be reasonably shown that it cannot cause unacceptable harm to people or the environment, or damage to the property. It is immediately obvious that safety is not the absolute category; we are always accepting some level of risk involved with using the system. Therefore, it is important to be able to express the level of risk which is acceptable. To express this risk, we first need to be able to quantify it. Undoubtedly, methods to assess and quantify the risk involved with a system are a big consideration in the field of system safety, and this is going to be addressed in this chapter. To be able to correctly assess risk, we need to understand what our system is doing. This means we need to resort back to the system requirements and functions, in order to assess what happens in case of failures of those functions. Further, system boundary would tell us which system components may exercise failures, and which targets from the system environment might be affected by those failures. This allows us to elicit hazards, which are all potential situations causing the system to fail in a harmful or dangerous way. Only by luring out all imaginable hazards, we can start quantifying the risk of each of the hazards and evaluate the extent of harm or damage each hazard may cause if activated. Notions such as accident, incident, hazard, risk, fault, error, failure, severity, and probability are all very important and precisely defined in the field of system safety, so they are going to be carefully addressed in this chapter as well. Learning the quantitative and qualitative methods to express the risk of a hazard allows us to figure out the acceptability of the risk and to treat it adequately to bring it © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_3

35

36

3 System Safety

down to the acceptable level. Active and passive safety measures are introduced, which allow us to now act starting from the earliest design phases and inherently build in safety to the systems we design. Active measures would allow the safety mechanism designed for the system to react in case of detection of any operational anomaly or abnormality and bring the system to the state in which the hazard can no longer activate (e.g., cutting off the power supply, applying brakes, sounding alarms, opening gates, turning off a function). Passive measures may help reduce the risk by introducing protective gear or equipment (e.g., safety helmets, safety belts in the cars, warning signs or labels), or by providing additional safety training. Active measures are more interesting since they include technical implementations which potentially require electronics (hardware), software, and mechanical parts, and they represent subsystems that are critical with regard to their reliability. The field dealing with the prescription of active measures and their correct design and assessment is called functional safety, and it is specifically considered from Chap. 5 onward. Finally, process models required in system engineering are now extended with the respective processes of safety engineering. Those processes include systematic ways of how hazards are analyzed and assessed, how they can be removed, and what needs to be done with the remaining (residual) failures in the context of risk acceptability and the requirements set forth by the safety standards, the legislators, and finally, us all as societies.

Video Lesson This chapter has a corresponding video lesson: sfs3.nit-institute.com

Lecture Notes

37

Lecture Notes Safety is a condition of being protected from danger, risk, or injury. Safety, therefore, is freedom from conditions causing death, injury, occupation illness, damage or loss of property, or damage to the environment. Safety is not absolute; we want our system to function with an acceptable minimum of accidental loss. Various areas of safety exist in general (food safety, occupational health and safety, public safety, job safety, drug safety, etc.). System safety deals with the safety of technical systems, which, when deployed, shall not cause harm or damage. Functional safety is related to system safety, regarding only active measures which are needed to keep or bring the system to the safe state (Fig. 3.1). A chain of events with a system may lead to an accident, if the last event in the chain caused damage or harm to a target (person, environment, property). If the chain only partially executes, with damage or harm avoided, we are regarding it as an incident (almost accident). Hazard is the potential for an accident (“Driving on the icy road”). It is an imagined situation that, if happens, may materialize as an accident if actuated (e.g., “braking when driving”). When assessing safety, we need to list all the hazards and evaluate them: which conditions would allow a hazard to materialize into an accident. Each hazard, therefore, has a causal factor (“icy road, braking”) and an associated probability (“chance of braking on the icy road”). It also has targets that

Food Safety Occupational Health & Safety

System Safety

Technical systems

Safety

Public Safety

Functional Safety

Environment Safety

Drug Safety

Fig. 3.1 Types of safety

Job Safety

38

3

System Safety

Causal factors **

Probability Actuation (Event)

Incident Target (s) Accident

Hazard

Harm Damage

Severity Fig. 3.2 Hazard and its components leading to incidents and accidents

can be harmed or damaged, as well as the severity of the harm or damage caused (assessing consequences). Within a safety analysis and associated processes, hazards are looked to be removed, removing the potential accidents/incidents as well (Fig. 3.2). Risk is a probability (P) that the causal factors of the hazard will materialize into an accident of certain severity (S). Each hazard, therefore, has an associated risk as a quantitative or qualitative measure to assess the seriousness of the hazard with regard to its required consideration and removal. Quantitative expression of risk is a multiplication of probability [0,1] and severity (e.g., number of fatalities): RH = P  S Risk can also be expressed as a “units per hour”: RH Δt = RH 

h i 1 U sev Δt h

Qualitative risk is a more usual definition, where the risk is expressed as a descriptive term (e.g., “Low,” “Medium,” “Serious,” or “High”), or as a letter or number (A–D, I–IV) with gradation of seriousness of the risk maintained. Each standard prescribes a risk assessment matrix (table), where probability category is stated descriptively, in the first column (“Frequent,” “Probable,” “Occasional,” “Remote,” “Improbable,” “Incredible”). Likewise, the severity category is stated descriptively, in the first row (“Catastrophic,” “Critical,” “Marginal,” “Negligible”). The intersection of values for probability and severity gives the resulting risk. Each category is described to help the assessor correctly select each value, but very often their selection is based on argumentation (Fig. 3.3). Once the risk is defined, its acceptance is discussed. If the risk is too high, the associated hazard needs to be treated and reassessed until the risk envelope drops below the decided threshold (Fig. 3.4).

Lecture Notes Category Frequent Probable Occasional Remote Improbable Incredible

39 Range (failures per year) > 10-3 -3 10 to 10-4 10-4 to 10-5 10-5 to 10-6 10-6 to 10-7 < 10-7

Catastrophic I I I II III IV

Critical I I II III III IV

Marginal Negligible I II III II III III III IV IV IV IV IV

Fig. 3.3 Risk assessment matrix as per IEC 61508

Fig. 3.4 Risk envelope and risk acceptability

Hazards can be prevented by addressing various phases of system development and execution. The internal fault is a condition in the system which happened during the design or development, inherently (they are in the system “all the time” – bugs, wrong design decisions, inadequate material properties, etc.). The external fault is a lack of consideration for an external effect in the design (e.g., not considering the ice on the road). Faults are dormant until activated by a causal factor. In that case, errors in the system appear (wrong internal state, calculations, variable values, specific internal behavior). Errors may propagate all the way to the system boundary, materializing as failures of system functions (functions not producing the prescribed effects or fulfilling goals set in the design). Failures may be dangerous, meaning that under specific conditions, they materialize into incidents or accidents, with potentially harmful consequences (Fig. 3.5). Hazard and risk analysis always starts by analyzing failures of system functions, and their various failure modes (how the function shall fail). If a dangerous failure is detected, a new hazard is defined and its risk is assessed. Hazard can be removed by strict adherence to the process model in the system life cycle (fault prevention), or by removing faults once detected (through redesign – fault removal), by detecting and correcting errors (during runtime – fault tolerance), or by addressing failures (via

40

3

System Safety

Fig. 3.5 Failure chain

passive safety measures, e.g., protective equipment). Failures are therefore classified as dangerous, detectable, and undetectable, while dangerous undetectable failures are considered remaining and their respective hazards and risk need to be assessed and discussed with regard to acceptability.

Calculation Examples Task 1 Calculate the quantitative risk for a hazard: “a fault in a power press machine in the factory causes an unintentional start during maintenance and kills maintenance workers.” The maintenance crew, on average, has five workers, and the death or serious injury is imminent (unavoidable) in case of an unintentional start. Factory records show that unintentional starts happen once in every 100 maintenance cycles, with one maintenance per year on schedule. Solution RH = P  S Severity can be expressed as a number of fatalities: S = 5 fatalities Probability can be expressed as: P= Therefore:

1 = 0:01 100

Exercise 3

41

5 fatalities 100 maintenance cycles       1 0:05 fatalities 0:05 fatalities - 6 fatalities = = 5:7  10 = RH Δt = RH  Δt 1a year 365  24 h h RH = P  S = 0:05 =

Exercise 3 For the electric scooter with geofencing, discussed in Chaps. 1 and 2, now we need to identify hazards and assess the risk for each of the hazards. Hazards can be identified by assessing the failures of functions of the electric scooter, based on the requirements specification developed in Chap. 2. To assess the failures correctly, consider as many failure modes as possible for each function. You may also use guidewords, such as no, always (stuck), reverse (opposite), more, less, early, and late to help you figure out failure modes. For each of the identified failures, assess whether it is dangerous or not. In case a failure is dangerous, mark it as being a hazard. Then assess the risk for the hazard by using IEC 61508 risk assessment matrix. Mark and shortly describe causal factors for each hazard in the form of a failure chain according to the class of cause (external fault, internal fault, error) (Fig. 3.6). Your Tasks for the Exercise • Based on the functional requirements from Chap. 2, now analyze hazards and fill in the hazard and risk evaluation sheet. • To fill in the sheet, analyze failure modes for at least two functions of your choice. Use guidewords to help you pinpoint the possible failure modes. For each failure

Funcon

FR_001

Is a hazard?

Failure mode Traffic Jam Pilot (TJP) acvates on a highway with vehicle speed above 60 mph

Yes

Probability

Remote

Risk category

Severity

Catastrophic

Fig. 3.6 Example of a hazard and risk evaluation sheet

II

Failure chain (fault – error – failure) Fault in the speed sensor leads to error (wrong speed value) in the algorithm for TJP acvaon, leading to TJP turning on in inappropriate situaon.

42

3

System Safety

mode, create a new row in the sheet. If you determine that the failure is dangerous, mark it as a hazard and assess the risk. • Risk assessment is done according to the IEC 61508 risk assessment matrix. • Discuss and describe the failure chain leading to the potential accident. Distinguish between faults (external, internal), errors, and the failure in your description. To-Do List • Perform the exercise individually or with your peers. One can share the screen and keep notes, all contribute. • Create presentables (e.g., drawing, Excel calculation). • Discuss your solution and share it with others. Note: Digital files for this exercise are available at sfs3.ex.nit-institute.com

Exercise 3 Template Power Press – Preliminary Hazard List (PHL) Hazard ID H1

Hazard description

P

S

R

R(

.

)

MEM (risk acceptable?)

P Safety measure description

S

R

R (

.

)

MEM (risk acceptable?)

Exercise 3 Sample Solutions

43

Exercise 3 Sample Solutions See several exemplary solutions to the exercise:

Solution 1 Electric scooter with geofencing – Hazard and risk evaluation sheet Function ID

Function name

Failure mode description

Hazard?

Probability

Severity

Risk category

FR_002

Braking

No Reaction when brake Lever is pull down

Y

Remote

Catastrophic

II

FR_002

Braking

Brake Lever Stuck, and it is not possible to brake

Y

Frequent

Critical

I

Braking

Reverse Reaction, when we press brake, scooter accelerate

Y

Remote

Critical

III

Braking

When we press brake lever, braking is more aggressive

Y

Fault? Error? Failure?

Braking

When we press brake lever, braking is not sufficient

Y

Fault? Error? Failure?

FR_002

Braking

When we press brake lever, reaction is delayed

Y

FR_002

Braking

When we press brake lever, reaction is too early

N

Remote

Negligible

IV

Fault? Error? Failure?

FR_005

Charging

Without possibility to charge the vehicle`s battery

N

Occasional

Marginal

III

Fault? Error? Failure?

FR_005

Charging

Charging continues even when battery is full

Y

Error

FR_005

Charging

Instead, Battery charge, battery goes to discharging

N

Fault? Error? Failure?

FR_005

Charging

Battery is charged less then maximum battery capacity

N

Fault? Error? Failure?

FR_002

FR_002

FR_002

Failure chain When cable is cut, leads to error in brake system, and it can cause a hazard incident/accident – undetectable failure on boundary of system When brake lever is stuck, leads to error in brake system, and it can cause a detectable failure on boundary of system When brake lever is pressed, accelerate system due to bug in design, and it can cause a detectable failure on boundary of system

Fault? Error? Failure?

Review Comments by the Instructor • When assessing the probability of a risk, argumentation needs to be provided, targeting failure rates (statistics) based on reports from previous system versions or similar systems. At the very least, the rationale needs to be consistent throughout the PHL (e.g., probability of misuse is always higher than the probability of the failure of mechanical components, which is again higher than the probability of failure of electronics). • Severity consideration may be impacted by the operation mode; therefore, in the usual PHI, failure mode of the function is combined with each of the operating modes, to lure out all potential hazards related to that function failing. For example, driving speed of the scooter in various operating modes (e.g., pedestrian mode or speed mode) might affect the severity of the risk (e.g., in pedestrian mode, when the scooter is being driven around 5 km/h, we can argue that the driver might “jump off” and prevent the accident in many cases, and that also the severity of impact in some cases may be lower). • Please make sure to use the guidewords correctly; e.g., if the function is “stuck,” this means it is “always-on,” e.g., always breaking (vs break lever stuck, what is a misinterpretation of the function).

44

3

System Safety

• Please note that in “stuck” failure modes, it is possible not to have a hazard (e.g., always braking, means not available, but indeed safe since the vehicle is not moving). This helps decompose the failure rate of the braking subsystem since not all failures are dangerous (all safe failures, and also all dangerous detectable failures – we will see about those in subsequent lectures and courses – can be disregarded in the final SIL consideration)

Solution 2 Electric scooter with geofencing – Hazard and risk evaluation sheet Function ID

Function name

FR_001

Failure mode description

Hazard?

no, always (stuck), reverse (opposite), more, less, early, late

Y/N

Probability

Severity

Risk category

Failure chain

I-IV

Fault? Error? Failure?

FR_001

Folding mode

Folding happens during driving.

Y

Remote

Critical

III

FR_002

Acceleration

Acceleration lever gets stuck and scooter accelerates out of control.

Y

Remote

Catastrophic

II

FR_002

Acceleration

Acceleration lever reaction time is late .(Latency in response)

N

FR_002

Acceleration

Acceleration lever gives more speed when used.

Y

Probable

Critical

I

FR_003

Braking

Braking works less when heavy rain.

Y

Occasional

Catastrophic

I

FR_003

Braking

Braking handle breaks with more force then it should.

Y

Occasional

Marginal

III

Fault: Metal pin is worn out. Error: Metal pin breaks. Failure: Scooter starts folding during driving. Fault: Jacket gets stuck between lever and steering wheel. Error: Acceleration lever unable to return to normal. Failure: Scooter accelerates out of control.

Fault: Acceleration lever not calibrated. Error: Control unit gives wrong acceleration value. Failure: Stability while driving lost. External Fault: Heavy rain. Internal Fault: Wires not isolated properly. Error: Breaks do not receive breaking signals. Failure: Not managing to break in time. Fault: Breaking lever not calibrated. Error: Control unit gives wrong breaking values. Failure: Breaking discs break with more force.

Review Comments by the Instructor • Make sure to properly analyze the environment; harm and damage are not only to the vehicle and the driver but also potentially to other traffic participants. For example, in a sudden folding case, if the scooter is allowed on the roads, heavy consequences can occur to many traffic participants in general, causing potentially many persons harmed. • Generally, the output you created is really good!

Exercise 3 Sample Solutions

45

Solution 3 Electric scooter with geofencing – Hazard and risk evaluation sheet Function ID

Function name

Failure mode description

Hazard?

Probability

Severity

Risk cathegory

Failure chain

FR_001

Increasing speed

Velocity increases rapidly

Y

Improbable

Critical

III

Fault in speed control unit leads to error (wrong acceleration value), that further leads to speed increase that is higher than wanted.

FR_001

Increasing speed

Velocity does not increases when the driver pulls the gas lever

Y

Remote

Critical

III

Fault? Error? Failure?

FR_002

Decrease velocity

Velocity does not decrease if driver releases the gas lever

Y

Remote

Critical

III

Mechanical fault in gas lever leads to losing control over speed control unit, the scooter maintains speed and causes traffic accident.

FR_002

Decrease velocity

Velocity abruptly jumps to maximum value after reaching 0 level of acceleration

Y

Occasional

Critical

II

Negative value of speed is interpreted as maximum speed, causing the scooter to increase speed.

FR_002

Decrease velocity

Velocity decreases rapidly

Y

Improbable

Marginal

IV

Fault? Error? Failure?

FR_005

Fold

It is impossible to open scooter if it is folded

N

Remote

Negligible

IV

Fault? Error? Failure?

FR_005

Fold

Scooter unfolded by its own

N

Remote

Marginal

III

Fault? Error? Failure?

Review Comments by the Instructor • Be careful about the faults originating from software (bugs). Those are systematic faults and are hard (and usually impossible) to model probabilistically. Instead, specific measures are prescribed for the development (by the standard) to prevent and remove systematic faults. It is good, however, to analyze these kinds of faults and point out the importance of the systematic development, but make sure to balance your PHI so that faults due to misuse, environment (external faults), and also hardware/mechanical wear are also taken into account with enough weight.

46

3

System Safety

Solution 4 Electric scooter with geofencing – Hazard and risk evaluation sheet Function ID

Function name

Failure mode description

Hazard?

Probability

Severity

Risk cathegory

Y

Improbable

Catastrophic

III

FR_003

Braking

Brake never works when trying to brake while driving at high speeds?

FR_003

Braking

Brake accidentaly activates when driving at high speed.

Y

Remote

Catastrophic

II

FR_003

Braking

Brake activates late, seconds after manually pressing the brake.

Y

Occasional

Marginal

III

FR_005

Geofencing

Geofencing zone does not detect the scooter leaving the zone, so that the driver can keep driving the scooter even if he is outside of the zone.

N

Improbable

Marginal

IV

FR_005

Geofencing

Geofencing falsely flags the scooter as being outside of the zone, so that the scooter can not be driven.

Y

Improbable

Catastrophic

III

FR_004

Weight limit

Weight limit falsely senses that the driver is over the required weight limit.

Y

Probable

Critical

I

FR_004

Weight limit

Weight limit falsely senses that the drivers weight is not heavy enough.

N

Probable

Negligible

III

Failure chain

Fault in the users driving skills leads to a driving error in terms of pressing the brake which causes an accident. Fault in the register overflow leads to false values in the variables that controls braking which causes an error in slowing down the vehicle in time. Fault in the geofencing server leads to an error when the scooter tries updating the zone where driving is permited, allowing the driver to drive in zones where it is not allowed. Fault in the geofencing mapping system leads to an error in calculating coordinates which leads to potential accidents possibly during high speeds. Fault in the weight sensor software leads to an error in the weight calculation that allows a child that is under the weight limit to operate the scooter. Fault in the control software leads to an error when flagging the measured weight which leads to a failure where the scooter can not be driven.

Review Comments by the Instructor • Please see comments to other groups, especially for hazards related to systematic faults in software. • In case a hazard is not identified, there is no need to assess the risk (no hazard – no risk). • Improper training and faults due to misuse are great to always consider because many hazards are actuated by a human factor. • I see you have corrected my other comments from the live session, so now this analysis looks pretty good (in the parts which are completed, of course).

Key Recap Questions In this chapter, you have been exploring the main concepts of system safety and performing initial steps to identify system hazards. Now, with regard to the system you discussed: • • • • •

For each system function, discuss potential failures. Are there any hazards involved? If so, what are the risks? Try to quantify the risk! What caused the failure? Go back following the failure chain! What can we do about faults and errors?

Self-assessment Key

47

Self-assessment Now take the time to self-assess your knowledge by taking the quiz below. Each listed statement is either correct or incorrect. Please mark your answer and then check in the key at the end of the book. 1. System safety deals with methods that need to provide absolute protection from harm or damage. 2. A safety belt in the vehicle is a safety measure prescribed by functional safety. 3. Reported incidents are an indicator of an imminent accident. 4. Hazards always have the same usual targets (people, environment, property) regardless of the safety standard applied for their assessment. 5. Hazard A is more serious and shall be prioritized over Hazard B if Hazard A if actuated, causes the death of 100 people, and Hazard B, if actuated, causes the death of 50 people. 6. Let us say that for System A, the maintenance phase, which happens once a year, is considered hazardous. One of the ways to remove this hazard is to decrease the frequency of system maintenance. 7. A fault in the system is always considered a causal factor for a hazard. 8. Errors in the system may lead to dangerous failures of system functions. 9. System faults can be detected and removed during the system operation. 10. If we detect a dangerous failure of a system function, we may prevent the accident.

Self-assessment Key 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

False False True False False True False True False True

Chapter 4

System Safety Process

Introduction In Chap. 3, we have introduced all the important terminology of system safety and understood how safety can generally be assessed and ensured. One big consideration in that field is the systematicity of all processes and procedures involved. Therefore, process models for overall system development now need to be extended with appropriate process models for safety-specific and safety-relevant activities. It is also necessary to position all safety activities in the context of the overall system life cycle to understand how those activities are interlinked and when and how they are appropriately performed. Including safety in all phases of a system development project is providing an inherent view of safety, and designing all safety prescriptions from the start in a proactive way. Being proactive about safety is an essential consideration in system safety engineering today. Opposite to proactive safety, reactive safety as a traditional concept is based on merely logging incidents and accidents once they occur, and performing safety “fixes” subsequently. It is already understood that reactive safety is in some programs too costly with regard to caused damages, and in some programs, it cannot at all be tolerated as a primary method of safety assurance (e.g., aerospace programs, transportation of people, etc.). Instead, proactive safety makes sure that all design decisions which are taken from the start consider existing or introduced hazards. Then, together with the requirement analysis phase, appropriate safety requirements are also defined. Based on the safety requirements, a system architecture is appropriately defined, including safety-related subsystems and safety functions. System implementation immediately includes all safety measures, and subsequent verification and validation phases seek to validate the safety and provide an argumentation in the form of the final safety case. Proactive system safety engineering includes some very specific activities. For example, hazard identification is performed very early, immediately from a so-called pre-project phase, in which we usually only estimate the scope of work, provide © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_4

49

50

4 System Safety Process

rough effort estimation, and sketch the top-level system architecture. The list of hazards is provided as one of the key inputs for the requirements elicitation process, in which functional hazard evaluation is performed to evaluate the risk for each hazard. Based on the quantification of risk, appropriate safety goals are formulated, which are then decomposed to specific safety requirements. Those requirements can be allocated to specific affected elements in the system technical concept during the system design. This way safety is conceptualized before the system is even commenced with any specific implementation. In this chapter, the system safety process is discussed and practiced in more detail, with the goal to understand all additional safety-related phases, artifacts they produce, and traceability that needs to be maintained throughout all artifacts across various processes involved. Additionally, important sub-processes are laid out algorithmically, with specific actions prescribed to bring the risks down to the tolerable zone.

Video Lesson This chapter has a corresponding video lesson: sfs4.nit-institute.com

Lecture Notes

51

Lecture Notes Proactive safety considers a safety process that includes all safety-related sub-processes which follow the system life cycle from its inception, until the decommissioning, and is aligned with system engineering processes and project management. Proactive safety is inherent to the system design (considered from day zero), as opposed to reactive safety (“fly-crash-fix-fly”) which was traditionally used to analyze incidents and accidents and based on them implement safety measures. ISAPro model considers the split of the system life cycle into the problem space (pre-project phase, conceptualization, ideation, usually to create the proposal to the customer), model space (requirements development and system design), solution space (implementation, construction, integration, and testing), and maintenance (during system operation). Each engineering phase or project management phase is intertwined with an appropriate safety process (Fig. 4.1). Preliminary hazard identification (PHI) is a safety process in which the team, based on the rough concept and the initial listing of high-level requirements, identifies a preliminary hazard list (PHL) and performs their early assessment (risk evaluation), considering all phases of the system life cycle. The output of PHI is the list of overall safety goals (high-level safety requirements, e.g., “The system shall avoid collision with other vehicles in the proximity”). The risk associated with each hazard in the PHL can be allocated to the respective element in the rough technical concept, to give an early indication of the criticality and the required reliability/redundancy of system components (this helps the early estimation of time and cost!), and even create the preliminary safety concept. The functional hazard evaluation (FHE) phase is where specific safety requirements are defined based on the initially defined safety goals, in close connection with

Development project Problem space

Project Initialization

Preliminary Hazard Investigation (PHI)

Rough concept

Model space

Project start

Maintenance Solution space

Project controlling

Operation

Project close down

Functional Hazard Evaluation (FHE) Preliminary System Safety Evaluation (PSSE)

Requirements analysis

Design

Configuration Management, Quality Management (Verification, Validation)

System Safety Evaluation (SSE)

Construction, Integration and Test

Maintenance

Operational System Safety Evaluation (OpSSE)

Operation and technical maintenance

Disposal

Problem solving management, Change management

Fig. 4.1 ISAPro process model. (Source: ISaPro®: A Process Model for Safety Applications)

52

4 System Safety Process

the requirements development phase, extending the PHL by performing a complete hazard identification at this point. This phase is based on the requirement analysis process running in parallel (based on, e.g., functional requirements) with specific safety requirements produced as output. Safety requirements are derived from safety goals and high-level requirements and are related to functional requirements. FHE further helps the requirement analysis to correctly define all technical safety requirements (TSR) which specify safety features (and safety functions) for the upcoming system design phase. In PHI and FHE, hazards are identified via workshops with a multitude of stakeholders and experts of various relevant backgrounds, and risk analysis and evaluation are performed for each hazard according to the appropriate safety standard, to correctly quantify or categorize the risk (e.g., “ASIL B,” “SIL 2,” etc.). Then, it is judged whether the risk is acceptable or not, based on the associated failure rates (number of dangerous failures/fatalities per year/hour). According to the Minimum Endogenous Mortality (MEM) principle, people die in accidents with 2 ∙ 10-4 fatalities per person per year. MEM states that any new technical system must not have mortality higher than 1/20 of 2 ∙ 10-4, that is, it must prove to have only up to 10-5 fatalities per person per year (it is assumed one person is using ~20 technical systems every day). GAMAB (globalement au moins aussi bon – globally at least as good) allows referencing of our system to the equivalent system already in use, judging that our system is at least as good as the existing (and approved!) system. ALARP (as low as reasonably practicable) method further allows us to argue that measures to decrease risk are grossly disproportionate to benefits the system brings (e.g., too costly or impractical), splitting the risk into intolerable, tolerable, and acceptable levels. We can then judge tolerable risk and make argumentation in this regard (what is usually challenging and needs appropriate referencing as in GAMAB). If the risk is judged to be too high (intolerable), we need to implement risk reduction measures. Respecification/redesign may be needed so that the technical concept is changed to allow the reduction of risk (e.g., adding safety functions to bring the system to the safe state in case dangerous failures/states are detected, or adding other safety elements – warnings, alarms). This shall be done in FHE (together with the design phase) since later on it can be very costly. Additional measures may be to (re)define procedures (e.g., for operation) and to enhance user/ operator training. Finally, passive measures may be employed, such as protective gear, enclosures/barriers, area demarcation, safety labels, etc. After prescribing the safety measures, hazard identification needs to be repeated (maybe we introduced new hazards!) and risks reevaluated until all risk is judged to be acceptably low and all hazards closed (Fig. 4.2). The preliminary system safety evaluation (PSSE) process is used to check whether the system design phase (system architecture and technical design) produces outputs that comply with the safety requirements, to make sure that the design is safe before the implementation starts. The system safety evaluation (SSE) process extends the verification and validation phases of the V-model to verify if all safety-related elements are correctly implemented and if the system fulfills the overall safety

Lecture Notes

53

HAZARD IDENTIFICATION RISK AVOIDANCE Measures? RISK ANALYSIS

RISK REDUCTION

Next Hazard

Risk is too high RISK EVALUATION

HAZARD CLOSED

Risk acceptable? ???

Risk is acceptably low

Fig. 4.2 Hazard identification and risk evaluation process steps

Fig. 4.3 Process tailoring. (Source: INCOSE)

goals. Safety is continuously evaluated during the operation of the system as well, in the operational system safety evaluation (OpSSE), logging and tracking all safetyrelevant system behavior (incidents, accidents) which can help in maintenance (repair) and further system (re)design. Standards usually allow process tailoring, to match the standard provisions more closely with the specific project, which can be used to find a trade-off between the level of processes required for safety compliance and the cost needed to implement and maintain them (Fig. 4.3).

54

4

System Safety Process

Exercise 4 Your team is in the middle of the preliminary hazard identification (PHI) phase, with the preliminary hazard list identified for the power press machine in the factory (Fig. 4.4). Your goal is first to analyze and evaluate the risk according to IEC 61508. Consider Catastrophic severity only in case of multiple fatalities; Critical in case of a single fatality or irreversible injury; Marginal in case of a nonfatal, reversible injury; and Negligible in case of minor or no injuries. Consider probability according to the failure range given in the table (see M3 lecture note). Then, argue the risk acceptance according to MEM (10-5 deaths per person per year). In case the risk is too high, prescribe a safety measure and reevaluate the risk until you can close it. H1: Operator places his hand in a pressing area due to misuse and gets injured. H2: Press starts unintentionally during maintenance, with press mechanism active, potentially harming maintenance workers. H3: Press moves or topples due to imbalance and potentially harming the operator. Fig. 4.4 Power press machine

Exercise 4 Template

55

Your Tasks for the Exercise • Analyze and evaluate the risk associated with hazards H1, H2, and H3 according to the IEC 61508 and the description above. • Express the risk quantitatively (in failures per year). • Evaluate the risk acceptability according to MEM. • In case of inacceptable risk, prescribe a suitable safety measure. • Reevaluate the risk and reassess its acceptability to close the hazard. To-Do List • Perform the exercise individually or with your peers. One can share the screen and keep notes, all contribute. • Create presentables (e.g., drawing, filled in sheet). • Discuss your solution and share it with others. Note: Digital files for this exercise are available at sfs4.ex.nit-institute.com

Exercise 4 Template Power Press – Preliminary Hazard List (PHL) Hazard ID H1

H2

H3

Hazard description Operator places his hand in a pressing area due to misuse and gets injured Press starts unintentionally during maintenance, with press mechanism active, potentially harming maintenance workers Press moves or topples due to imbalance and potentially harming the operator

P

S

R

R (

.

)

MEM (risk acc?)

Safety measure description

P

S

R

R (

.

)

MEM (risk acc?)

56

4

System Safety Process

Exercise 4 Sample Solutions See several exemplary solutions to the exercise:

Solution 1 Power Press – Preliminary Hazard List (PHL) Hazard ID H1

H2

H3

Hazard description Operator places his hand in a pressing area due to misuse and gets injured Press starts unintentionally during maintenance, with press mechanism active, potentially harming maintenance workers Press moves or topples due to imbalance and potentially harming the operator

P

S

R

R (

.

)

MEM (risk acc?)

R

MEM

Safety measure description

P

S

R

Improbable

Critical

3

10^-6

yes

(

.

)

(risk acc?)

Frequent

Critical

1

>10^-3

no

Sensor shall detect human hand when the machine comes into contact with the skin.When the sensor detects human skin it will not allow operating.

Remote

Critical

3

10^-5 to 10^-6

yes

-

-

-

-

-

-

Probable

Cathastopic

1

10^-3 to 10^-4

no

Press machine shall be nailed down to the floor so that it shall not be moved without the use of special equipment.

Incredible

Cathastopic

4

all technical safety requirement derived from SR004 should inherit

Send braking signal in 0.02s from detection

Review Comments by the Instructor • Requirements definition is usually one field composed of all the required aspects. Currently, you have split it into Description and Intervention, which can be done while brainstorming, but the final SRS shall be made in a standard requirements specification format. • One note: in case you have a very high-level safety requirement (e.g., Folding shall not happen during driving) which is not prescribing any measure, this is usually called safety goal (SG) and placed at the same level as HLRs. • Make sure to address freedom from interference aspects as well. • ASIL levels are inherited by TSRs from the SRs, based on, e.g., PHI. It is strange to have different ASIL levels at this point. It seems you have started to prematurely think about the implementation and ASIL allocation to functional blocks or perhaps even ASIL decomposition (what is possible in theory but here not convincingly performed).

Exercise 5 Sample Solutions

69

Solution 3 Electric scooter with geofencing – Requirements specification ID

Name

FR_001

Increasing speed

FR_002

Decrease velocity

FR_003

HMI information

FR_004

Lights

Description The scooter shall increase speed when the driver pulls the gas lever The scooter shall decrease speed when the driver release the gas lever The scooter shall present information about the vehicle speed, battery percentage, driving mode and mileage on display The scooter shall turn on the lights when the visibility (light conditions) drops below a level The scooter should provide function to be folded for easier transport

Author

Ver.

Derived from

Used by

Manic

1

HLR_001

TR_001 TR_0011

Beric

1

HLR_001

TR_004

Basta

1

HLR_001

TR_005 TR_012 TR_013

Basta

1

HLR_002

TR_006

Priority

Kano (P/M M/D D/I/Q Q/R R)

1

M

1

M

Note

LIGHT_LEVEL is referenced in the definition part of the requirements

FR_005

Fold

Basta

1

HLR_002

TR_007

SF_001

Maximum velocity

The scooter shall not exceed maximum velocity of 25 km/h

Videnovic

1

HLR_002

TR_003

2

P

SF_002

Maximum acceleration

The scooter shall not exceed the maximum acceleration of

Basta

1

HLR_002

TR_008

1

M

MAX_ACC_VALUE is referenced in the definition part of the requirements

SF_003

Battery level

The scooter shall not be able to drive if battery level below 10%

basta

1

HLR_002

TR_009

Q

Rejected after review

basta

1

HLR_002

TR_010

1

M

Karan

1

HLR_002

TR_002

3

I

TR_006, TR_007, TR_008, TR_009

1

The scooter shall not be able to drive if there is weight below 25kg The scooter should remain constant velocity per certain drive mode (SPORT, ECO, CITY)

SF_004

Weight limit

QR_001

Constant velocity

SR_001

Lever engaged control

Control unit shall slow down scooter when hand is not present on the lever

Manic

1

HR_008

TR_006

Lever pressure detection

The contact sensor shall detect pressure on the lever

Karan

1

SR_001

TR_007

Lever contact pressure

Sensor monitor shall detect that lever is released within 250ms

TR_008

Sensor monitor signal detection

TR_009

Slowing down safe state

The sensor monitor shall generate output signals within 50ms. Speed control unit shall slow down system within 3s if sensor monitor doesn’t detect pressure

Basta

1

SR_001

Videnovic

1

SR_001

Beric

1

SR_001

ASIL A. According to ISO 26262, although high exposure is present, at the same time user has high controllability due to low speed

Review Comments by the Instructor • It is not obvious where the SR_001 was derived from. It is fine for the exercise but usually, it is related to the FR or derived from HLR based on PHI. The definition of the safety function seems sane. • Aspects such as freedom from interference and SIL allocation to requirements need to be also provided.

70

5

Functional Safety

Solution 4 Electric scooter with geofencing – Requirements specification ID

Name

Description

Author

Ver.

Derived from

Used by

Kano (P/M/D/I/Q/R)

FR_001

Acceleration

FR_002

Steering

The electric scooter shall provide steering control for the driver.

Bojkić

1.0

HLR_001

TR_003

1

M

FR_003

Braking

The electric scooter shall provide braking control for the driver.

Bojkić

1.0

HLR_001

TR_004

1

M

FR_004

Weight limit

Barić

1.0

HLR_002

TR_005 TR_006 TR_007

2

D

HLR_002

TR_008 TR_009

2

D

TR_010 TR_011 TR_012

FR_005

Geofencing

FR_006

Battery alert

FR_007

Mobile control

FR_008

Sound signaling

QR_001

Light signaling

SF_001

Velocity control

The electric scooter shall prevent starting if the weight exceeds the limit. The electric scooter should prevent driving outside of the defined geofencing zone. The electric scooter should give the driver an alert if the battery level is insufficient for the travel distance. The electric scooter should provide controls for turning on and off via the mobile application. The electric scooter may be able to produce sound alerts. The electric scooter shall have lights bright enough so that the driver can navigate safely on the road with insufficient visibility. The electric scooter shall prevent powering off unless it is not stationary.

Bojkić

Mihić

1.0

1.0

HLR_001

TR_001 TR_002

Priority

The electric scooter shall enable acceleration and deceleration.

Mihić

1.0

HLR_002

Simić

1.0

HLR_003

TR_013

Simić

1.0

HLR_004

TR_014 TR_015

Kaštelan

1.0

HLR_004

TR_016 TR_017

Mihić

1.0

HLR_037

TR_125

1

Note

M

Safety requirement?

To the TR level?

Exercise 5

SF_101

Braking safety

TR_101

Mechanical system blockade

TR_102

Dependence on pressure

TR_103

Response time

TR_104

SIL

The electric scooter shall not allow unintentional braking while driving.

The mechanical braking system shall deactivate after the driver releases the brake lever. The braking level shall be linearly dependent on the pressure on the brake lever. The response time of the braking system shall be between and sec. The electric scooter braking system shall have SIL 2.

1.0

HLR_001

1.0

SF_101

1.0

SF_101

1.0

SF_101

1.0

SF_101

TR_101 TR_102 TR_103 TR_104

EUC: braking system

ECS: electric scooter control system

Safe state: the electric scooter brakes at 50% intensity when the problem with the braking system is detected.

Safe operation: the electric scooter shall limit the maximum speed to 10 kmph after it enters the safe state.

Review Comments by the Instructor • It seems you have added TRs with the intention to enhance the quality of the overall functions (as additional quality requirements) rather than prescribing the safety function what was the goal of the exercise. • Safety function addressing SF_101 shall probably be able to detect (by MONITORING) whether the braking was unintentional (by, e.g., pressure sensor on the handle or similar method) and then performing INTERVENTION (e.g., by signaling the malfunction and entering the safe state – the question is what it could be if we need to prevent the braking – what probably should not be done). • It is good to detail EUC/ECS, etc. for the exercise, but, e.g., safe state requires to use of the braking system as a remedy, whereas the braking system itself is the cause of the problem in the first place. Safety function, therefore, needs to provide freedom from interference, by finding an alternative way to brake (having, e.g., redundant/additional braking system – e.g., another mechanical brake).

Self-assessment

71

Key Recap Questions In this chapter, you have been analyzing the prescription of active safety measures (safety functions). Now, with regard to the system you discussed: • • • • • •

Think up a safety function. Which safety requirement/safety goal does it address? How the safe state is defined? What is the safety integrity level of the function? Discuss freedom from interference! Can you make your system fail-operational?

Self-assessment Now take the time to self-assess your knowledge by taking the quiz below. Each listed statement is either correct or incorrect. Please mark your answer and then check in the key at the end of the book. 1. Safety functions always address hazards exhibited by the functions of EUC, but not the functions of ECS. 2. Safety functions execute on top of the safety-related system (SRS) which can be made an integral part of the ECS only if the freedom from interference is provided. 3. A safe state is a state in which a system induces no harm or damage, and in which all system functions are always turned off. 4. The safety function shall monitor the system only for dangerous failures of its functions. 5. In case SRS fails, the system is immediately considered unsafe. 6. The electronic control unit within SRS shall always comply with the safety integrity level determined from the risk category of the hazard which is actuated by a dangerous failure of an EUC function. 7. A safety function operating in an on-demand mode can be made of less reliable components than the equivalent safety function operating in a continuous mode. 8. Safety integrity level is among the most important requirements for the safety function. 9. Methods of implementation of the safety function (e.g., system engineering processes, software development techniques) shall always be at the highest possible quality according to the standard regardless of the SIL allocated to that safety function. 10. Hazard A is identified, with risk evaluated according to ISO 26262 as ASIL C. The function of the system exhibiting this hazard then always needs to be reworked so that it satisfies the requirements for ASIL C according to the standard.

72

5

Self-assessment Key 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

False True False False True True True True False False

Functional Safety

Chapter 6

Defining Safety Functions

Introduction In Chaps. 1 through 5 we have introduced all top-level aspects of system safety and functional safety. To bring the considerations to the applicability, now it would be important to analyze specific exemplary functional safety standards and an example system, to see which provisions it gives with respect to functional safety. In this chapter, we would only work out the part of the process which results in the definition of safety functions and their safety integrity requirements. We could say that this process yields correct inputs to the system requirements specification, in terms of additional safety requirements (including top-level safety requirement/ safety goal, but also technical safety requirements and quality safety requirements/ safety integrity requirements). All the general provisions in functional safety stem from the originating standard, IEC 61508. This standard deals with the electrical, electronic, or programmable electronic (E/E/PE) systems and specifically deals with the safety functions including sensors, controllers, and actuators. Functional safety standards for different specific areas are derivates (profiles) of IEC 61508, with the goal to more closely define aspects according to specificities of systems in a regarded area (e.g., power plants, heavy machinery, automotive, etc.). For example, heavy machines (such as loading trucks, cranes, forklifts, harvesters, etc.) need to be compliant with either IEC 62061 or ISO 13849, and also ISO 26262 in case they are driven on regular roads. IEC 62061 focuses on electrical, electronic, and programmable electronic control systems specifically for machinery, whereas ISO 13849 provides a broader look at the safety life cycle, including safety requirements and guidance on the principles for the design and integration of safetyrelated parts of complex mechanical/electrical systems of machinery. Those standards are currently simultaneously applied to the same system and are therefore good as learning examples. They have different provisions, e.g., risk assessment which is

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_6

73

74

6 Defining Safety Functions

similar in nature but revealing in the sense of how different standards cater to the essentially same practice. An example in this chapter would start by laying out an idea for a heavy machinery system. The analyzed system is a wrecking ball crane. Although a bit outdated, this system is a great ramification of our childhood/cartoon memories and could be nice for easy understanding with people of no specific technical background in construction and demolition projects. The initial step is inevitably understanding our system. We need to be able to define the system, its scope, main components, and the system boundary. By doing this, we set the stage for the hazard and risk analysis (HARA) phase, in which we would need to list all thinkable hazards: dangerous events, hazardous situations, dangerous faults, malevolent or unauthorized activities, and foreseeable misuse. Then, we would analyze each hazard with respect to its probability and severity. We here apply specific provisions from IEC 62061, in which probability is a composite of three distinctly evaluated categories (frequency and duration of use, probability of hazard event, and avoidance) which are summed up and then used with the severity category to look up the table and obtain the resulting safety integrity level (SIL): SIL 1, SIL 2 or SIL 3, SIL 3 being the most strict. In ISO 13849, however, three categories are used to assess risk (severity of injury, frequency of exposure to hazard, and probability of avoiding harm), and based on their combination, the resulting performance level (PL) is deduced: PL a, PL b, PL c, PL d, or PL e, PL e being the most strict. SIL and PL are used for the same thing, but they affectively define the safety integrity requirements safety-related systems (and safety functions within them) must fulfill if they are designed and used to mitigate the respective hazard, according to the selected standard. Safety functions are designed to monitor the events leading to hazards (e.g., via specific sensors) and then intervene in case those events are observed (e.g., by bringing the system to the safe state). Safety functions are then prescribed by their safety requirements, defining these monitoring and intervention phases, which shall obviously fully eliminate the risk of the hazard or reduce it to the tolerable zone. Safety functions inherit the SIL or PL resulting from the hazard, and this SIL or PL effectively becomes their safety integrity requirement. This further means that safety functions must be designed so that all processes, procedures, and architectural constraints must be met as prescribed by the standard for the given SIL or PL. Further, safety integrity against random failures must be provided, showing that the residual failures are low enough with regard, again, to the SIL or PL prescribed and their respective quantitative failure metrics. Let’s now do this interesting analysis for our wrecking ball crane!

Video Lesson This chapter has a corresponding video lesson: sfs6.nit-institute.com

Lecture Notes

75

Lecture Notes The procedure can be graphically described with the analysis sheet shown in the figure below. We would then magnify this procedure step-by-step until safety functions are correctly defined (Fig. 6.1). Wrecking ball crane has, among others, two hazards identified in this example. Hazard 1: Machine tipping due to imbalance and harming operator/workers Affecting: Whole machine Operation mode: In operation Countermeasure: Balance monitoring Hazard 2: Wrecking ball hits the operator Affecting: Whole machine Operation mode: In operation Countermeasure: Ball position monitoring According to IEC 62061, the probability of occurrence of harm resulting from an identified hazard is expressed via three categories using the table below. Probability of occurrence of harm Frequency and duration F ≤1h 5 >1h – ≤1 day (d) 5 >1d – ≤2 wks 4 >2wks – ≤1 year 3 >1 year 2

Probability of hazard event Very high Likely Possible Rarely Negligible

P 5 4 3 2 1

Avoidance

A

Impossible Possible Likely

5 3 1

Class CI is calculated as a sum of all parameters, CI = F + P + A.

76

6

Defining Safety Functions

Fig. 6.1 The overall procedure for the wrecking ball crane example

For Hazard 1, we can judge the frequency of use of the wrecking ball crane to be less than each day (say, it is used once per week). This results in F = 4. The probability of wrecking ball crane tipping due to imbalance is judged to be caused by operator misuse and only when brought to the extreme position, rendering this situation to happen rarely. This results in P = 2.

Lecture Notes

77

Finally, if this happens, this harm is impossible to avoid, that is, the operator cannot somehow escape and evade this accident. This results in A = 5. Final CI is therefore CI = 4 + 2 + 5 = 11. Note how the selection of those categories can be subjective, so the proper argumentation for the reviewer is extremely important. The final selection of the safety integrity level (SIL) as per IEC 62061 is conducted by using CI and the evaluated severity of harm, according to the table below.

Severity of hazard consequences Death, losing eye or arm Permanent, losing fingers Reversible, medical attention Reversible, first aid

S 4 3 2 1

Class CI (F + P + A) 8–10 4 5–7 SIL 2 SIL 2 SIL 2 SIL 1

11–13 SIL 3 SIL 2 SIL 1

14–15 SIL 3 SIL 3 SIL 2 SIL 1

For Hazard 1, consequences can result in a permanent injury of the crane operator in the cabin, but not necessarily death (it is not expected the cabin to be damaged so that the fatal harm could occur, especially taking into account the passive safety measures, as well as the operation happening at the construction site where medical attention can be sought for by co-workers). This results in S = 3. Finally, for S = 3 and CI = 11, the table yields the resulting SIL 2. It is often denoted as the SIL required, as SILr 2. For Hazard 2, we can attempt the analysis according to ISO 13849. Three categories are prescribed here. The severity of injury resulting from a hazard, denoted as S, can be classified as S1 (slight, usually reversible injury) or S2 (severe, usually irreversible injury, including death). Frequency and/or duration of stay (exposure to hazard), denoted as F, can be classified as F1 (rare to often and/or short exposure to hazard) or F2 (frequent to continuous and/or long exposure to hazard). Finally, the probability of avoiding or limiting harm, denoted as P, can be classified as P1 (possible under certain conditions) or P2 (hardly possible). The final performance level (PL) is obtained by following the tree as in the figure below (Fig. 6.2). Hazard 2 is judged to bear severity S = S2 (injury sustained can be irreversible), and the exposure to the hazard is expected to be short (not continuous), with F = F1. Finally, the probability is very low (hardly possible, only under extreme circumstances) with P = P2. Now, S2 - F1 - P2 results in PLr d. For each of the hazards, now safety functions can be defined to effectively remove hazards altogether. The safety function for Hazard 1 can be defined by the following requirements for the safety-related system (SRS): Safety requirement 1.1 (SR 1.1): SRS shall monitor the balance of the crane in order to predict a near-tipping event.

78

6

Fig. 6.2 Performance level decision tree as per ISO 13849

Defining Safety Functions P1

F1

PLa

P2 S1

P1

PLb

F2 P2

Start P1

PLc

F1 P2

S2 P1

PLd

F2 P2

PLe

Technical safety requirement 1.1.1 (TR1.1.1, derived from SR1.1): SRS shall use gyroscopes positioned at the crane top, cabin, and the back of the vehicle to detect imbalance. Safety requirement 1.2 (SR 1.2): SRS shall, in case of imbalance over the threshold (near-tipping), cut off the controls and command the vehicle to sound an alarm and bring up the crane. Quality safety requirement 1.3 (QSR 1.3): SRS shall achieve SILr 2. The safety function for Hazard 2 can be defined by the following requirements for the safety-related system (SRS): Safety requirement 2.1 (SR 2.1): SRS shall monitor ball position to prevent the wrecking ball from reaching the vehicle. Technical safety requirement 2.1.1 (TR2.1.1, derived from SR2.1): SRS shall use two position sensors (at crane arm and ball pulley) to monitor crane height and ball descent and to detect possible collision if swung. Safety requirement 2.2 (SR 2.2): SRS shall disallow dangerous positioning by limiting the controls extent continuously. Quality safety requirement 2.3 (QSR 2.3): SRS shall achieve PLr d.

Your First Safety Project! By using the knowledge and existing exercise materials obtained from all chapters so far, now you need to work and submit your assignment. If possible, you can carry out this assignment in a group of four or five. Each group would need to define the workflow for project execution first. Discuss this workflow with your instructor, if possible. Key points of the workflow shall define the amount of common work (e.g., during workshops and brainstorming sessions, integration of the work, preparation of the presentation of the work) and individual work (e.g., preparing parts of the required items for submission).

Your First Safety Project!

79

Together with your team, you need to choose a safety-critical system. It can be the electric scooter we used during exercises, but also any other system of interest. As assistance, you can consider, for example, any of the following: • • • •

Specialty vehicles, e.g., forklift truck Scooters and hoverboards Special machines, e.g., excavator Factory equipment, e.g., wood cutting

Upon selection of the system, you need to perform a quick system delineation and decomposition of the system to components, considering users/operators and the environment (see Exercise 1). After that, you need to perform requirements elicitation. First, for each of the systems, define a few high-level requirements (HLRs) focusing only on the top-level goals of the system. Then, perform the decomposition of each HLR to several functional requirements (FR). If you come across some quality/constraint requirements at this point, feel free to include them also. You do not need to document the elicitation process (as using Kano) or perform it systematically. You may use Exercise 2 as a reference. Based on the functional requirements, now perform the preliminary hazard identification (PHI), discussing and enumerating all the hazards (creating the preliminary hazard list – PHL). Evaluate risk and define the appropriate safety measures. Perform risk assessment according to any of the standards discussed during lectures, which you feel best suits your system (IEC 61508, IEC 62061, or ISO 13849). Devise safety measures for each hazard. Make sure to prescribe at least two active safety measures, using the concepts of functional safety. You do not need to quantify the risk and provide formal acceptance argumentation, but make sure to decrease the SIL after reassessment to the lowest level existing in the standard. You may use Exercises 3 and 4 as a reference. Update the requirement specification, by defining top-level safety requirements according to the identified hazards. For at least two safety requirements of your choice, prescribe active safety measures and derive their technical safety requirements (TSRs) further. You may use Exercise 5 as a reference. As finalization, illustrate your safety concept with the focus on at least one safety function in a draft drawing. The format of this illustration is free; however, make sure to either consider the system state diagram (depicting relevant events and the safe state) or architectural block diagram (depicting the data flow which describes fault-error-failure propagation as well as “trapping” the error/failure with a safety function). Create a presentation in which you will describe the process you followed, as well as the most important (interesting) details from the documentation you produced. Please make sure that you include both safety functions you defined in the presentation. It would be beneficial if you are in a position to present your findings to a live audience since this would be a simulation of your argumentation to a reviewer. Use 15 minutes for the presentation in total. Please strictly adhere to the time. Make sure

80

6 Defining Safety Functions

you split the presentation so that each group member can contribute and present a part.

Required Output • System delineation/decomposition sheet (may be free-form) • System requirements specification (SRS), in the Exercise 2 template • Preliminary hazard list (PHL) including the reassessment after safety measures (adapt the template starting from either Exercise 3 or Exercise 4) • Safety concept illustration (drawing, free-form) • Final presentation (in ppt)

Submission Deadline Take 1 or 2 weeks to work the project out.

Assessment You or your peers (or your instructor) can assess your work. Take 20 points as the project total score. Up to 10 points shall be allocated for your submission only. Each released document artifact brings you 2 points (if complete and mostly correct), 1 point (if incomplete and/or partially incorrect), or 0 points (if not provided, mostly incomplete or mostly incorrect). Other 10 points are going to be assigned based on your presentation (up to 5 points for convincing presentation-proof with solid/ believable argumentation and up to 5 points for the presentation content – specifically addressing the communication clarity and understandability of the presented material). Consider that you need to score at least 14 points to pass this review simulation!

Sample Solution to the Project The team selected a forklift truck as their considered system (available also in digital form at sfs6.ex.nit-institute.com). They gave the following presentation (Fig. 6.3): The group analyzed and delineated the system as in Fig. 6.4. System requirements specification for the forklift truck was provided, as well as preliminary hazard list (PHL).

Your First Safety Project!

81

Procedure • Research about the system ion: defining the system components, compo mponents, system • System delineation: nvironment boundary and environment rements • High level requirements rements • Functional requirements ard Ident ntiification Preliminary Hazard List • Preliminary Hazard Identification ents • Safety requirements • Technical safety requirements • Safety concept

Defining Defi De finiing fi ng Safety Safe fety ty Functions Funct Fu ctiions ions : FForklift io o orrkl kliifft Project Project 1 : Group Grou rou r p4 Maja Ma Maj ja Baric Ba ic Ivan Kastelan a Ka stelan ste t Nebojsa Nebo Neb eb b jsa Cvijic C Cvij Cvvi vijijiicc Filip Fi Fil lip M Mihic ih Miroslav Videnovic Miirosla rosla osla slavv V Videnov Videno Viden id denov enov nov o ic

High Level Requirements

ID

Name

Description

HLR_001

Lifting

The forklift shall allow lifting and lowering the load.

HLR_002

Driving

The forklift shall provide the ability to be driven.

HLR_003

Tilting

The forklift shall allow tilting of the forks.

Preliminary Hazard List Hazard ID

Hazard description

Functional Requirements

Severity (S)

Frequency (F)

Probability (P)

PL

Acc?

Safety measure description

Severity (S)

Frequency (F)

Probability (P)

PL

Acc?

S2

F1

P1

c

YES

S2

F1

P1

c

YES

Active safety measures:

H01

ID

Name

Description

Derived from

FR_009

Forward movement

The forklift shall move forward when the forward-reverse lever is in the forward position.

Power supply

S2

Escaping propane gas is ignited by the hot engine potentially irreversibly harming multiple people.

S2

F1

P2

d

NO

- Install lightning alert when driving forward with restricted visibility - Install fixed sensors with a 360degree detection zone to detect a pedestrian

Active safety measures:

HLR_002 H08

FR_011

- Install collision warning system

The driver's view is blocked by the load while driving forward, potentially hits and irreversibly harms pedestrians.

The engine shall burn propane gas to power the forklift.

F1

P2

d

NO

- Automatic fire detection and suppression system with the ability of manual activation that releases non-corrosive fluid to extinguish the fire Passive safety measures:

HLR_002

- crew training (fire safety)

Technical safety requirements derived from SR_006 Safety requirements

ID

Name

SR_001 Blocked view

Forklift fire safety

SR_006

Description

Derived from

Used by

The collision warning system shall activate when driving forward while driver view is blocked by the load.

FR_009

TSR_001 TSR_002 TSR_003 TSR_004 TSR_006 QSR_001

The forklift shall provide a fire extinguishing system that releases noncorrosive fire-extinguishing fluid in case of a potential fire near the engine.

FR_011

TSR_019 TSR_020 TSR_021 QSR_007

Technical safety requirements derived from SR_001 ID

Name

Description

TSR_001

Forward driving visual alarm

The visual alarm in the cabin shall activate when driving forward with speed above 3 kmph while the driver's view is blocked by load.

TSR_002

Forward driving sound alarm

The sound alarm system shall activate when driving forward with speed above 5 kmph while the driver's view is blocked by load.

TSR_003

Near object detection

The radar shall measure a distance from an object or pedestrian while driving.

ID

Name

Description

TSR_019

Forklift fire sensor

The forklift shall be equipped with a sensor that detects a fire.

TSR_020

Forklift fire suppression display

The forklift shall be equipped with an interactive display that allows the driver to activate the fire suppression manually.

SR_006

TSR_021

Automatic fire suppression activation

The forklift shall automatically activate the release of fire-extinguishing fluid once a potential fire is detected by the sensor.

SR_006

QSR_007

Forklift fire suppression performance level

The forklift fire suppression system shall be developed according to PL d.

SR_006

Forklift system state diagram regarding SR_001

Derived from

SR_001

SR_001

SR_001 SR_002 SR_001 TSR_004

Environment preview

A monitor of the driver assistance system shall display the environment around the vehicle.

TSR_006

Collision avoidance

Forklift shall apply breaks and stop automatically if the driver assistance system detects a possible collision.

QSR_001

Radar detection range

The radar shall monitor 360 degrees around the forklift

SR_002 SR_001 SR_002 SR_001 SR_002

Fig. 6.3 Project presentation sent for a review (outline)

Derived from SR_006

82

6

Defining Safety Functions

Fig. 6.4 System delineation sheet (forklift truck)

ID

Name

HLR_001

Lifting

HLR_002

Driving

HLR_003

Tilting

FR_001

Lifting the forks

FR_002

Lowering the forks

Derived from

Description Author High-level requirements Group The forklift shall allow 4 lifting and lowering of the load

Ver. 1.0



The forklift shall provide the ability to be driven

Group 4

1.0



The forklift shall allow tilting of the forks

Group 4

1.0



1.0

HLR_001

1.0

HLR_001

Functional requirements The forklift shall move Group the forks up when the 4 lever is pushed up Group 4

Used by FR_001 FR_002 FR_003 CR_001 FR_004 FR_005 FR_006 FR_007 FR_008 FR_009 FR_010 FR_011 CR_002 FR_012 FR_013 FR_014 CR_003

(continued)

Your First Safety Project!

ID

Name

FR_003

Fixed fork height

FR_004

Acceleration

FR_005

Deceleration

FR_006

Parking

FR_007

Steering left

FR_008

Steering right

FR_009

Forward movement

FR_010

Reverse movement

FR_011

Power supply

FR_012

Tilt forward

FR_013

Tilt backward

FR_014

Fixed tilt angle

83

Description The forklift shall move the forks down when the lever is pulled down The forks shall not move up or down when the lever is in a neutral position The forklift shall accelerate when the gas pedal is pressed The forklift shall decelerate when the brake pedal is pressed The forklift shall not allow movement when the parking lever is pulled The forklift shall turn left when the steering wheel is turned left The forklift shall turn right when the steering wheel is turned right The forklift shall move forward when the forward-reverse lever is in the forward position The forklift shall move in reverse when the forward-reverse lever is in the reverse position The engine shall burn propane gas to power the forklift The forks shall tilt forward when the tilt control lever is pushed forward The forks shall tilt backward when the tilt control lever is pulled backward The tilt angle of the forks shall remain unchanged when the tilt control lever is in the neutral position

Author

Ver.

Derived from

Group 4

1.0

HLR_001

Group 4

1.0

HLR_002

Group 4

1.0

HLR_002

Group 4

1.0

HLR_002

Group 4

1.0

HLR_002

Group 4

1.0

HLR_002

Group 4

1.0

HLR_002

Group 4

1.0

HLR_002

Group 4

1.0

HLR_002

Group 4

1.0

HLR_003

Group 4

1.0

HLR_003

Group 4

1.0

HLR_003

Used by

(continued)

84

6

ID

Name

CR_001

Load limit

CR_002

Maximum speed

CR_003

Maximum tilt angle

SR_001

Blocked view

SR_002

Blind spot preview

SR_003

Forklift imbalance prevention

SR_004

Load balancing system

Defining Safety Functions

Description Author Constraint requirements The forklift shall lift the Group load with mass up to 4 2 metric tons The forklift shall not Group exceed the maximum 4 speed of 10 km/h Group The tilt angle of the 4 forks shall not exceed ±10 degrees Safety requirements The collision warning Group system shall activate 4 when driving forward while the driver’s view is blocked by the load

Ver.

Derived from

1.0

HLR_001

1.0

HLR_002

1.0

HLR_003

1.0

FR_009

The driver assistance system shall display areas around the vehicle which are obstructed from the view of the driver while driving in reverse The forklift shall prevent vehicle tipping by monitoring and correcting the forklift balance

Group 4

1.0

FR_010

Group 4

1.0

FR_001

The forklift shall have a load balancing system that shall detect the load position relative to the forks and correct the position if over the threshold

FR_004 FR_005

Group 4

1.0

FR_007 FR_008 FR_009 FR_010 FR_012 FR_013 FR_004 FR_005 FR_007 FR_008 FR_012 FR_013

Used by

TSR_001 TSR_002 TSR_003 TSR_004 TSR_006 QSR_001 TSR_003 TSR_004 TSR_005 TSR_006 QSR_001 TSR_007 TSR_008 TSR_009 TSR_010 TSR_011 TSR_012 QSR_002 QSR_003

QSR_004 TSR_013 TSR_014 TSR_015 QSR_005

(continued)

Your First Safety Project!

ID SR_005

Name Lift and tilt hydraulics safety

SR_006

Forklift fire safety

SR_007

Wearing protective gear

SR_008

Passing forklift safety training

SR_009

Seat belt

TSR_001

Forward driving visual alarm

TSR_002

Forward driving sound alarm

TSR_003

Near object detection

TSR_004

Environment preview

85

Description The forklift shall prevent potential loss of hydraulic pressure by having a backup hydraulic hose The forklift shall provide a fire extinguishing system that releases noncorrosive fireextinguishing fluid in case of a potential fire near the engine All operators within the system boundary shall wear protective gear

Derived from FR_001 FR_002 FR_012 FR_013

Author Group 4

Ver. 1.0

Used by TSR_016 TSR_017 TSR_018 QSR_006

Group 4

1.0

FR_011

TSR_019 TSR_020 TSR_021 QSR_007

Group 4

1.0



All operators within the Group system boundary shall 4 pass forklift safety training The forklift shall have a Group brightly colored seat 4 belt that shall prevent the driver from falling out of the seat while driving Technical safety requirements Group The visual alarm in the 4 cabin shall activate when driving forward with a speed above 3 km/h while the driver’s view is blocked by the load The sound alarm system Group 4 shall activate when driving forward with a speed above 5 km/h while the driver’s view is blocked by the load The radar shall measure Group 4 a distance from an object or pedestrian while driving Group A monitor of the driver assistance system shall 4 display the environment around the vehicle

1.0

HLR_001 HLR_002 HLR_003 HLR_001 HLR_002 HLR_003



1.0

HLR_002



1.0

SR_001



1.0

SR_001



1.0

SR_001 SR_002



1.0

SR_001 SR_002



(continued)

86

6

ID TSR_005

Name Reverse driving visual alarm

TSR_006

Collision avoidance

QSR_001

Radar detection range

TSR_007

Forklift load position monitoring

TSR_008

Forklift load mass monitoring

TSR_009

Forklift balance monitoring

TSR_010

Forklift imbalance detection

TSR_011

Forklift safe state transition due to imbalance detection Forklift safe state due to imbalance protection

TSR_012

Defining Safety Functions Derived from SR_002

Description A monitor for driver assistance shall display visual warning signals while reverse movement if the distance from the object is less than 3 m The forklift shall apply breaks and stop automatically if the driver assistance system detects a possible collision The radar shall monitor 360 degrees around the forklift The forklift shall be equipped with two gyroscopes, one positioned just between rear wheels and one just between forks The forklift forks shall be equipped with force sensors to monitor force due to load The forklift shall be able to continuously monitor balance by using data from gyroscopes and force sensors By monitoring the balance, the forklift shall be able to detect neartipping events The forklift shall be put into the safe state if a near-tipping event is detected

Author Group 4

Ver. 1.0

Used by –

Group 4

1.0

SR_001 SR_002



Group 4

1.0

SR_001 SR_002



Group 4

1.0

SR_003



Group 4

1.0

SR_003



Group 4

1.0

SR_003



Group 4

1.0

SR_003



Group 4

1.0

SR_003



The forklift safe state shall be cut off the lift and tilt control to the user, taking over lowering and backward tilting to the initial position with and turning on the emergency alarm

Group 4

1.0

SR_003



(continued)

Your First Safety Project!

ID QSR_002

QSR_003

QSR_004

TSR_013

Name Forklift imbalance prevention performance level Forklift lifting performance level Forklift tilting performance level Load imbalance monitoring

TSR_014

Load imbalance detection

TSR_015

Load imbalance correction

QSR_005

Load balancing system level Forklift hydraulic hose monitoring Forklift hydraulic hose burst alert

TSR_016

TSR_017

87

Description The forklift balance monitoring and imbalance prevention shall be developed according to PL e The forklift lifting and lowering function shall be developed according to PL e The forklift tilting function shall be developed according to PL e The load balancing system shall monitor the load tilt relative to the forks with sensors placed at each fork The load balancing system shall detect an imbalance of the load if the tilt of the load is above the threshold angle of 5 degrees The load balancing system shall correct load imbalance by activating the fork extensions that shall push the load toward the equilibrium position The load balancing system shall be developed according to PL d The forklift shall be equipped with a sensor that detects when a hydraulic hose bursts The forklift shall provide a visual alert in the form of a glowing red light informing the driver that the primary hose has burst, that the backup hose is in use, and that maintenance is due

Author Group 4

Ver. 1.0

Derived from SR_003

Group 4

1.0

SR_003



Group 4

1.0

SR_003



Group 4

1.0

SR_004



Group 4

1.0

SR_004



Group 4

1.0

SR_004



Group 4

1.0

SR_004



Group 4

1.0

SR_005



Group 4

1.0

SR_005



Used by –

(continued)

88

6

ID TSR_018

Name Forklift backup hydraulic hose activation

QSR_006

Forklift hydraulics performance level Forklift fire sensor

TSR_019

TSR_020

Forklift fire suppression display

TSR_021

Automatic fire suppression activation

QSR_007

Forklift fire suppression performance level

Defining Safety Functions

Description The forklift shall allow the use of a backup hydraulic hose in case the system has detected that the primary hose has burst The forklift hydraulics system shall be developed according to PL d

Author Group 4

Ver. 1.0

Derived from SR_005

Group 4

1.0

SR_005



The forklift shall be equipped with a sensor that detects a fire The forklift shall be equipped with an interactive display that allows the driver to activate the fire suppression manually The forklift shall automatically activate the release of fireextinguishing fluid once a potential fire is detected by the sensor The forklift fire suppression system shall be developed according to PL d

Group 4

1.0

SR_006



Group 4

1.0

SR_006



Group 4

1.0

SR_006



Group 4

1.0

SR_006



Used by –

Hazard description The driver’s view is blocked by the load while driving forward, potentially hitting and irreversibly harming pedestrians

While driving in reverse, due to increased driving difficulty, the forklift potentially hits and irreversibly harms pedestrians

Hazard ID H01

H02

Frequency (F) F1

F1

Severity (S) S2

S2

Forklift – preliminary hazard list (PHL)

P2

Probability (P) P2

d

PL d

NO

Acc? NO

Safety measure description Active safety measures: Install collision warning system Install lightning alert when driving forward with restricted visibility Install fixed sensors with a 360-degree detection zone to detect a pedestrian Active safety measures: Install forklift driver assistance systems Install fixed sensors with a 360-degree detection zone to detect a pedestrian

Frequency (F) F1

F1

Severity (S) S2

S2

P1

Probability (P) P1

YES

Acc? YES

(continued)

c

PL c

Your First Safety Project! 89

Hazard description The forklift tips forward while carrying the load due to imbalance, potentially injuring the driver or pedestrian/crew

The forklift tips to the side while making a sharp turn due to imbalance, potentially injuring the driver or pedestrian/crew

Hazard ID H03

H04

Frequency (F) F2

F2

Severity (S) S2

S2

P2

Probability (P) P2

e

PL e

NO

Acc? NO

Safety measure description Active safety measures: Balance monitoring and imbalance prevention Passive safety measures: Wearing a seat belt Wearing a protective helmet Passing forklift safety training Active safety measures: Balance monitoring and imbalance prevention Passive safety measures: Wearing a seat belt Wearing a protective helmet Passing forklift safety training

Frequency (F) F1

F1

Severity (S) S2

S2

P1

Probability (P) P1

c

PL c

YES

Acc? YES

90 6 Defining Safety Functions

Hazard description The load falls off the forks potentially harming the driver in the cabin or pedestrians/crew

The driver falls out of the cabin while driving, potentially dying or obtaining an irreversible injury

Hazard ID H05

H06

Frequency (F) F1

F2

Severity (S) S2

S2

P2

Probability (P) P2

e

PL d

NO

Acc? NO

Safety measure description Active safety measures: Install the load balancing system that will detect load imbalance and correct its position Passive safety measures: Install the locks that will hold the load firmly attached to the forks Active safety measures: Install the automatic cabin door locking system Passive safety measures: Install the brightly colored seat belt

Frequency (F) F1

F1

Severity (S) S2

S2

P1

Probability (P) P1

YES

Acc? YES

(continued)

c

PL c

Your First Safety Project! 91

Hazard description Forklift hydraulic hose bursts leading to a loss of pressure which drops the forks potentially killing multiple people

Escaping propane gas is ignited by the hot engine potentially irreversibly harming multiple people

Hazard ID H07

H08

Frequency (F) F1

F1

Severity (S) S2

S2

P2

Probability (P) P2

d

PL d

NO

Acc? NO

Safety measure description Active safety measures: Install backup hydraulic hose Passive safety measures: Regular hose maintenance Crew training (don’t walk under/near forks and safety gear) Active safety measures: Automatic fire detection and suppression system with the ability of manual activation that releases noncorrosive fluid to extinguish the fire Passive safety measures: Crew training (fire safety)

Frequency (F) F1

F1

Severity (S) S2

S2

P1

Probability (P) P1

c

PL c

YES

Acc? YES

92 6 Defining Safety Functions

Hazard ID H09

Hazard description The forklift tips backward while carrying the load on an inclined surface potentially harming or killing the driver or pedestrians nearby

Severity (S) S2

Frequency (F) F1

Probability (P) P2 PL d

Acc? NO

Safety measure description Active safety measures: Monitor the balance of a forklift and activate an alarm signal if the forklift becomes out of balance Forklift should sound an alarm and light signaling while carrying a heavy load, warning the pedestrians around to keep a safe distance

Severity (S) S2

Frequency (F) F1

Probability (P) P1 Acc? YES

(continued)

PL c

Your First Safety Project! 93

Hazard ID H10

Hazard description The forklift hits a power cable while forks are in an elevated position causing electricity harm or killing the driver

Severity (S) S2

Frequency (F) F1

Probability (P) P2 PL d

Acc? NO

Safety measure description Active safety measures: Install a camera with object detection on top of forks, so the driver can have a better view of the position and detect the surrounding objects Passive safety measures: Isolate the cabin of the driver from forks, such that the electric current will not endanger the driver

Severity (S) S2

Frequency (F) F1

Probability (P) P1 PL c

Acc? YES

94 6 Defining Safety Functions

Your First Safety Project!

95

Fig. 6.5 State transition diagram depicting the draft initial safety concept for the forklift truck

Finally, the team provided their safety concept in the form of a state transition diagram, depicting the safe state (Fig. 6.5).

Chapter 7

Safety Integrity and Random Failures

Introduction The first half of the textbook covered all the aspects required to understand the safety-critical system, specify its requirements, perform hazard and risk assessment, figure out potential safety precautions in the form of functional safety prescriptions (safety functions), and allocate safety integrity level requirements to them. The second half of the textbook now shall address aspects of verification, that is, how we can show that our safety function design and implementation conforms with the requirements prescribed for the designated safety integrity level. Safety integrity as a notion is related to the resilience of the system to any dangerous failure. Safety functions, as key functional safety prescriptions, have the goal to prevent accidents by detecting anomalies and/or failures that may lead to the accidents, and bringing the system to a safe state before the accident has a chance to materialize. Failures of safety functions as such shall also be avoided since their correct operation is directly responsible for accident prevention. Furthermore, safety integrity levels are allocated to safety functions, defining the extent of quality their design and implementation need to respect, to make them appropriate for the fulfillment of their tasks. In this chapter, we would dissect the specific properties of safety integrity that safety functions need to show. The first set of properties is related to the systematic safety integrity, making sure that the processes and procedures we selected to design and implement safety functions minimize the chance that faults in the design and implementation can occur. The second set of properties is related to random failures, usually attributed to physical aspects of parts used in safety function implementation. Those parts shall show high robustness against wear and tear and shall demonstrate durability. Random failures are analyzed using probabilistic models, and this chapter deals with the respective metrics for quantifying random failures. Safety integrity against random failures, therefore, can be a deal-breaker for your safety-related system compliance against the required functional safety standard, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_7

97

98

7 Safety Integrity and Random Failures

even if your design and implementation were flawless. Each component that is selected needs to fulfill reliability targets defined by the standard. Reliability is formally defined, as the probability that the system or its component would be still operational after a certain elapsed runtime (“survival” probability). In this chapter, we would explore all related metrics which are usually provided by the component manufacturers, such as failure rate and mean time to failure (MTTF) which can be used to assess the required reliability for that component. Based on the calculations available within the reliability theory, it is possible to assess the composite reliability for the complete safety-related system and therefore provide formal proof that the safety integrity against random failures is in line with the required safety integrity level allocated to that system.

Video Lesson This chapter has a corresponding video lesson: sfs7.nit-institute.com

Lecture Notes Safety integrity is resistance against dangerous failures. It is among the requirements for safety functions, expressed as a required safety integrity level (e.g., SIL, ASIL, PL, etc.) stemming from the risk evaluation of the hazard the safety function mitigates. Safety integrity is verified by quantifying the failures of the safety function and comparing them with the required values from the functional safety standard,

Lecture Notes

99

Safety integrity level (SIL) 4 3 2 1

Average frequency of a dangerous failure of the safety function [h-1] (PFH) ≥ 10−9 < 10−8 < 10−7 ≥ 10−8 −7 < 10−6 ≥ 10 < 10−5 ≥ 10−6

Fig. 7.1 Required failure rates for each safety integrity level as defined by IEC 61508

and also by verifying whether all the procedures prescribed by the standard have been met for the safety function according to its safety integrity level. Safety integrity is assessed against random failures, depending on the system and hardware reliability (e.g., IEC 61508 requires the frequency of dangerous failures for SIL 4 to be anywhere from 10-9 to 10-8) (Fig. 7.1). Safety integrity against systematic failures, which are not random in nature, is also regarded: faults in specifications and design, faulty processes and documentation, software bugs, and other design-time faults which can be latent (hidden). Artificial intelligence (AI) is also prone to systematic faults, but due to the pseudorandom nature of these faults and the inability of their systematic discovery (they are based on learning processes happening in training phases), it can have its failures probabilistically modeled through a test. Random failures happen due to manufacturing material fatigue, wear and tear, as well as environmental influences (interferences) which mostly affect mechanical and hardware (electronic) components. Safety function usually consists of both electronic (controllers executing the logic and sensors) and electromechanical components (as actuators which bring the system to the safe state, e.g., relays, switches, breaking pads, valves, etc.), so random failures must be quantified for it and compared with the required values from the functional safety standards. Random failures are quantified using failure probability F(t). This is a cumulative distribution function of a random variable T (time to failure) evaluated at t (expressing the probability that the failure will certainly happen within a time interval [0, t]). Therefore: Fð0Þ = 0,Ft → ðtÞ = 1 FðtÞ = ½0, 1 Another important quantification is reliability R(t). As opposed to failure probability, reliability is the probability that the system is going to survive until the time t. Therefore: RðtÞ = 1 - FðtÞ, Rð0Þ = 1, Rt → 1 ðtÞ = 0 RðtÞ = ½0, 1 Please note that both F(t) and R(t) are conditional probabilities (Fig. 7.2).

100

7

Safety Integrity and Random Failures

1 0.9 0.8 0.7 0.6 0.5 Failure probability F(t)

0.4

Reliability R(t)

0.3 0.2 0.1 0 0

1

2

3

4

5 time (t)

6

7

8

9

10

Fig. 7.2 Graphs of failure probability and reliability of a system over the time t

The trend of failures at a time moment is expressed with failure rate: hð t Þ =

d dt F ðt Þ

Rðt Þ

=

units failed in ½t, t þ dt  1 : units survived until t dt

The fundamental relation between the reliability and failure rate: Z RðtÞ = e

-

t hðuÞdu

0

:

The failure rate is usually higher right after the system production, due to process faults and manufacturing errors (early “infant mortality” failures), and then it decreases (decreasing failure rate – DFR) and becomes nearly constant. After a sufficient system runtime, the failure rate starts to increase again, due to wear-out failures (increasing failure rate – IFR). This typical failure rate graph is called the bathtub curve. In practice, the system is released after it is stress-tested (“skipping” the DFR zone) and regarded only during the constant failure rate period, what allows simplifications and many benefits for reliability calculation and comparisons (Fig. 7.3). These systems are called e-systems for which a constant failure rate is defined as: hðtÞ = const: = λ ) RðtÞ = e - λt : Manufacturers of components usually release λ values in the documentation (or they can be calculated using handbooks – MIL, FIDES, etc.). Another typical value that is used is mean time to failure (MTTF), what is the expectation value of system failure until the runtime t for which it stands:

Lecture Notes

101

Failure Rate

Decreasing Failure Rate

Increasing Failure Rate

Constant Failure Rate

Observed Failure Rate

Early “Infant Mortality” Failure

Wear Out Failures

Constant (Random) Failures

Time

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

2

4

6 F(t)

8

10

R(t)

Fig. 7.3 Bathtub curve (left) and failure probability F(t) and reliability R(t) for an e-system

Z

1

MTTF = 0

Z RðtÞ = 0

1

e - λt =

1 ½ h λ

Interestingly, for e-systems failure probability at MTTF is always ~0.63 (not 0.5!). The failure rate is usually expressed as “failures per hour” (h-1) or “failures in time” (FIT) for which it stands:

102

7

Safety Integrity and Random Failures

1

failure probability F(t)

0.8

~0.63

0.6

0.4

0.2

0

O = 0.2 O = 0.05 O = 0.01

1 O

0

20

40

60

80

100

time 1 O = 0.2 O = 0.05 O = 0.01

reliability R(t)

0.8

0.6

0.4

~0.37

0.2 1 O

0 0

20

40

60

80

100

time

Fig. 7.4 Failure probability and reliability for the e-system for different failure rate values (MTTF is expressed as 1/λ)

λ = x  h - 1 = x  109 FIT⟹λ = y FIT = y  10 - 9 h - 1 More reliable systems have smaller λ and larger MTTF and vice versa (Fig. 7.4).

Calculation Examples

103

Calculation Examples Task 1 A company released 1000 units into pilot deployment on January 5, 2021. Each month, the company made a record of the number of units still operational, as shown in the table (Table 7.1). Calculate failure probability (F(t)), reliability (R(t)), and failure rate (h(t)) after each month of runtime. Solution Reliability is the “probability of survival”; therefore, it is a direct proportion between the number of surviving (remaining) units after a monthly inspection xm and the total number of units: RðmÞ =

units surviving x = m 1000 total units

Failure probability is then: F ðmÞ = 1 - RðmÞ The failure rate can be expressed as:   xm - x m units failed in mprev , m hð m Þ = = prev units survived until mprev xmprev Therefore: Date of inspection February 5, 2021 March 5, 2021

Units surviving (xm) 798 694

Units failed since last inspection (xmprev - xm ) 202 104

F(m) 0.202 0.306

R(m) 0.798 0.694

April 5, 2021 May 5, 2021 June 5, 2021 July 5, 2021 August 5, 2021 September 5, 2021 October 5, 2021 November 5, 2021

638 607 577 548 494 373 221 0

56 31 30 29 54 121 152 221

0.362 0.393 0.423 0.452 0.506 0.627 0.779 1

0.638 0.607 0.577 0.548 0.494 0.373 0.221 0

h(m) 0.2 104 798 = 0:13 0.08 0.05 0.05 0.05 0.1 0.24 0.41 1

104

7

Table 7.1 Units remaining in operation after each inspection

Safety Integrity and Random Failures

Date of inspection February 5, 2021 March 5, 2021 April 5, 2021 May 5, 2021 June 5, 2021 July 5, 2021 August 5, 2021 September 5, 2021 October 5, 2021 November 5, 2021

Units remaining in operation 798 694 638 607 577 548 494 373 221 0

Task 2 Constant failure rate for a power supply component, according to the manufacturer, is rated at λ = 200FIT. Calculate failure probability and reliability of the power supply after the runtime of 5 years as well as the MTTF in hours. Solution: First we need to convert the failure rate to h-1: λ = 200 FIT = 200  10 - 9 h - 1 = 2  10 - 7 h - 1 We need to convert the requested runtime to hours: t = 5a = 5  365  24 h = 43 800 h Failure probability can be expressed as: F ðt Þ = 1 - e - λt F ð43 800 hÞ = 1 - e - 210

- 71 h ∙ 43800

h

= 0:0087

Reliability can be expressed as: Rðt Þ = e - λt = 1 - F ðt Þ Rð43 800 hÞ = 1 - 0:0087 = 0:9913 Mean time to failure: MTTF =

1 1 = 5 000 000 h ⟹MTTF = λ 2  10 - 7 h - 1

Exercise 7

105

Exercise 7 One of the safety functions prescribed for your system needs a relay switch as an actuator to cut off the power and bring your system into the safe state. Your company evaluated the relay switch component for reliability, performing the accelerated life testing (ALT) on the sample of 10,000 switches over the course of 3 weeks. It is expected that the switch is activated once per hour during the system runtime, being that the function of the system for which the safety function is prescribed executes in the on-demand mode, once per hour. During ALT, the test is performed by activating the relay switch once every 10 seconds, and logging the total number of failed switches every 10,000 cycles, as shown in the ALT log. Cycles tested 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000 100,000 110,000 120,000 130,000 140,000 150,000 0.99): A=

MUT MUT = MUT þ MDT MTBF

Calculation Examples Task 1 An excerpt from pseudo-FMEDA is given in for the microcontroller component in the table below (F, is a dangerous failure; D, is an undetectable failure; R, failure rate). Component uC

Failure code 101

Failure cause

Safety measure

Failure assessment F

Power cutoff

Redesign, safety PMIC

1

D 1

R 20 FIT

uC

102

Time desync

Redesign, backup RTC

1

1

5 FIT

uC

103

Faulty data, output stuck at 1

Systematic safety integrity assurance

0

1

10 FIT

uC

104

Faulty data, output stuck at 0

Plausability check, crosscorrelation of outputs

1

0

10 FIT

uC

105

Memory corruption

Diagnosis (BIST)

1

0

10 FIT

uC

106

Bond wires detachment

Manufacturing process reassessment, soldering QA

1

1

5 FIT

LVDS deser.

201

Wire break

Redesign, PCB reassessement

1

1

1 FIT

LVDS deser.

202

Stuck at last frame

Plausability check at uC

1

0

1 FIT

Calculation Examples

149

Calculate the diagnostic coverage and the safe failure fraction for the microcontroller. Solution First, we need to classify the failures of the microcontroller from FMEDA: • • • •

Safe failures: 103 Dangerous failures: 101, 102, 104, 105, 106 Undetectable failures: 101, 102, 103, 106 Dangerous undetectable failures: 101, 102, 106 Now we can calculate failure rates for each failure class: λs = λ103 = 10 FIT λd = λ101 þ λ102 þ λ104 þ λ105 þ λ106 = ð20 þ 5 þ 10 þ 10 þ 5Þ FIT = 50 FIT λ = λs þ λd = 10 FIT þ 50 FIT = 60 FIT λdu = λ101 þ λ102 þ λ106 = ð20 þ 5 þ 5Þ FIT = 30 FIT λdd = λd - λdu = 50 FIT - 30 FIT = 20 FIT Diagnostic coverage is: DC ½% = Safe failure fraction is: 10þ20 dd SFF ½% = λλssþλ þλd = 10þ50 = 50%

λdd 20 = = 40% λd 50 ,

SFF ½% = 1 -

λdu λ

=1-

30 60

= 50%

Task 2 A log of the operation of three units of your system is given, depicting the uptimes and downtimes of the system. Calculate the availability of the system.

150

10

Proving the Safety Integrity

Solution We can count the number of all uptimes and downtimes for all units directly from the log: Number of uptimes for Unit A: 4 Number of downtimes for Unit A: 4 Number of uptimes for Unit B: 5 Number of downtimes for Unit B: 5 Number of uptimes for Unit C: 4 Number of downtimes for Unit C: 4 Number of all uptimes: 13 Number of all downtimes: 13 We can sum up all durations of uptimes and downtimes for all units directly from the log: Duration of all uptimes for Unit A: 12 Duration of all downtimes for Unit A: 4 Duration of all uptimes for Unit B: 11 Duration of all downtimes for Unit B: 5 Duration of all uptimes for Unit C: 9 Duration of all downtimes for Unit C: 7 Duration of all uptimes: 32 Duration of all downtimes: 16 P

MUT =

uptimes of all units 32 = = 2:46 No of uptimes of all units 13 P

MDT =

downtimes of all units 16 = = 1:23 No of downtimes of all units 13

Now we can calculate the availability of the system: 2:46 A = MUTMUT þMDT = 2:46þ1:23 = 0:67 = 67%

Exercise 10

151

Exercise 10 Continue the exercise from Chap. 9, now attempting to increase reliability by analyzing the failures of system components and attempting to assess dangerous undetectable failures. The initial pseudo-FMEDA sheet is given below:

Additionally, try to respecify the detectability of failures for the diversified configuration of the system. Finally, calculate safe failure fraction (SFF) and diagnostic coverage (DC), and compare all safety integrity metrics against the requirements from the functional safety standard (see further table), considering that the required SIL for the described SRS is SIL 2:

152

10

Safe failure fraction of an element SIL 2 SUB2: SIL 2 SUB 3: (1-β) 2*λd2*T+β*λ d= 2.4*10-9+1.08*10-7=1.104*10-7=> SIL 2 SRS1 => SIL 2 = SILr 2

Once the SRS is finally validated, the system shall be put in operation and monitored in the field trial. All potential issues (incidents, accidents, errors) need to be logged and carefully investigated.

We must also consider the potential failures due to systematic faults in programmable components. Those faults may be removed only in the design phase, or, if known, may be tolerated via a fault tolerance mechanism by using measures for HW/SW prescribed in standards.

To consider systematic faults in the example, it is possible to either fully document the process used to develop software for logic components (L), by using ASPICE L2 and measures for HW/ SW prescribed by the standards. We can also resort to using components which are proven in use.

SRS 1 acc. IEC 62061 Gyro 1

Gyro 2

Gyro 3

L R

Diagnostics

I/O

SUB2 (Subsys. A)

CCF

R

SUB1 (Subsystem C)

SUB3 (Subsys. B) SUB1

SUB2

SUB3

SRS1 (Subsys. A)

MTTFd pos: B10d/(0.1*n op)=154a 1/MTTFchannel=1/154+1/300+1/300 MTTFchannel=76a (high)

SRS 2 acc. ISO 13849 I (Pos)

L

O (I/O)

SRS2 Pos. sensor (rotary), B10=10,000,000 op, hop=8, dop=90, t=2s nop=(dop*hop*3600)/t=1,296,000 SFF=50% => B10d=0.5/B10=20,000,000 I/O uC, MTBF=300a Logic uC, MTBF=300a DC=95% (medium)

Category 2

DCmedium & MTTF high & C2 => PL d = PLr d

TE

Let us remember the first safety function defined for the wrecking ball crane in our example, by looking at its requirements. Safety requirement 1.1 (SR 1.1): SRS shall monitor the balance of the crane in order to predict a near-tipping event.

162 Fig. 11.1 Technical safety concept for the SRS serving to SR 1.1

11

Gyro 1 (top) Gyro 2 (cabin) Gyro 3 (back)

Practical SIL Calculation

Logic

Diag

Relay 1 Relay 2 I/O

Technical safety requirement 1.1.1 (TR1.1.1, derived from SR1.1): SRS shall use gyroscopes positioned at the crane top, cabin, and the back of the vehicle to detect imbalance. Safety requirement 1.2 (SR 1.2): SRS shall, in case of imbalance over the threshold (near-tipping), cut off the controls and command the vehicle to sound an alarm and bring up the crane. Quality safety requirement 1.3 (QSR 1.3): SRS shall achieve SILr 2. By following the guidelines from IEC 62061, we can create a technical safety concept including an architecture containing three distinct gyroscope sensors as inputs (top, cabin, and back – Gyro 1, Gyro 2, and Gyro 3, respectively) which feed their readings to a logic component (L). The logic component implements the algorithm required in SR 1.2, sending the cutoff command to two redundant relays at the same time (Relay 1 and Relay 2) to disable the controls. Also, the alarm command is sent to the appropriate I/O block which consists of the sound actuators (e.g., speakers). To enhance the safety integrity, a separate diagnostics unit is used to detect failures on any of the gyroscopes to prevent misjudgments of the system (Fig. 11.1). Now we can decompose our technical safety concept using the guidelines from IEC 62061. We can notice that our concept matches three design patterns with the prescribed RBDs. Input gyroscopes are equivalent to IEC 62061 subsystem C. Logic and I/O can be represented as a simple series (as prescribed by subsystem A), whereas redundant relays represent subsystem B. All three subsystems finally compose a final series allowing us to calculate the final reliability of the defined SRS (Figs. 11.2 and 11.3). The next step is to consult the specification of each selected component according to the information released by the manufacturer (or results from a previously conducted FMExA analysis). For example: Gyroscope model is ADiS16385, where manufacturer specified λd = 0.86∙10-6 h-1. Relays are identical with λd = 600 FIT. Common cause failures for relays can be modeled by means of the beta-factor model, β=18%. Additionally, the manufacturer suggests a proof test interval to check and replace relays each T = 10,000 h. I/O block combined is based on a microcontroller with MTBF = 300a, whereas logic microcontroller is declared as SIL 2 compliant – we estimate this being at the “middle” of the range: ~5*10-7. Diagnostic coverage for detecting dangerous failures on gyroscopes is 90%.

Lecture Notes

163

Fig. 11.2 Possible design templates available with IEC 62061 Fig. 11.3 RBD templates for the technical safety concept of SR 1.1. as per IEC 62061

Gyro 1

Gyro 2

Gyro 3

L

SUB2 (Subsys. A)

R

Diagnostics

I/O

CCF

R

SUB1 (Subsystem C)

SUB3 (Subsys. B) SUB1

SUB2

SUB3

SRS1 (Subsys. A)

By knowing this information, we can now perform the calculations. IEC 62061 gives ready-made formulae for each of the subsystems, although they all stem from the calculations devised in Chaps. 7, 8, 9, 10, and 11. For example, for the Subsystem C, IEC 62061 gives the following final failure rate formula: λF = N  λd ð1 - DC Þ If we remember from before, failure rates of the series configuration can be summed, meaning: λF = λ1 þ λ 2 þ . . . þ λ N When having the diagnostics in place, only the undetectable portion of failure rate remains in the calculation:

164

11

Practical SIL Calculation

Fig. 11.4 Safety integrity levels and their required failure rate targets as per IEC 62061

λdu = λd ð1 - DC Þ When we plug the actual values: λF = 3  0:86  10 - 6  ð1 - 0:9Þ = 0:26  10 - 6 IEC 62061 gives the requirements on which targets need to be achieved for each SIL level (Fig. 11.4). We can therefore see that for SUB1, the obtained value falls within the SIL 2. For the subsystem B, IEC 62061 also gives a final formula to use: λ F = ð1 - β Þ2  λ d 2  T þ β  λ d The same result is obtained by following the “classic” calculation approach for T: Rk ðTÞ = 1 - ð1 - RðTÞÞ2 e - λk T = 1 - 1 þ 2RðT Þ - R2 ðT Þ = 2e - λT - e - 2λT λk =

ln ð2e - λT - e - 2λT Þ T

 ln 2e - ð1 - βÞλd T - e - 2ð1 - βÞλd T λF = þ β  λd T When plugging in the values for SUB 3: λF = 1:104  10 - 7 = > SIL 2 Finally, for SUB1, SUB2, and SUB3 as a series, we get: λTOT = ð 1 þ  2:6 þ  5Þ  10 - 7 = > SIL 2

Lecture Notes

165

Fig. 11.5 Available architectural categories as templates given by ISO 13849 Fig. 11.6 Defined technical safety concept for the SRS 2

I (Pos)

L

O (I/O)

Category 2

TE

We can see that the obtained values conform to the safety integrity requirement QSR1.3 of SILr 2. The second safety function for the wrecking ball crane was defined using the following requirements: Safety requirement 2.1 (SR 2.1): SRS shall monitor ball position to prevent the wrecking ball from reaching the vehicle. Technical safety requirement 2.1.1 (TR2.1.1, derived from SR2.1): SRS shall use two position sensors (at crane arm and ball pulley) to monitor crane height and ball descent and to detect possible collision if swung. Safety requirement 2.2 (SR 2.2): SRS shall disallow dangerous positioning by limiting the controls extent continuously. Quality safety requirement 2.3 (QSR 2.3): SRS shall achieve PLr d. We now create a technical safety concept according to the guidelines from ISO 13849. We first look at the suggested design templates from this standard (Fig. 11.5). Our system requires a set of input sensors (nonredundant!) that provide continuous input to the logic, which in turn continuously decides whether to limit further control possibility by sending appropriate signals to the output. We can include a “test” module (diagnostics) to introduce some diagnostic coverage and therefore decrease the output failure rate. This architecture is most similar to the Category 2 design (Fig. 11.6). By consulting manufacturer information, we get that the rotary position sensors are declared with their B10 = 10,000,000 operations (clicks). For the use case of the

166

11

Practical SIL Calculation

wrecking ball crane, based on the worktime logs we can extract the yearly number of operations: the machine is operated 8 hours each day, 90 days each year on average. If one cycle of the position sensor (time between clicks when used) is 2 seconds, we can calculate the total yearly number of sensor operations: hop = 8, dop = 90, t = 2s

nop =

d op  hop  3600 = 1 296 000 t

We have also performed an FMEDA and decided that out of all failures, 50% are safe failures which can be excluded (safe failure fraction – SFF=50%). In this case, we have an updated B10 value: B10d =

SFF 0:5 = = 20 000 000 B10 10 000 000

The manufacturer declares MTBF values for I/O and logic controllers to be at 300a. Diagnostic coverage of the configuration is 95%. Category 2 is a simple series configuration for which it stands: λF = λPOS þ λL þ λIO 1 1 1 1 = þ þ MTTF channel MTTF dPOS MTTF L MTTF O We can calculate MTTF for dangerous failures of the position sensors: MTTF dPOS =

B10d = 154a 0:1  nop

Finally: 1 1 1 1 = þ þ MTTF channel 154a 300a 300a

MTTF channel = 76a To decide the achieved performance level as per ISO 13849, we need to use the obtained MTTF for one series in a configuration (denoted as “channel”) as well as

Now Try for Yourself!

167

the overall DC value and the selected system category. We can then plug those values into appropriate tables given by the standard to obtain the achieved PL. For MTTF of the channel, we deduce the value as high: MTTFD Denotation of each channel Low Medium High

Range of each channel 3a ≤ MTTFD < 10a 10a ≤ MTTFD < 30a 30a ≤ MTTFD < 100a

For the DC of the system, we deduce the value medium: Diagnostic coverage (DC) Denotation of each channel None Low Medium High

Range of each channel DC < 60% 60 % ≤ DC < 90% 90 % ≤ DC < 99% DC ≥ 99%

Then we can use the final table to determine the PL: Category DCavg MTTFD of each channel Low Medium High

B None

1 None

2 Low

2 Med.

3 Low

3 Med.

4 High

a b n/a

n/a n/a c

a b c

b c d

b c d

c d d

n/a n/a e

We can see that the obtained performance level is PL d, which meets the requirement QSR 2.3 of PLr d.

Now Try for Yourself! Based on the outputs from Project 1, now you need to provide argumentation and evidence for your safety case, related to the safety integrity of your safety functions and other important aspects. As a group work, revisit your system requirements specification, and, if needed, extend it so that you have at least five safety requirements and corresponding derived safety requirements (technical, quality) which describe five active risk mitigation measures (safety functions). In a group, for each of the safety functions, define the list of required written evidence for the safety case (just the list, not the evidence itself!) which are in your view required to close the safety case with respect to those safety functions. One of

168

11

Practical SIL Calculation

the pieces of evidence would surely be the evidence on the fulfillment of the prescribed safety integrity level with regard to random failures and other quantitative safety integrity metrics. After the group part, now each of the team members shall select one safety function and provide evidence, through calculation, that the safety function fulfills the respective safety integrity level with regard to quantitative failure metrics (as defined in Chap. 10 exercise). Remember to: • Define a rough architectural block diagram for the corresponding safety-related system (SRS) • Draw a corresponding reliability block diagram (RBD) • Find out failure quantifiers from relevant online sources, handbooks, or references for each of the components, and be able to argue your choice (it does not need to be exact – this is just an exercise – but the choice needs to be reasonable) • Make sure to use only dangerous undetectable failures – base the numbers on the Chap. 10 exercise FMEDA, guesstimates, or rule of thumb (e.g., consider 50% of all failures to be dangerous, and 90% of all dangerous failures to be detectable) – you do not need to derive any specific FMEDA • Perform the calculation to find out the reliability of the complete SRS after 10 years of runtime and decide on the output SIL; use tables as in exercise Chap. 10 • Discuss the SIL with regard to safety integrity requirements • In case SIL is not met, apply any of the safety integrity improvement methods you find suitable (e.g., derating, hot spare, item-level redundancy, majority voting, etc.), update the RBD, and recalculate reliability so that the SIL is met • In the case SIL was met and no safety integrity improvement was needed, anyhow apply the improvement so that the MTTF of the SRS is increased by 30% using the same methods as in the previous point • In case the changes were violating the system requirement specification, backtrace and perform the required changes in the specification. As a group, create the final presentation in which you will describe the updates to the specification, discuss the list of evidence, and introduce the safety functions and their requirements. Then, each team member shall describe his/her safety claim for his corresponding safety function and present his/her findings. It would be beneficial if you can present your findings live to your peers or your instructor. Spend 15 minutes on the presentation in total. Please strictly adhere to the time. For example, you can spare 5 minutes for the overall group part presentation, and 2 minutes per team member to describe particular claims around each of the five safety functions. Feel free to organize the presentation of the overall group part however you find suitable (one or more people can present this part).

Now Try for Yourself!

169

Required Output • Updated system requirement specification, in the Chap. 2 template. • PDF with the claim for your safety function, including: (a) (b) (c) (d) (e)

Architectural block diagram of the SRS Corresponding RBD Calculation steps and results Discussion around the achieved SIL Application of the SIL improvement method (revisited architecture, RBD, etc.) (f) Recalculation steps and final results

• If you are using a calculation Excel sheet, feel free to provide it instead of writing all calculation details in PDF (this is not mandatory, but then the calculation shall be presented in detail in the PDF). • Final presentation (in ppt). Submission Deadline Take 1 or 2 weeks to work the project out.

Assessment You or your peers (or your instructor) can assess your work. Take 20 points as the total possible score. Up to 5 points can be allocated based on the group part of your submission and presentation, and will be the same for each group member. Up to 2 points can be provided for the safety functions requirements (2, mostly complete and correct; 1, incomplete or partially incorrect; 0, major flaws or not provided). Up to 3 points can be provided for the safety case composition and the presented claims (3, mostly complete and correct; 2, at most one notable missing artifact in the list; 1, several aspects incorrect and/or missing; 0, mostly incomplete or incorrect). Ten points can be assigned based on the individual submission part (up to 2 points per each of the aspects – 2 if it is mostly complete and correct; 1, incomplete or partially incorrect; and 0, completely incorrect or missing): 2 points for architectural block diagram, 2 points for RBD, 2 points for SIL calculation, 2 points for the prescribed SIL improvement methods, and 2 points for the recalculation and final SIL achievement or reliability improvement). Finally, 5 points can be assigned based on your presentation and argumentation (up to 3 points for convincing presentation proof and up to 2 points for the presentation content – specifically addressing the communication clarity and understandability of the presented material).

170

11

Practical SIL Calculation

Sample Solution to the Project The team selected a forklift truck as their considered system and continued the previously started project (the full solution is available in digital form at sfs11.ex.nitinstitute.com). They gave the following presentation:

Now Try for Yourself!

171

Required Evidence for the Safety Case • System functions are defined, and hazards are identified and evaluated according to the functional safety standard (ISO 13849 is our case). • Safety functions are defined to mitigate the hazards of system functions. • The performance level that is allocated to the system function is inherited from the risk of the hazard. • SRS is realized by respecting restrictions by architecture according to the functional safety standard. • Dangerous undetectable random failures of SRS are quantified by failure rate and compared with values required by the functional safety standard.

172

11

Practical SIL Calculation

• In case PL is not met, redundancy or other improvement methods may be applied, and dangerous undetectable random failures are re-quantified and proved that SRS met PL according to the functional safety standard. • Traceability from safety requirements and then through technical safety requirements, with regard to functional requirements they are related to, and then through safety design and implementation is established. • Safety functions are implemented according to all provisions of the appropriate functional safety standard, including the process of its definition, implementation, and verification, PL requirements, as well as specifically prescribed implementation techniques (at the system, hardware, and software level). This is especially regarded to the failures due to systematic faults which cannot be quantified as random failures. • SRS is verified on each level against test cases for compliance with the technical safety requirements. • Compliance with quality safety requirements is demonstrated through various failure metrics derived from the used architectural elements and their composition (failure probability, reliability, failure rate, MTTF, safe failure fraction, diagnostic coverage, etc.), as well as from failure statistics from the field trials and stress tests. • Traceability from test cases toward safety architecture and design is established. • Safety processes and procedures defined by the company, which are aligned with the functional safety standard, are followed: (a) Documentation is reviewed, and reports are provided as evidence of the completeness of documentation and correct traceability. (b) Implementation is reviewed, and reports are provided as evidence of compliance with the functional safety standard. (c) It is proved that there are no open items that can be traced to the hazards which exhibit intolerable risk. • It is proved that the performance level of the system security cannot be compromised. • Training for employees regarding safety processes and procedures defined by the company is periodically organized. • Internal audits and audits of external authorities are performed periodically regarding safety processes and procedures that are defined within the company. ID Name High-level requirements HLR_001 Lifting

Description

Author

Ver.

Derived from

The forklift shall allow lifting and lowering of the load

Group 4

1.0



Used by FR_001 FR_002 FR_003 CR_001 (continued)

Now Try for Yourself!

173

ID HLR_002

Name Driving

Description The forklift shall provide the ability to be driven

Author Group 4

Ver. 1.0

Derived from –

HLR_003

Tilting

The forklift shall allow tilting of the forks

Group 4

1.0



The forklift shall move the forks up when the lever is pushed up The forklift shall move the forks down when the lever is pulled down The forks shall not move up or down when the lever is in a neutral position The forklift shall accelerate when the gas pedal is pressed The forklift shall decelerate when the brake pedal is pressed The forklift shall not allow movement when the parking lever is pulled The forklift shall turn left when the steering wheel is turned left The forklift shall turn right when the steering wheel is turned right The forklift shall move forward when the forward-reverse lever is in the forward position The forklift shall move in reverse when the forward-reverse lever is in the reverse position

Group 4

1.0

HLR_001

Group 4

1.0

HLR_001

Group 4

1.0

HLR_001

Group 4

1.0

HLR_002

Group 4

1.0

HLR_002

Group 4

1.0

HLR_002

Group 4

1.0

HLR_002

Group 4

1.0

HLR_002

Group 4

1.0

HLR_002

Group 4

1.0

HLR_002

Functional requirements FR_001 Lifting the forks FR_002

Lowering the forks

FR_003

Fixed fork height

FR_004

Acceleration

FR_005

Deceleration

FR_006

Parking

FR_007

Steering left

FR_008

Steering right

FR_009

Forward movement

FR_010

Reverse movement

Used by FR_004 FR_005 FR_006 FR_007 FR_008 FR_009 FR_010 FR_011 CR_002 FR_012 FR_013 FR_014 CR_003

(continued)

174

11

ID FR_011

Name Power supply

FR_012

Tilt forward

FR_013

Tilt backward

FR_014

Fixed tilt angle

Constraint requirements CR_001 Load limit

CR_002

Maximum speed

CR_003

Maximum tilt angle

Safety requirements SR_001 Blocked view

SR_002

Blind spot preview

Practical SIL Calculation

Description The engine shall burn propane gas to power the forklift The forks shall tilt forward when the tilt control lever is pushed forward The forks shall tilt backward when the tilt control lever is pulled backward The tilt angle of the forks shall remain unchanged when the tilt control lever is in the neutral position

Author Group 4

Ver. 1.0

Derived from HLR_002

Group 4

1.0

HLR_003

Group 4

1.0

HLR_003

Group 4

1.0

HLR_003

The forklift shall lift the load with mass up to 2 metric tons The forklift shall not exceed the maximum speed of 10 km/h The tilt angle of the forks shall not exceed ±10 degrees

Group 4

1.0

HLR_001

Group 4

1.0

HLR_002

Group 4

1.0

HLR_003

The collision warning system shall activate when driving forward while the driver view is blocked by the load

Group 4

1.0

FR_009

Driver assistance system shall display areas around the vehicle which are obstructed from the view of the driver while driving in reverse and safely stop the vehicle if a probable collision is detected

Group 4

1.0

FR_010

Used by

TSR_001 TSR_002 TSR_003 TSR_004 TSR_006 QSR_001 QSR_002 TSR_003 TSR_004 TSR_005 TSR_006 QSR_001 QSR_003

(continued)

Now Try for Yourself!

ID SR_003

SR_004

SR_005

Name Forklift imbalance prevention

Load balancing system

Lift and tilt hydraulics safety

SR_006

Forklift fire safety

SR_007

Wearing protective gear

SR_008

Passing forklift safety training

SR_009

Seat belt

175

Description The forklift shall prevent vehicle tipping by monitoring and correcting the forklift balance

The forklift shall have a load balancing system that shall detect the load position relative to the forks and correct the position if over the threshold Wireless Hose Diagnostic Unit (HDU) shall continuously monitor hose assemblies using a 433 MHz frequency communication protocol and alert the user when the hose is damaged The forklift shall provide a fire extinguishing system that releases noncorrosive fireextinguishing fluid in case of a potential fire near the engine All operators within the system boundary shall wear protective gear All operators within the system boundary shall pass forklift safety training The forklift shall have a brightly colored seat belt that shall prevent the driver from falling out of the seat while driving

Author Group 4

Ver. 1.0

Derived from FR_001 FR_004 FR_005

Group 4

Group 4

1.0

1.0

FR_007 FR_008 FR_009 FR_010 FR_012 FR_013 FR_004 FR_005 FR_007 FR_008 FR_012 FR_013 FR_001 FR_002 FR_012 FR_013

Used by TSR_007 TSR_008 TSR_009 TSR_010 TSR_011 TSR_012 QSR_004

TSR_013 TSR_014 TSR_015 QSR_005 TSR_016 TSR_017 TSR_018 QSR_006

Group 4

1.0

FR_011

TSR_019 TSR_020 TSR_021 QSR_007

Group 4

1.0



Group 4

1.0

HLR_001 HLR_002 HLR_003 HLR_001 HLR_002 HLR_003

Group 4

1.0

HLR_002





(continued)

176

ID Name Description Technical safety requirements TSR_001 Forward The visual alarm in the driving visual cabin shall activate alarm when driving forward with a speed above 3 km/h while the driver’s view is blocked by the load TSR_002 Forward The sound alarm system driving sound shall activate when alarm driving forward with a speed above 5 km/h while the driver’s view is blocked by the load TSR_003 Near object The radar shall measure detection a distance from an object or pedestrian while driving TSR_004 Environment A monitor of the driver preview assistance system shall display the environment around the vehicle TSR_005 Reverse driv- A monitor for driver assistance shall display ing visual visual warning signals alarm while reverse movement if the distance from the object is less than 3 m TSR_006 Collision The forklift shall apply avoidance breaks and stop automatically if the driver assistance system detects a possible collision QSR_001 Radar detecThe radar shall monitor tion range 360 degrees around the forklift QSR_002 Blocked view The collision warning system shall be develperformance oped according to PL d level The driver assistance QSR_003 Blind spot system shall be develview perforoped according to PL d mance level The forklift shall be TSR_007 Forklift load equipped with two position gyroscopes, one posimonitoring tioned just between rear wheels and one just between forks

11

Practical SIL Calculation

Author

Ver.

Derived from

Used by

Group 4

1.0

SR_001



Group 4

1.0

SR_001



Group 4

1.0

SR_001 SR_002



Group 4

1.0

SR_001 SR_002



Group 4

1.0

SR_002



Group 4

1.0

SR_001 SR_002



Group 4

1.0

SR_001 SR_002



Group 4

1.0

SR_001



Group 4

1.0

SR_002



Group 4

1.0

SR_003



(continued)

Now Try for Yourself!

ID TSR_008

Name Forklift load mass monitoring

TSR_009

Forklift balance monitoring

TSR_010

Forklift imbalance detection

TSR_011

Forklift safe state transition due to imbalance detection Forklift safe state due to imbalance protection

TSR_012

QSR_004

TSR_013

Forklift imbalance prevention performance level Load imbalance monitoring

TSR_014

Load imbalance detection

TSR_015

Load imbalance correction

177

Description The forklift forks shall be equipped with force sensors to monitor force due to load The forklift shall be able to continuously monitor balance by using data from gyroscopes and force sensors By monitoring the balance, the forklift shall be able to detect neartipping events The forklift shall be put into the safe state if a near-tipping event is detected

Author Group 4

Ver. 1.0

Derived from SR_003

Used by –

Group 4

1.0

SR_003



Group 4

1.0

SR_003



Group 4

1.0

SR_003



The forklift safe state shall be cut off the lift and tilt control to the driver, taking over lowering and backward tilting to initial position with and turning on the emergency alarm The forklift balance monitoring and imbalance prevention shall be developed according to PL d The load balancing system shall monitor the load tilt relative to the forks with sensors placed at each fork The load balancing system shall detect an imbalance of the load if the tilt of the load is above the threshold angle of 5 degrees The load balancing system shall correct load imbalance by activating the fork extensions that shall push the load toward the equilibrium position

Group 4

1.0

SR_003



Group 4

1.0

SR_003



Group 4

1.0

SR_004



Group 4

1.0

SR_004



Group 4

1.0

SR_004



(continued)

178

ID QSR_005

TSR_016

11

Name Load balancing system level Hydraulic hose state updates

TSR_017

Hydraulic hose sensors

TSR_018

Wireless HDU

QSR_006

Hydraulic hose monitoring system performance level Forklift fire sensor

TSR_019

TSR_020

Forklift fire suppression display

TSR_021

Automatic fire suppression activation

QSR_007

Forklift fire suppression performance level

Practical SIL Calculation

Description The load balancing system shall be developed according to PL d Wireless Hose Diagnostic Unit shall transmit performance data regularly to provide SMS text and email messages to signal impending hose failure Sensors shall monitor and detect potential issues and transmit data to the HDU The wireless HDU shall update the server when the sensors have signaled impending hose failure The forklift hydraulics system shall be developed according to PL d

Author Group 4

Ver. 1.0

Derived from SR_004

Used by –

Group 4

1.0

SR_005



Group 4

1.0

SR_005



Group 4

1.0

SR_005



Group 4

1.0

SR_005



The forklift shall be equipped with a sensor that detects a fire The forklift shall be equipped with an interactive display that allows the driver to activate the fire suppression manually The forklift shall automatically activate the release of fireextinguishing fluid once a potential fire is detected by the sensor The forklift fire suppression system shall be developed according to PL d

Group 4

1.0

SR_006



Group 4

1.0

SR_006



Group 4

1.0

SR_006



Group 4

1.0

SR_006



Gyro

Relay

Gyro

Relay

Origninal

Li Hydraulic cylinder

Logic

Force Sens

Architectural Block Diagram

I/O

Tilt Hydraulic cylinder

Force Sens

TE

Improved

Relay

Gyro

Relay

Gyro

Relay

Force Sens

Relay

Logic

Force Sens

Li Hydraulic cylinder

I/O

Force Sens

Tilt Hydraulic cylinder

Force Sens

Now Try for Yourself! 179

Gyro

Improved

Gyro

Original

Gyro

Gyro

Force Sens

Force Sens

Force Sens

Reliability Block Diagram

CCF

Force Sens

Force Sens

Force Sens

Logic

CCF

Relay

Logic

Relay I/O

Relay

Relay

Lift Hydraulic cylinder

CCF

Tilt Hydraulic cylinder

Relay

Relay

CCF

I/O

Lift Hydraulic cylinder

Tilt Hydraulic cylinder

180 11 Practical SIL Calculation

Now Try for Yourself!

181

Chapter 12

System Safety Checklist

Video Lesson This chapter has a corresponding video lesson: sfs12.nit-institute.com

Lecture Notes To make sure that our organization has the capability of developing a safe system, and that our technical system will be safe in the end, this lecture provides the final checklist which is needed to close that safety circle. The safety circle consists of six top-level areas: three expertise- and knowledge-based – knowing the system, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0_12

183

184

12 System Safety Checklist

Knowing the system

FuSa Hazards and Risk Evaluation

Safety prescriptions

SySa

SAFETY CIRCLE Inherent safety process

V&V

Certification and audits

Living safety

Fig. 12.1 Safety circle with all the items which need to be closed to achieve safety

knowing to perform hazard and risk evaluation correctly and knowing various safety prescriptions which assure the safety of the technical system implementation – as well as three organizational-/process-based aspects, applying an inherent safety process, performing audits and relevant certifications, as well as living safety by nurturing the safety culture in the company (Fig. 12.1). Knowing the system requires knowledge of the system environment as well as knowledge of system requirements. System delineation splits the system from the environment. If we know what the system does, in the form of requirements, we can only then know what can go wrong with the system. By knowing the environment, we understand who and what is affected by the operation of our system. Hazard and risk evaluation requires the knowledge of methods for the purpose, to identify hazards and evaluate the risk for acceptability, and to be as complete as possible (e.g., FFA, HAZOP, FMEA, FTA, ETA, etc.). Risk acceptability is defined by the standards or by the justification of tolerable risk. Verifiability needs to be assured, so that the analysis can be argued by using references/sources and that the

Lecture Notes

185

selected values, e.g., risk categories, are plausible. Human factors must be regarded when hazards are identified, so the effects such as misuse, or reasonably foreseeable misuse, must be considered as actuating factors for a hazard, which often stem from the complexity of user interface or lack of training. Humans tend to habituate actions and disregard new and changed procedures. Humans are often inattentive, and inattentiveness can come from the fact of paying too much attention to one aspect while completely overlooking the other; therefore, balance in the human interface design must be sought. Safety prescriptions include the knowledge of basic concepts and terminology from system and safety engineering, as well as functional safety with respect to active safety measures. Which prescription needs to be applied and how is usually regulated by the authorities of the area or country where the system is deployed. Many fields have prescribed safety standards for the purpose, which need to be examined and, if relevant and/or required, applied to the system design. Security aspects deal with freedom from attack (intrusions), which needs to be provided to assure safety (freedom from harm). Any technical element or a process in the system, if compromised by an attacker, can as a result produce a decrease in the safety integrity of the system and, therefore, make the system unsafe. Confidentiality in access to all system artifacts and elements must be assured to make the system both safe and secure. Security does not always imply safety (e.g., security measures might disable or prolong some safety-critical access in case of incidents). Safety measures by themselves can compromise safety (e.g., unlocking gates to prevent accidents due to fire hazards allows intruders to access the premise without authentication). These aspects need simultaneous regard for both security and safety which go hand in hand. The software usually participates in a safety function implementation, and we must make sure that its design is given proper attention in the safety sense. System safety evaluation needs to include the evaluation of software safety integrity, making sure that all software engineering and system engineering processes are strictly followed for the prevention of systematic faults (bugs). An inherent safety process is required to deal with safety in a proactive way. The safety life cycle needs to be followed along the V-model, tapping into the project management and system engineering processes (PHI, FHE, SSE, OpSSE). General quality management principles (standards and process models, such as ISO 9001 and ASPICE) are usually required and strictly monitored in case of safety-critical system development, including the quality assurance (QA) methods through traceability, as well as having explicit verification and validation phases. Companies using the inherent safety process need to define safety roles, appoint the appropriate personnel, and give them essential decision powers. Safety roles are not as such sufficient for safety – they only complement the well-established safety processes in the company! Audits are important, meaning that we have processes to check whether we continuously perform all the before mentioned procedures and prescriptions for our system and within our company. Finally, external audit and certification

186

12 System Safety Checklist

would allow us to prove to others our capabilities and the safety of the released systems. Finally, companies need to nurture a safety culture. First, the management of the company needs to understand all safety aspects and to be educated in that regard, not prioritizing company monetary performance without paying meticulous attention to impacts on safety. They shall enforce the safety roles and encourage them to question all management decisions and escalate properly. Safety culture shall propagate throughout the company, requiring the definition of safety objectives and having safety procedures/principles disseminated and adopted by all employees – and not only safety and QA departments! A proactive approach to safety, even in thinking and attitude (will this action that I make impact safety?), is essential to company success – safety is everyone’s responsibility!

Self-assessment Now take the time to self-assess your knowledge about all the required safety aspects by taking the quiz below. Each listed statement is either correct or incorrect. Please mark your answer and then check in the key at the end of the book. 1. If the system is removed from one original environment and placed into another environment, its safety properties remain the same. 2. A user interface element that requires too much user attention to be operated properly can be an actuating factor for a hazard. 3. Misuse of the system does not need to be regarded if we define specific training and prescribe procedures for system operation. 4. Safety prescriptions are first applied according to the safety standards, and only after that with regard to the regulations of the authorities in the area in which the system shall be deployed. 5. When proving the safety integrity of the system, we must prove that the security of the system cannot be compromised. 6. Together with the enforced system integrity, security requires confidentiality and availability of the system to be maintained. 7. Software debugging and bug reporting by users after the system release is an essential practice to make sure the system is safe. 8. Quality management and the application of process models, as in ISO 9001 and ASPICE, are required together with the inherent safety process, to make sure that the company is capable of developing safe systems. 9. The safety manager in the company is fully responsible for the safety of the developed system. 10. Safety culture in the company needs to be nurtured and starts from the company management, who shall not prioritize the monetary performance of the company over the unacceptable impact that this might have on safety.

Self-assessment Key

Self-assessment Key 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

False True False False True True False True False True

187

Bibliography

An Introduction to Functional Safety and IEC 61508, Application Note, AN9025, MTL Instruments Group, 2002 Automotive SPICE, Process Reference Model, Process Assessment Model, v3.1, VDA QMC Working Group, 2017 Bjelica, Milan. My Big, Fat, Safe Software Stack: Functional Safety for Complex Software for Next-Generation Vehicles. 7th Conference on the Engineering of Computer Based Systems. 2021. Blanchard, Benjamin S. System engineering management. Wiley, 2004. CASE Editorial Board. 2014. The Guide to the Systems Engineering Body of Knowledge (SEBoK), v. 1.3. R.D. Adcock (EIC). Hoboken, NJ: The Trustees of the Stevens Institute of Technology. Dubrova, Elena. Fault-tolerant design. New York: Springer, 2013. Ericson, Clifton A Hazard Analysis Techniques for System Safety, 2nd Edition, 2015 FIDES guide 2009: Reliability Methodology for Electronic Systems, Edition A, FIDES Consortium, September 2010 Gell-Mann, Murray. What is complexity? Complexity and industrial clusters. Physica-Verlag HD, 2002. 13–24. IEC 61508: Functional Safety of Electrical/Electronic/Programmable Electronic Safety-related Systems, IEC, Geneva, 2010. IEC 62061:2021, Safety of machinery - Functional safety of safety-related control systems, IEC, Geneva, 2021. IEEE/ISO/IEC 15288-2008, ISO/IEC/IEEE International Standard – Systems and software engineering System life cycle processes, 2008. INCOSE Systems Engineering Handbook v. 3.2, INCOSE‐TP‐2003‐002‐03.2, January 2010 ISO 13849-1:2015, Safety of machinery — Safety-related parts of control systems, ISO, Geneva, 2015 ISO 26262: Road vehicles – Functional safety. 2nd ed., parts 01-12. ISO, Geneva, 2018. ISO 9001:2015-11: Quality management systems – Requirements; ISO, Geneva, 2015 Le Guen, Jean, and Risk Assessment Policy Unit. Reducing risks, protecting people. 1999. Manson, Steven M. Simplifying complexity: a review of complexity theory. Geoforum 32.3 (2001): 405-414. MIL-STD-882E: Department of Defense Standard Practice – System Safety; 2012

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0

189

190

Bibliography

Raussand, M. and Hoyland, A. 2004. System Reliability Theory: Models, Statistical Methods, and Applications, Wiley, 2nd edition. Reason, James. Human error: models and management. BMJ 320.7237 (2000): 768-770. Roland, Harold E., and Brian Moriarty. System safety engineering and management. Wiley, 1991. Sebron, Walter, Hans Tschürtz, and Peter Krebs. The shell model–a method for system boundary analysis. European Conference on Software Process Improvement. Springer, Cham, 2018.

Index

A Accident, 3, 19, 32, 35, 37–39, 42, 47, 49, 51–53, 59, 61, 63, 77, 97, 113, 185 As low as reasonably practicable (ALARP), 52, 59 Availability, 144, 147–150, 156, 186

B Burn in testing, 105, 125, 126

C Claim, 8, 143–145, 152, 168, 169 Composite systems, 109–111, 124

D Dangerous undetectable failures, 40, 146, 149, 151, 152, 156, 168 Diagnostic coverage (DC), 144, 146, 147, 149, 151, 152, 156, 159, 162, 165–167, 172 Diversity, 125–127, 140 Dynamic redundancy, 127, 141

E Equipment under control (EUC), 61, 63, 64, 71, 136

Error, 35, 39, 41, 42, 46, 47, 62–64, 79, 100, 133, 145

F Failure, 4, 21, 26, 32, 35, 36, 39–42, 46, 47, 52, 54, 55, 62–65, 71, 74, 79, 97–101, 107–111, 124–129, 133, 136, 140, 141, 143, 145–149, 151, 156, 162, 166, 168, 172, 178 Failure chain, 40–42, 46, 61 Failure probability, 99–106, 108, 109, 112, 143, 145, 172 Failure rates, 52, 98–106, 108, 109, 112, 114, 117, 123–126, 128, 133–135, 141, 143, 144, 146–149, 152, 156, 159, 163–165, 171, 172 Faults, 4, 35, 39–42, 46, 47, 62–64, 74, 97, 99, 100, 110, 125–127, 152, 156, 159, 172, 185 Fit, 65, 101, 105, 133, 162 Functional safety (FuSa), 23, 36, 37, 47, 61–66, 73, 79, 97–99, 143, 146, 147, 151, 159, 171, 172, 185 Functions, 1–4, 7, 18, 21, 23, 25, 26, 32, 35–37, 39, 41, 46, 47, 62–67, 71, 87, 99, 105, 108–111, 117, 127, 144, 145, 147, 171

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Z. Bjelica, Systems, Functions and Safety, https://doi.org/10.1007/978-3-031-15823-0

191

192 G Globale-ment au moins aussi bon (GAMAB), 52

H Hazard, 3, 4, 18, 23, 26, 32, 35–42, 46, 47, 49–52, 55, 59, 62, 64–67, 71, 74–79, 94, 97, 98, 136, 145, 146, 156, 171, 172, 184–186 Hazard identification, 49, 51–53, 59, 61, 64 Hazard list, 51, 54, 80

I Incidents, 35, 37–39, 47, 49, 51, 53, 59, 185 Inherent safety, 59, 184–186 ISApro, 51, 59

K Kano model, 24–26

M Mean downtime (MDT), 147, 148, 150 Mean time between failures (MTBF), 148, 162, 166 Mean time to failure (MTTF), 98, 100–102, 104–109, 123, 136, 141, 143, 146, 159, 166–168, 172 Mean uptime (MUT), 147, 148, 150 Minimum endogenous mortality (MEM), 52, 54, 55, 59, 136, 152

R Random failures, 62, 74, 97–99, 107, 110, 117, 124–126, 133, 143, 147, 168, 171, 172 Reactive safety, 49, 51, 59 Reliability, 2, 36, 51, 65, 98–106, 108–117, 123–134, 136, 140, 143, 145, 151, 156, 162, 168, 169, 172, 180–181 Reliability block diagram (RBD), 110–112, 114, 115, 117, 123, 124, 127–131, 133, 136, 141, 159, 162, 163, 168, 169 Requirements, 2, 5, 6, 19, 21–27, 32, 35, 36, 41, 49–52, 59, 61–66, 71, 73, 74, 77–80, 82, 84, 97, 98, 107, 117, 125, 143–147, 151, 156, 161, 164, 165, 167–169, 172–174, 184

Index Requirements engineering, 21, 22, 24, 32 Risk analysis, 39, 52, 74 Risk assessment, 38, 39, 41, 42, 61, 73, 79, 97, 145

S Safe failure fraction (SFF), 144, 146, 147, 149, 151, 152, 156, 159, 166, 172 Safety, 3, 18, 19, 23, 26, 27, 32, 35–38, 40, 47, 49–55, 59, 61, 63, 64, 66, 71, 73, 77, 79, 80, 85, 89–94, 97–141, 143–145, 147, 156, 159, 168, 172, 175, 183–186 Safety case, 49, 143–145, 152, 156, 167, 169 Safety concept, 51, 79, 80, 95, 159, 162, 163, 165 Safety critical system, 2, 18, 21, 22, 35, 79, 97, 185 Safety culture, 184, 186 Safety function requirements, 26, 49, 52, 61, 62, 64–66, 71, 73, 74, 77, 78, 98, 123, 156, 165, 168, 169 Safety functions, 26, 49, 52, 61–66, 71, 73–75, 77–79, 97–99, 105, 107, 108, 110, 111, 113, 115, 117, 124, 126, 140, 145, 146, 156, 159–161, 165, 167–169, 171, 172, 185 Safety integrity, 62, 64, 73, 74, 97–99, 107, 108, 110, 123–126, 140, 143–145, 151, 156, 159, 160, 162, 164, 165, 167, 168, 185, 186 Safety integrity improvement, 125, 141, 168 Safety integrity level (SIL), 52, 64–66, 71, 74, 77, 79, 97–99, 105, 107, 108, 117, 125, 126, 145–147, 151, 152, 156, 159–181 Safety life cycle, 47, 147 Safety related system (SRS), 25, 61, 64, 66, 71, 77, 78, 80, 97, 98, 107, 113, 117, 125, 126, 135, 136, 145, 151, 152, 156, 161, 162, 165, 168, 169, 171, 172 Safety requirements, 24, 26, 32, 49–52, 59, 62, 64–66, 71, 73, 74, 77–79, 84, 85, 145, 156, 161, 162, 165, 167, 172, 174, 176 SIL calculations, 146, 169 Static redundancy, 127, 129 Swiss cheese model, 3, 4 System, 1–7, 18, 19, 21–32, 35–47, 49–59, 61–66, 71, 73, 74, 79, 80, 82, 84–89, 91, 92, 97–100, 102, 105, 108–111, 117,

Index 123–127, 129–133, 135, 136, 140, 143–152, 156, 162, 165, 167–172, 174–178, 183–186 System of system (SoS), 2, 4, 5, 7, 19 System safety, 21, 23, 32, 35, 37, 46, 47, 49, 52, 53, 59, 61–63, 73, 109, 156, 159, 185 System safety process, 50, 59

193 Systematic failures, 99, 108, 143 Systematic safety integrity, 61, 62, 97, 143, 146

T Traceability, 2, 6, 7, 23, 26, 32, 50, 66, 145, 172, 185