Software Development, Design and Coding: With Patterns, Debugging, Unit Testing, and Refactoring [2 ed.] 148423152X, 978-1484231524

Learn the principles of good software design, and how to turn those principles into great code. This book introduces you

1,889 240 6MB

English Pages 344 [330] Year 2017

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Software development, design and coding: with patterns, debugging, unit testing, and refactoring [2nd edition] 9781484231524, 9781484231531, 148423152X

1,399 279 2MB Read more

Software Development, Design and Coding: With Patterns, Debugging, Unit Testing, and Refactoring 9781484231531, 2017961306

1,105 186 2MB Read more

Refactoring for Software Design Smells 9780128013977, 2014029955

1,077 248 23MB Read more

Microprocessor 5: Software and Hardware Aspects of Development, Debugging and Testing - The Microcomputer [1 ed.] 1786306514, 9781786306517

The final volume presents the hardware and software aspects of the development cham for a digital system as well as the

1,333 173 15MB Read more

Software Development and Design The Preliminary Course

6,458 1,294 9MB Read more

C++ Software Design : Design Principles and Patterns for High-Quality Software [1 ed.] 9781098113162

4,408 812 6MB Read more

API Testing and Development with Postman 9781800569201

4,067 660 7MB Read more

Adaptive Code: Agile coding with design patterns and SOLID principles [2 ed.] 2016936931, 9781509302581

5,232 1,095 35MB Read more

Sustainable Software Development Patterns and Best Practices 9781484259498

1,150 171 5MB Read more

Inclusive design patterns : coding accessibility into web design 9783945749432, 3945749433

1,508 300 5MB Read more

Software Development, Design and Coding: With Patterns, Debugging, Unit Testing, and Refactoring [2 ed.]
148423152X, 978-1484231524

Author / Uploaded
John F. Dooley

Categories
Computers
Programming

Commentary
True PDF

Table of contents :
Front Matter ....Pages i-xxii
Introduction to Software Development (John F. Dooley)....Pages 1-5
Software Process Models (John F. Dooley)....Pages 7-27
Project Management Essentials (John F. Dooley)....Pages 29-38
Requirements (John F. Dooley)....Pages 39-51
Software Architecture (John F. Dooley)....Pages 53-63
Design Principles (John F. Dooley)....Pages 65-74
Structured Design (John F. Dooley)....Pages 75-94
Object-Oriented Overview (John F. Dooley)....Pages 95-106
Object-Oriented Analysis and Design (John F. Dooley)....Pages 107-120
Object-Oriented Design Principles (John F. Dooley)....Pages 121-140
Design Patterns (John F. Dooley)....Pages 141-166
Parallel Programming (John F. Dooley)....Pages 167-189
Parallel Design Patterns (John F. Dooley)....Pages 191-209
Code Construction (John F. Dooley)....Pages 211-234
Debugging (John F. Dooley)....Pages 235-251
Unit Testing (John F. Dooley)....Pages 253-269
Code Reviews and Inspections (John F. Dooley)....Pages 271-282
Ethics and Professional Practice (John F. Dooley)....Pages 283-304
Wrapping It all Up (John F. Dooley)....Pages 305-309
Back Matter ....Pages 311-320

Citation preview

Software Development, Design and Coding With Patterns, Debugging, Unit Testing, and Refactoring — Learn the principles of good software design, and how to turn those principles into great code — Second Edition

— John F. Dooley

Software Development, Design and Coding With Patterns, Debugging, Unit Testing, and Refactoring Second Edition

John F. Dooley

Software Development, Design and Coding John F. Dooley Galesburg, Illinois, USA ISBN-13 (pbk): 978-1-4842-3152-4 https://doi.org/10.1007/978-1-4842-3153-1

ISBN-13 (electronic): 978-1-4842-3153-1

Library of Congress Control Number: 2017961306 Copyright © 2017 by John F. Dooley This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Cover image designed by Vexels.com Managing Director: Welmoed Spahr Editorial Director: Todd Green Acquisitions Editor: Todd Green Development Editor: James Markham Technical Reviewer: Michael Thomas Coordinating Editor: Jill Balzano Copy Editor: Corbin Collins Compositor: SPi Global Indexer: SPi Global Artist: SPi Global Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail [email protected], or visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation. For information on translations, please e-mail [email protected], or visit www.apress.com/ rights-permissions. Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook versions and licenses are also available for most titles. For more information, reference our Print and eBook Bulk Sales web page at www.apress.com/bulk-sales. Any source code or other supplementary material referenced by the author in this book is available to readers on GitHub via the book's product page, located at www.apress.com/9781484231524. For more detailed information, please visit www.apress.com/source-code. Printed on acid-free paper

For Diane, who is always there, and for Patrick, the best son a guy could have.

Contents About the Author��xv About the Technical Reviewer��xvii Acknowledgments��xix Preface��xxi ■Chapter ■ 1: Introduction to Software Development�� 1 What We’re Doing�� 2 So, How to Develop Software?�� 2 Conclusion�� 4 References�� 5 ■Chapter ■ 2: Software Process Models�� 7 The Four Variables�� 8 A Model That’s not a Model At All: Code and Fix�� 8 Cruising over the Waterfall�� 9 Iterative Models�� 11 Evolving the Iterative Model�� 12 Risk: The Problem with Plan-Driven Models�� 13 Agile Methodologies�� 13 Agile Values and Principles�� 14 eXtreme Programming (XP)�� 15 XP Overview�� 15 The Four Basic Activities�� 16 Implementing XP: The 12 Practices�� 16 v

■ Contents

Scrum�� 18 Scrum Roles�� 19 The Sprint�� 19 Scrum Artifacts�� 19 Sprint Flow�� 20

Lean Software Development�� 21 Principle 1: Eliminate Waste�� 22 Principle 2: Build Quality In�� 22 Principle 3: Create Knowledge�� 22 Principle 4: Defer Commitment�� 23 Principle 5: Deliver Fast�� 23 Principle 6: Respect People�� 23 Principle 7: Optimize the Whole�� 24

Kanban�� 24 The Kanban board, WIP, and Flow�� 24 Lead Time�� 26

Conclusion�� 26 References�� 27 ■Chapter ■ 3: Project Management Essentials�� 29 Project Planning�� 30 Project Organization�� 30 Risk Analysis�� 31 Resource Requirements�� 33

Task Estimates�� 33 Project Schedule�� 34 Velocity�� 35 Project Oversight�� 36

Status Reviews and Presentations�� 36 Defects�� 37

vi

■ Contents

The Retrospective�� 38 Conclusion�� 38 References�� 38 ■Chapter ■ 4: Requirements�� 39 What Types of Requirements Are We Talking About?�� 39 User Requirements�� 40 Domain Requirements�� 40 Non-Functional Requirements�� 40 Non-Requirements�� 41

Requirements Gathering in a Plan-Driven Project�� 41 But I Don’t Like Writing!�� 41 Outline of a Functional Specification�� 42 Design and New Feature Ideas�� 43 One More Thing�� 44

Requirements Gathering in an Agile Project�� 44 The Three Cs�� 44 INVEST in Stories�� 45 Product Backlog�� 47 SMART Tasks�� 47 Sprint/Iteration Backlog�� 48

Requirements Digging�� 48 Why Requirements Digging Is Hard�� 49

Analyzing the Requirements�� 50 Conclusion�� 51 References�� 51 ■Chapter ■ 5: Software Architecture�� 53 General Architectural Patterns�� 54 The Main Program—Subroutine Architectural Pattern�� 54 Pipe-and-Filter Architecture�� 55

vii

■ Contents

An Object-Oriented Architectural Pattern�� 56 An MVC Example: Let’s Hunt!�� 58

The Client-Server Architectural Pattern�� 60 The Layered Approach�� 61 Conclusion�� 62 References�� 63 ■Chapter ■ 6: Design Principles�� 65 The Design Process�� 68 Desirable Design Characteristics (Things Your Design Should Favor)�� 69 Design Heuristics�� 70 Designers and Creativity�� 72 Conclusion�� 73 References�� 74 ■Chapter ■ 7: Structured Design�� 75 Structured Programming�� 75 Stepwise Refinement�� 76 Example of Stepwise Refinement: The Eight-Queens Problem�� 77

Modular Decomposition�� 84 Example: Keyword in Context�� 86

Conclusion�� 94 References�� 94 ■Chapter ■ 8: Object-Oriented Overview�� 95 An Object-Oriented Analysis and Design Process�� 96 Requirements Gathering and Analysis�� 98 Design�� 98 Implementation and Testing�� 98 Release/Maintenance/Evolution�� 98

viii

■ Contents

Doing the Process�� 98 The Problem Statement�� 98 The Feature List�� 99 Use Cases�� 99 Decompose the Problem�� 100 Class Diagrams�� 100 Code Anyone?�� 101

Conclusion�� 105 References�� 106 ■Chapter ■ 9: Object-Oriented Analysis and Design�� 107 Analysis�� 108 An Analytical Example�� 109

Design�� 111 Change in the Right Direction�� 112 Recognizing Change�� 112 Songbirds Forever�� 113 A New Requirement�� 113

Separating Analysis and Design�� 115 Shaping the Design�� 116 Abstraction�� 117 Conclusion�� 119 References�� 120 ■Chapter ■ 10: Object-Oriented Design Principles�� 121 List of Fundamental Object-Oriented Design Principles�� 122 Encapsulate Things in Your Design That Are Likely to Change�� 122 Code to an Interface Rather Than to an Implementation�� 123 The Open-Closed Principle�� 126 The Don’t Repeat Yourself Principle�� 127 The Single Responsibility Principle�� 128 ix

■ Contents

The Liskov Substitution Principle�� 129 The Dependency Inversion Principle�� 136 The Interface Segregation Principle�� 138 The Principle of Least Knowledge�� 138 Class Design Guidelines�� 139 Conclusion�� 140 References�� 140 ■Chapter ■ 11: Design Patterns�� 141 Design Patterns and the Gang of Four�� 142 The Classic Design Patterns�� 143

Patterns We Can Use�� 144 Creational Patterns�� 144 Structural Patterns�� 151 Behavioral Patterns�� 157

Conclusion�� 166 References�� 166 ■Chapter ■ 12: Parallel Programming�� 167 Concurrency vs. Parallelism�� 168 Parallel Computers�� 170 Flynn’s Taxonomy�� 170

Parallel Programming�� 171 Scalability�� 172 Performance�� 172 Obstacles to Performance Improvement�� 173

How to Write a Parallel Program�� 174 Parallel Programming Models�� 174 Designing Parallel Programs�� 175 Parallel Design Techniques�� 175

x

■ Contents

Programming Languages and APIs (with examples)�� 177 Parallel Language Features�� 177 Java Threads�� 178 OpenMP�� 184

The Last Word on Parallel Programming�� 188 References�� 189 ■Chapter ■ 13: Parallel Design Patterns�� 191 Parallel Patterns Overview�� 191 Parallel Design Pattern Design Spaces�� 192

A List of Parallel Patterns�� 199 Embarrassingly Parallel�� 199 Master/Worker�� 200 Map and Reduce�� 200 MapReduce�� 202 Divide & Conquer�� 204 Fork/Join�� 205

A Last Word on Parallel Design Patterns�� 209 References�� 209 ■Chapter ■ 14: Code Construction�� 211 A Coding Example�� 213 Functions and Methods and Size�� 214 Formatting, Layout, and Style�� 214 General Layout Issues and Techniques�� 215 White Space�� 217 Block and Statement Style Guidelines�� 217 Declaration Style Guidelines�� 218 Commenting Style Guidelines�� 220 Identifier Naming Conventions�� 222

xi

■ Contents

Refactoring�� 224 When to Refactor�� 224 Types of Refactoring�� 226

Defensive Programming�� 228 Assertions Are Helpful�� 229 Exceptions�� 230 Error Handling�� 230 Exceptions in Java�� 232

The Last Word on Coding�� 234 References�� 234 ■Chapter ■ 15: Debugging�� 235 What Is an Error, Anyway?�� 236 What Not To Do�� 237 An Approach to Debugging�� 238 Reproduce the Problem Reliably�� 238 Find the Source of the Error �� 239 Fix the Error (Just That One)! �� 245 Test the Fix �� 246 Look for More Errors �� 246

Source Code Control�� 246 The Collision Problem�� 247 Source Code Control Systems�� 248

One Last Thought on Coding and Debugging: Pair Programming�� 250 Conclusion�� 251 References�� 251 ■Chapter ■ 16: Unit Testing�� 253 The Problem with Testing�� 254 That Testing Mindset�� 254 When to Test?�� 255 xii

■ Contents

Testing in an Agile Development Environment�� 256 What to Test?�� 256 Code Coverage: Test Every Statement�� 257 Data Coverage: Bad Data Is Your Friend?�� 258

Characteristics of Tests�� 259 How to Write a Test�� 259 The Story�� 260 The Tasks�� 260 The Tests�� 260

JUnit: A Testing Framework�� 264 Testing Is Good�� 268 Conclusion�� 268 References�� 269 ■Chapter ■ 17: Code Reviews and Inspections�� 271 Walkthroughs, Reviews, and Inspections�� 272 Walkthroughs�� 273 Code Reviews�� 273 Code Inspections�� 274 Inspection Roles�� 275 Inspection Phases and Procedures�� 276

Reviews in Agile Projects�� 278 How to Do an Agile Peer Code Review�� 279

Summary of Review Methodologies�� 279 Defect Tracking Systems�� 280 Defect Tracking in Agile Projects�� 281 Conclusion�� 282 References�� 282

xiii

■ Contents

■Chapter ■ 18: Ethics and Professional Practice�� 283 Introduction to Ethics�� 283 Ethical Theory�� 284 Deontological Theories�� 284 Consequentialism (Teleological Theories)�� 287

Ethical Drivers�� 289 Legal Drivers�� 289 Professional Drivers�� 289

Ethical Discussion and Decision Making�� 291 Identifying and Describing the Problem�� 291 Analyzing the Problem�� 291

Case Studies�� 292 #1 Copying Software�� 292 #2 Who’s Computer Is It?�� 292 #3 How Much Testing Is Enough?�� 292 #4 How Much Should You Tell?�� 293

The Last Word on Ethics?�� 293 References�� 294 The ACM Code of Ethics and Professional Conduct�� 294 Preamble�� 294 Contents & Guidelines�� 295

The ACM/IEEE-CS Software Engineering Code of Ethics�� 300 PREAMBLE�� 300 PRINCIPLES�� 301

■Chapter ■ 19: Wrapping It all Up�� 305 What Have You Learned?�� 305 What to Do Next?�� 306 References�� 308 Index�� 311 xiv

About the Author John F. Dooley is the William and Marilyn Ingersoll Emeritus Professor of Computer Science at Knox College in Galesburg, Illinois. Before returning to teaching in 2001, Professor Dooley spent nearly 18 years in the software industry as a developer, designer, and manager working for companies such as Bell Telephone Laboratories, McDonnell Douglas, IBM, and Motorola, along with an obligatory stint as head of development at a software startup. He has more than two dozen professional journal and conference publications and four books to his credit, along with numerous presentations. He has been a reviewer for the Association for Computing Machinery Special Interest Group on Computer Science Education (SIGCSE) Technical Symposium for the last 36 years and reviews papers for the IEEE Transactions on Education, the ACM Innovation and Technology in Computer Science Education (ITiCSE) Conference, and other professional conferences. He has developed short courses in software development and created three separate software engineering courses at the advanced undergraduate level.

xv

About the Technical Reviewer Michael Thomas has worked in software development for more than 20 years as an individual contributor, team lead, program manager, and vice president of engineering. Michael has more than ten years experience working with mobile devices. His current focus is in the medical sector using mobile devices to accelerate information transfer between patients and healthcare providers.

xvii

Acknowledgments I’d like to thank Todd Green of Apress for encouraging me and making this book possible. The staff at Apress, especially Jill Balzano and Michael Thomas, have been very helpful and gracious. The book is much better for their reviews, comments, and edits. Thanks also to all my students in CS 292 over the last 12 years, who have put up with successive versions of the course notes that became this book, and to my CS department colleagues David Bunde and Jaime Spacco, who put up with me for all these years. And my thanks also go to Knox College for giving me the time and resources to finish both editions of this book. Finally, I owe everything to Diane, who hates that I work nights, but loves that I can work at home.

xix

Preface What’s this book all about? Well, it’s about how to develop software from a personal perspective. We’ll look at what it means for you to take a problem and produce a program to solve it from beginning to end. That said, this book focuses a lot on design. How do you design software? What things do you take into account? What makes a good design? What methods and processes are there to help you design software? Is designing small programs different from designing large ones? How can you tell a good design from a bad one? What general patterns can you use to help make your design more readable and understandable? It’s also about code construction. How do you write programs and make them work? “What?” you say. “I’ve already written eight gazillion programs! Of course I know how to write code!” Well, in this book, we’ll explore what you already do and investigate ways to improve on that. We’ll spend some time on coding standards, debugging, unit testing, modularity, and characteristics of good programs. We’ll also talk about reading code, what makes a program readable, and how to review code that others have written with an eye to making it better. Can good, readable code replace documentation? How much documentation do you really need? And it’s about software engineering, which is usually defined as “the application of engineering principles to the development of software.” What are engineering principles? Well, first, all engineering efforts follow a defined process. So we’ll be spending a bit of time talking about how you run a software development project and what phases there are to a project. We’ll talk a lot about agile methodologies, how they apply to small development teams and how their project management techniques work for small- to medium-sized projects. All engineering work has a basis in the application of science and mathematics to real-world problems. So does software development. As I’ve said already, we’ll be spending a lot of time examining how to design and implement programs that solve specific problems. By the way, there’s at least one other person (besides me) who thinks software development is not an engineering discipline. I’m referring to Alistair Cockburn, and you can read his paper, “The End of Software Engineering and the Start of Economic-Cooperative Gaming,” at http://alistair.cockburn.us/The+end+o f+software+engineering+and+the+start+of+economic-cooperative+gaming. Finally, this book is about professional practice, the ethics and the responsibilities of being a software developer, social issues, privacy, how to write secure and robust code, and the like. In short, those fuzzy other things that one needs in order to be a professional software developer. This book covers many of the topics described for the ACM/IEEE Computer Society Curriculum Guidelines for Undergraduate Degree Programs in Computer Science (known as CS2013).1 In particular, it covers topics in a number of the Knowledge Areas of the Guidelines, including Software Development Fundamentals, Software Engineering, Systems Fundamentals, Parallel and Distributed Computing, Programming Languages, and Social Issues and Professional Practice. It’s designed to be both a textbook

The Joint Task Force on Computing Education. 2013. “Computer Science Curricula 2013: Curriculum Guidelines for Undergraduate Degree Programs in Computer Science.” New York, NY: ACM/IEEE Computer Society. www.acm.org/ education/CS2013-final-report.pdf.

1

xxi

■ Preface

for a junior-level undergraduate course in software design and development and a manual for the working professional. Although the chapter order generally follows the standard software development sequence, one can read the chapters independently and out of order. I’m assuming that you already know how to program and that you’re conversant with at least one of these languages: Java, C, or C++. I’m also assuming you’re familiar with basic data structures, including lists, queues, stacks, maps, and trees, along with the algorithms to manipulate them. In this second edition, most of the chapters have been updated, some new examples have been added, and the book discusses modern software development processes and techniques. Much of the plan-driven process and project-management discussions from the first edition have been removed or shortened, and longer and new discussions of agile methodologies, including Scrum, Lean Software Development, and Kanban have taken their place. There are new chapters on parallel programming and parallel design patterns, and a new chapter on ethics and professional practice. I use this book in a junior-level course in software development. It’s grown out of the notes I’ve developed for that class over the past 12 years. I developed my own notes because I couldn’t find a book that covered all the topics I thought were necessary for a course in software development, as opposed to one in software engineering. Software engineering books tend to focus more on process and project management than on design and actual development. I wanted to focus on the design and writing of real code rather than on how to run a large project. Before beginning to teach, I spent nearly 18 years in the computer industry, working for large and small companies, writing software, and managing other people who wrote software. This book is my perspective on what it takes to be a software developer on a small- to medium-sized team and help develop great software. I hope that by the end of the book you’ll have a much better idea of what the design of good programs is like, what makes an effective and productive developer, and how to develop larger pieces of software. You’ll know a lot more about design issues. You’ll have thought about working in a team to deliver a product to a written schedule. You’ll begin to understand project management, know some metrics and how to review work products, and understand configuration management. I’ll not cover everything in software development—not by a long stretch—and we’ll only be giving a cursory look at the management side of software engineering, but you’ll be in a much better position to visualize, design, implement, and test software of many sizes, either by yourself or in a team.

xxii

CHAPTER 1

Introduction to Software Development “Not only are there no silver bullets now in view, the very nature of software makes it unlikely that there will be any—no inventions that will do for software productivity, reliability, and simplicity what electronics, transistors, and large-scale integration did for computer hardware. We cannot expect ever to see twofold gains every two years.” — Frederick J. Brooks, Jr.1 So, you’re asking yourself, why is this book called Software Development, Design and Coding? Why isn’t it called All About Programming or Software Engineering? After all, isn’t that what software development is? Well, no. Programming is a part of software development, but it’s certainly not all of it. Likewise, software development is a part of software engineering, but it’s not all of it. Here’s the definition of software development that we’ll use in this book: software development is the process of taking a set of requirements from a user (a problem statement), analyzing them, designing a solution to the problem, and then implementing that solution on a computer. Isn’t that programming, you ask? No. Programming is really the implementation part, or possibly the design and implementation part, of software development. Programming is central to software development, but it’s not the whole thing. Well, then, isn’t it software engineering? Again, no. Software engineering also involves a process and includes software development, but it also includes the entire management side of creating a computer program that people will use, including project management, configuration management, scheduling and estimation, baseline building and scheduling, managing people, and several other things. Software development is the fun part of software engineering. So, software development is a narrowing of the focus of software engineering to just that part concerned with the creation of the actual software. And it’s a broadening of the focus of programming to include analysis, design, and release issues.

Brooks, Frederick. “No Silver Bullet.” IEEE Computer (1987). 20(4): 10-19.

1

© John F. Dooley 2017 J. F. Dooley, Software Development, Design and Coding, https://doi.org/10.1007/978-1-4842-3153-1_1

1

Chapter 1 ■ Introduction to Software Development

What We’re Doing It turns out that, after 70 or so years of using computers, we’ve discovered that developing software is hard. Learning how to develop software correctly, efficiently, and beautifully is also hard. You’re not born knowing how to do it, and many people, even those who take programming courses and work in the industry for years, don’t do it particularly well. It’s a skill you need to pick up and practice—a lot. You don’t learn programming and development by reading books—not even this one. You learn it by doing it. That, of course, is the attraction: to work on interesting and difficult problems. The challenge is to work on something you’ve never done before, something you might not even know if you can solve. That’s what has you coming back to create new programs again and again. There are probably several ways to learn software development. But I think that all of them involve reading excellent designs, reading a lot of code, writing a lot of code, and thinking deeply about how you approach a problem and design a solution for it. Reading a lot of code, especially really beautiful and efficient code, gives you lots of good examples about how to think about problems and approach their solution in a particular style. Writing a lot of code lets you experiment with the styles and examples you’ve seen in your reading. Thinking deeply about problem solving lets you examine how you work and how you do design, and lets you extract from your labors those patterns that work for you; it makes your programming more intentional.

So, How to Develop Software? The first thing you should do is read this book. It certainly won’t tell you everything, but it will give you a good introduction into what software development is all about and what you need to do to write great code. It has its own perspective, but that’s a perspective based on 20 years writing code professionally and another 22 years trying to figure out how to teach others to do it. Despite the fact that software development is only part of software engineering, software development is the heart of every software project. After all, at the end of the day what you deliver to the user is working code. A team of developers working in concert usually creates that code. So, to start, maybe we should look at a software project from the outside and ask what does that team need to do to make that project a success? In order to do software development well, you need the following:

2

•

A small, well-integrated team: Small teams have fewer lines of communication than larger ones. It’s easier to get to know your teammates on a small team. You can get to know their strengths and weaknesses, who knows what, and who is the “go-to” person for particular problems or particular tools. Well-integrated teams have usually worked on several projects together. Keeping a team together across several projects is a major job of the team’s manager. Well-integrated teams are more productive, are better at holding to a schedule, and produce code with fewer defects at release. The key to keeping a team together is to give them interesting work to do and then leave them alone.

•

Good communication among team members: Constant communication among team members is critical to day-to-day progress and successful project completion. Teams that are co-located are better at communicating and communicate more than teams that are distributed geographically (even if they’re just on different floors or wings of a building). This is a major issue with larger companies that have software development sites scattered across the globe.

•

Good communication between the team and the customer: Communication with the customer is essential to controlling requirements and requirements churn during a project. On-site or close-by customers allow for constant interaction with the development team. Customers can give immediate feedback on new releases and be

Chapter 1 ■ Introduction to Software Development

involved in creating system and acceptance tests for the product. Agile development methodologies strongly encourage customers to be part of the development team and, even better, to be on site daily. See Chapter 2 for a quick introduction to a couple of agile methodologies. •

A process that everyone buys into: Every project, no matter how big or small, follows a process. Larger projects and larger teams tend to be more plan-driven and follow processes with more rules and documentation required. Larger projects require more coordination and tighter controls on communication and configuration management. Smaller projects and smaller teams will, these days, tend to follow more agile development processes, with more flexibility and less documentation required. This certainly doesn’t mean there is no process in an agile project; it just means you do what makes sense for the project you’re writing so that you can satisfy all the requirements, meet the schedule, and produce a quality product. See Chapter 2 for more details on process and software life cycles.

•

The ability to be flexible about that process: No project ever proceeds as you think it will on the first day. Requirements change, people come and go, tools don’t work out or get updated, and so on. This point is all about handling risk in your project. If you identify risks, plan to mitigate them, and then have a contingency plan to address the event where the risk actually occurs, you’ll be in much better shape. Chapter 4 talks about requirements and risk.

•

A plan that every one buys into: You wouldn’t write a sorting program without an algorithm to start with, so you shouldn’t launch a software development project without a plan. The project plan encapsulates what you’re going to do to implement your project. It talks about process, risks, resources, tools, requirements management, estimates, schedules, configuration management, and delivery. It doesn’t have to be long and it doesn’t need to contain all the minute details of the everyday life of the project, but everyone on the team needs to have input into it, they need to understand it, and they need to agree with it. Unless everyone buys into the plan, you’re doomed. See Chapter 3 for more details on project plans.

•

To know where you are at all times: It’s that communication thing again. Most projects have regular status meetings so that the developers can “sync up” on their current status and get a feel for the status of the entire project. This works very well for smaller teams (say, up to about 20 developers). Many small teams will have daily meetings to sync up at the beginning of each day. Different process models handle this “stand-up” meeting differently. Many plan-driven models don’t require these meetings, depending on the team managers to communicate with each other. Agile processes often require daily meetings to improve communications among team members and to create a sense of camaraderie within the team.

•

To be brave enough to say, “Hey, we’re behind!”: Nearly all software projects have schedules that are too optimistic at the start. It’s just the way we developers are. Software developers are an optimistic bunch, generally, and it shows in their estimates of work. “Sure, I can get that done in a week!” “I’ll have it to you by the end of the day.” “Tomorrow? Not a problem.” No, no, no, no, no. Just face it. At some point you’ll be behind. And the best thing to do about it is tell your manager right away. Sure, she might be angry—but she’ll be angrier when you end up a month behind and she didn’t know it. Fred Brooks’s famous answer to the question of how software projects get so far behind is “one day at a time.” The good news, though, is that the earlier you figure out you’re behind, the more options you have. These include lengthening the schedule (unlikely, but it does happen), moving some requirements to a future release, getting additional help, and so on. The important part is to keep your manager informed.

3

Chapter 1 ■ Introduction to Software Development

•

The right tools and the right practices for this project: One of the best things about software development is that every project is different. Even if you’re doing version 8.0 of an existing product, things change. One implication of this is that for every project, one needs to examine and pick the right set of development tools for this particular project. Picking tools that are inappropriate is like trying to hammer nails with a screwdriver; you might be able to do it eventually, but is sure isn’t easy or pretty or fun, and you can drive a lot more nails in a shorter period of time with a hammer. The three most important factors in choosing tools are the application type you are writing, the target platform, and the development platform. You usually can’t do anything about any of these three things, so once you know what they are, you can pick tools that improve your productivity. A fourth and nearly as important factor in tool choice is the composition and experience of the development team. If your team is composed of all experienced developers with facility on multiple platforms, tool choice is pretty easy. If, on the other hand, you have a bunch of fresh-outs and your target platform is new to all of you, you’ll need to be careful about tool choice and fold in time for training and practice with the new tools.

•

To realize that you don’t know everything you need to know at the beginning of the project: Software development projects just don’t work this way. You’ll always uncover new requirements. Other requirements will be discovered to be not nearly as important as the customer thought, and still others that were targeted for the next release are all of a sudden requirement number 1. Managing requirements churn during a project is one of the single most important skills a software developer can have. If you’re using new development tools—say, that new web development framework—you’ll uncover limitations you weren’t aware of and side-effects that cause you to have to learn, for example, three other tools to understand them—for example, that that web development tool is Ruby based, requires a specific relational database system to run, and needs a particular configuration of Apache to work correctly.

Conclusion Software development is the heart of every software project and is the heart of software engineering. Its objective is to deliver excellent, defect-free code to users on time and within budget—all in the face of constantly changing requirements. That makes development a particularly hard job to do. But finding a solution to a difficult problem and getting your code to work correctly is just about the coolest feeling in the world.

“[Programming is] the only job I can think of where I get to be both an engineer and an artist. There’s an incredible, rigorous, technical element to it, which I like because you have to do very precise thinking. On the other hand, it has a wildly creative side where the boundaries of imagination are the only real limitation. The marriage of those two elements is what makes programming unique. You get to be both an artist and a scientist. I like that. I love creating the magic trick at the center that is the real foundation for writing the program. Seeing that magic trick, that essence of your program, working correctly for the first time, is the most thrilling part of writing a program.” —Andy Hertzfeld (designer of the first Mac OS)2 2

Lammers, Susan. Programmers at Work. (Redmond, WA: Microsoft Press, 1986).

4

Chapter 1 ■ Introduction to Software Development

References Brooks, Frederick. “No Silver Bullet.” IEEE Computer (1987). 20(4): 10-19. Lammers, Susan. Programmers at Work. (Redmond, WA: Microsoft Press, 1986).

5

CHAPTER 2

Software Process Models If you don’t know where you’re going, any road will do. If you don’t know where you are, a map won’t help. —Watts Humphrey Every program has a life cycle. It doesn’t matter how large or small the program is, or how many people are working on the project—all programs go through the same steps: 1. Conception 2. Requirements gathering/exploration/modeling 3. Design 4. Coding and debugging 5. Testing 6. Release 7. Maintenance/software evolution 8. Retirement Your program may compress some of these steps, or combine two or more steps into a single piece of work, but all programs go through all steps of the life cycle. Although every program has a life cycle, many different process variations encompass these steps. Every development process, however, is a variation on two fundamental types. In the first type, the project team will generally do a complete life cycle—at least steps 2 through 7—before they go back and start on the next version of the product. In the second type, which is more prevalent now, the project team will generally do a partial life cycle—usually steps 3 through 5—and iterate through those steps several times before proceeding to the release step. These two general process types can be implemented using two classes of project management models. These are traditional plan-driven models,1 and the newer agile development models.2 In plan-driven models, the methodology tends to be stricter in terms of process steps and when releases happen. Plan-driven models have more clearly defined phases, and more requirements for sign-off on completion of a phase

Paulk, M. C. The Capability Maturity Model: Guidelines for Improving the Software Process. (Reading, MA: Addison-Wesley, 1995.) 2 Martin, R. C. Agile Software Development, Principles, Patterns, and Practices. (Upper Saddle River, NJ: Prentice Hall, 2003.) 1

© John F. Dooley 2017 J. F. Dooley, Software Development, Design and Coding, https://doi.org/10.1007/978-1-4842-3153-1_2

7

Chapter 2 ■ Software Process Models

before moving on to the next phase. Plan-driven models require more documentation at each phase and verification of completion of each work product. These tend to work well for large contracts for new software with well-defined deliverables. The agile models are inherently incremental and make the assumption that small, frequent releases produce a more robust product than larger, less frequent ones. Phases in agile models tend to blur together more than in plan-driven models, and there tends to be less documentation of work products required, the basic idea being that code is what is being produced, so developer efforts should focus there. See the Agile Manifesto web page at http://agilemanifesto.org to get a good feel for the agile development model and goals. This chapter takes a look at several software life cycle models, both plan driven and agile, and compares them. There is no one best process for developing software. Each project must decide on the model that works best for its particular application and base that decision on the project domain, the size of the project, the experience of the team, and the timeline of the project. But first we have to look at the four factors, or variables, that all software development projects have in common.

The Four Variables The four variables of software development projects are as follows: •

Cost is probably the most constrained; you can’t spend your way to quality or being on schedule, and as a developer you have very limited control over cost. Cost can influence the size of the team or, less often, the types of tools available to the team. For small companies and startups, cost also influences the environment where the developers will work.

•

Time is your delivery schedule and is unfortunately many times imposed on you from the outside. For example, most consumer products (be they hardware or software) will have a delivery date somewhere between August and October in order to hit the holiday buying season. You can’t move Christmas. If you’re late, the only way to fix your problem is to drop features or lessen quality, neither of which is pretty. Time is also where Brooks’s law gets invoked (adding programmers to a late project just makes it later).

•

Quality is the number and severity of defects you’re willing to release with. You can make short-term gains in delivery schedules by sacrificing quality, but the cost is enormous: it will take more time to fix the next release, and your credibility is pretty well shot.

•

Features (also called scope) are what the product actually does. This is what developers should always focus on. It’s the most important of the variables from the customer’s perspective and is also the one you as a developer have the most control over. Controlling scope allows you to provide managers and customers control over quality, time, and cost. If the developers don’t have control over the feature set for each release, then they are likely to blow the schedule. This is why developers should do the estimates for software work products.

A Model That’s not a Model At All: Code and Fix The first model of software development we’ll talk about isn’t really a model at all. But it is what most of us do when we’re working on small projects by ourselves, or maybe with a single partner. It’s the code and fix model. The code and fix model, shown in Figure 2-1, is often used in lieu of actual project management. In this model there are no formal requirements, no required documentation, and no quality assurance or formal testing, and release is haphazard at best. Don’t even think about effort estimates or schedules when using this model.

8

Chapter 2 ■ Software Process Models

Figure 2-1. The code and fix process model Code and fix says take a minimal amount of time to understand the problem and then start coding. Compile your code and try it out. If it doesn’t work, fix the first problem you see and try it again. Continue this cycle of type-compile-run-fix until the program does what you want with no fatal errors and then ship it. Every programmer knows this model. We’ve all used it way more than once, and it actually works in certain circumstances: for quick, disposable tasks. For example, it works well for proof-of-concept programs. There’s no maintenance involved, and the model works well for small, single-person programs. It is, however, a very dangerous model for any other kind of program. With no real mention of configuration management, little in the way of testing, no architectural planning, and probably little more than a desk check of the program for a code review, this model is good for quick and dirty prototypes and really nothing more. Software created using this model will be small, short on user interface niceties, and idiosyncratic. That said, code and fix is a terrific way to do quick and dirty prototypes and short, one-off programs. It’s useful to validate architectural decisions and to show a quick version of a user interface design. Use it to understand the larger problem you’re working on.

Cruising over the Waterfall The first and most traditional of the plan-driven process models is the waterfall model. Illustrated in Figure 2-2, it was created in 1970 by Winston Royce,3 and addresses all of the standard life cycle phases. It progresses nicely through requirements gathering and analysis, to architectural design, detailed design, coding, debugging, integration and system testing, release, and maintenance. It requires detailed documentation at each stage, along with reviews, archiving of the documents, sign-offs at each process phase, configuration management, and close management of the entire project. It’s an exemplar of the plandriven process.

Royce, W. W. Managing the Development of Large Software Systems. Proceedings of IEEE WESCON. (Piscataway, NJ, IEEE Press. 1970.)

3

9

Chapter 2 ■ Software Process Models

Figure 2-2. The waterfall process model It also doesn’t work. There are two fundamental and related problems with the waterfall model that hamper its acceptance and make it very difficult to implement. First, it generally requires that you finish phase N before you continue on to phase N+1. In the simplest example, this means you must nail down all your requirements before you start your architectural design, and finish your coding and debugging before you start anything but unit testing. In theory, this is great. You’ll have a complete set of requirements, you’ll understand exactly what the customer wants and everything the customer wants, so you can then confidently move on to designing the system. In practice, though, this never happens. I’ve never worked on a project where all the requirements were nailed down at the beginning of the work. I’ve never seen a project where big things didn’t change somewhere during development. So, finishing one phase before the other begins is problematic. The second problem with the waterfall is that, as stated, it has no provision for backing up. It is fundamentally based on an assembly-line mentality for developing software. The nice little diagram shows no way to go back and rework your design if you find a problem during implementation. This is similar to the first problem above. The implications are that you really have to nail down one phase and review everything in detail before you move on. In practice this is just not practical. The world doesn’t work this way. You never know everything you need to know at exactly the time you need to know it. This is why software is a wicked problem. Most organizations that implement the waterfall model modify it to have the ability to back up one or more phases so that missed requirements or bad design decisions can be fixed. This helps and generally makes the waterfall model usable, but the requirement to update all the involved documentation when you do back up makes even this version problematic. All this being said, the waterfall is a terrific theoretical model. It isolates the different phases of the life cycle and forces you to think about what you really do need to know before you move on. It’s also a good way to start thinking about very large projects; it gives managers a warm fuzzy because it lets them think they know what’s going on (they don’t, but that’s another story). It's also a good model for inexperienced teams working on a well-defined, new project because it leads them through the life cycle.

10

Chapter 2 ■ Software Process Models

Iterative Models The best practice is to iterate and deliver incrementally, treating each iteration as a closedend “mini-project,” including complete requirements, design, coding, integration, testing, and internal delivery. On the iteration deadline, deliver the (fully-tested, fully-integrated) system thus far to internal stakeholders. Solicit their feedback on that work, and fold that feedback into the plan for the next iteration. (From “How Agile Projects Succeed”4) Although the waterfall model is a great theoretical model, it fails to recognize that all the requirements aren’t typically known in advance, and that mistakes will be made in architectural design, detailed design, and coding. Iterative process models make this required change in process steps more explicit and create process models that build products a piece at a time. In most iterative process models, you’ll take the known requirements—a snapshot of the requirements at some time early in the process—and prioritize them, typically based on the customer’s ranking of what features are most important to deliver first. Notice also that this is the first time we’ve got the customer involved except at the beginning of the whole development cycle. You then pick the highest priority requirements and plan a series of iterations, where each iteration is a complete project. For each iteration, you’ll add a set of the next highest priority requirements (including some you or the customer may have discovered during the previous iteration) and repeat the project. By doing a complete project with a subset of the requirements every time at the end of each iteration, you end up with a complete, working, and robust product, albeit with fewer features than the final product will have. According to Tom DeMarco, these iterative processes follow one basic rule:

Your project, the whole project, has a binary deliverable. On the scheduled completion day, the project has either delivered a system that is accepted by the user, or it hasn’t. Everyone knows the result on that day. The object of building a project model is to divide the project into component pieces, each of which has this same characteristic: each activity must be defined by a deliverable with objective completion criteria. The deliverables are demonstrably done or not done.” 5 So, what happens if you estimate wrong? What if you decide to include too many new features in an iteration? What if there are unexpected delays? Well, if it looks as if you won’t make your iteration deadline, there are only two realistic alternatives: move the deadline or remove features. We’ll come back to this problem later when we talk about estimation and scheduling. The key to iterative development is “live a balanced life—learn some and think some and draw and paint and sing and dance and play and work every day some,”6 or in the software development world, analyze some and design some and code some and test some every day. We’ll revisit this idea when we talk about the agile development models later in this chapter.

4

www.adaptionsoft.com/on_time.html DeMarco, T. Controlling Software Projects: Management, Measurement and Estimation. (Upper Saddle River, NJ: Yourdon Press, 1983.) 6 Fulghum, Robert. All I Really Need to Know I Learned in Kindergarten. (New York, NY: Ivy Books. 1986.) 5

11

Chapter 2 ■ Software Process Models

Evolving the Iterative Model A traditional way of implementing the iterative model is known as evolutionary prototyping.7 In evolutionary prototyping, illustrated in Figure 2-3, one prioritizes requirements as they are received and produces a succession of increasingly feature-rich versions of the product. Each version is refined using customer feedback and the results of integration and system testing. This is an excellent model for an environment of changing or ambiguous requirements, or a poorly understood application domain. This is the model that evolved into the modern agile development processes.

Figure 2-3. Evolutionary prototyping process model Evolutionary prototyping recognizes that it’s very hard to plan the full project from the start and that feedback is a critical element of good analysis and design. It’s somewhat risky from a scheduling point of view, but when compared to any variation of the waterfall model, it has a very good track record. Evolutionary prototyping provides improved progress visibility for both the customer and project management. It also provides good customer and end user input to product requirements and does a good job of prioritizing those requirements. On the downside, evolutionary prototyping leads to the danger of unrealistic schedules, budget overruns, and overly optimistic progress expectations. These can happen because the limited number of requirements implemented in a prototype can give the impression of real progress for a small amount of work. On the flip side, putting too many requirements in a single prototype can result is schedule slippages because of overly optimistic estimation. This is a tricky balance to maintain. Because the design evolves over time as the requirements change, there is the possibility of a bad design, unless there’s the provision of re-designing—something that becomes harder and harder to do as the project progresses and your customer is more heavily invested in a particular version of the product. There is also the possibility of low maintainability, again because the design and code evolve as requirements change. This may lead to lots of re-work, a broken schedule, and increased difficulty in fixing bugs post-release. Evolutionary prototyping works best with tight, experienced teams who have worked on several projects together. This type of cohesive team is productive and dexterous, able to focus on each iteration and usually producing the coherent, extensible designs that a series of prototypes requires. This model is not generally recommended for inexperienced teams.

7

McConnell, S. Rapid Development: Taming Wild Software Schedules. (Redmond, WA: Microsoft Press, 1996.)

12

Chapter 2 ■ Software Process Models

Risk: The Problem with Plan-Driven Models Risk is the most basic problem in software. Risk manifests itself in many ways: schedule slips, project cancelation, increased defect rates, misunderstanding of the business problem, false feature richness (you’ve added features the customer really doesn’t want or need), and staff turnover. Managing risk is a very difficult and time-consuming management problem. Minimizing and handling risk are the key areas of risk management. Agile methodologies seek to minimize risk by controlling the four variables of software development. Agile methods recognize that to minimize risk, developers need to control as many of the variables as possible, but they especially need to control the scope of the project. Agile uses the metaphor of “learning to drive.” Learning to drive is not pointing the car in the right direction. It’s pointing the car, constantly paying attention, and making the constant minor corrections necessary to keep the car on the road. In programming, the only constant is change. If you pay attention and cope with change as it occurs, you can keep the cost of change manageable.

Agile Methodologies Starting in the mid 1990s, a group of process mavens began advocating a new model for software development. As opposed to the heavyweight plan-driven models mentioned earlier and espoused by groups like the Software Engineering Institute (SEI) at Carnegie Mellon University,8 this new process model was lightweight. It required less documentation and fewer process controls. It was targeted at small- to medium-sized software projects and smaller teams of developers. It was intended to allow these teams of developers to quickly adjust to changing requirements and customer demands, and it proposed to release completed software much more quickly than the plan-driven models. It was, in a word, agile.9 Agile development works from the proposition that the goal of any software development project is working code. And because the focus is on working software, then the development team should spend most of their time writing code, not writing documents. This gives these processes the name lightweight. Lightweight methodologies have several characteristics: they tend to emphasize writing tests before code, frequent product releases, significant customer involvement in development, common code ownership, and refactoring—rewriting code to make it simpler and easier to maintain. Lightweight methodologies also suffer from several myths. The two most pernicious are probably that lightweight processes are only good for very small projects, and that you don’t have any process discipline in a lightweight project. Both of these are incorrect. The truth is that lightweight methodologies have been successfully used in many small- and mediumsized projects—say, up to about 500,000 lines of code. They have also been used in very large projects. These types of projects can nearly always be organized as a set of smaller projects that hang together and provide services to the single large product at the end. Lightweight processes can be used on the smaller projects quite easily. Lightweight methodologies also require process discipline, especially at the beginning of a project when initial requirements and an iteration cycle are created, and in the test-driven-development used as the heart of the coding process. The rest of this chapter describes the agile values and principles, then looks at two lightweight/agile methodologies, eXtreme Programming (XP) and Scrum, and finally talks about an interesting variant: lean software development.

8 9

Paulk, M. C. (1995.) Cockburn, A. Agile Software Development. (Boston, MA: Addison-Wesley, 2002.)

13

Chapter 2 ■ Software Process Models

Agile Values and Principles In early 2001 a group of experienced and innovative developers met in Snowbird, Utah to talk about the state of the software development process. All of them were dissatisfied with traditional plan-driven models and had been experimenting with new lightweight development techniques. Out of this meeting came the Agile Manifesto.10 The original description proposed by the group included two parts: values (the manifesto itself ) and principles. The values are as follows: •

Individuals and interactions over processes and tools

•

Working software over comprehensive documentation

•

Customer collaboration over contract negotiation

•

Responding to change over following a plan

The idea behind the manifesto is that although the authors understood the value of the latter items in each value, they preferred to think and work on the former items. Those things—individuals, working software, collaboration, and responding to change—are the most important and valuable ideas in getting a software product out the door. The principles run as follows: 1. Our highest priority is to satisfy the customer through early and continuous delivery of valuable software. 2. Welcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage. 3. Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale. 4. Business people and developers must work together daily throughout the project. 5. Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done. 6. The most efficient and effective method of conveying information to and within a development team is face-to-face conversation. 7. Working software is the primary way to measure progress. 8. Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely. 9. Continuous attention to technical excellence and good design enhances agility. 10. Simplicity—the art of maximizing the amount of work not done—is essential. 11. The best architectures, requirements, and designs emerge from self-organizing teams. 12. At regular intervals, the team reflects on how to become more effective and then tunes and adjusts its behavior accordingly.

10

www.agilealliance.org/agile101/

14

Chapter 2 ■ Software Process Models

eXtreme Programming (XP) Kent Beck and Ward Cunningham created XP around 1995. XP is a “lightweight, efficient, low-risk, flexible, predictable, scientific, and fun way to develop software.”11

XP Overview XP relies on the following four fundamental ideas: •

Heavy customer involvement: XP requires that a customer representative be part of the development team and be on site at all times. The customer representative works with the team to define the content of each iteration of the product. They also create all the acceptance tests for each interim release.

•

Continuous unit testing (also known as test-driven development, or TDD): XP calls for developers to write the unit tests for any new features before any of the code is written. In this way the tests will, of course, initially all fail, but it gives a developer a clear metric for success. When all the unit tests pass, you’ve finished implementing the feature.

•

Pair programming: XP requires that pairs of developers write all code. In a nutshell, pair programming requires two programmers—a driver and a navigator—who share a single computer. The driver is actually writing the code while the navigator watches, catching typos, making suggestions, thinking about design and testing, and so on. The pair switches places periodically (every 30 minutes or so, or when one of them thinks he has a better way of implementing a piece of code). Pair programming works on the “two heads are better than one” theory. Although a pair of programmers isn’t quite as productive as two individual programmers when it comes to number of lines of code written per unit of time, their code usually contains fewer defects, and they have a set of unit tests to show that it works. This makes them more productive overall. Pair programming also provides the team an opportunity to refactor existing code—to re-design it to make it as simple as possible while still meeting the customer’s requirements. Pair programming is not exclusive to XP, but XP was the first discipline to use it exclusively. In fact, XP uses pair programming so exclusively that no code written by just a single person is typically allowed into the product.

•

Short iteration cycles and frequent releases: XP typically uses release cycles in the range of just a few weeks or months, and each release is composed of several iterations, each on the order of three to five weeks. The combination of frequent releases and an on-site customer representative allows the XP team to get immediate feedback on new features and to uncover design and requirements issues early. XP also requires constant integration and building of the product. Whenever a programming pair finishes a feature or task and it passes all their unit tests, they immediately integrate and build the entire product. They then use all the unit tests as a regression test suite to make sure the new feature hasn’t broken anything already checked in. If it does break something, they fix it immediately. So, in an XP project, integrations and builds can happen several times a day. This process gives the team a good feel for where they are in the release cycle every day and gives the customer a completed build on which to run the acceptance tests.

This is a very short description of how XP works; for a much more eloquent and detailed explanation see the bible of XP: Beck, K. Extreme Programming Explained: Embrace Change. (Boston, MA: Addison-Wesley, 2000.)

11

15

Chapter 2 ■ Software Process Models

The Four Basic Activities XP describes four activities that are the bedrock of the discipline: •

Designing: Design while you code. “Designing is creating a structure that organizes the logic in the system. Good design organizes the logic so that a change in one part of the system doesn’t always require a change in another part of the system. Good design ensures that every piece of logic in the system has one and only one home. Good design puts the logic near the data it operates on. Good design allows the extension of the system with changes in only one place.”12

•

Coding: The code is where the knowledge of the system resides, so it’s your main activity. The fundamental difference between plan-driven models and agile models is this emphasis on the code. In a plan-driven model, the emphasis is on producing a set of work products that together represent the entire work of the project, with code being just one of the work products. In agile methodologies, the code is the sole deliverable and so the emphasis is placed squarely there; in addition, by structuring the code properly and keeping comments up to date, the code becomes documentation for the project.

•

Testing: The tests tell you when you’re done coding. Test-driven development is crucial to the idea of managing change. XP depends heavily on writing unit tests before writing the code that they test and on using an automated testing framework to run all the unit tests whenever changes are integrated.

•

Listening: To your partner and to the customer. In any given software development project, there are two types of knowledge. The customer has knowledge of the business application being written and what it is supposed to do. This is the domain knowledge of the project. The developers have knowledge about the target platform, the programming language(s), and the implementation issues. This is the technical knowledge of the project. The customer doesn’t know the technical side, and the developers don’t have the domain knowledge, so listening—on both sides—is a key activity in developing the product.

Implementing XP: The 12 Practices We (finally) get to the implementation of XP. Here are the rules that every XP team follows during their project. The rules may vary depending on the team and the project, but in order to call yourselves an XP team, you need to do some form of these things. The practices described here draw on everything previously described: the four values, the 12 principles, and the four activities. This is really XP: •

12

The planning game: Develop the scope of the next release by combining business priorities and technical estimates. The customer and the development team need to decide on the stories (read: features) that will be included in the next release, the priority of each story, and when the release needs to be done. The developers are responsible for breaking the stories up into a set of tasks and for estimating the duration of each task. The sum of the durations tells the team what they really think they can get done before the release delivery date. If necessary, stories are moved out of a release if the numbers don’t add up. Notice that estimation is the responsibility of the developers and not the customer or the manager. In XP only the developers do estimation.

Beck, (2000).

16

Chapter 2 ■ Software Process Models

13 14

•

Small releases: Put a simple system into production quickly and then release new versions on a very short cycle. Each release has to make sense from a business perspective, so release size will vary. It’s far better to plan releases in durations of a month or two rather than 6 or 12 months. The longer a release is, the harder it is to estimate.

•

Metaphor: “A simple shared story of how the whole system works.” The metaphor replaces your architecture. It needs to be a coherent explanation of the system that is decomposable into smaller bits—stories. Stories should always be expressed in the vocabulary of the metaphor, and the language of the metaphor should be common to both the customer and the developers.

•

Simple design: Keep the design as simple as you can each day and re-design often to keep it simple. According to Beck, a simple design (1) runs all the unit tests, (2) has no duplicated code, (3) expresses what each story means in the code, and (4) has the fewest number of classes and methods that make sense to implement the stories so far.13

•

Testing: Programmers constantly write unit tests. Tests must all pass before integration. Beck takes the hard line that “any program feature without an automated test simply doesn’t exist.”14 Although this works for most acceptance tests and should certainly work for all unit tests, this analogy breaks down in some instances, notably in testing the user interface in a GUI. Even this can be made to work automatically if your test framework can handle the events generated by a GUI interaction. Beyond this, having a good set of written instructions will normally fill the bill.

•

Refactoring: Restructure the system “without changing its behavior” to make it simpler—removing redundancy, eliminating unnecessary layers of code, or adding flexibility. The key to refactoring is to identify areas of code that can be made simpler and to do it while you’re there. Refactoring is closely related to collective ownership and simple design. Collective ownership gives you permission to change the code, and simple design imposes on you the responsibility to make the change when you see it needs to be made.

•

Pair programming: Two programmers at one machine must write all production code in an XP project. Any code written alone is thrown away. Pair programming is a dynamic process. You may change partners as often as you change tasks to implement. This has the effect of reinforcing collective ownership by spreading the knowledge of the entire system around the entire team. And it avoids the “beer truck problem,” where the person who knows everything gets hit by a beer truck and thus sets the project schedule back months.

•

Collective ownership: The team owns everything, implying that anyone can change anything at any time. In some places this is known as “ego-less programming.” Programmers need to buy into the idea that anyone can change their code and that collective ownership extends from code to the entire project; it’s a team project, not an individual one.

•

Continuous integration: Integrate and build every time a task is finished, possibly several times a day (as long as the tests all pass). This helps to isolate problems in the code base; if you’re integrating a single task change, then the most likely place to look for a problem is right there.

Beck, (2000) Beck, (2000)

17

Chapter 2 ■ Software Process Models

•

40-hour week: Work a regular 40-hour week. Never work a second week in a row with overtime. The XP philosophy has a lot in common with many of Tom DeMarco’s Peopleware arguments. People are less productive if they’re working 60 or 70 hours a week than if they’re working 40 hours. When you’re working excessive amounts of overtime, several things happen. Because you don’t have time to do chores and things related to your “life,” you do them during the workday. Constantly being under deadline pressure and never getting a sustained break also means you get tired and then make more mistakes, which somebody then needs to fix. But being in control of the project and working 40 hours a week (give or take a few) leaves you with time for a life, time to relax and recharge, and time to focus on your work during the workday—making you more productive, not less.

•

On-site customer: A customer is part of the team, is on-site, writes and executes functional tests, and helps clarify requirements. The customer’s ability to give immediate feedback to changes in the system also increases team confidence that they’re building the right system every day.

•

Coding standards: The team has them, follows them, and uses them to improve communication. Because of collective code ownership, the team must have coding standards and everyone must adhere to them. Without a sensible set of coding guidelines, it would take much, much longer to do refactoring and it would decrease the desire of developers to change code. Notice that I said sensible. Your coding standards should make your code easier to read and maintain: they shouldn’t constrict creativity.

Scrum The second agile methodology we’ll look at is Scrum. Scrum derives its name from rugby, where a Scrum is a means of restarting play after a rules infraction. The Scrum uses the 8 forwards on a rugby team (out of 15 players in the rugby union form of the game) to attempt to (re)gain control of the ball and move it forward towards the opposing goal line. The idea in the agile Scrum methodology is that a small team is unified around a single goal and gets together for sprints of development that move them towards that goal. Scrum is, in fact, older than XP, with the original process management idea coming from Takeuchi and Nonaka’s 1986 paper, “The New New Product Development Game.”15 The first use of the term Scrum is attributed to DeGrace and Stahl’s 1990 book Wicked Problems, Righteous Solutions.16 Scrum is a variation on the iterative development approach and incorporates many of the features of XP. Scrum is more of a management approach than XP and doesn’t define many of the detailed development practices (like pair programming or test-driven development) that XP does, although most Scrum projects will use these practices. Scrum uses teams of typically no more than ten developers. Just like other agile methodologies, Scrum emphasizes the efficacy of small teams and collective ownership.

Takeuchi, H. and I. Nonaka. “The New New Product Development Game.” Harvard Business Review 64(1): 137-146 (1986). 16 DeGrace, P. and L. H. Stahl. Wicked Problems, Righteous Solutions: A Catalogue of Modern Software Engineering Paradigms. (Englewood Cliffs, NJ: Yourdon Press, 1990.) 15

18

Chapter 2 ■ Software Process Models

Scrum Roles Scrum defines three roles in a development project. The first is the product owner, the person who generates the requirements for the product and prioritizes them. The requirements normally take the form of user stories—features that can be summarized by sentences like “As a , I want to , so that .” These user stories turn into one or more tasks that suggest how to create the feature. The product owner adds the user stories to the product backlog and prioritizes them. This points the development team towards the most valuable work. The product owner is also charged with making sure that the team understands the requirements behind the user stories. Once the team completes a user story, they have added value to the end product. Scrum projects are facilitated by a Scrum master whose job it is to manage the backlogs, run the daily Scrum meetings, coach the team, and protect the team from outside influences during the sprint. The Scrum master may or may not be a developer but they are an expert in the Scrum process and are the go-to person for questions on Scrum. The Scrum master is emphatically not the manager of the team. Scrum teams are teams of equals and arrive at decisions by consensus. Besides the product owner and the Scrum master, everyone else involved in the project is on the development team. The development team itself is self-organizing; the members of the Scrum team decide among themselves who will work on what user stories and tasks, assume collective ownership of the project, and decide on the development process they’ll use during the sprint. The entire team is dedicated to the goal of delivering a working product at the end of every sprint. This organization is reinforced every day at the Scrum meeting.

The Sprint Scrum is characterized by the sprint, an iteration of between one and four weeks. Sprints are time-boxed in that they are of a fixed duration and the output of a sprint is what work the team can accomplish during the sprint. The delivery date for the sprint does not move out. This means that sometimes a sprint can finish early, and sometimes a sprint will finish with less functionality than was proposed. A sprint always delivers a usable product.

Scrum Artifacts Scrum requirements are encapsulated in two backlogs. The product backlog is the prioritized list of all the requirements for the project; the product owner creates it. The product owner prioritizes the product backlog, and the development team breaks the high-priority user stories into tasks and estimates them. This list of tasks becomes the sprint backlog. The sprint backlog is the prioritized list of user stories for the current sprint. Once the sprint starts, only the development team may add tasks to the sprint backlog—these are usually bugs found during testing. No outside entity may add items to the sprint backlog, only to the product backlog. One important thing about the Scrum process is that in most Scrum teams, the sprint backlog is visual. It’s represented on a board using either Post-It notes or index cards, with one note or card per task; it may also be an online virtual board. For example, see Jira or Pivotal Tracker. This task board always has at least three columns: ToDo, InProgress, and Done. The task board provides a visual representation of where the team is during the sprint. At a glance, any team member can see what tasks are currently available to be picked up, which are actively being worked on, and which are finished and integrated into the product baseline. This visual task board is similar to the Kanban board that we’ll talk about later.

19

Chapter 2 ■ Software Process Models

Sprint Flow Before the first sprint starts, Scrum has an initial planning phase that creates the product list of the initial requirements, decides on an architecture for implementing the requirements, divides the user stories into prioritized groups for the sprints, and breaks the first set of user stories into tasks to be estimated and assigned. They stop when their estimates occupy all the time allowed for the sprint. Tasks in a sprint should not be longer than one day of effort. If a task is estimated to take more than one day of effort, it is successively divided into two or more tasks until each task’s effort is the appropriate length. This rule comes from the observation that humans are terrible at doing exact estimations of large tasks and that estimating task efforts in weeks or months is basically just a guess. So, breaking tasks down into smaller pieces gives the team a more reliable estimate for each. Sprints have a daily Scrum meeting, which is a stand-up meeting of 15–30 minutes duration (the shorter, the better) where the entire team discusses sprint progress. The daily Scrum meeting allows the team to share information and track sprint progress. By having daily Scrum meetings, any slip in the schedule or any problems in implementation are immediately obvious and can then be addressed by the team at once. “The Scrum master ensures that everyone makes progress, records the decisions made at the meeting and tracks action items, and keeps the Scrum meetings short and focused.”17 At the Scrum meeting, each team member answers the following three questions in turn: •

What tasks have you finished since the last Scrum meeting?

•

Is anything getting in the way of your finishing your tasks?

•

What tasks are you planning to do between now and the next Scrum meeting?

Discussions other than responses to these three questions are deferred to other meetings. This meeting type has several effects. It allows the entire team to visualize progress towards the sprint and project completion every day. It reinforces team spirit by sharing progress—everyone can feel good about tasks completed. And finally, the Scrum meeting verbalizes problems, which can then be solved by the entire team. Some Scrum teams have a meeting in the middle of the sprint called story time. In story time the team and the product owner take a look at the product backlog and begin the process of prioritizing the user stories in the backlog and breaking the high priority stories into tasks that will become part of the next sprint’s backlog. Story time is optional, but has the advantage of preparing for the planning meeting for the next sprint. At the end of the sprint two things happen. First, the current version of the product is released to the product owner, who may perform acceptance testing on it. This usually takes the form of a demo of the product near the end of the last day of the sprint. This meeting and demo is called the sprint review. After the sprint review, the team will wrap up the sprint with a sprint retrospective meeting. In the sprint retrospective, the team looks back on the just completed sprint, congratulates themselves on jobs well done, and looks for areas in which they can improve performance for the next sprint. These meetings typically don’t last long (an hour or so) but are a valuable idea that brings closure to and marks the end of the current sprint. At the start of the next sprint, another planning meeting is held where the Scrum master and the team re-prioritize the product backlog and create a backlog for the new sprint. With most Scrum teams, estimates of tasks become better as the project progresses, primarily because the team now has data on how they’ve done estimating on previous sprints. This effect in Scrum is called velocity; the productivity of the team can actually increase during the project as they gel as a team and get better at estimating tasks. This planning meeting is also where the organization can decide whether the project is finished—or whether to finish the project at all.

Rising, L. and N. S. Janoff. “The Scrum Software Development Process for Small Teams.” IEEE Software 17(4): 26-32 (2000). 17

20

Chapter 2 ■ Software Process Models

After the last scheduled development sprint, a final sprint may be done to bring further closure to the project. This sprint implements no new functionality, but prepares the final deliverable for product release. It fixes any remaining bugs, finishes documentation, and generally productizes the code. Any requirements left in the product backlog are transferred to the next release. A Scrum retrospective is held before the next sprint begins to ponder the previous sprint and see whether any process improvements can be made. Scrum is a project-management methodology and is typically silent on development processes. Despite this, Scrum teams typically use many of the practices described earlier in the XP practices section. Common code ownership, pair programming, small releases, simple design, test-driven development, continuous integration, and coding standards are all common practices in Scrum projects.

Lean Software Development Lean software development is not really an agile methodology. It’s more of a philosophy that most agile methodologies, like XP and Scrum, draw from and from which they seek guidance and inspiration. Lean software development comes from the just-in-time manufacturing processes (also known as the Toyota Production System, among other names) that were introduced in Japan in the 1970s and then made their way around the world in the 1980s and 1990s, encouraged by the publication in 1990 of The Machine That Changed The World by Womack, et. al.18 Just-in-time manufacturing evolved into first lean manufacturing and then into lean product management systems throughout the 1990s. The publication of Poppendieck & Poppendieck’s Lean Software Development: An Agile Toolkit19 in 2003 marked the movement of lean into the agile development community. Lean software development is a set of principles designed to improve productivity, quality, and customer satisfaction. Lean wants to eliminate from your process anything that doesn’t add value to the product. These non-value-adding parts of your process are called waste. Lean also emphasizes that the team should only be working on activities that add value to the product right now. The Poppendiecks transformed the lean principles that started at Toyota into seven key principles for software development: 1. Eliminate Waste 2. Build Quality In 3. Create Knowledge 4. Defer Commitment 5. Deliver Fast 6. Respect People 7. Optimize the Whole We’ll go through each of these principles briefly to illustrate how they apply to software development and how agile methodologies make use of them.

Womack, James P., Daniel T. Jones, and Daniel Roos. The Machine That Changed the World: The Story of Lean Production -- Toyota’s Secret Weapon in the Global Car Wars That Is Now Revolutionizing World Industry. (New York, NY: Simon & Schuster, 1990.) 19 Poppendieck, Mary, and Tom Poppendieck. Lean Software Development: An Agile Toolkit. (Upper Saddle River, NJ: Addison-Wesley Professional, 2003.) 18

21

Chapter 2 ■ Software Process Models

Principle 1: Eliminate Waste Lean software development wants you to eliminate anything that doesn’t add value to the product. Things that don’t add value are, as mentioned, waste. Obviously in any kind of production or development environment you want to eliminate waste. Waste costs you money and time. The question here, in a software development project, is: “What is waste?” Some things may be obviously wasteful: too many meetings (some will say meetings of any kind), too much documentation, unnecessary features, and unclear or rapidly changing requirements. But there are others, such as partially written code, code for “future features” that the customer may never use, defects in your code and other quality issues, excessive task switching (you as a developer are assigned several tasks and you have to keep switching between them), and too many features or tasks for a given iteration or release. All these things constitute waste, and in a lean environment the team’s job is to minimize waste in all forms. Only by doing that can the team increase productivity and deliver working code fast. The next question is: “How do we eliminate waste?” A general way to focus on waste in a project and work towards eliminating all forms of it is to consider the team’s development process and how it’s working. In Scrum this is the end of sprint retrospective. During the retrospective, the team looks back on the just-finished sprint and asks “What can we improve?” A lean team will turn that question into “What was wasteful and how can we change to be less wasteful next time?” Teams that make these little improvements in their process at the end of every sprint, focusing on one or two items to change each time, will learn more and more about what works and what doesn’t in their process and are on the way to continuous process improvement.

Principle 2: Build Quality In Quality issues are the bane of every software developer. Defects in code, testing your code more than once, logging defects, fixing defects, re-testing, all result in waste you’d like to eliminate. It turns out that agile processes are on top of removing this type of waste in order to build quality software. Nearly all agile teams will implement two techniques that improve code quality at the source: pair programming and test-driven development (TDD). Both of these techniques allow developers to write, test, and fix code quickly and before the code is integrated into the product code base, where defects become harder to find and fix. Integrating new features as soon as they’re done gives the testing team a new version of the product to test as quickly as possible and shortens the amount of time between code creation and testing. Another technique to improve quality in XP teams is constant feedback. In XP, because the customer is part of the team, they can evaluate new iterations of the product constantly, giving the developers instant feedback on what’s working and what’s not. There is one other relatively painless, but probably heretical thing you as a developer can do that will add to the quality of your code: don’t log defects in a defect tracking system. But wait! How are you going to document that there’s a defect to be fixed later? The answer is you don’t. You fix it—now. As soon as you find it. That builds quality in and eliminates waste at the same time.20

Principle 3: Create Knowledge It seems obvious that, as your team works through requirements, creates a design, and implements the code that will become a product, that you are creating knowledge. However, another way to describe this lean principle is to say, “The team must learn new things constantly.” Learning new things is what’s happening as you work through a project. You learn by working to understand requirements. You learn by beginning with an initial design and realizing that the design will change as the requirements do. You learn that the detailed

20

Poppendieck and Poppendieck (2003. pp. 25-26).

22

Chapter 2 ■ Software Process Models

design evolves and isn’t truly finished until you have code written. And you learn that by implementing new requirements and fixing defects that you can make the code simpler by refactoring as often as you can. Thus you’re creating knowledge that’s embodied in the code you produce and ship.

Principle 4: Defer Commitment This lean principle really dates back to the early 1970s and the advent of top-down structured design. What Defer Commitment means is put off decisions (particularly irreversible ones) as long as you can and only make them when you must. In top-down design you start with a general formulation of the problem solution and push decisions about implementation down as you make the design and the code more detailed. This gives you more flexibility at the upper levels and pushes the commitment to a particular design or piece of code down until you have no alternative but to write it. This is why you write libraries and APIs, so at any particular level of the code you can use something at a lower level without needing to know the implementation details. At that lower level you’ll hopefully know more about what needs to be done, and the code will write itself. This principle also means that you shouldn’t put decisions off too late. That has the possibility of delaying the rest of the team and making your design less flexible. Also try to make as few irreversible decisions as you can to give yourself as much flexibility at all levels of your code.

Principle 5: Deliver Fast Well, this seems obvious too. Particularly in the age of the Internet, mobile applications, and the Internet of Things, it seems that faster must be better. And that’s true. Companies that can bring quality products to market faster will have a competitive edge. “First to market” players will gain a larger market share initially and if they continue to release products quickly can maintain that edge over time. Additionally, if your team minimizes the amount of time between when the customer or product owner generates the requirements and when you deliver a product that meets those requirements, there is less time for the requirements—and the market—to change. How do you deliver fast? Well, first you should adhere to other lean principles, especially Eliminate Waste and Build Quality In. Both of these will improve productivity and allow you to deliver product iterations faster. But there is more. You should keep things simple. This means keep the requirements simple. Don’t add too many features and don’t spend time planning on future features. Don’t over-engineer the solution. Find a reasonable solution, a reasonable set of data structures, and reasonable algorithms to implement your solution. Remember that the perfect is the enemy of the good—and the fast. Finally, the best way to Deliver Fast is to have an experienced, well-integrated, cooperative, self-organizing, loyal-toeach-other team with the right skill set for your product. Nothing will help more than a good team.

Principle 6: Respect People Respecting people is all about building strong, productive teams. It’s based on the idea that the people doing the work should make the decisions. Process and product-creation decisions shouldn’t be imposed from above—they should be generated from the trenches. From the manager’s perspective, respecting your team means empowering the team to make their own decisions, including about task time estimates and decomposition, processes, tools, and design. This empowerment means that the manager must learn to listen to their team and take their ideas and concerns into consideration. It means that the manager must act as a shield for their team so that they can get the job done. Everyone on the team is enjoined to create an environment where everyone is able to speak their mind and disagreements are resolved with respect for each other. This creates a team with open communication and decision-making transparency.

23

Chapter 2 ■ Software Process Models

Principle 7: Optimize the Whole “A lean organization optimizes the whole value stream, from the time it receives an order to address a customer need until software is deployed and the need is addressed. If an organization focuses on optimizing something less than the entire value stream, we can just about guarantee that the overall value stream will suffer.”21 The main idea behind Optimize the Whole is to keep the entire product picture in sight as you develop. Agile organizations do this by having strong, multi-disciplinary teams that are co-located and that contain all the skills and product creation functions they need to deliver a product that meets the customer’s needs with little reference to another team.

Kanban The Kanban method is a practice derived from lean manufacturing (the Toyota Production System) and other change-management systems; it draws most of its original ideas from the just-in-time manufacturing processes. Just like lean software development, Kanban isn’t really a process. Rather it’s an objective and a set of principles and practices to meet that objective. Kanban uses three ideas to influence a development process: work-in-progress (WIP), flow, and lead time. Work-in-progress is the total number of tasks the team is currently working on, including all the states in which a task may find itself (in-progress, done, testing, review, and so on). Flow is the passage of tasks from one state to another on the way to completion. Lead time is the amount of time it takes a task to move from its initial state to its completed state.

The Kanban board, WIP, and Flow Work-in-progress and flow are illustrated visually in Kanban via the use of a Kanban board. A Kanban board will look familiar to most agile practitioners because it’s a variant on the Scrum task board. Figure 2-4 shows a generic task/Kanban board. These boards are often physical white boards that occupy one wall of a common space for the team. On this board the team will either write in tasks or use (in Scrum) index cards or post-it notes to identify tasks. This makes the tasks easy to move from column to column (state to state) on the board.

Figure 2-4. A generic task/Kanban board

21

Poppendieck & Poppendieck (2003. p. 39).

24

Chapter 2 ■ Software Process Models

With this type of board, the user has the option of changing the number and headings of the columns in order to make the board fit their process. When Scrum and Kanban are applied to software development, the board might start out looking more like Figure 2-5.

Figure 2-5. A task/Kanban board applied to a software development project If the team is using a Kanban board, they’ll add the maximum number of tasks that are allowed in each of the first four columns. This maximum work-in-progress is used to control the flow of work through the team. The Kanban board will then look like Figure 2-6.

Figure 2-6. A Kanban board illustrating Work-in-Progress maxima and Flow In Figure 2-6 the maximum number of Work-in-Progress tasks is ten. This means the team can’t work on more than ten tasks at a time, as denoted in each column. So, for example, say that every state on the Kanban board is maxed-out (there are five tasks in development, two in review, and three in testing) and a developer finishes a task in the Develop state. That task cannot move to the Review state until one of the two currently under review tasks is finished. We’ve just exposed a bottleneck in the Flow. In Kanban, what the developer would do is jump in to help either review one of the two tasks in that state, or help test a task in the Task state. No new tasks can be pulled into the Develop state until there is room for the finished task downstream. Let’s further say that at some time later on that there are only three tasks in Develop, one in Review, and three in Test, so the team isn’t working at maximum. If a developer is available, that developer will select a task from the ToDo state, pull it into Develop, and begin working. If a developer finishes a task in Develop, that task can then flow into Review, and the developer can then pull a task from the ToDo list into Develop and begin work.

25

Chapter 2 ■ Software Process Models

Thus, with Kanban the objective is to maximize Work-in-Progress within the constraint of the maximum number of tasks allowable at any time. There is no time-boxing as in Scrum; the goal is to move tasks through as quickly as possible—have the shortest lead time—while maximizing productivity and working up to the maximum number of tasks allowable. You can note that when I say maximizing Work-in-Progress, I don’t mean giving the team more work. In fact, the goal in keeping the maximum number of tasks in each state low is to reduce the team’s workload and allow them to focus on a smaller number of tasks (to avoid multi-tasking). The idea is that this will reduce stress and increase quality and productivity. Kanban also works by using a pull system. Instead of pushing items out to other teams, Kanban tells developers to pull items from their ToDo state as they have the capacity to handle them. You’ll notice that there are some things missing here, notably how things get on the ToDo list, and what Done means. That’s because Kanban isn’t really a project-management technique. In order to use Kanban, you need to have some process in place already and you need to manage it.

Lead Time Probably the biggest difference between using Kanban and Scrum is time-boxing. Scrum uses time-boxed sprints and measures the team’s productivity by using velocity (the number of task points finished per sprint) as its metric. This works well for Scrum and gives the team an idea of how well they’re performing at the end of every sprint. Over time the team’s average velocity is used as a predictor to decide how many tasks to include in each sprint. Because Kanban isn’t time-boxed, it uses a different productivity metric: lead time. Lead time is the amount of time it takes for the team to get one task from the initial ToDo (or Waiting, or Queued) state to the Done (or Integrated, or Released) state. Lead time can be measured in hours or days; it answers the question: “From this point, how long will it take this task to get to Done?” When a task enters the ToDo state, that date—called the entry date—is marked on the index card. As the task moves from state to state, the date the move happens is also marked on the card until finally the task is complete and it moves to the Done state. That final date is the done date. The difference between the done date and the entry date is the lead time for that task. Over time, as the team finishes more and more tasks, you can compute the average lead time for the team and answer the question just posed. Note that the lead time is realistic because it also includes time when the task is in some queue waiting to be worked on. Kanban is not so different from Scrum, and elements of Kanban are typically used in projects that implement Scrum. In Scrum, at the end of every sprint you have working software with a particular set of features available. In Kanban, at the end of every task completion you have working software with a new feature added.

Conclusion As can be seen from the methodologies described in this chapter, iteration is the key, whether you’re using an evolutionary, plan-driven process or an agile development one. Recognize that the best way to build a complex piece of software is incrementally. Learn that designing, writing, testing, and delivering incrementally better code is your first step to writing great software.

26

Chapter 2 ■ Software Process Models

References Anderson, David J. Kanban: Successful Evolutionary Change for Your Technology Business. (Sequin, WA: Blue Hole Press, 2010.) Beck, K. Extreme Programming Explained: Embrace Change. (Boston, MA: Addison-Wesley, 2000.) Cockburn, A. Agile Software Development. (Boston, MA: Addison-Wesley, 2002.) DeGrace, P. and L. H. Stahl. Wicked Problems, Righteous Solutions: A Catalogue of Modern Software Engineering Paradigms. (Englewood Cliffs, NJ: Yourdon Press, 1990.) DeMarco, T. Controlling Software Projects: Management, Measurement and Estimation. (Upper Saddle River, NJ: Yourdon Press, 1983.) Fulghum, Robert. All I Really Need to Know I Learned in Kindergarten. (New York, NY: Ivy Books. 1986.) Martin, R. C. Agile Software Development, Principles, Patterns, and Practices. (Upper Saddle River, NJ: Prentice Hall, 2003.) McConnell, S. Rapid Development: Taming Wild Software Schedules. (Redmond, WA: Microsoft Press, 1996.) Paulk, M. C. The Capability Maturity Model: Guidelines for Improving the Software Process. (Reading, MA: Addison-Wesley, 1995.) Poppendieck, Mary, and Tom Poppendieck. Lean Software Development: An Agile Toolkit. (Upper Saddle River, NJ: Addison-Wesley Professional, 2003.) Rising, L. and N. S. Janoff. “The Scrum Software Development Process for Small Teams.” IEEE Software 17(4): 26-32 (2000). Royce, W. W. Managing the Development of Large Software Systems. Proceedings of IEEE WESCON, (Piscataway, NJ, IEEE Press, 1970.) Takeuchi, H. and I. Nonaka. “The New New Product Development Game.” Harvard Business Review 64(1): 137-146 (1986). Waters, Kelly. Seven Key Principles of Lean Software Development. Retrieved on 12 June 2017. www.101ways. com/7-key-principles-of-lean-software-development-2/ (2010). Womack, James P., Daniel T. Jones, and Daniel Roos. The Machine That Changed the World: The Story of Lean Production -- Toyota’s Secret Weapon in the Global Car Wars That Is Now Revolutionizing World Industry. (New York, NY: Simon & Schuster, 1990.)

27

CHAPTER 3

Project Management Essentials Quality, features, schedule—pick two. Project management? Isn’t this a software development book? Yes, but working on a larger-than-one-person development project means working on a team; and working on a team means being managed. So learning something about project management from both sides is an essential part of learning software development. Project management is an involved and complicated set of tasks. We’ll restrict ourselves to several tasks that will impact you as a developer the most: •

Project planning

•

Estimation and scheduling

•

Resource management

•

Project oversight

•

Project reviews and presentations

•

The project retrospective

Traditional project managers usually take on a great deal of responsibility. They are responsible for managing scope, cost, estimation, schedule, quality, personnel, communication, risk, and more. However, in an agile project, the entire team is usually responsible for managing the project. If there is a separate project manager, that person is largely responsible for the following: •

Making sure the team has the resources it needs

•

Ensuring that the team adheres to the agile values and principles

•

Facilitating communications

•

Shielding the team from outside interruptions

In particular, the agile project manager doesn’t manage the day-to-day operations of the team. That’s up to the team itself. The objective is to make sure there is no delay in management decision-making.

© John F. Dooley 2017 J. F. Dooley, Software Development, Design and Coding, https://doi.org/10.1007/978-1-4842-3153-1_3

29

Chapter 3 ■ Project Management Essentials

Project Planning Project planning is forever. By that I mean that project planning continues throughout the entire duration of the project. “The Plan” is never really set in stone, because things in a typical software project are usually in constant flux. In those projects that are using a plan-driven process model, a project plan is an actual document, written by the project manager, that is approved and signed off on by the development team and by upper management. It is, in effect, a contract, albeit a rolling one, of what the team is going to do and how they’re going to do it. It says how the project will be managed, and in the most extreme plan-driven projects, even states how and when the document itself will be modified. What’s in the project plan? Generally a project plan consists of the following seven parts: •

Introduction and explanation of the project

•

Team organization

•

Risk analysis

•

Hardware, software, and human resource requirements

•

Task list and size and effort estimates

•

Project schedule

•

Project monitoring and reporting mechanisms, collectively known as project oversight

Not all of these are necessary for all projects or project methodologies. In particular, plan-driven projects will use all of them, whereas agile projects may use a few on a single page. A project plan is a great tool for setting down what you think you’re doing, an outline of how it will be done, and how you plan on executing the outline. The problem with a project plan is that it’s static. Once it’s written and signed off on, upper management thinks the project will run exactly as stated in the plan. But the reality of the project often thwarts the plan. As with many other parts of project management, agile methodologies do project planning differently. An agile project plan •

is feature-based (remember it’s built around getting production code running quickly).

•

is organized into iterations.

•

is multi-layered (because it knows that things will change and that the initial set of requirements are not complete or detailed enough).

•

is owned by the team, not the project manager.

Project Organization The project organization section of the plan contains the following three things: •

How you’re going to organize the team

•

What process model the project will be using

•

How will the project be run on a day-to-day basis

If you’re working with an experienced team, all this is already known to everyone, so your project organization section can be “We’ll do what we usually do.” However, this section is a necessity for brand-new projects and inexperienced teams, because the organization section gives you something to hang your hat on when you start the actual project work.

30

Chapter 3 ■ Project Management Essentials

Agile projects simplify the three items just mentioned. Nearly all agile teams are self-organizing. Because they’re small and because one of the agile principles is the idea of common code ownership, agile developers don’t segregate themselves based on what part of the code they’re working on or on a particular skill set. Agile developers share code, share testing, and share expertise. Like all software developers, those on an agile project are constantly learning new things and improving their skills. No matter what agile process model is being used—XP, Scrum, Crystal, feature-driven, and so on—all agile projects are iterative, use short development cycles, and produce running code with more features at the end of each iteration. Simple. Different agile projects will run differently on a day-to-day basis. But pretty much all of them include daily stand-up meetings, at least daily integrations, and some form of shared programming—for example, pair programming. Whether the project is using a time-boxed methodology or not, iterations are typically short, and developers spend quite a bit of time writing and executing unit tests.

Risk Analysis In the risk analysis section, you need to think about the bad things.1 What can possibly go wrong with this project? What’s the worst that could happen? What will we do if it does? Some risks to watch out for include the following:

1

•

Schedule slips: That task that you estimated would take three days has just taken three weeks. In a plan-driven project, this can be an issue if you don’t have regular status meetings. Waiting three weeks to tell your boss that you’re late is always worse than telling them that you’ll be late as soon as you know it. Don’t put off delivering bad news. In an agile project this is unlikely, because most agile projects have a daily status meeting (see the Scrum meeting section in Chapter 2). That way, schedule slips are noticed almost immediately, and corrective action can take place quickly.

•

Defect rate is excessive: Your testing is finding lots of bugs. What do you do—continue to add new features or stop to fix the bugs? Again, this can be a real issue in a project where integration builds happen according to a fixed schedule—say, once a week. In a project where integrations happen every day, or every time a new feature is added, you can keep up with defects more easily. In either case, if you’re experiencing a high defect rate, the best thing to do is to stop, take a look around, and find the root cause of the defects before adding more functionality. This can be very hard to do from a project-management standpoint, but you’ll thank yourself in the end.

•

Requirements misunderstood: What you’re doing isn’t what the customer wanted. This classic problem is the result of the fact that customers and developers live in two different worlds. The customer lives in the application domain where they understand from a user’s perspective what they want the product to do. The developer understands from a technical perspective how the product will work. Occasionally, these worlds intersect and that’s good; but lots of times they don’t and that’s where you get a misunderstanding of requirements. The best ways to avoid this situation are to have the customer on site as often as possible and to produce deliverable products as often as possible.

McConnell, S. Rapid Development: Taming Wild Software Schedules. (Redmond, WA: Microsoft Press, 1996.)

31

Chapter 3 ■ Project Management Essentials

•

Requirements churn: New features, altered features, deleted features . . . will the misery never end? Requirements churn is probably the largest single reason for missed delivery dates, high defect rates, and project failure. Churn happens when the customer (or your own marketing folks, or the development team itself ) continues to change requirements while development is underway. It leads to massive amounts of rework in the code, retesting of baselines, and delay after delay. Managing requirements is the single most important job of the project manager. In a plan-driven process, a change control board (CCB) examines each new requirement and decides whether to add it to the list of features to be implemented. There may be a member of the development team on the CCB, but that’s not required, so the danger here is that the CCB adds new features without understanding all the scheduling and effort ramifications. In agile processes, the development team keeps control of the prioritized requirements list (called the product backlog in Scrum), and only adjusts the list at set points in the project—after iterations in XP, and after each sprint in Scrum.

•

Turnover: Your most experienced developer decides to join a startup three weeks before product delivery. The best way to reduce turnover is to (1) give your developers interesting work, (2) have them work in a pleasant environment, and (3) give them control over their own schedules. Oddly enough, money is not one of the top motivators for software developers. This doesn’t mean they don’t want to get paid well, but it does mean that throwing more money at them in order to get them to work harder or to keep them from leaving doesn’t generally work. And if, despite your best efforts, your best developer does leave, you just have to move on. Trust me, it won’t be the end of the world. The best way to mitigate the effect of turnover is to spread the knowledge of the project around all the members of the development team. Principles like common code ownership and techniques like pair programming work to invest all the team members in the product and spread the knowledge of the code across the entire team. One of the best books on managing and keeping software developers is Peopleware by Tom DeMarco.2

Once you’ve got a list of the risks to your project, you need to address each one and talk about two things: avoidance and mitigation. For each risk, think about how you can avoid it. Build slack into your schedule, do constant code reviews, freeze requirements early, do frequent releases, require pair programming so you spread around the knowledge of the code, and the like. Then you need to think about what you’ll do if the worst-case scenario does happen; this is mitigation. Remove features from a release, stop work on new features and do a bug hunt, negotiate new features into a future release, and so on. If a risk becomes a reality, you’ll have to do something about it; it’s better to have planned what you’ll do beforehand. Once you address avoidance and mitigation, you’ll have a plan on how to handle your identifiable risks. This doesn’t completely let you off the hook, because you’re bound to miss risks; but the experience of addressing the risks you do come up with will enable you to better handle new ones that surprise you during the project. If your project is using an iterative process model, it’s a good idea to revisit your risks after every iteration and see which ones have changed, identify any new ones, and remove any that can no longer happen.

DeMarco, T. and T. Lister. Peopleware: Productive Projects and Teams, Second Edition. (New York, NY: Dorset House Publishing Company, 1999.)

2

32

Chapter 3 ■ Project Management Essentials

Resource Requirements This section is a piece of cake. How many people do you need for the project? Do they all need to start at once, or can their starting dates on the project be staggered as phases are initiated? How many computers do you need? What software will you be using for development? What development environment do you need? Is everyone trained in that environment? What support software and hardware do you need? Yes, you do need a configuration management system and a stand-alone build machine, no matter which process model you’re using. The platform you’re targeting and the application domain you’re working in answer many of these resource questions for you. That’s the easy part. Questions about team size, start dates, and phases of the project will likely not be able to be answered until you do a first cut at effort estimation and scheduling.

Task Estimates The first step toward a project schedule is seeing what you’ll be doing and figuring out how long each step will take. This is the classic chicken-egg problem. You can’t really do estimation until you have a fairly detailed decomposition of the features or user stories into tasks. But your manager always wants effort estimates and schedule data before you start doing the design. Resist this. Make design your top priority once you’ve got some idea of the requirements. If you select a small set of high priority requirements and then design a solution for that feature set, then you can do an effort estimation of that iteration. Don’t worry that the requirements might change—they will. You need a detailed decomposition of features into implementable tasks before you can do effort estimation. Don’t ever believe anyone who tells you, “That feature will take six months to do.” That is a wild-assed guess (WAG) and bears little to no relation to reality. You just can’t estimate something that big. The best you can do is to say, “I was once on a team that implemented a feature like that in six months.” And even that only helps a little. You’ve got to get your work broken down into tasks that are no more than about a week in duration. One or two days is a much better bet. Even better, never do estimation in any unit except person-hours. That way you’ll be more tempted to work with small increments of hours, and you’ll break your larger tasks down into smaller ones that you may actually know how to do. Once you have a believable list of tasks, you can start doing size and then effort estimation. Size always needs to come first, because you just can’t figure out how long something will take until you have an idea of how big it is. Size can be several things, depending on your work breakdown and your development model; functional modules, number of classes, number of methods, number of function points, number of object points, or that old standby, uncommented lines of code. Actually, no matter what you initially measure size in, you’ll end up with estimates in terms of KLOC—thousands of uncommented lines of code. There are several techniques for getting effort estimates: COCOMO II,3 function point analysis, and the Delphi method are just three. All, however, depend on being able to count things in your design. The estimation mantra is: size first, then effort and cost estimates, finally schedule. All other things being equal, the Delphi method is a quick and relatively efficient estimation technique. Here’s one way it can work: find three of your most senior developers. These are the folks who’ve got the most experience and who should therefore be able to give you a good guess. Then give them the task breakdown (assuming they weren’t already involved in doing the initial breakdown—the ideal situation). Then ask them to give you three numbers for each task: the shortest amount of time it should take, the longest amount of time it should take, and the “normal” amount of time it should take, all in person-hours. Once you have these numbers, add them all up, the shortest together, the longest together, and the “normal” together and take the mean. Those are your estimates for each task—the averages of the best guess by your best developers for each task. You then use those average values for each task as the official (for now) effort estimate and proceed to create a schedule. 3

Boehm, 2000.

33

Chapter 3 ■ Project Management Essentials

Finally, you should have the right people—the developers who will do the work—do all the estimates for the project. Managers should never do development estimates. Even if a manager has been a developer in the past, unless they’re deeply involved in the actual development work, they shouldn’t be in the business of doing development estimates.

Project Schedule Once you have estimates of the tasks in your first release or iteration and have people resource estimates, you can create a schedule. There are several things to take into account before you can look at that spiffy Gantt chart with the nice black diamond that marks the release date. Here’s one possible list: •

Get your developers to tell you the dependencies between tasks. There will be some tasks that can’t start before others finish. There may be tasks that can start once others are half-finished. There will be some that can all start together. You need to know because the task dependencies will push out your delivery date.

•

Figure out what your duty cycle is. Out of each eight-hour day, how many hours do your developers actually do development? You need to remember that reading mail, attending meetings, doing code reviews, taking breaks, going to the bathroom, and lunch all eat up time. You can’t assume that an eight-hour task will be done in a single day. Realistically, out of each eight-hour day, two to four hours are eaten up with other stuff, so your duty cycle can be as low as 50%—four hours a day. Duty cycles can also vary based on corporate culture, so you need to figure out what yours is before you start to schedule.

•

Take weekends, vacations, sick days, training, and slack into account when you’re making the schedule. If your senior developer has a task on the critical path of your project, you probably need to know that they’re taking that three-week vacation in May.

•

You can’t schedule a developer to work on two tasks at the same time. Most projectscheduling software won’t let you do this by default, but most of them will let you override that. Don’t. You’ll be tempted to do this so that your schedule doesn’t push out past whatever deadline your manager or marketing team wants, but resist the temptation. You’ll only have to change the schedule when you miss the date anyway.

Finally, use project-scheduling software to make your schedule. You don’t have to do this—just using a simple spreadsheet technique like the one proposed in Chapter 9 of Joel on Software by Joel Spolsky4 can work for small projects. But using real project-management software like Microsoft Project, Fast Track Scheduling, Basecamp, or Merlin for plan-driven projects and web applications like Jira, Pivotal Tracker, or Trello for agile projects provides lots of features that make keeping the schedule up to date much easier. The big thing that project-management software can do that your spreadsheet can’t is track dependencies. Joel doesn’t understand how Microsoft Project is useful in this; in fact, he says, “I’ve found that with software, the dependencies are so obvious that it’s just not worth the effort to formally keep track of them.”5 This might be true for small projects, but when your team gets to be 10 developers or larger and you’re working on 100 or more tasks, knowing something about the dependencies of your project can help manage who’s working on what, and when. This knowledge is even more critical for agile methodologies like Scrum. In Scrum projects—where you’re using short time-boxed sprints and you need to know the priorities of each user story in the product backlog and how they relate to each other and you must have good estimates for how long each task will take and you need to track the progress of a number of tasks in the sprint backlog every day— knowing dependencies can be the difference between making your sprint and not. 4 5

Spolsky, J. Joel on Software. (Berkeley, CA: Apress, 2004.) Spolsky, 2004.

34

Chapter 3 ■ Project Management Essentials

Joel is right that tools like Microsoft Project are overkill for many projects, so for those, you can use a spreadsheet approach that just lists the features and tasks you can see right now (see Table 3-1); but projectmanagement software sure is handy to have around when you need it. Table 3-1. Spolsky’s Painless Schedule (with the Velocity Addition)

1 Feature

2 Task

3 Priority

4 Orig Est

5 Curr Est

6 Elapsed

7 Remaining

8 Developer

9 Velocity

Spolsky’s painless schedule lists the following seven columns that should be in every schedule: •

Feature name

•

Tasks within the feature

•

The Priority of the task

•

The Original Estimate (in person-hours)

•

The Current Estimate (in person-hours)

•

The Elapsed Time worked on the task (in person-hours)

•

The Remaining Time on the task (also in person-hours)

Joel correctly emphasizes that tasks need to be fine-grained and small in terms of effort. Otherwise, as noted previously, your estimates will most likely be wildly off. He also suggests that each developer either have a separate spreadsheet or, as shown here, you add an eighth column with the developer assigned to the task. Having all the tasks on the same sheet makes it more crowded, but easier to see all the tasks at once. Though not exactly “painless,” this method of keeping a schedule is useful for smaller projects with a fairly limited number of tasks.

Velocity I suggest adding a ninth column to measure the velocity of each task. Velocity is a term from XP6 and is defined as the estimated effort of a task, divided by the actual effort. In our case, we’d use the Original Estimate of the task and the Elapsed Time. If you overestimate your task, your velocity will be greater than one (your task took less time than you originally thought); if you underestimate, it will be less than one (the task took you longer than you originally thought). Ideally, velocity should be 1.0, but that hardly ever happens. It’s worth noting that humans are really bad at estimating time. To get around this problem, some project-management tools have you create an estimate using magnitude instead. The simplest type lets you do small, medium, or large for estimates, whereas others (for example, Pivotal Tracker and Trello) let you give each task some number of points on a scale—say, 1, 2, 4, or 8 points for each task. This makes it easy to total up the number of points the team is trying to implement in each time-boxed iteration and over time gives you the average numbers of points a team can implement per iteration.

6

Beck, K. Extreme Programming Explained: Embrace Change. (Boston, MA: Addison-Wesley 2000.)

35

Chapter 3 ■ Project Management Essentials

The reason for using velocity is to give each developer and the project manager an idea of how accurate the developer’s estimates are and to help do a better job of estimating next time. Ideally, as a developer gains experience, their velocity will approach 1.0 on each task. Alternatively, if a developer’s velocity jumps around a lot (one task is 0.6, another is 1.8, a third is 1.2), then a crash course in estimation techniques might be appropriate. In my experience, a new developer’s velocity will start out gyrating wildly, with most values well under 1.0; the new developer is overly optimistic. But as time goes along velocities will settle into a range centered on 1.0, maybe from 0.85 to 1.15. As a developer gains a history, the project manager can then start to depend more on their estimates, and the schedules will be more accurate.

Project Oversight Project oversight is what happens once you’ve got a schedule and your project is underway. It will typically include a number of review meetings for things like the overall architecture, design, and code. Once your project begins, the work needs to be managed. How this happens depends on the process you’re using. But regardless of the process, you need to manage the schedule, manage the developers, manage the process itself, and above all, manage your manager. A manager’s technique is critical to keeping a project on schedule. Fear is not a motivator. Appealing to professional pride is, though. If your boss doesn’t support you, you’re doomed. Without creative, supportive management, you’re doomed. If your people aren’t happy, you don’t have a hope. Treat your developers as humans, not resources. Supporting your team and keeping them insulated from distractions is your number one job. Remember, projects are cooperative, social events.7

Status Reviews and Presentations Status reviews and presentations are an inescapable part of any project. The bigger the project, the more formal the review. Remember that reporting status doesn’t fix problems, and that generally upper management doesn’t like hearing about problems. Tough. When you give a project status report, just tell them where your project is and where it’s going during the period before the next status report. Don’t embellish and don’t make excuses; be honest about problems and where you are in the schedule. Just providing good news is usually bad for your reputation; something will go wrong at some point, so it’s best to report it and get it out of the way right away. You must communicate bad news about the project as soon as possible. That’s the best way to mitigate the problem and get others involved in helping to find a solution. When giving a presentation, be it a status review or a technical presentation, make sure you know your audience. Set your presentation to the level of the audience and keep the purpose of your presentation in front of you and them at all times. PowerPoint is ubiquitous in the industry, so learn to use it effectively. Keep your PowerPoint presentations short and to the point. Avoid cramming your slides with lots of bullet points. Don’t make your bullet points complete sentences, mostly because you’ll be tempted to read them. This is the kiss of death for two reasons: it takes too long and takes attention away from what you’re actually saying. And it insults the audience. Surely they know how to read? Your bullet points should be talking points that you can then expand upon. This lets your audience focus on you, the speaker, rather than the slides. When you’re constructing a PowerPoint presentation, use as few words as you can.

Cockburn, A. “The End of Software Engineering and The Start of Economic-Cooperative Gaming.” Computer Science and Information Systems 1(1): 1 - 32 (2004).

7

36

Chapter 3 ■ Project Management Essentials

Defects Inevitably, you’ll introduce defects (errors) into your program. Defects don’t just appear; developers put them there. As a developer, your aim is twofold: •

Introduce as few defects as possible into the code you write.

•

Find as many of them as you can before releasing the code.

■■Note By the way, I’m deliberately not using the word bug here, because it sounds both inoffensive and cute. Defects are neither. They are errors in your code that you put there. See Chapter 13 for a more detailed discussion on errors. Despite your best efforts, though, you will release code with defects in it. It’s just inevitable. For a program of any size, there are just too many possible paths through the program (called a combinatorial explosion) and too many different ways to introduce bad data for there not to be defects. Your objective is to release with as few defects as possible and to make those defects ones that don’t really impact the product or its performance. To make this a reality, most development organizations have a set of defect levels they use to characterize just how bad a defect really is. These defect levels are generally built into a defect tracking system that allows developers and project managers to see how many and how severe the current defects are. One set of levels looks like the following: 1. Fatal: Either this defect causes the product to crash, or a fundamental piece of functionality doesn’t work (for example, you can’t save files in your new word processor). 2. Severe: A major piece of functionality doesn’t work, and there’s no workaround for it that the user can perform (such as Cut and Paste don’t work at all). 3. Serious: A piece of functionality doesn’t work, but there is a workaround for it that the customer can perform (for example, the keyboard shortcuts for Cut and Paste don’t work, but the pull-down menus do). 4. Annoying: There’s a minor defect or error in the documentation that may annoy the user, but doesn’t affect how the program works (say, Paste is always spelled Patse). 5. New feature request: This isn’t a defect, but a request for the product to do something new (as in, the word processor should have a speech-to-text feature built in). Whenever you find a defect in a piece of code, you’ll file a defect report in the defect tracking system (to keep track of how many defects you’re finding, what types they are, and how severe they are) and you’ll characterize the defect by severity level. When the developers are fixing defects, they start at level 1 and work their way down. In nearly all organizations, no product can release with known level 1 or level 2 defects in it. Most organizations also try their best to remove all the level 3 defects as well.

37

Chapter 3 ■ Project Management Essentials

The Retrospective Most development teams will do a retrospective after every project. A retrospective, as the name implies, is an opportunity to reflect on the project just completed and answer a few questions. Typically, the questions will be like the following: •

What went right? Did our process work the way we anticipated? Did we meet our schedule? Did we implement all the features required by the customer?

•

What went wrong? Why did we have so many defects? Why did we need to work 60-hour weeks for the last month of the project?

•

What process issues came up? Did we follow our process? If not, what parts were problematic?

•

What do we need to fix for next time? Given the answers to the preceding questions, what do we need to fix in our process, work habits, or environment for the next project?

•

Who’s responsible for the fixes? Someone has to be responsible for the changes to our process—who is it? (Don’t make it a manager; the development team should own the process).

In a plan-driven project, the retrospective is typically held either after the product release or after each major iteration of the product. In an agile project, a retrospective is always held after every iteration. So, for example, in Scrum a retrospective is held after every sprint.

Conclusion So where do we end up? We’ve gone through the general parts of managing projects, and I’ve presented some alternative ways of doing project management. The most important ideas to consider are that the developers should own the process and management should be supportive and listen to the developers— particularly where schedules and estimates are concerned—and be the buffer between the developers and the world. If you can work in an organization where those things are true, be a happy camper, because you’ll be able to write great code.

References Beck, K. Extreme Programming Explained: Embrace Change. (Boston, MA: Addison-Wesley 2000.) Boehm, B., C. Abts, et. al. Software Cost Estimation with COCOMO II. (Englewood Cliffs, NJ: Prentice-Hall, 2000). Cockburn, A. “The End of Software Engineering and The Start of Economic-Cooperative Gaming.” Computer Science and Information Systems 1(1): 1 - 32 (2004). DeMarco, T. and T. Lister. Peopleware: Productive Projects and Teams, Second Edition. (New York, NY: Dorset House Publishing Company, 1999). McConnell, S. Rapid Development: Taming Wild Software Schedules. (Redmond, WA: Microsoft Press, 1996.) Spolsky, J. Joel on Software. (Berkeley, CA: Apress, 2004).

38

CHAPTER 4

Requirements The hardest single part of building a software system is deciding what to build. No other part of the conceptual work is as difficult in establishing the detailed technical requirements, including the interfaces to people, to machines, and to other software systems. No other part of the work so cripples the results if done wrong. No other part is more difficult to rectify later. Therefore, the most important function that the software builder performs for the client is the iterative extraction and refinement of the product requirements. —Fred Brooks1 Before you start coding –— yes, before you start coding—you need to know what it is you’re going to build. That’s what requirements are: a list of stuff you have to implement in order to create your terrific program. Most developers hate requirements. Really, all we’d like to do is sit down and start coding. All of us have that super-programmer mentality: “Just give me the problem and I can sit down and design and code it on the fly.” Not! If you want to be a productive developer, make fewer errors, and come up with a good, clean design, you need requirements. And the more detailed they are, the better. A good set of requirements tells you just what the program is supposed to do. It gives you the scaffolding around which you’ll hang your design. You’ll do requirements anyway—it’s one of those steps in a standard development life cycle that you can’t avoid—but if you don’t make room for it in your project, you won’t create a great program. Being intentional about requirements forces you to think about the details of the program and lets you listen to the users so you have a better idea of what they really want. So let’s talk about requirements.

What Types of Requirements Are We Talking About? We’re really talking about functional requirements, the list of features the user will see and be able to use when they fire up your program. These are the “black box” requirements that show the external behavior of your program. As far as the user is concerned these are the only requirements that matter. In a plan-driven process the output of this activity of identifying requirements is a functional specification of what the software system is supposed to do. For an agile process the output is a set of user stories that define the product backlog. During the course of uncovering requirements for your project you’ll usually see four different types: user requirements, domain requirements, non-functional requirements, and non-requirements.

Brooks, F. P. The Mythical Man-Month : Essays on Software Engineering, Silver Anniversary Edition. (Boston, MA: Addison-Wesley, 1995.)

1

© John F. Dooley 2017 J. F. Dooley, Software Development, Design and Coding, https://doi.org/10.1007/978-1-4842-3153-1_4

39

Chapter 4 ■ Requirements

User Requirements User requirements are nearly always expressed in natural language. They are the details of what the user expects to see as they use the program. They also include descriptions of screen layouts, dialog boxes, and menus. Any interaction element in the program should be described in the user requirements. For example: Logging into the system: When Gloria clicks the Login button on the main page, a Login dialog box appears in the middle of the screen. The Login dialog must contain two text boxes, labeled “Username” and “Password.” There must also be two buttons in the dialog box, labeled “Submit” and “Cancel.” If at any time Gloria clicks the Cancel button, the dialog box shall disappear and she will be taken back to the previous screen. In normal usage, she will click in the Username text box and type in her username, and then click in (or tab to) the Password text box and type in her password. The text typed in the Password text box must be hidden. Once Gloria is finished typing in her username and password she must click the Submit button. If she has entered a correct user name/password combination she will then be taken to the main menu page. If Gloria’s user name/password combination is incorrect, an “Invalid username or password, please try again” message shall appear in the dialog box, the text boxes shall be cleared, and she will be given the opportunity to login again. As seen in this section, you can express user requirements as scenarios, and as detailed screen-byscreen descriptions. Remember to use pictures as much as you can when you’re doing user requirements. If your program is web-based, you can create lots of quick and dirty HTML pages to show the user. If it’s not web-based, use a drawing program to create pictures of what the user will see, or just draw them by hand on paper.

Domain Requirements These are requirements that are imposed on you by the application domain of the program. If you’re writing a new version of an income tax program, you will be constrained by the latest IRS regulations. A general ledger program will have to abide by the latest edition of the Generally Accepted Accounting Principles (GAAP), and a smartphone will need to implement the latest Global System for Mobile communication (GSM) protocols. You don’t need to write down all these requirements, just refer to them. A set of detailed domain requirements gives the developers information they will need during the design of the program. Domain requirements are usually considered “middle layer” software because they are the heart of the application, below the user interface and above the operating system, networking, or database software. A lot of domain requirements will get implemented as separate classes and libraries with their own APIs. Users are concerned with domain requirements only insofar as they affect the user requirements.

Non-Functional Requirements Non-functional requirements are constraints on the services and functions of the program and also expectations about performance. They can include target platform specifications, timing constraints, performance requirements, memory usage requirements, file access privileges, security requirements, response times, minimum number of transactions per second, and so on. These are usually requirements that may not be visible to the user, but that do effect the user experience. An example of this type of requirement is that your web page must load and display within three seconds.

40

Chapter 4 ■ Requirements

Non-Requirements These are the things you’re not going to do. You will need to communicate this to the customer because after laying out what the program will do, the most important thing to do in the requirements phase is manage expectations. One of the worst phrases a customer can utter at that final demo before you release is, “But I thought it was going to do . . .” You need to tell all the stakeholders in a project what the program is going to do and also what it’s not going to do. In particular you need to let them know that there are requirements that won’t be implemented—at least not in the current release. “Only one countdown timer may run at a time.” “There will not be a defrost cycle that allows defrost modes to be selected by food type.” It’s likely that your customer won’t read this section, but at least you can point to it when they ask. Be careful, though, because the number of things that your program will not do is nearly infinite.

Requirements Gathering in a Plan-Driven Project A functional specification describes what the program will do entirely from the user’s perspective. It doesn’t care how the software is implemented. It talks about the features of the program and specifies screens, menus, dialogs, and the like. Think of it as a badly written user manual. A second kind of spec can be called a technical specification. The technical specification describes the internal implementation details of the program. That is, it talks about data structures, algorithms used, database models, choice of programming language, and so on. We’re not going to talk about technical specs, just functional specs. “Wait,” you say. “What about all those agile methodologies we talked about in Chapter 2? They don’t write functional specs. So there! I’m off the hook.” Well, in fact, agile methodologies do write functional specifications. They’re just in a different format from the 300-page single-spaced requirements document that some plan-driven methodologies require. XP requires that, together with the customer representative or product owner, you write user stories that lay out what the program will do. That’s a spec. We will discuss how agile methodologies do requirements later in this chapter. The important part and the idea behind this entire chapter is to write down what your program is supposed to do before you start coding.

But I Don’t Like Writing! A standard argument made by software developers is that they can’t write. Nonsense! Everyone can learn to write functional specs. But writing is work. You have to get in there and practice writing before you’ll be any good at it. If you’re still in school (be it undergrad or graduate school), take a course in writing, one where you’ve got to write essays or journal entries or stories or poetry every single week. You should also have to read other works critically; reading other people’s writing, whether good or bad, is a great way to learn how to write better. Functional requirements should always be written in a natural language. Why? Well, it’s the Sapir-Whorf linguistic relativity hypothesis, don’t you know?2 In a nutshell, language not only determines what you do say, it determines what you can say (and think). That is, the language you use determines what kinds of thoughts you are able to have; it tends to constrain your thinking processes, and thus what you can think about and how you express your thoughts. If the language doesn’t have room for certain kinds of thoughts, you are much less likely to think them. Natural languages are much more expressive and varied than programming languages, so you want to write requirements and do your designs in natural languages and save the programming languages for implementation later. Whether you believe the Sapir-Whorf hypothesis or not, it’s nearly always a good idea to develop your functional requirements in a natural language so you

2

http://en.wikipedia.org/wiki/Linguistic_relativity (retrieved September 15, 2009).

41

Chapter 4 ■ Requirements

don’t get bogged down in the syntactic and semantic details of a programming language before you need to. This doesn’t mean you can’t think about implementation while you’re doing the functional requirements (you will, trust me), but just shunt those thoughts over into a “technical note” sidebar of your specification or a completely separate document.3 You might also look at Kenneth Iverson’s Turing Award lecture, “Notation as a Tool of Thought,” for a similar discussion.4

Outline of a Functional Specification Every functional specification is different, just as every software development project is different. So take this outline with a grain of salt and just use the parts that apply to your project. Lots of the ideas here are from Spolsky.5 Every function specification should have the elements discussed in the following sections.

Overview This is your executive summary. A paragraph or at most two of what the program is supposed to do. “This program runs your microwave oven. It interfaces to a keypad and an LCD display that provides user input and output functionality. Its functions are limited to those that a standard microwave would have, with the addition of single buttons for pizza and coffee reheating. It also will run a time of day clock and a standalone countdown timer. It doesn’t control the light. It has a safety interlock that will prevent the microwave from starting if the door is open.”

Disclaimer You should always put in a statement right at the beginning that “This specification isn’t done yet. If you think something is missing or wrong, just sent me an email.” That helps keep all the marketing guys off your back and lets you file new feature requests in your mail trash bin. Lots of people will put a big, black DRAFT in the header or footer of the document. That can work as well, but folks tend to ignore it. Some people will use a big DRAFT watermark on their specs, so that every page has the word embedded behind the text. That doesn’t stop people from yelling at you either. At some point your disclaimer should change to something like “This specification is as complete as it will be for this release. If you think something is missing or wrong, just sent an email to the author and we’ll consider it for the next release.”

Author’s Name Somebody needs to be responsible for the functional specification. Not a committee, not the development team, one person. This is usually either the development manager or the project manager, depending on how your company sets up development projects. There are pros and cons to all the different organizational arrangements. If the development manager (the person to whom the developers report) is in charge of the functional spec, then that person is usually up to speed on all the technical aspects of the project. That’s good. On the other hand, if your boss writes the functional spec, it might be harder to tell them that there’s something wrong with the specification, or that you don’t agree with the design. Also, development managers were probably developers at one time and so they may not have the people skills (read: charm and schmoozing skills) necessary to talk to marketing, the customer, documentation, testing, and so on.

Spolsky, J., Joel on Software. (Berkeley, CA: Apress, 2004.) Iverson, K. E. “Notation as a Tool of Thought.” Communications of the ACM 23(8): 444–465 (1980). 5 Spolsky, 2004. 3 4

42

Chapter 4 ■ Requirements

If your company uses project managers that are in charge of the specification, design, and schedule, but don’t have developers directly reporting to them, then you run the risk of getting someone that isn’t as technically astute as a former developer. On the other hand, these folks can usually charm the socks off the other teams, so negotiations are a lot smoother. Project managers need to have some technical skills and to be very good at getting all the stakeholders to reach consensus on the contents of the functional specification.

Scenarios of Typical Usage These are the actual requirements. A great way to get customers to respond to your requirements list is to present several scenarios of typical usage of the program to them as part of the specification. This has a couple of advantages: •

First, if you write the scenarios as if they’re user stories, the customer is more likely to read them.

•

Second, customers are more likely to understand what you’re doing and come up with ideas for things you’ve missed or gotten wrong. This is always a good thing, because the more customer input you get early in the process, the more likely you’ll actually create something they want.

In many agile methodologies, including XP, user stories are often written like scenarios. In XP, the customer is part of the project team, so you get constant feedback on user stories and daily program builds. In Scrum, the customer isn’t required to be part of the project team, but they are strongly encouraged to keep in close contact with the team. Also in Scrum, shorter sprint lengths allow the customer to see working versions of the product more often. In the Unified Modeling Language (UML, see www.uml.org), there is an entire notation used to create use cases (another word for scenarios). But as already discussed, nothing beats natural language for describing usage scenarios. We’ll come back to use cases later, in Chapter 8. Once you’ve written a couple of scenarios, you’ll have a much better idea of how your program will flow, and what screens, dialog boxes, menus, and so on you’ll need. This lets you go through each one of those screens and flesh out the details of how they’re laid out, what buttons, text boxes, icons, graphics, and so on they’ll have, and what other screens they connect to. Use pictures! A picture of a screen or a dialog box is worth way more than a thousand words. It gives the reader something to react to and it gets them thinking about program flow and user interface issues.

Open Issues When you first write the functional specification, there will be one or two things you don’t know. That’s okay. Just put them in the “Open Issues” section. Then every time you meet with the customer, point to this section and try to get answers. Some of these questions will move to requirements sections and some will end up in the “Non-requirements” section, after you get those answers. By the end of the project, though, this section should be empty. If it’s not, well, you’ve got issues that will haunt you.

Design and New Feature Ideas If you’re like most developers, you’ll be trying to design and code the program in your head all the time you’re doing your requirements gathering and analysis. That’s just what developers do. The two types of notes developers and project managers typically create are technical notes containing design or coding ideas for developers, and marketing notes containing feature ideas for the marketing folks and the customer. So, to keep from forgetting the design and implementation ideas you have during the requirements phase, write

43

Chapter 4 ■ Requirements

a separate notebook. This notebook is just a separate document that contains a note for later. Ideally it’s a document that’s shared with the entire team. Finally, as your project proceeds through development, new requirements and features will surface. This always happens. But if you want to keep to a schedule and deliver a working product, you just can’t implement everything that will come up. If you want your requirements to be up to date, you need a place to put all the tasks you will do later. That’s what a “Backlog” document is for—all the requirements you’re going to consider for the next release of the product. This does a couple of good things for you. It tells the customer you haven’t forgotten these features, and that by moving them to the next release you are committed to delivering the current release as close to the published schedule as possible. And it tells the developers that you’re not out of control and that the project has a good shot at being done with high quality and on time. For more information on backlogs, take a look any of the Scrum agile methodology descriptions.6

One More Thing One more thing about the functional specification: don’t obsess. Chances are, you’ll do a good job of picking out requirements and writing them down in the functional spec, but it won’t be as detailed as you like and it won’t be complete. Don’t worry. The only time a functional specification is complete is when you ship the release. Don’t spend time trying to get every single detail correct; don’t spend time trying to tease every requirement out of your customer. It just won’t happen. Set a time limit, do your best, and let it go. You don’t want to have a bunch of developers sitting around twiddling their thumbs with nothing to do, waiting for the spec, do you?

Requirements Gathering in an Agile Project First things first. In an agile development project there is no functional specification. That’s because agile developers recognize from the beginning that the requirements will change and so they should embrace change and defer making decisions about requirements and design as long as possible. Also, because in an agile project the customer is an integral part of the team, the agile developers also know that they can get immediate feedback on feature implementations and they can get timely updates on requirements from the customer. This doesn’t necessarily make the process of gathering requirements any easier, but it gives everyone more confidence that the current set of requirements is the right set. For most agile methodologies the key idea in requirements gathering is the user story. The user story is just that—a description of some feature or scenario that the customer wants to execute in order to get some type of work done. The classic way to describe the contents of a user story is to say, “As a , I want to do , so that .” By expressing a user story this way, you get to the who, what, and why of the requirement.

The Three Cs A user story has three fundamental components, expressed by Ron Jeffries in 2001: the card, the conversation, and the confirmation.7

6 7

Schwaber, K. and M. Beedle. Agile software development with Scrum. (Upper Saddle River, NJ: Prentice Hall, 1980.) http://ronjeffries.com/xprog/articles/expcardconversationconfirmation/

44

Chapter 4 ■ Requirements

Card All user stories are written on cards. A card can be a Post-It note, an index card, or a larger piece of paper. But it’s nearly always a physical thing. Although the card contains the text of the story “As a I want to so that ,” it’s really an invitation to a collaborative conversation about what the story really means and what the user really wants. Note that the card isn’t very detailed. It usually just contains an outline of the story. It’s a placeholder for the real requirement that will be subsequently hashed out. Stakeholders can write on the card, put estimates on it, questions, and so forth.

Conversation The conversation about a user story takes place between all the important stakeholders in the project, the product owner or user, the development team, the testers, marketing folks, and maybe others. This is a substantive discussion about what the product owner really wants from the story. The conversation is ideally held in person, and the discussion will include more details about the story, possibly estimates of size, and an estimate of the relative priority of the story. The conversation may include breaking the original story into two or more smaller stories if the initial estimates indicate the effort to implement the story may be too large.

Confirmation The last component of a user story is confirmation. The user or product owner provides this information in the form of acceptance criteria for the story. The acceptance criteria are usually written on the back of the card as a short bullet list. These criteria will end up being acceptance tests that the product owner will use to confirm that the implementation of the story is acceptable to the user. The best way to create acceptance tests is for the product owner to generate examples of the story in use and then for the development team to automate the examples. That way, the product owner can execute the acceptance tests on the delivered feature and confirm whether the implementation works or not.

INVEST in Stories Note that the three components of a user story just mentioned don’t tell us all about the story. A lot of the details will come out of the conversation and the confirmation. They also don’t tell us anything about the quality of the story and what makes a good user story. Bill Wake, in a classic blog post, laid out the characteristics of a good user story using the acronym INVEST.8

Independent The idea here is that your user stories should be independent of each other. That way a single user story (which, depending on the agile process you’re using, may take several iterations to fully implement) can be scheduled and implemented separately from any other user story. Wake gives the example of a multilayered cake. If you take a slice out of the cake, you can eat (implement) just that one slice, independently of any other. This may not be completely possible—think of things like the radio software in a mobile phone. In order to fully test part of the user interface embodied in a user story, you may have to have all the radio software working first. We’ll see this same idea later on as loose coupling in object-oriented design.

8

http://xp123.com/articles/invest-in-good-stories-and-smart-tasks/

45

Chapter 4 ■ Requirements

Negotiable A good story leaves room for the parties involved to negotiate the details of its implementation; it provides the essence of the requirement. The story is not so specific that it reads like a contract. Rather it provides the developer with a goal and allows the owner and the developer room to create an interesting and workable implementation.

Valuable A good user story must be valuable to the customer. It must describe a feature or a service that the customer wants. Because user stories will be scheduled and implemented in a development iteration, they must add value to the product when the iteration is complete. In addition, if the team decides a user story is too large and must be split into two or more stories, each of them must provide value to the customer. This idea gives the development team guidance on how to split stories—based on value to the customer, not on technology.

Estimable User stories must be able to be estimated. Estimation is critical to the product owner so that they can assign a relative priority to the story. This characteristic also allows the team to focus on the size of the story. Stories that are too large can’t really be estimated and so should be candidates for splitting. Estimation is part of the negotiation in the conversation. It means to look at both the size of the story and the possible effort involved in implementing it. If a team is having trouble estimating the size or the effort of a particular story, it may be that the story is too large and needs to be split. Estimation is also the first step in decomposing the story into implementable tasks.

Small As implied above, good user stories should be small. Ideally, a user story should be implementable in a single sprint or iteration. That will allow the development team to decompose the story into a number of small tasks—ideally of eight person-hours or less worth of effort. It also speaks to the Value characteristic because if it’s small, the story will add value for the customer at the end of a single iteration. One way to manage story sizes is to not care about them if the story is low priority. As the story moves up in the product backlog, it will become more important, and at some point it will be important enough to do a detailed estimate of its effort. If large enough, (longer than a single iteration), you should consider splitting it into smaller stories.

Testable Good stories must be testable. This harkens back to the plan-driven idea of traceability. Once implemented, one should be able to trace a feature back through the implementation and design and into the original requirements. In agile, this practice is usually implemented using test-driven development (TDD) and acceptance criteria. The product owner writes the acceptance criteria, and the developers will implement the unit tests that will confirm that a task is correct. The acceptance tests the product owner will use confirm that the user story is correctly implemented. This is typically the definition of Done in an agile environment. If a user story is written and the product owner is unsure or unclear about how to write the acceptance criteria, this may mean that the story details are unclear and the story conversation should be restarted to clear up any confusion.

46

Chapter 4 ■ Requirements

The Testable characteristic is also a way for the development team to include non-functional requirements (performance, response time, usability, and so on) as things to be tested.

Product Backlog At this point, the total number of user stories generated by the product owner and agreed upon by the development team is added to a list of the total number of things that need to be done to create the product. This is known as the product backlog. Where a plan-driven process team will have a long, detailed document—the functional specification—an agile product team will have a stack of cards that ends up defining the product. This stack of cards is only preliminary, though. As the development process moves along, an agile product team will be adding, removing, and dividing user story cards constantly. The product owner will be the primary person doing this job, and it’s the product owner’s job to decide when the product is done and should be released. There may still be cards left in the stack when a product is released. Those cards are destined for the next product effort.

SMART Tasks Once the team agrees on a set of user stories and adds them to the product backlog, the developers must then plan for the next iteration or sprint. This planning includes taking each user story and breaking it down into a set of implementable tasks, each of which is easy to estimate with effort requiring a relatively short amount of developer time. In short, tasks are the work to be done in order to implement user stories. This work of decomposing stories into tasks is the work of the development team, not the product owner. Generally the entire team will do this exercise as part of planning the next iteration or sprint. The estimates for the tasks are added up until the amount of effort reaches the amount of time available to the team in the next iteration. In Scrum, tasks can be assigned point values based on their perceived difficulty. The more difficult the task, the more points are assigned. The team will add up the points for the highpriority tasks until the point value reaches the team’s average velocity. These tasks are then presented to the product owner, who either approves the list of work to be done or suggests changes by changing the priorities of stories in the product backlog. Eventually everyone agrees on the stories, and tasks for the iteration and work can begin. In order to perform this decomposition and effort estimation, the team must be able to identify tasks within stories and write them out on cards so they can be put on a task or kanban board. Each task must meet certain goals, characterized by the acronym SMART.

Specific Where user stories can be general and leave room for interpretation and negotiation, the tasks created from them need to be specific so they can be implemented. This will likely uncover hidden requirements in the stories. These requirements can be incorporated into the tasks or they may require the creation of a new user story that will then get put in the product backlog. In any event, tasks should be as specific as possible, including details about data structures and user interfaces if necessary.

Measurable The key idea here is that the team needs to know when all the requirements for the task have been completed. That is, when is the task done? Each team will have a different definition of done, but it should include things like “the feature works as listed on the task,” “all the unit tests pass,” and “the code has been reviewed and integrated.”

47

Chapter 4 ■ Requirements

Achievable The task must be something that a developer can do within the timeframe of the iteration or sprint. The developer must also have the skill set necessary to achieve the task. This doesn’t mean the developer can’t ask for help. This goal can also integrate well with the idea of pair programming, which will spread the required skill set across two developers.

Relevant This goal ties in with the Valuable story component discussed earlier. With respect to tasks, this means that the task must do something that makes progress towards the creation of the user story implementation. It should add value to the iteration for the customer.

Time-Boxed This goal means that the task, as estimated, can be finished within the iteration or sprint. If the task turns out to be harder than expected, the team is expected to divide it into two or more tasks and estimate them separately. The other goal implied here is that the total number of points assigned to the tasks included in the iteration is doable within the team’s average velocity.

Sprint/Iteration Backlog As the tasks are created from user stories and estimated, they’re added to the sprint/iteration backlog as things to do in the next sprint or iteration. Depending on the agile methodology being used, once an agreed-upon number of tasks are added to the backlog, the number of tasks in the backlog may or may not be changed while the sprint or iteration is underway. In Scrum, no more tasks may be added, except by the developers themselves, for the duration of the sprint. Any new work that’s discovered must be added to the product backlog instead.

Requirements Digging Most software engineering texts use the phrase requirements elicitation to talk about the process of getting your users to tell you what they want. Hunt and Thomas, in their book The Pragmatic Programmer, use the much more descriptive phrase requirements digging to emphasize the point that what you’re really doing is digging for all those requirements that your customer doesn’t know they want yet.9 Hunt and Thomas also make the terrific distinction between requirements, policies, and implementations as a way to illustrate the requirements digging process. For example, “The system must let the user choose a loan term” is a nice succinct requirement. It says that there’s something you have to do. It isn’t specific enough for implementation yet, but it tells the developer something concrete that must be built. “Loan terms must be between 6 months and 30 years” isn’t a requirement, although it kind of looks like one. This statement is an example of a business policy. When statements like this are presented to developers as requirements, they have a tendency to hard-code the statement in the program. Wrong, wrong, wrong. Policies like this can change, so you need to be very careful about putting business policies in your

Hunt, A. and D. Thomas. The Pragmatic Programmer: From Journeyman to Master. (Boston, MA: Addison-Wesley, 2000.)

9

48

Chapter 4 ■ Requirements

requirements. It’s almost always the case that you need to implement a more general version of the business policy than is stated. The real requirement is probably something like “Loan terms are of finite length but the length of the loan will vary by type of loan.” This tells you that you probably need to build a table-driven subsystem to handle this feature. That way, the loan term for a particular type of loan can be changed by making a single change in a data table, and the code doesn’t need to change at all. “The user must be able to select a loan term using a drop-down list box” isn’t a requirement either, although, again, it may look like one. This is only a requirement if the customer absolutely must have a drop-down menu to choose their loan term. Otherwise, this is an example of the implementation that the customer would like to see, and it may not be a requirement. As Hunt and Thomas state in their book, “It’s important to discover the underlying reason why users do a particular thing, rather than just the way they currently do it. At the end of the day, your development has to solve their business problem, not just meet their stated requirements. Documenting the reasons behind requirements will give your team invaluable information when making daily implementation decisions.”

Why Requirements Digging Is Hard There are several reasons why pulling requirements out of your customer is a really hard exercise. We’ll look at a few of these.

Problems of Scope Lots of times the boundaries of what your program is supposed to do are fuzzy. This can be because of several things. The program may be part of a larger system, and the integration of the parts is ill-defined. The customer may not have thought through exactly what they want the program to do, so they start throwing out all sorts of ideas, many of which may not even apply to the problem at hand. Finally, the customer may have dropped into implementation-land and be providing unnecessary levels of detail. It takes lots of patience, discipline, repeatedly saying the word no, and repeatedly asking, “Why does this need to be part of the program?” in order to overcome problems of scope. Scope is directly related to requirements creep, so beware.

Problems of Understanding Let’s face it: the customer and you as the developer speak different languages. Your customer is the domain expert and they speak the domain language (accounts receivable, accounts payable, reconciliation, general ledger, and so on). You speak the design and implementation language (class, object, method, use case, recursion, activation record, and the like). This is usually worse than an American in Paris; at least there, both sides can pantomime their needs and figure things out. With problems of domain understanding, the best you can usually do is order drinks together. There are usually two ways to overcome problems of understanding. The first is to have someone in the middle who has lived in both worlds and who can translate between the two. Some companies have folks called system engineers or technical marketers who fulfill this role. These folks have done development and have also worked the customer side of things, so they can speak both languages. Good system engineers are worth their weight in user stories. The second way to promote understanding is to have the customer as part of the development team. This is the approach taken by some agile methodologies, notably XP. When the customer is part of the development team, you get to talk to them every day, ask them questions, and teach them technical stuff. Both sides benefit. And because the on-site customer sees intermediate product builds as soon as they pop out of your build machine, you get immediate feedback. Win, win, and win.

49

Chapter 4 ■ Requirements

Problems of Volatility Things change. This is by far the hardest part of requirements gathering and analysis and the biggest reason why schedules slip. You can’t do anything about it. Get used to it. As Kent Beck says, “Embrace change.” What you can do is manage change. Create a backlog of new features that get added as they arrive. In the Scrum methodology, new requirements are always added to the product backlog—they’re not added to the current sprint backlog. This allows the current sprint to proceed normally, and the requirements are all reviewed at the end of the sprint. Another way to manage change is to push the decision onto the user. Give the user a choice: “If we implement this new feature it will add six weeks to the schedule. Do you still want it?” Alternatively: “If you want to keep to the original schedule, we can only implement and test one of A, B, or C. You pick the one you want most.” This is one of the things that the agile folks mean by courage10; sometimes you have to take a stand and choose what’s best for the project as a whole.

Non-Technical Problems From a developer’s perspective, non-technical problems with requirements are the worst ones you will see. In fact, these are problems developers should never see; their managers should shield them from nontechnical problems. Non-technical requirements problems are fundamentally political. Examples abound. One group of customers in an organization has a different view of the program requirements than another group. Or worse, one manager has a different view than another manager. The program being developed will reduce the influence of one department by automating a function where they used to be the sole source of expertise. The program will distribute data processing across several departments where it was once centralized in a single department. The list goes on and on. The best advice for non-technical problems is to run away—quickly. Let your vice-president deal with it; that’s why they’re paid the big bucks.

Analyzing the Requirements Once you’ve written down a set of requirements, you need to make sure that these are the right requirements for the program; you need to analyze them. Analysis has three basic parts. First, you must categorize the requirements and organize them into related areas. This will help the designers a lot. Second, you—or better yet, the customer—need to prioritize them. This is critical because you won’t be able to implement all the requirements in the first product release (trust me, you won’t). So, this prioritized list will be what you’ll use to set the contents of each interim release. Lastly, you need to examine each requirement in relation to all the others to make sure they fit into a coherent whole. Ask yourself a series of questions: 1. Is each requirement consistent with the overall project objective? If your program is supposed to sell your users books, it doesn’t also have to compute their golf handicap. 2. Is this requirement really necessary? Have you added something that can be removed without impairing the essential functionality of the program? If your first release is supposed to allow users to buy books, you probably don’t need to also allow them to buy sailboats.

10

Beck, K. Extreme Programming Explained: Embrace Change. (Boston, qwo21iwqswMA: Addison-Wesley, 2000.)

50

Chapter 4 ■ Requirements

3. Is this requirement testable? This is probably the most important question when you’re doing requirements analysis. If you can’t figure out how to test a requirement, then you can’t know that you’ve implemented it correctly or that you’re finished. All requirements must be testable, or else they’re not requirements. In most agile methodologies, the rule is to write the test first, then write the code. 4. Is this requirement doable in the technical environment you’ve got to work in? This question normally applies to those non-functional requirements mentioned previously. Are your requirements feasible given the particular target platform or set of hardware constraints you must work under for this project? For example, if your target platform is a Macintosh running macOS, a requirement that the DirectX graphics library be used is not doable because DirectX is a Windowsonly library. 5. Is this requirement unambiguous? Your requirements need to be as precise as possible (so that they are testable), because as sure as you’re sitting here reading this, someone will misinterpret an ambiguous requirement, and you’ll discover the error the day after you ship. Your requirements should never contain the words or or may.

Conclusion Once you’re done with your functional specification or set of user stories and the analysis of your requirements, you’re done with the requirements phase. Right? Well, of course not. As we’ve said before: requirements change. So relax, don’t obsess about the requirements. Do the best you can to get an initial list of clear, testable requirements and then move on to design. You’ll always come back here later.

References Beck, K. Extreme Programming Explained: Embrace Change. (Boston, MA: Addison-Wesley, 2000.) Brooks, F. P. The Mythical Man-Month : Essays on Software Engineering, Silver Anniversary Edition. (Boston, MA: Addison-Wesley, 1995.) Hunt, A. and D. Thomas. The Pragmatic Programmer: From Journeyman to Master. (Boston, MA: AddisonWesley, 2000.) Iverson, K. E. “Notation as a Tool of Thought.” Communications of the ACM 23(8): 444–465 (1980). Jeffries, Ron. “Essential XP: Cards, Conversation, and Confirmation.” http://ronjeffries.com/xprog/ articles/expcardconversationconfirmation/. Retrieved June 30, 2017. (August 30, 2001.) Schwaber, K. and M. Beedle. Agile Software Development with Scrum. (Upper Saddle River, NJ: Prentice Hall, 1980.) Spolsky, J., Joel on Software. (Berkeley, CA: Apress, 2004.) Wake, William., “INVEST in Good Stories and SMART Tasks.” Agile Advice. August 17. http://xp123.com/ articles/invest-in-good-stories-and-smart-tasks/. Retrieved June 26, 2017. (2003.) Wikipedia, Sapir-Whorf Linguistic Relativity Hypothesis, http://en.wikipedia.org/wiki/Linguistic_ relativity. Retrieved September 15, 2009.

51

CHAPTER 5

Software Architecture What do we mean by a software architecture? To me the term architecture conveys a notion of the core elements of the system, the pieces that are difficult to change. A foundation on which the rest must be built. —Martin Fowler1 Once you have an idea of what you’re going to build, then you can start thinking about how you’re going to build it. Of course, you’ve already been thinking about this from the very first requirement, but now you have permission to do it. Here we begin to delve into design. There are really two levels of software design. The level we normally think of when we’re writing programs is called detailed design. What operations do we need? What data structures? What algorithms are we using? How is the database going to be organized? What does the user interface look like? What are the calling sequences? These are all very detailed questions that need to be answered before you can really get on with the detailed work of coding (well, sort of—we’ll get to that later). But there’s another level of design. This kind of design is all about style. If you were building a house, this design level asks questions like ranch or multi-story? Tudor or Cape Cod? Which direction do the bedroom windows face? Forced-air or hot-water heat? Three bedrooms or four? Open concept floor plan or closed? These questions focus somewhat on details, but they’re much more about the style of the house and how you’ll be using it, rather than things like 12- or 14-gauge wire for the electrical system or the dimensions of the air conditioning ductwork. This emphasis on style is what software architecture is all about. As Fowler says in this chapter’s opening quote, you need the foundation before you can build the rest of the structure. Software architecture is a set of ideas that tells you which foundation is the right one for your program. The idea of software architecture began as a response to the increasing size and complexity of programs. As Garlan and Shaw put it in the seminal document on software architecture: “As the size and complexity of software systems increases, the design problem goes beyond the algorithms and data structures of the computation: designing and specifying the overall system structure emerges as a new kind of problem. . . . This is the software architecture level of design.”2 Actually, all programs of any size and complexity have an architecture. It’s just that for larger programs you need to be more intentional about your thinking about the architecture to make sure you have the right set of architectural patterns incorporated in your system design. You need to do this. It’s so much harder to change things at the architectural level once the program has been written, because architectural features are so fundamental to the structure of the program.

“Is Design Dead?” Retrieved from http://martinfowler.com/articles/designDead.html on July 3, 2017. Garlan, D. and M. Shaw. An Introduction to Software Architecture. Pittsburgh, PA: Carnegie Mellon University: 49. CMU/SEI-94-TR-21 (1994).

1 2

© John F. Dooley 2017 J. F. Dooley, Software Development, Design and Coding, https://doi.org/10.1007/978-1-4842-3153-1_5

53

Chapter 5 ■ Software Architecture

There are many different styles of software architecture, and in any given project you’ll probably use more than one. The architectural style used for a program depends on what it is you’re doing. As we’ll see, different types of programs in different domains will lead us to different architectural styles; we can also call these architectural patterns since they have many characteristics of the design patterns we’ll see shortly. First, let’s get some general vocabulary under our belts.

General Architectural Patterns Whenever a software architect starts thinking about an architecture for a program, they usually start by drawing pictures. Diagrams of the architecture allow people to see the structure and framework of the program much more easily than text allows. Software architectures are normally represented as black box graphs where graph nodes are computational structures and the graph edges are communication conduits between the structures. The conduits can represent data flow, object message passing, or procedure calls. Notations of this type vary, and there are several standard notations, the most popular being the Unified Modeling Language (UML). Visual descriptions of architectures are generally easier to understand. A particular architectural style is a pattern that can represent a set of similar structures. Let’s looks at several different common architectural styles.

The Main Program—Subroutine Architectural Pattern The most traditional and oldest architectural pattern is the main program—subroutine pattern. It descends from Niklaus Wirth’s 1971 paper “Program Development by Stepwise Refinement,”3 although Wirth was just the first to formally define the top-down problem decomposition methodology that naturally leads to the main program—subroutine pattern. The idea is simple. You start with a big problem and then try to decompose the problem into several smaller, semi-independent problems or pieces of the original problem. For example, nearly every problem that is amenable to solution by top-down decomposition can be divided into three parts immediately: input processing, computation of the solution, and output processing. Once you have a problem divided into several pieces, you look at each piece individually and continue dividing, ignoring all the other pieces as you go. Eventually, you’ll have a very small problem where the solution is obvious; now is the time to write code. So, you generally solve the problem from the top down, and write the code from the bottom up. There are many variations, however. To quote from the conclusion to Wirth’s paper: 1. Program construction consists of a sequence of refinement steps. In each step a given task is broken up into a number of subtasks. Each refinement in the description of a task may be accompanied by a refinement of the description of the data, which constitute the means of communication between the subtasks . . . 2. The degree of modularity obtained in this way will determine the ease or difficulty with which a program can be adapted to changes or extensions of the purpose . . . 3. During the process of stepwise refinement, a notation which is natural to the problem in hand should be used as long as possible . . . Each refinement implies a number of design decisions based upon a set of design criteria . . . 4. The detailed elaborations on the development of even a short program form a long story, indicating that careful programming is not a trivial subject. 3

Wirth, N. “Program Development by Stepwise Refinement.” Communications of the ACM 14(4): 221-227 (1971).

54

Chapter 5 ■ Software Architecture

Figure 5-1 gives an impression about how the main program—subroutine architecture works. We’ll discuss top-down decomposition of problems much more in Chapter 7.

Figure 5-1. A main program—subroutine architecture

Pipe-and-Filter Architecture In a pipe-and-filter style architecture, the computational components are called filters and they act as transducers that take input, transform it according to one or more algorithms, and then output the result to a communications conduit. The input and outputs conduits are called pipes. A typical pipe-and-filter architecture is linear, as shown in Figure 5-2.

Figure 5-2. The pipe-and-filter architecture The filters must be independent components. That’s one of the beauties of a pipe-and-filter architecture. You can join different filters in the set in different orders to get different results. The classic example of a pipeand-filter architectural style is the Unix shell, where there are a large number of small programs that typically do a single thing and that can be chained together using the Unix pipe mechanism. Here’s an example that shows how a pipe-and-filter can work. This problem is from Jon Bentley’s book Programming Pearls4:

The Problem: Given a dictionary of words in English, find all the anagrams in the dictionary. That is, find all the words that are permutations of each other. For example, pots, stop, and spot are anagrams of each other. So, what do we know? Well, first of all, all the anagrams have the same letters and the same number of letters in each word. That gives us the clue to the method you’ll use to find the anagrams. Got it yet? Don’t worry. I’ll wait. Yes! If you sort each word by individual letters, you’ll end up with a string of characters that has all the letters in the word in alphabetical order. We call this creating a sign for the word. If you then sort the resulting list, all the anagrams will end up together in the sorted list because their sorted letters will be identical. If you then keep track of which words you sorted, you can then simplify the list and create a new list with, say, each set of anagrams on the same line of the output file. This is exactly how Bentley does it. 4

Bentley, J. Programming Pearls, Second Edition. (Boston, MA: Addison-Wesley, 2000.)

55

Chapter 5 ■ Software Architecture

But how does this relate to a pipe-and-filter architecture, you ask? Good question. Let’s break down the solution again. 1. Create a sign for each word in the list by sorting the letters in each word; keep the sign and the word together. 2. Sort the resulting list by the signs; all the anagrams should now be together. 3. Squash the list by putting each set of anagrams on the same line, removing the signs as you do. See the pipe-and-filter now? In Unix-speak it looks like this: sign anagrams.txt sign is the filter we use to do step 1, with input file dictionary.txt. sign outputs a list of signs and their associated words, which is piped to the Unix sort utility (we didn’t need to write that one). Sort then sorts the list by the first field on each line (its default behavior), which happens to be the sign of each word. It then outputs the sorted list to the next pipe. Squash takes the sorted list from the incoming pipe and compresses it by putting all the words with the same sign on the same line, eliminating the signs as it does so. This final list is sent via one last pipe (this time a Unix I/O redirection) to the output file anagrams.txt. Note that this example has all the features of a standard pipe-and-filter architecture: independent computational components that perform a transformation on their input data and communication conduits that transmit the data from the output of one component to the input of the next. Note also that not all applications should use the pipe-and-filter architecture. For example, it won’t work so well for interactive applications or applications that respond to events or interrupts. That’s why we’re going to look at more architectural styles.

An Object-Oriented Architectural Pattern The advent of object-oriented analysis, design, and programming in the early 1980s (well, it really started in the 1960s, but no one was paying attention) brought with it a number of architectural and design patterns. We’ll just focus on one object-oriented architectural pattern here and save discussions of the rest to Chapter 11, which covers design patterns. The Model-View-Controller (MVC) architectural pattern is a way of breaking an application, or even just a piece of an application’s interface, into three parts: the model, the view, and the controller. MVC was originally developed to map the traditional input, processing, and output roles of many programs into the GUI realm: Input ➤ Processing ➤ Output Controller ➤ Model ➤ View The user input, the modeling of the external world, and the visual feedback to the user are separated and handled by model, view, and controller objects, as shown in Figure 5-3.

56

Chapter 5 ■ Software Architecture

Figure 5-3. The Model-View-Controller architecture •

The controller interprets mouse and keyboard inputs from the user and maps these user actions into commands that are sent to the model and/or view to effect the appropriate change. The controller handles input.

•

The model manages one or more data elements, responds to queries about its state, and responds to instructions to change state. The model knows what the application is supposed to do and is the main computational structure of the architecture—it models the problem you’re trying to solve. The model knows the rules.

•

The view or viewport manages a rectangular area of the display and is responsible for presenting data to the user through a combination of graphics and text. The view doesn’t know anything about what the program is actually doing; all it does is take instructions from the controller and data from the model and display them. It communicates back to the model and controller to report status. The view handles the output.

The flow of an MVC program typically looks like this: •

The user interacts with the user interface (for example, the user clicks a button) and the controller handles the input event from the user interface, often via a registered handler or callback. The user interface is displayed by the view but controlled by the controller. Oddly enough, the controller has no direct knowledge of the view as an object; it just sends messages when it needs something on the screen updated.

•

The controller accesses the model, possibly updating it in a way appropriate to the user’s action (for example, the controller causes the user’s shopping cart to be updated by the model). This usually causes a change in the model’s state as well as in its data.

•

A view uses the model to generate an appropriate user interface (for example, the view produces a screen listing the shopping cart contents). The view gets its own data from the model. The model has no direct knowledge of the view. It just responds to requests for data from whomever and to requests for transforming data from the controller.

•

The controller, as the user interface manager, waits for further user interactions, which begins the cycle anew.

The main idea here is separation of concerns—and code. The objective is to separate how your program works from what it is displaying and how it gets its input data. This is classic object-oriented programming: create objects that hide their data and hide how they manipulate their data and then just present a simple interface to the world to interact with other objects. We’ll see this again in Chapter 9.

57

Chapter 5 ■ Software Architecture

An MVC Example: Let’s Hunt! A classic example of a program that uses the MVC architectural pattern is the Nifty Assignment presented by Dr. David Matuszek at the 2004 SIGCSE Technical Symposium.5

The Problem The program is a simple simulation of a fox and a rabbit. The fox is trying to find the rabbit in a grid environment, and the rabbit is trying to get away. There are bushes that the rabbit can hide behind, and there are some restrictions on movement. Figure 5-4 shows a typical picture of the game in action.

Figure 5-4. A typical fox and rabbit hunt instance The fox is the large dot, the rabbit is the small dot, and the bushes are the fat crosses. The objective of the programming assignment is to make the rabbit smarter so it can escape from the fox. We don’t really care about this—we want to look at how the program is organized. Figure 5-5 shows the organization of the program. It’s a UML object diagram taken from the BlueJ IDE. The key parts of the program are the three classes: Model, View, and Controller.

Matuszek, David. “Rabbit Hunt,” SIGCSE 2004 Technical Symposium, Nifty Assignments Session. Retrieved August 17, 2009, http://nifty.stanford.edu/2004/RabbitHunt/ (2004).

5

58

Chapter 5 ■ Software Architecture

Figure 5-5. The fox and rabbit hunt class structure

Model The model represents the rules of the game. It does all the computation and all the work of deciding whose turn it is, what happens during each turn, and whether anyone has won. The model is strictly internal and has practically nothing to do with the other parts of the program. The model part of this program is actually composed of five classes: model (the “main” model class), animal, rabbit, fox, and bush. Rabbit and Fox are subclasses of Animal (as you can see from the solid arrows in the UML diagram). This is the part of the program that you really need to understand. The RabbitHunt class just creates model, view, and controller objects and turns control over to the controller object. The controller object starts the model object and then just waits for the user to click a button. When a button is clicked, a message is sent to the model object, which decides what to do. The model object •

places the fox, rabbit, and bushes in the field.

•

gives the rabbit and the fox each a chance to move (one moves, then the other—they don’t both move at the same time).

•

tells the view to display the result of these two moves.

•

determines which animal won.

View The view displays what is going on. It puts an image on the screen so the user can see what’s happening. The view is completely passive; it doesn’t affect the hunt in any way, it’s just a news reporter that gives you a (partial) picture of what is happening inside the model.

59

Chapter 5 ■ Software Architecture

Controller The controller is the part of the program that displays the controls (the five buttons and the speed controls at the bottom of the window). It knows as little as possible about the model and view; it basically tells the model when to go and when to stop. The advantages of breaking the program up into these separate parts are many. We can safely rewrite the GUI in the Controller object or the display in the view object without changing the model. We can make the fox and/or the rabbit smarter (or dumber!) without changing the GUI or the display. We can reuse the GUI for a different application with very little effort. The list just goes on. In short, MVC is your friend; use it wisely and often.

The Client-Server Architectural Pattern Moving back to a more traditional architecture, we once again go back in time. Back in the day, all programs ran on big iron, and your entire program ran on a single machine. If you were lucky enough to be using a time-shared operating system, several people could be using the same program—albeit usually different copies—simultaneously. Then came personal computers and networks. And someone had the bright idea of dividing the work up between that big iron and your puny desktop machine. Thus was born the client-server architecture. In a client-server architecture, your program is broken up into two different pieces that typically run on two separate computers. A server does most of the heavy lifting and computation; it provides services to its clients across a high-bandwidth network. Clients, on the other hand, mostly just handle user input, display output, and provide communication to the server. In short, the client program sends requests for services to the server program. The server program then evaluates the request, does whatever computation is necessary (including accessing a database, if needed), and responds to the client’s request with an answer. The most common example of a client-server architecture today is the World Wide Web. In the web model, your browser is the client. It presents a user interface to you, communicates with a web server, and renders the resulting web pages to your screen. The web server does a number of things. It serves web pages in HTML, but it also can serve as a database server, file server, and computational server— think about everything that Amazon.com does when you access that website in order to make a purchase. Clients and servers don’t have to be on different computers, though. Two examples of programs written using a client-server architecture where both sides can reside on the same computer are print spoolers and the X Windows graphical system. In a print spooler application, the program you’re running—a word processor, spreadsheet program, web browser—runs as a client that makes a request to a printing service that is implemented as a part of the computer’s operating system. This service is typically known as a print spooler because it keeps a spool of print jobs and controls which jobs get printed and the order of their printing. So, from your word processor, you’ll select Print from a menu, set certain attributes and often pick a printer, and then click OK in a dialog box. This sends a print request to the print spooler on your system. The print spooler adds your file to a queue of print jobs that it manages and then contacts the printer driver and makes requests for printing to occur. The difference here is that once you’ve clicked the OK button, your client program (the word processor) typically doesn’t have any more contact with the print spooler; the print service runs unattended. The X Window System (see www.x.org/wiki/) is a graphical windowing system used on all Unix- and Linux-based systems; it’s also available for Apple’s macOS and Microsoft Windows systems as an add-on windowing system. The X system uses a client-server architecture where the client programs and the server typically both reside on the same computer. The X system server receives requests from client programs, processes them for the hardware that’s attached to the current system, and provides an output service that displays the resulting data in bitmapped displays. Client program examples include xterm—a windowed terminal program that provides a command line interface to Unix, xclock—you guessed it—a clock, and xdm, the X Window display manager. The X system allows hierarchical and overlapping windows and provides the ability to configure menus, scroll bars, open and close buttons, background and foreground

60

Chapter 5 ■ Software Architecture

colors, and graphics. X can also manage a mouse and keyboards. These days the main use of the X system is as a springboard to build more sophisticated window managers, graphical environments, graphical widgets, and desktop management windowing systems like GNOME and KDE.

The Layered Approach The layered architectural approach suggests that programs can be structured as a series of layers, much like geologic strata, with a sequence of well-defined interfaces between the layers. This has the effect of isolating each layer from the ones above and below it so that one can change the internals of any layer without having to change any of the other layers in the program—that is, as long as your changes don’t involve any changes to the interface. In a layered approach, interfaces are sacred. Two classic examples of a layered approach to programming are operating systems (OSes) and communications protocols. An operating system’s architecture has several objectives, among them to centralize control of the limited hardware resources and to protect users from each other. A layered approach to the operating system architecture does both of these things. Figure 5-6 shows a pretty standard picture of an OS architecture.

Figure 5-6. A layered architecture In this layered model, user applications request OS services via a system call interface. This is normally the only way for applications to access the computer’s hardware. Most OS services must make requests through the kernel, and all hardware requests must go through device drivers that talk directly to the hardware devices. Each of these layers has a well-defined interface, so that, for example, a developer may add a new device driver for a new disk drive without changing any other part of the OS. This is a nice example of information hiding. The same type of interface happens in a communications protocol. The most famous of these layered protocols is the International Standards Organization (ISO) Open Systems Interconnection (OSI) sevenlayer model This model looks like Figure 5-7.

61

Chapter 5 ■ Software Architecture

Figure 5-7. The ISO-OSI layered architecture In this model, each layer contains functions or services that are logically similar and are grouped together. An interface is defined between each layer, and communication between layers is only allowed via the interfaces. A particular implementation need not contain all seven layers, and sometimes two or more layers are combined to make a smaller protocol stack. The OSI model defines both the seven-layer approach and all the interface protocols. The model can be downloaded as a PDF file from www.itu.int/ rec/T-REC-X.200/en. (The ITU or International Telecommunications Union is the new name for the ISO.) Examples of protocols that are implemented at each layer are shown in Table 5-1. Table 5-1. Layered Protocols Using the ISO-OSI Architecture

Layer

Protocol

7. Application

HTTP, FTP, telnet

6. Presentation

MIME, SSL

5. Session

Sockets

4. Transport

TCP, UDP

3. Network

IP, IPsec

2. Data Link

PPP, Ethernet, SLIP, 802.11

1. Physical

Conclusion The software architecture is the core of your application. It is the foundation on which you build the rest of the program. It drives the rest of your design. There are many different styles of software architecture, and in any given project you’ll probably use more than one. The architectural style used for a program depends on what you’re doing. That's the beauty of these styles—it may not always be true that form follows function, but for software, design follows architecture. These foundational patterns lead you down the path of design, shaping how your program will be constructed and lived in. Go out there and build a great program.

62

Chapter 5 ■ Software Architecture

References Bentley, J. Programming Pearls, Second Edition. (Boston, MA: Addison-Wesley, 2000.) Garlan, D. and M. Shaw (1994). An Introduction to Software Architecture. Pittsburgh, PA: Carnegie Mellon University: 49. CMU/SEI-94-TR-21 (1994). Kernighan, B. W. and R. Pike. The Practice of Programming. (Boston, MA: Addison-Wesley, 1999.) Matuszek, David. “Rabbit Hunt,” SIGCSE 2004 Technical Symposium, Nifty Assignments Session. Retrieved August 17, 2009, http://nifty.stanford.edu/2004/RabbitHunt/ (2004). McConnell, S. Code Complete 2. (Redmond, WA: Microsoft Press, 2004.) Wirth, N. “Program Development by Stepwise Refinement.” Communications of the ACM 14(4): 221-227 (1971).

63

CHAPTER 6

Design Principles There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies. —C. A. R. Hoare One way to look at software problems is with a model that divides the problems into two different layers: •

“Wicked” problems fall in the upper layer: These are problems that typically come from domains outside of computer science (such as biology, business, meteorology, sociology, political science, and so on). These types of problems tend to be open ended, ill defined, and large in the sense that they require much work. For example, pretty much any kind of a web commerce application is a wicked problem. Horst W. J. Rittel and Melvin M. Webber, in a 1973 paper on social policy,1 gave a definition for and a set of characteristics used to recognize a wicked problem that we’ll look at later in this chapter.

•

“Tame” problems fall in the lower layer: These problems tend to cut across other problem domains and tend to be better defined and small. Sorting and searching are great examples of tame problems. Small and well defined don’t mean “easy,” however. Tame problems can be very complicated and difficult to solve. It’s just that they’re clearly defined and you know when you have a solution. These are the kinds of problems that provide computer scientists with foundations in terms of data structures and algorithms for the wicked problems we solve from other problem domains.

According to Rittel and Webber, a wicked problem is one for which the requirements are completely known only after the problem is solved, or for which the requirements and solution evolve over time. It turns out this describes most of the “interesting” problems in software development. Recently, Jeff Conklin has revised Rittel and Webber’s description of a wicked problem and provided a more succinct list of the characteristics of wicked problems.2 To paraphrase: 1. A wicked problem is not understood until after the creation of a solution: Another way of saying this is that the problem is defined and solved at the same time.3

Rittel, H. W. J. and M. M. Webber. “Dilemmas in a General Theory of Planning.” Policy Sciences 4(2): 155-169 (1973). Conklin, J. Dialogue Mapping: Building Shared Understanding of Wicked Problems. (New York, NY: John Wiley & Sons, 2005.) 3 DeGrace, P. and L. H. Stahl Wicked Problems, Righteous Solutions: A Catalogue of Modern Software Engineering Paradigms. (Englewood Cliffs, NJ: Yourdon Press, 1990.) 1 2

© John F. Dooley 2017 J. F. Dooley, Software Development, Design and Coding, https://doi.org/10.1007/978-1-4842-3153-1_6

65

Chapter 6 ■ Design Principles

2. Wicked problems have no stopping rule: You can create incremental solutions to the problem, but there’s nothing that tells you that you’ve found the correct and final solution. 3. Solutions to wicked problems are not right or wrong: They are better or worse, or good-enough or not-good-enough. 4. Every wicked problem is essentially novel and unique: Because of the “wickedness” of the problem, even if you have a similar problem next week, you basically have to start over again because the requirements will be different enough and the solution will still be elusive. 5. Every solution to a wicked problem is a one-shot operation: See the preceding bullet point. 6. Wicked problems have no given alternative solutions: There is no small, finite set of solutions to choose from. Wicked problems crop up all over the place. For example, creating a word processing program is a wicked problem. You may think you know what a word processor needs to do—insert text, cut and paste, handle paragraphs, print, and so forth. But this list of features is only one person’s list. As soon as you “finish” your word processor and release it, you’ll be inundated with new feature requests: spell checking, footnotes, multiple columns, support for different fonts, colors, styles, and the list goes on. The word processing program is essentially never done—at least not until you release the last version and end-of-life the product. Word processing is actually a pretty obvious wicked problem. Others might include problems where you don’t really know if you can solve the problem at the start. Expert systems require a user interface, an inference engine, a set of rules, and a database of domain information. For a particular domain, it’s not at all certain at the beginning that you can create the rules that the inference engine will use to reach conclusions and recommendations. So you have to iterate through different rule sets, send out the next version, and see how well it performs. Then you do it again, adding and modifying rules. You don’t really know whether the solution is correct until you’re done. Now that’s a wicked problem. Conklin, Rittel, and Webber say that traditional cognitive studies indicate that when faced with a large, complicated (wicked) problem, most people will follow a linear problem-solving approach, working top-down from the problem to the solution. This is equivalent to the traditional waterfall model described in Chapter 2.4 Figure 6-1 shows this linear approach.

Figure 6-1. Linear problem-solving approach

Conklin, J. Wicked Problems and Social Complexity. Retrieved from http://cognexus.org/wpf/wickedproblems.pdf on 8 September 2009. Paper last updated October 2008.

4

66

Chapter 6 ■ Design Principles

Instead of this linear, waterfall approach, real wicked problem solvers tend to use an approach that swings from requirements analysis to solution modeling and back until the problem solution is good enough. Conklin calls this an opportunity-driven or opportunistic approach because the designers are looking for any opportunity to make progress toward the solution.5 Instead of the traditional waterfall picture in Figure 6-1, the opportunity-driven approach looks like Figure 6-2.

Figure 6-2. The opportunity-driven development approach In this figure, the jagged line indicates the designer’s work moving from the problem to a solution prototype and back again, slowly evolving both the requirements understanding and the solution iteration and converging on an implementation that’s good enough to release. As an example, let’s take a quick look at a web application. Say a nonprofit organization keeps a list of activities for youth in your home county. The list is updated regularly and is distributed to libraries around the county. Currently, the list is kept on a spreadsheet and is distributed in hard copy in a three-ring binder. The nonprofit wants to put all its data online and make it accessible over the web. It also wants to be able to update the data via the same website. Simple, you say. It’s just a web application with an HTML front end, a database, and middleware code to update and query the database as the back end. Not a problem. Ah, but this is really a wicked problem in disguise. First of all, the customer has no idea what they want the web page(s) to look like. So whatever you give them the first time will not be precisely what they want; the problem won’t be understood completely until you are done. Secondly, as you develop prototypes, they will want more features—so the problem has no stopping rule. And finally, as time goes on, the nonprofit will want new features. So there is no “right” answer; there is only a “good enough” answer. Very wicked. Conklin also provides a list of characteristics of tame problems, ones for which you can easily and reliably find a solution. “A tame problem: 1. has a well-defined and stable problem statement; 2. has a definite stopping point, i.e., when the solution is reached; 3. has a solution which can be objectively evaluated as right or wrong; 4. belongs to a class of similar problems which are all solved in the same similar way; 5. has solutions which can be easily tried and abandoned; and 6. comes with a limited set of alternative solutions.”

5

Conklin, J. (2008)

67

Chapter 6 ■ Design Principles

A terrific example of a tame problem is sorting a list of data values: •

The problem is easily and clearly stated—sort this list into ascending order using this function to compare data elements.

•

Sorting has a definite stopping point: the list is sorted.

•

The result of a sort can be objectively evaluated (the list is either sorted correctly or it isn’t).

•

Sorting belongs to a class of similar problems that are all solved in the same way. Sorting integers is similar to sorting strings is similar to sorting database records using a key and so on.

•

Sorting has solutions that can easily be tried and abandoned.

•

Finally, sorting has a limited set of alternative solutions; sorting by comparison has a set of known algorithms and a theoretical lower bound.

What does this have to do with design principles, you ask? Well, realizing that most of the larger software problems we’ll encounter have a certain amount of “wickedness” built into them influences how we think about design issues, how we approach the design of a solution to a large, ill-formed problem, and gives us some insight into the design process. It also lets us abandon the waterfall model with a clear conscience and pushes us to look for unifying heuristics that we can apply to design problems. In this chapter we’ll discuss overall principles for design that I’ll then expand upon in the chapters ahead.

The Design Process Design is messy. Even if you completely understand the problem requirements (meaning it’s a tame problem), you typically have many alternatives to consider when you’re designing a software solution. You’ll also usually make lots of mistakes before you come up with a solution that works. As we saw in Figure 6-2, your design will change as you understand the problem better over time. This gives the appearance of messiness and disorganization, but really, you’re making progress. Design is about tradeoffs and priorities. Most software projects are time-limited, so you usually won’t be able to implement all the features that the customer wants. You have to figure out the subset that will give the customer the most high-priority features in the time you have available. So, you have to prioritize the requirements and trade off one subset for another. Design is heuristic. For the overwhelming majority of projects, there is no set of cut-and-dried rules that say, “First we design component X using technique Y. Then we design component Z using technique W.” Software just doesn’t work that way. Software design is done using a set of ever-changing heuristics (rules of thumb) that each designer acquires over the course of a career. Over time, good designers learn more heuristics and patterns (see Chapter 11) that allow them to quickly move through the easy bits of a design and get to the heart of the wickedness of the problem. The best thing you can do is to sit at the feet of a master designer and learn the heuristics. Designs evolve. Good designers recognize that for any problem, tame or wicked, the requirements will change over time. This will then cascade into changes in your design, so your design will evolve over time. This is particularly true across product releases and new feature additions. The trick here is to create a software architecture (see Chapter 5) that is amenable to change with limited effect on the downstream design and code.

68

Chapter 6 ■ Design Principles

Desirable Design Characteristics (Things Your Design Should Favor) Regardless of the size of your project or what process you use to do your design, there are a number of desirable characteristics that every software design should have. These are the principles you should adhere to as you consider your design. Your design doesn’t necessarily need to exhibit all of these characteristics, but having a majority of them will certainly make your software easier to write, understand, and use: •

Fitness of purpose: Your design must work, and it must work correctly in the sense that it must satisfy the requirements you’ve been given within the constraints of the platform on which your software will be running. Don’t add new requirements as you go—the customer will do that for you.

•

Separation of concerns: Related closely to modularity, this principle says you should separate out functional pieces of your design cleanly in order to facilitate ease of maintenance and simplicity. Modularity is good.

•

Simplicity: Keep your design as simple as possible. This will let others understand what you’re up to. If you find a place that can be simplified, do it! If simplifying your design means adding more modules or classes to your design, that’s okay. Simplicity also applies to interfaces between modules or classes. Simple interfaces allow others to see the data and control flow in your design. In agile methodologies, this idea of simplicity is kept in front of you all the time. Most agile techniques have a rule that says if you’re working on part of a program and you have an opportunity to simplify it (called refactoring in agile-speak), do it right then. Keep your design and your code as simple as possible at all times.

•

Ease of maintenance: A simple, understandable design is amenable to change. The first kind of change you’ll encounter is fixing errors. Errors occur at all phases of the development process: requirements, analysis, design, coding, and testing. The more coherent and easy to understand your design is, the easier it will be to isolate and fix errors.

•

Loose coupling: When you’re separating your design into modules—or in objectoriented design, into classes—the degree to which the classes depend on each other is called coupling. Tightly coupled modules may share data or procedures. This means that a change in one module is much more likely to lead to a required change in the other module. This increases the maintenance burden and makes the modules more likely to contain errors. Loosely coupled modules, on the other hand, are connected solely by their interfaces. Any data they both need must be passed between procedures or methods via an interface. Loosely coupled modules hide the details of how they perform operations from other modules, sharing only their interfaces. This lightens the maintenance burden because a change to how one class is implemented will not likely affect how another class operates as long as the interface is invariant. So, changes are isolated and errors are much less likely to propagate.

•

High cohesion: The complement of loose coupling is high cohesion. Cohesion within a module is the degree to which the module is self-contained with regard both to the data it holds and the operations that act on the data. A class that has high cohesion pretty much has all the data it needs defined within the class template, and all the operations that are allowed on the data are defined within the class as well. So, any object that’s instantiated from the class template is very independent and just communicates with other objects via its published interface.

69

Chapter 6 ■ Design Principles

•

Extensibility: An outgrowth of simplicity and loose coupling is the ability to add new features to the design easily. This is extensibility. One of the features of wicked software problems is that they’re never really finished. So, after every release of a product, the next thing that happens is the customer asks for new features. The easier it is to add new features, the cleaner your design is.

•

Portability: Though not high on the list, keeping in mind that your software may need to be ported to another platform (or two or three) is a desirable characteristic. There are a lot of issues involved with porting software, including operating system issues, hardware architecture, and user interface issues. This is particularly true for web applications.

Design Heuristics Speaking of heuristics, here’s a short list of good, time-tested heuristics. The list is clearly not exhaustive and it’s pretty idiosyncratic, but it’s a list you can use time and again. Think about these heuristics and try some of them during your next design exercise. We will come back to all of these heuristics in much more detail in later chapters:

6 7

•

Find real-world objects to model: Alan Davis6 and Richard Fairley7 call this intellectual distance. It’s how far your design is from a real-world object. The heuristic here is to try to find real-world objects that are close to things you want to model in your program. Keeping the real-world object in mind as you’re designing your program helps keep your design closer to the problem. Fairley’s advice is to minimize the intellectual distance between the real-world object and your model of it.

•

Abstraction is key: Whether you’re doing object-oriented design and you’re creating interfaces and abstract classes, or you’re doing a more traditional layered design, you want to use abstraction. Abstraction means being lazy. You put off what you need to do by pushing it higher in the design hierarchy (more abstraction) or pushing it further down (more details). Abstraction is a key element of managing the complexity of a large problem. By abstracting away the details you can see the kernel of the real problem.

•

Information hiding is your friend: Information hiding is the concept that you isolate information—both data and behavior—in your program so that you can isolate errors and isolate changes; you also only allow access to the information via a welldefined interface. A fundamental part of object-oriented design is encapsulation, a concept that derives from information hiding. You hide the details of a class away and only allow communication and modification of data via a public interface. This means that your implementation can change, but as long as the interface is consistent and constant, nothing else in your program need change. If you’re not doing object-oriented design, think about using libraries for hiding behavior and using separate data structures (structs in C and C++) for hiding state.

•

Keep your design modular: Breaking your design up into semi-independent pieces has many advantages. It keeps the design manageable in your head; you can just think about one part at a time and leave the others as black boxes. It takes advantage of information hiding and encapsulation. It isolates changes. It helps with extensibility and maintainability. Modularity is just a good thing. Do it.

Davis, A. M. 201 Principles of Software Development. (New York, NY: McGraw-Hill, 1995.) Fairley, R. E. Software Engineering Concepts. (New York, NY: McGraw-Hill, 1985.)

70

Chapter 6 ■ Design Principles

•

Identify the parts of your design that are likely to change: If you make the assumption that there will be changes in your requirements, then there will likely be changes in your design as well. If you identify those areas of your design that are likely to change, you can separate them, thus mitigating the impact of any changes you need to make. What things are likely to change? Well, it depends on your application, doesn’t it? Business rules can change (think tax rules or accounting practices), user interfaces can change, hardware can change, and so on. The point here is to anticipate the change and to divide up your design so that the necessary changes are contained.

•

Use loose coupling, interfaces, and abstract classes: Along with modularity, information hiding, and change, using loose coupling will make your design easier to understand and to change as time goes along. Loose coupling says that you should minimize the dependencies of one class (or module) on another. This is so that a change in one module won’t cause changes in other modules. If the implementation of a module is hidden and only the interface exposed, you can swap out implementations as long as you keep the interface constant. So you implement loose coupling by using well defined interfaces between modules, and in an objectoriented design, using abstract classes and interfaces to connect classes.

•

Use your knapsack full of common design patterns: Robert Glass8 describes great software designers as having “a large set of standard patterns” that they carry around with them and apply to their designs. This is what design experience is all about—doing design over and over again and learning from the experience. In Susan Lammer’s book Programmers at Work,9 Butler Lampson says, “Most of the time, a new program is a refinement, extension, generalization, or improvement of an existing program. It’s really unusual to do something that’s completely new. . . .” That’s what design patterns are: they’re descriptions of things you’ve already done that you can apply to a new problem.

•

Adhere to the Principle of One Right Place: In his book Programming on Purpose: Essays on Software Design, P. J. Plauger says, “My major concern here is the Principle of One Right Place—there should be One Right Place to look for any nontrivial piece of code, and One Right Place to make a likely maintenance change.10 ” Your design should adhere to the Principle of One Right Place; debugging and maintenance will be much easier.

•

Use diagrams as a design language: I’m a visual learner. For me, a picture really is worth a thousand or so words. As I design and code, I’m constantly drawing diagrams so I can visualize how my program is going to hang together, which classes or modules will be talking to each other, what data is dependent on what function, where do the return values go, what is the sequence of events. This type of visualization can settle the design in your head and can point out errors or possible complications in the design. Whiteboards or paper are cheap; enjoy!

Glass, R. L. Software Creativity 2.0. Atlanta, GA, developer* (2006). Lammers, S. Programmers At Work. (Redmond, WA: Microsoft Press, 1986.) 10 Plauger, P. J. Programming on Purpose: Essays on Software Design. (Englewood Cliffs, NJ: PTR Prentice Hall, 1993.) 8 9

71

Chapter 6 ■ Design Principles

Designers and Creativity Don’t think that design is cut and dried or that formal processes rules can be imposed to crank out software designs. It’s not like that at all. Although there are formal restrictions and constraints on your design that are imposed by the problem, the problem domain, and the target platform, the process of reaching the design itself need not be formal. It’s at bottom a creative activity. Bill Curtis, in a 1987 empirical study of software designers, came up with a process that seems to be what most of the designers followed11: 1. Understand the problem. 2. Decompose the problem into goals and objects. 3. Select and compose plans to solve the problem. 4. Implement the plans. 5. Reflect on the design product and process. Frankly, this is a pretty general list and doesn’t really tell us all we’d need for software design. Curtis, however, then went deeper into step 3 on his list, “select and compose plans,” and found that his designers used the following steps: 1. Build a mental model of a proposed solution. 2. Mentally execute the model to see if it solves the problem—make up input and simulate the model in your head. 3. If what you get isn’t correct, change the model to remove the errors and go back to step 2 to simulate again. 4. When your sample input produces the correct output, select some more input values and go back and do steps 2 and 3 again. 5. When you’ve done this enough times (you’ll know because you’re experienced) then you’ve got a good model and you can stop.12 This deeper technique makes the cognitive and the iterative aspects of design clear and obvious. We see that design is fundamentally a function of the mind, and is idiosyncratic and depends on things about the designer that are outside the process itself. John Nestor, in a report to the Software Engineering Institute, came up with a list of what are some common characteristics of great designers. Great designers •

have a large set of standard patterns.

•

have experienced failing projects.

•

have mastery of development tools.

•

have an impulse towards simplicity.

•

can anticipate change.

•

can view things from the user’s perspective.

•

can deal with complexity.13

Curtis, B., R. Guindon, et al. Empirical Studies of the Design Process: Papers for the Second Workshop on Empirical Studies of Programmers. Austin, TX, MCC (1987). 12 Glass, R. L. Software Creativity 2.0. Atlanta, GA, developer* (2006). 13 Glass, R. L. (2006) 11

72

Chapter 6 ■ Design Principles

Conclusion So at the end of the chapter, what have we learned about software design? •

Design is ad hoc, heuristic, and messy: It fundamentally uses a trial-and-error and heuristic process, and that process is the natural one to use for software design. There are a number of well-known heuristics that any good designer should employ.

•

Design depends on understanding of prior design problems and solutions: Designers need some knowledge of the problem domain. More importantly, they need knowledge of design and patterns of good designs. They need to have a knapsack of these design patterns that they can use to approach new problems. The solutions are tried and true. The problems are new but they contain elements of problems that have already been solved. The patterns are malleable templates that can be applied to those elements of the new problem that match the pattern’s requirements.

•

Design is iterative: Requirements change, and so must your design. Even if you have a stable set of requirements, your understanding of the requirements changes as you progress through the design activity, so you’ll go back and change the design to reflect this deeper, better understanding. The iterative process clarifies and simplifies your design at each step.

•

Design is a cognitive activity: You’re not writing code at this point, so you don’t need a machine. Your head and maybe a pencil and paper or a whiteboard are all you need to do design. As Dijkstra says, “We must not forget that it is not our business to make programs; it is our business to design classes of computations that will display a desired behavior.14 ”

•

Design is opportunistic. Glass sums up his discussion of design with this: “The unperturbed design process is opportunistic—that is, rather than proceed in an orderly process, good designers follow an erratic pattern dictated by their minds, pursuing opportunities rather than an orderly progression.15 ”

All these characteristics argue against a rigid, plan-driven design process and for a creative, flexible way of doing design. This brings us back to the first topic in this chapter: design is just wicked. And finally:

A designer can mull over complicated designs for months. Then suddenly the simple, elegant, beautiful solution occurs to him. When it happens to you, it feels as if God is talking! And maybe He is. —Leo Frankowski (in The Cross-Time Engineer)

14 15

Dijkstra, E. “The Humble Programmer.” CACM 15(10): 859-866. (1972) Glass, R. L. (2006)

73

Chapter 6 ■ Design Principles

References Conklin, J. Dialogue Mapping: Building Shared Understanding of Wicked Problems. (New York, NY: John Wiley & Sons, 2005.) Conklin, J. Wicked Problems and Social Complexity. Retrieved from http://cognexus.org/wpf/ wickedproblems.pdf on 8 September 2009. Paper last updated October 2008. Curtis, B., R. Guindon, et al. Empirical Studies of the Design Process: Papers for the Second Workshop on Empirical Studies of Programmers. Austin, TX, MCC (1987). Davis, A. M. 201 Principles of Software Development. (New York, NY: McGraw-Hill, 1995.) DeGrace, P. and L. H. Stahl. Wicked Problems, Righteous Solutions: A Catalogue of Modern Software Engineering Paradigms. (Englewood Cliffs, NJ: Yourdon Press, 1990.) Dijkstra, E. "The Humble Programmer." CACM 15(10): 859-866 (1972). Fairley, R. E. Software Engineering Concepts. (New York, NY: McGraw-Hill, 1985.) Glass, R. L. Software Creativity 2.0. Atlanta, GA, developer* (2006). Lammers, S. Programmers At Work. (Redmond, WA: Microsoft Press, 1986.) McConnell, S. Code Complete 2. (Redmond, WA: Microsoft Press, 2004.) Parnas, D. “On the Criteria to be Used in Decomposing Systems into Modules.” CACM 15(12): 1053-1058 (1972). Plauger, P. J. Programming on Purpose: Essays on Software Design. (Englewood Cliffs, NJ: PTR Prentice Hall, 1993.) Rittel, H. W. J. and M. M. Webber. “Dilemmas in a General Theory of Planning.” Policy Sciences 4(2): 155-169 (1973).

74

CHAPTER 7

Structured Design Invest in the abstraction, not the implementation. Abstractions can survive the barrage of changes from different implementations and new technologies.” —Andy Hunt and Dave Thomas1

Structured Programming Structured design has its genesis in Edsger Dijkstra’s famous 1968 letter to the Communications of the ACM, “Go To Statement Considered Harmful.” Dijkstra’s paper concludes like this:

The go to statement as it stands is just too primitive; it is too much an invitation to make a mess of one’s program. One can regard and appreciate the clauses considered (ed. if-thenelse, switch, while-do, and do-while) as bridling its use. I do not claim that the clauses mentioned are exhaustive in the sense that they will satisfy all needs, but whatever clauses are suggested (e.g. abortion clauses) they should satisfy the requirement that a programmer independent coordinate system can be maintained to describe the process in a helpful and manageable way.2 Programming languages created from this point onward, while not eliminating the goto statement (except for Java, which has none), certainly downplayed its use, and courses that taught programming encouraged students to avoid it. Instead, problem solving was taught in a top-down structured manner, where one begins with the problem statement and attempts to break the problem down into a set of solvable sub-problems. The process continues until each sub-problem is small enough to be either trivial or very easy to solve. This technique is called structured programming. Before the advent and acceptance of object-oriented programming in the mid 1980s, this was the standard approach to problem solving and programming. It’s still one of the best ways to approach a large class of problems.

Hunt, A. and D. Thomas. The Pragmatic Programmer: From Journeyman to Master. (Boston, MA: Addison-Wesley, 2000.) Dijkstra, E. “GoTo Statement Considered Harmful.” Communications of the ACM 11(3): 147-148 (1968).

1 2

© John F. Dooley 2017 J. F. Dooley, Software Development, Design and Coding, https://doi.org/10.1007/978-1-4842-3153-1_7

75

Chapter 7 ■ Structured Design

Stepwise Refinement Niklaus Wirth formalized the structured design technique in his 1971 paper, “Program Development by Stepwise Refinement.3 ” Stepwise refinement contends that designing programs consists of a set of refinement steps. In each step, a given task is broken up into a number of subtasks. Each refinement of a task must be accompanied by a refinement of the data description and the interface. The degree of modularity obtained will determine the ease or difficulty with which a program can be adapted to changes in requirements or environment. During refinement, you use a notation that’s natural to the problem space. Avoid using a programming language for description as long as possible. Each refinement implies a number of design decisions based on a set of design criteria. These criteria include efficiency of time and space, clarity, and regularity of structure (simplicity). Refinement can proceed in two ways: top-down or bottom-up. Top-down refinement is characterized by moving from a general description of the problem to detailed statements of what individual modules or routines do. The guiding principle behind stepwise refinement is that humans can concentrate on only a few things at a time—Miller’s famous 7 +/– 2 chunks of data rule.4 One works by •

Analyzing the problem and trying to identify the outlines of a solution and the pros and cons of each possibility.

•

Designing the top levels first.

•

Steering clear of language-specific details.

•

Pushing down the details until you get to the lower levels.

•

Formalizing each level.

•

Verifying each level.

•

Moving to the next lower level to make the next set of refinements (that is, repeat).

One continues to refine the solution until it seems as if it would be easier to code than to decompose; we'll see an example of this process later in this chapter. That is, you work until you become impatient at how obvious and easy the design becomes. The downside here is that you really have no good metric on “when to stop.” It just takes practice. If you can’t get started at the top, start at the bottom: •

Ask yourself, “What do I know that the system needs to do?” This usually involves lower level I/O operations, other low-level operations on data structures, and so on.

•

Identify as many low-level functions and components as you can from that question.

•

Identify common aspects of the low-level components and group them together.

•

Continue with the next level up, or go back to the top and try again to work down.

Bottom-up assessment usually results in early identification of utility routines, which can lead to a more compact design. It also helps promote reuse—because you’re reusing the lower-level routines. On the downside, bottom-up assessment is hard to use exclusively: you nearly always end up switching to a top-down approach at some point, and sometimes you find you just can’t put a larger piece together from the bottom-up. This isn’t really stepwise refinement, but it can help get you started. Most real step-wise refinements involve alternating between top-down and bottom-up design elements. Fortunately, top-down and bottom-up design methodologies can be very complementary. Wirth, N. “Program Development by Stepwise Refinement.” CACM 14(4): 221-227 (1971). Miller, G. A. “The magical number seven, plus or minus two: Some limits on our capacity for processing information.” Psychological Review 63: 81-97 (1956).

3 4

76

Chapter 7 ■ Structured Design

Example of Stepwise Refinement: The Eight-Queens Problem The eight queens problem is familiar to most students. The problem is to find a placement of eight queens on a standard 8 × 8 chessboard in such a way that no queen can be attacked by any other. One possible solution to the eight-queens problem is shown in Figure 7-1.

Figure 7-1. One solution to the eight-queens problem Remember that queens can move any number of spaces horizontally, vertically, or diagonally. It turns out that no one has yet found an analytical solution to this problem, and it’s likely one does not exist. So, how would you approach this problem? Go ahead, think about it. I’ll wait. ... ... Done? Okay. Let’s see one way to decompose this problem.

Proposed Solution 1 The first thing we need to do is to look at the problem and tease out the requirements and the outline of a solution. This will start us down the road of answering the question of what the top-level decomposition should be. First, you could think of solving the problem using brute force; just try all the possible arrangements of queens and pick the ones that work. With 8 queens and 64 possible squares there are

possible board configurations, where n is the number of squares on the board and k is the number of queens, which is only 4,294,967,296 (a bit over 4 billion configurations). These days, that’s not very many, so brute force might be the way to go. So, if we generate a set A of all the possible board combinations, we can create a test called q(x) that returns a true if the board configuration x is a solution, and returns false if x isn’t a solution. Then we can create a program that looks like the following: Generate the set A of all board configurations; while there are still untested configurations in A do

77

Chapter 7 ■ Structured Design

            x = the next configuration from A             if (q(x) == true) then print the solution x and stop             go back to the top and do it again. Notice that all the work is getting done in two steps: generating the set A and performing the test q(x). The generation of A only happens once, but performing the test q(x) happens once for every configuration in A until you find a solution. Although this decomposition will surely work, it’s not terribly efficient. Let’s just say that we’d rather reduce the number of combinations. Efficiency is a good thing, after all.

Proposed Solution 2 Again, we need to start at the top level. But this time we’ve done some analysis, so we have a clearer idea of what has to happen. We’ve eliminated brute force, but we see that we can think in terms of board configurations. In order to reduce the number of total possible configurations and then come up with a more efficient algorithm, we need to think about the problem. The first thing to notice is that you can never have more than one queen in a column; in fact, you must have exactly one queen per column. That reduces the number of possible combinations to 224 or just 16 million. Although this is good, it doesn’t really change the algorithm. Our proposed solution would now look like this: Generate the set B of restricted board configurations; while there are still untested configurations in B do             x = the next configuration from B             if (q(x) == true) then print the solution x and stop             go back to the top and do it again. This version requires generating the set B of board positions, with one queen in each column, and still requires visiting up to 16 million possible board positions. Generating B is now more complicated than generating A because we now have to test to see whether a proposed board position meets the one queen per column restriction. There must be a better way.

Proposed Solution 3 Of course there is. We just need to be more intelligent about generating board configurations and evaluate board positions while we’re generating them. Instead of generating a complete board configuration and then testing it, why not generate and test partial solutions? If we can stop as soon as we know we don’t have a valid solution, things should go faster. Also, if we can back up from a bad solution to the last good partial solution, we can eliminate bad configurations more quickly. Now we’re at the point where we can do that top-level design, formalize it, and move down to the next refinement level.

Refinement 1 Here’s the idea: 1. Put down a queen on the next row in the next available column. 2. Test the queen to see if she can attack any other queen. (That’s a variation on the q(x) test above.) 3. If she can attack, pick her back up, back up to the previous trial solution, and try again. 4. If she can’t attack, leave her alone and go back to the top and try the next queen.

78

Chapter 7 ■ Structured Design

With this method, we’re guaranteed that the trial solution at column j is correct. We then attempt to generate a new solution by adding a queen in column j + 1. If this fails, we just need to back up to our previous correct trial solution at column j and try again. Wirth calls this technique of creating and testing partial solutions a stepwise construction of trial solutions. And the backing-up technique is, of course, called backtracking. Here’s more formal pseudo-code to find a single solution: do {             while ((row < 8) && (col < 8))  {                 if (the current queen is safe) then                     advance: keep the queen on the board and                 else                     the queen is not safe, so move up to the             }             if (we've exhausted all the rows in this column)                 regress: retreat a column, move its queen up

advance to the next column next row. then a row, and start again.

    } while ((col < 8) && (col >= 0));         if (we've reached column 8) then             we have a solution, print it. This solution tests only a single queen at a time. One implication of this is that, at any given time, there are only j queens on the board and so only j queens need to be tested each time through the outer loop. (One only needs to test the queens up through column j.) That reduces the amount of work in the testing routine. This algorithm is our first formal view of the solution. Notice in the method described that we’re using pseudo-code rather than a real programming language. That’s because we want to push language details further down the refinement levels. Also, though we’ve got a general outline of the method, there are a lot of details still to be considered. These details have been pushed down in the hierarchy of control we’re creating, and we’ll get to them in the next refinement iteration. This is also a function of the stepwise refinement. Now that we have a description of the algorithm, we can also work to verify it. The final verification will, of course, be watching the program produce a correct solution, but we're not at that point yet. However, we can surely take a chessboard (or a piece of paper) and walk through this algorithm by hand to verify that we can generate a placement of queens on the board that is a solution to the problem. At this point we’ve got a more formal top-level description, we’ve done what verification we can, and we’re ready to expand those fuzzy steps we saw earlier.

Refinement 2 Now that we’ve got a first cut at the program, we need to examine each of the steps in the program and see what they’re made of. The steps we’re interested in are as follows: 1. Check to see if the current queen is safe. 2. Keep a safe queen on the board and advance to the next column. 3. Advance an unsafe queen up a row. 4. Retreat a column and reset the queen in that column. Checking to see if the current queen is safe means we need to check that there are no other queens on either of the diagonals (the up or down diagonals) or the row that the current queen is on. The row check is easy; one just checks all the other squares in the same row. To check the up and down diagonals, remember that if the current queen is at column j, we only need to check columns 1 through j – 1. If you think about it for a minute, you’ll see that the difference between the row and column indexes of all the squares on the

79

Chapter 7 ■ Structured Design

up diagonal (those that go from lower left to upper right) are a constant. Similarly, the sum of the row and column indexes of all the squares on the down diagonal (those that go from upper left to lower right) is also a constant. This makes it easier to figure out which cells are on the diagonal of a queen and how to check them. Now we’re ready to start considering data structures. Where did this come from, you ask? Well, stepwise refinement is mostly about describing the control flow of the program. But at some point you need to decide on exactly what the data will look like. For each problem you try to solve, this will happen at a different place in your refinement process. For this problem we’re at a place where in the next refinement we should be writing more detailed pseudo-code. That’s pretty much forcing us to think about data structures now, so we can do the pseudo-code. In particular, now we need to ask ourselves how we’re going to represent the board and the queens on the board. How are we going to represent the empty cells? We need a data structure that will allow us to efficiently represent queens and check whether they can be attacked. A first cut at this might be an 8 × 8 two-dimensional array where we place queens at the appropriate row and column intersections. Because we don’t need to do any computation on this matrix—all we need is to indicate the presence or absence of a queen—we can save space by making it a Boolean array. This data structure also allows us to quickly check the rows, and the up and down diagonals for queens that can attack the current queen. So, we should use a 2D Boolean array, right? Not so fast. This isn’t the only way to think about the data representation for queens. In fact, if we think about the data structure and the operations we need to perform during the safety check, we might be able to simplify things a bit. First of all, since we know that there can only be one queen in each column and one queen in each row, why not combine that information? Instead of a 2D array, why not just use a 1D Boolean array like the following: boolean column[8]; column[i] = true means that the ith column is still free. For the diagonals, we can use the property about the constant difference or sum of up and down diagonals to create two other arrays boolean up[-7..+7], down[0..14]; that will indicate which diagonal squares are free. With this arrangement, the test for a queen being safe looks like this: (column[i] and up[row-col] and down[row+col])5 All right. That seems simple enough. We’re finally done with this, right? Well, no. There’s yet another way to think about this: going back to using a 1D array, but this time using an integer array int board[8];

5

Dahl, O. J., E. Dijkstra, et al. (1972). Structured Programming. (London, UK: Academic Press, 1972.)

80

Chapter 7 ■ Structured Design

where each index into the array represents a column (0 through 7 in the eight queens case), and each value stored in the array represents the row on which a queen is deposited (also 0 through 7 in the eight queens case). Because we now have data on the exact location (rows and columns) of each queen, we don’t need separate arrays for the up and down diagonals. The test for safety is a bit more difficult, but still simple. This might be the time for some more code. At this point it seems appropriate to move from pseudo-code to a real language. You’ll have to make that move at some point in the refinement process. Just like deciding when to define your data structures, exactly when to insert language-specific features depends on the problem and what how detailed the refinement is at this point. A Java method to test for safety might look like this: public boolean isSafe (int[ ] board) {         boolean safe = true;         for (int i = 0; i < col; i++) {                 if ( ( ( board[i] + i) == (row + col) ) ||      // down diagonal test                         ( ( board[i] - i) == (row - col) ) ||   // up diagonal test                         ( board[i]  == row) )                   // row test                         safe = false;         }         return safe; } Remember that because we’re creating partial solutions by adding one queen to a column at a time, we only need to test the first col columns each time. Now that we have the safety procedure out of the way and we’ve decided on a simple data structure to represent the current board configuration, we can proceed to the remaining procedures in the decomposition: 1. Keep a safe queen on the board and advance to the next column. 2. Advance an unsafe queen up a row. 3. Retreat a column and reset the queen in that column. These are all simple enough to just write without further decomposition. That’s a key point of structured programming: keep doing the decompositions until a procedure becomes obvious, and then you can code. These three methods then look like the following when written in code: /* *  keep a safe queen on the board and advance to the next column *  the queen at (row, col) is safe, so we have a partial solution. *  advance to the next column */ public void advance (int[] board) {     board[col] = row;           // put the queen at (row, col) on the board     col++;                      // move to the next column     row = 0;                    // and start at the beginning of the column } For advance an unsafe queen up a row we don’t even need a method. The test in the main program for safety moves the queen up a row if the isSafe() method determines that the current (row, col) position is unsafe. The code for this is shown here:        if (isSafe(board))            advance(board);        else            row++;

81

Chapter 7 ■ Structured Design

Finally, we have the following:     /**      *  retreat a column and reset the queen in that column      *  we could not find a safe row in current col      *  so back up one col and move that queen      *  up a row so we can start looking again      */     public void retreat (int[] board) {         col--;         row = board[col] + 1;       } The complete Java program is shown in Listing 7-1. Listing 7-1. The Complete Non-Recursive Eight-Queens Program /* *  NQueens.java *  8-Queens Program *  A non-recursive version for a single solution */ import java.util.*; public class NQueens {         static int totalcount = 0;         static int row = 0;         static int col = 0;     /*      *  the queen at (row, col) is safe,      *  so we have a partial solution.      *  advance to the next column      */     public void advance (int[] board) {         board[col] = row;         col++;         row = 0;     }     /*      *  could not find a safe row in current col      *  so back up one col and move that queen      *  up a row      */     public void retreat (int[] board) {         col--;         row = board[col] + 1;       }

82

Chapter 7 ■ Structured Design

    /*      *   check to see if queen at (row, col)  can be      *   attacked      */     public boolean isSafe (int[] board) {         boolean safe = true;         totalcount++;         /*          *  check diagonals and row for attacks          *  since we're just checking partial solutions          *  only need to go up to current col          */         for (int i=0; i= 0));         /* If we've placed all N queens, we've got a solution */         if (col == N) {             for (int i = 0; i < N; i++) {                 System.out.print(board[i] + " ");             }         } else {             System.out.println("No solution. ");            }         System.out.println();         System.out.println("after trying " + totalcount +                          " board positions.");     } }

Modular Decomposition In 1972, David Parnas published a paper titled “On the Criteria to Be Used in Decomposing Systems into Modules” that proposed that one could design programs using a technique called modularity.6 Parnas’s paper was also one of the first papers to describe a decomposition based on information hiding, one of the key techniques in object-oriented programming. In his paper, Parnas highlighted the differences between a top-down decomposition of a problem based on the flow of control of a problem solution and a decomposition of the problem that used encapsulation and information hiding to isolate data definitions and their operations from each other. His paper is a clear precursor to object-oriented analysis and design (OOA&D), which we’ll look at in the next chapter. Although Parnas’s paper pre-dates the idea, he was really talking about a concept called separation of concerns. “In computer science, separation of concerns is the process of separating a computer program into distinct features that overlap in functionality as little as possible. A concern is any piece of interest or focus in a program. Typically, concerns are synonymous with features or behaviors. Progress towards separation of concerns is traditionally achieved through modularity of programming and encapsulation (or “transparency” of operation), with the help of information hiding.”7 Traditionally, separation of concerns was all about separating functionality of the program. Parnas added the idea of separating the data as well, so that individual modules would control data as well as the operations that acted on the data and the data would be visible only through well-defined interfaces.

Parnas, D. “On the Criteria to be Used in Decomposing Systems into Modules.” Communications of the ACM 15(12): 1053-1058 (1972). 7 Wikipedia. “Separation of Concerns” 2009. http://en.wikipedia.org/wiki/. Retrieved December 7, 2009. 6

84

Chapter 7 ■ Structured Design

There are three characteristics of modularity that are key to creating modular programs: •

Encapsulation

•

Loose coupling (how closely do modules relate to each other)

•

Information hiding

In a nutshell, encapsulation means to bundle a group of services defined by their data and behaviors together as a module, and keep them together. This group of services should be coherent and should clearly belong together. (Like a function, a module should do just one thing.) The module then presents an interface to the user, and that interface is ideally the only way to access the services and data in the module. An objective of encapsulating services and data is high cohesion. This means that your module should do one thing and all the functions inside the module should work towards making that one thing happen. The closer you are to this goal, the higher the cohesion in your module. This is a good thing. The complement of encapsulation is loose coupling. Loose coupling describes how strongly two modules are related to each other. This means we want to minimize the dependence any one module has on another. We separate modules to minimize interactions and make all interactions between modules through the module interface. The goal is to create modules with internal integrity (strong cohesion) and small, few, direct, visible, and flexible connections to other modules (loose coupling). Good coupling between modules is loose enough that one module can easily be called by others. Loose coupling falls into four broad categories that go from good to awful: •

Simple data coupling: Where non-structured data is passed via parameter lists. This is the best kind of coupling, because it lets the receiving module structure the data as it sees fit and allows the receiving module to decide what to do with the data.

•

Structured data coupling: Where structured data is passed via parameter lists. This is also a good kind of coupling, because the sending module keeps control of the data formats and the receiving module gets to do what it wants to with the data.

•

Control coupling: Where data from module A is passed to module B and the content of the data tells module B what to do. This is not a good form of coupling; A and B are too closely coupled in this case because module A is controlling how functions in module B will execute.

•

Global-data coupling: Where the two modules make use of the same global data. This is just awful. It violates a basic tenet of encapsulation by having the modules share data. This invites unwanted side-effects and ensures that at any given moment during the execution of the program neither module A nor module B will know precisely what is in the globally shared data. And what the heck are you doing using global variables anyway? Bad programmer!

Information hiding is often confused with encapsulation, but they’re not the same thing. Encapsulation describes a process of wrapping both data and behaviors into a single entity—in our case, a module. Data can be publicly visible from within a module, and thus not hidden. Information hiding, on the other hand, says that the data and behaviors in a module should be controlled and visible only to the operations that act on the data within the module, so it’s invisible to other, external modules. This is an important feature of modules (and later of objects as well) because it leaves control of data in the module that understands best how to manipulate the data and it protects the data from side-effects that can arise from other modules reaching in and tweaking the data.

85

Chapter 7 ■ Structured Design

Parnas was not just talking about hiding data in modules. His definition of information hiding was even more concerned with hiding design decisions in the module definition. “We propose . . . that one begins with a list of difficult design decisions or design decisions which are likely to change. Each module is then designed to hide such a decision from the others.8 ” Hiding information in this manner allows clients of a module to use the module successfully without needing to know any of the design decisions that went into constructing the module. It also allows developers to change the implementation of the module without affecting how the client uses the module.

Example: Keyword in Context Back in the day, when Unix was young and the world was new, the Unix documentation was divided into eight different sections and the entire manual started with a permuted index. The problem with Unix is not the command line interface, and it’s not the inverted tree file system structure. No, the problem with Unix is that the three guys who developed it, Kernighan, Ritchie, and Thompson, are the three laziest guys on the planet. How do I know? Where’s my proof? Well, the proof is in practically every Unix command: ls, cat, cp, mv, mkdir, ps, cc, as, ld, m4 . . . I could go on. Unix has to have the most cryptic command line set of any operating system on the planet. The cardinal rule for creating Unix command line tools was apparently, “Why use three characters when two will do?” So, finding anything in any of the eight sections of Unix documentation could have been a real trial. Enter the permuted index. Every Unix man page starts with a header line that contains the name of the command and a short description of what the command does. For example, the cat(1) man page begins like this: cat -- concatenate and print files What if I don’t know the name of a command, but I do know what it does? The permuted index solves this problem by making most of the words in the description (the articles were ignored) of the command part of the index itself. So that cat could be found under cat and also concatenate, print, and files. This is known as a Keyword in Context (KWIC) index. It works just dandy. Our problem is to take as input two files, the first of which contains words to ignore, the second of which contains lines of text to index, and create a KWIC index for them. For example, say that we’re ignoring the articles for, the, and, and so forth, and the second file looks like this: The Sun also Rises For Whom the Bell Tolls The Old Man and the Sea Our KWIC index would look like this:             The Sun        For Whom the             The Old                 The        The Sun also The Old Man and the                 The   For Whom the Bell                 For

8

Parnas, 1972.

86

ALSO Rises BELL Tolls MAN and the Sea OLD Man and the Sea RISES SEA SUN also Rises TOLLS WHOM the Bell Tolls

Chapter 7 ■ Structured Design

Note that each keyword is in all caps, each input line appears once for every index word in the line, and the keywords are sorted alphabetically. Each line of text has its keywords made visible by circularly shifting the words in the line. In the case of a tie (two lines of text have the same index word and so should appear together in the output), the lines of text should appear in the same order in which they appeared in the text input file. The question we have to answer is this: how do we create the KWIC index? A secondary question we’ll need to answer almost immediately is: how do we store the data?

Top-Down Decomposition We’ll start by designing the problem solution using a top-down decomposition. Top-down decompositions, as we’ve seen with the eight queens problem earlier in this chapter, are all about control flow. We want to figure out how to sequentially solve the problem, making progress with each step we take. It’s assumed that the data is stored separately from the routines and each subroutine in the control flow can access the data it needs. The alternative is to pass the data along to each subroutine as we call it; this can be cumbersome and time consuming because the data usually has to be copied each time you pass it to a routine. A first decomposition of this problem might look like the following: 1. Input the words to ignore and the text. 2. Create a data structure containing the circularly shifted lines of text, keeping track of which word in the line is the index word for this line. 3. Sort the circularly shifted lines of text by the index words. 4. Format the output lines. 5. Output the text. Note that these five steps can easily become five subroutines that are all called in sequence from a main program. The data structure used for the input text could be an array of characters for each line, a String for each line, or an array of Strings for the entire input file. One could also use a map data structure that uses each index word as the key and a String containing the input text line as the value of the map element. There are certainly other possible data structures to be used. Sorting can be done by any of the stable sorting algorithms, and which algorithm to use would depend on the data structure chosen and on the expected size of the input text. Your sort must be stable because of the requirement that identical index words sort their respective lines in the same order in which they appear in the input text file. Depending on the programming language you use and the data structure you choose, sorting might be done automatically for you. The data structure you choose will affect how the circular shifts are done and how the output routine does the work of formatting each output line. Now that we’ve got a feel for how a top-down decomposition might proceed, let’s move on and consider a modular decomposition.

Modular Decomposition of KWIC A modular decomposition of the KWIC problem can be based on information hiding in the sense that we will hide both data structures and design decisions. The modules we create won’t necessarily be the sequential list we’ve just seen, but will be modules that can cooperate with each other and are called when needed. One list of modules for KWIC looks like this: •

A Line module (for lines of input text)

•

A Keyword-Line pair module

•

A KWICIndex module to create the indexed list itself

87

Chapter 7 ■ Structured Design

•

A Circular Shift module

•

A module to format and print the output

•

A master control module—the main program

The Line module will use the Keyword-Line module to create a map data structure where each Line is a keyword and a list of lines that contain that keyword. The KWICIndex module will use the Line module to create the indexed list. The Circular Shift module will use the KWICIndex module (and recursively, the Line and Keyword-Line modules) and create the circularly shifted set of keyword-line pairs. Sorting will be taken care of internally in the KWICIndex module; ideally the index will be created as a sorted list, and any additions to the list will maintain the ordering of the index. The format and print module will organize the keyword-lines so that the keywords are printed in all caps and centered on the output line. Finally, the master control module will read the input, create the KWICIndex, and cause it to print correctly. The key of these modules is that one can describe the modules and their interactions without needing the details of how each module is implemented and how the data is stored. That’s hidden in the module description itself. Other designs are also possible. For example, it might be better to subsume the circular shift operations inside the Line module, allowing it to store the input lines and their shifts. Regardless, the next step in the design is to create the interface for each module and to coordinate the interfaces so that each module can communicate with every other module regardless of the internal implementation. Listing 7-2 shows an implementation of the KWIC index program written in Java that somewhat closely follows the earlier discussion. Listing 7-2. A Modular Version of the KWIC Solution /** * CLASS Line * Handle the storage of 3 key pieces of information. * the current line, the keyword, and the index of the * keyword in the line. * * Basically just like a struct in C. * */ public class Line implements Comparable {     public String line;     public String keyword;     public int indexOf;     public Line(String line, String keyword, int indexOf) {         this.keyword = keyword;         this.indexOf = indexOf;         // capitalize the keyword in the line         // grab the first part of the line         String first = line.substring(0, indexOf);         // capitalize the entire keyword         String middle = keyword.toUpperCase();         // grab the rest of the line after the keyword         String last = line.substring(indexOf + keyword.length());         // put it all back together         this.line = first + middle + last;     }

88

Chapter 7 ■ Structured Design

    /**      * We want to sort lines based on keyword alone.      * This will do a lexicographical comparison of the keywords      * Remember that keyword is a String      */     @Override     public int compareTo(Line other) {         return this.keyword.compareToIgnoreCase(other.keyword);     } } import java.util.Scanner; import java.util.*; /** * CLASS KwicIndex * A KwicIndex object contains a collection of Lines * and the words we are ignoring as keywords. * * We use a HashSet for the words to ignore because * we only ever want one of each of these words. * * We use a PriorityQueue for the lines because we * want to store them sorted by keywords and the PQ * does that for us automatically. * */ public class KwicIndex {     public HashSet wordsToIgnore;     public PriorityQueue lines;     /**      * Constructor that initializes the lists and      * reads all the words to ignore      */     public KwicIndex(Scanner ignore) {         this.wordsToIgnore = new HashSet();         this.lines = new PriorityQueue();         while (ignore.hasNext()) {             this.wordsToIgnore.add(ignore.next());         }     }     /**      * Create an entry in the index for the given line.      * @param str; a string to examine      * @return      */     public void add(String str) {         Scanner scan = new Scanner(str);

89

Chapter 7 ■ Structured Design

        int offset = 0;         int words = -1;         while (scan.hasNext()) {             // grab the next word             String temp = scan.next();             words++;             /** if this word is not to be ignored create a new line              *  with the line shifted with the new word removed              *  then add it to the list of lines              */             if (!wordsToIgnore.contains(temp.toLowerCase())) {                 Line version = new Line(str, temp, offset + words);                 this.lines.add(version);             }             offset += temp.length();         }     }     /**      * return the index so we can print it      */     public PriorityQueue getLines() {         return lines;     } } import java.util.*; /** * CLASS Print * Print the resulting KWIC index * */ public class Print {     public PriorityQueue lines;     public Print(PriorityQueue lines) {         this.lines = lines;     }     /**      * Print to System.out the contents of the index      * lines formatting adjusted so      * keywords are in the same column      */     public void printIndex() {         // make a new PriorityQueue         PriorityQueue newLines = new PriorityQueue();

90

Chapter 7 ■ Structured Design

        // lets figure out the length of the longest line         int longest = 0;         for (Line l : lines) {             if (l.indexOf > longest) {                 longest = l.indexOf;             }         }         /**          * do the printing          */         while (!lines.isEmpty()) {             /** grab the line with smallest keyword */             Line l = lines.poll();             /** save the line */             newLines.add(l);             /**              * figure out the whitespace              * Here we figure out how far over to print              * the keyword based on putting the longest line              * right in the middle              */             String retval = "";             for (int i = 0; i < (longest - l.indexOf); i++) {                 retval += " ";             }             /**              * construct the line              */             retval += l.line;             // output             System.out.println(retval);         }         /** Save the lines from all that polling */         this.lines = newLines;     } } import java.io.File; import java.io.FileNotFoundException; import java.util.Scanner; /** * CLASS Main * Manage the KWIC indexing system. * * @author jfdooley

91

Chapter 7 ■ Structured Design

* */ public class Main {     public static void main(String[] args) {         /**          * declare the Scanners to read the files          */         Scanner scan = null;         Scanner ignore = null;         /**          * usage and file opening          *          * if we have the correct number of input args          * we try to open the input files          */         if (args.length == 2) {             try {                 scan = new Scanner(new File(args[0]));                 ignore = new Scanner(new File(args[1]));             } catch (FileNotFoundException ex) {                 System.out.println(ex.getMessage());                 System.exit(1);             }             /**              * wrong number of input args. Give user a usage              * message and leave              */         } else {             System.out.println("Usage: java Main ");             System.exit(1);         }         /**          * first we create an KwicIndex object & add          * the words to ignore to it          */         KwicIndex index = new KwicIndex(ignore);         /**          * Now we add all the lines to the index          *  the add() method does the work of the circular shift          *  and adding the shifted lines to the priority queue          */         while (scan.hasNextLine()) {             index.add(scan.nextLine());         }         /**          * Finally we print the index we just created          */

92

Chapter 7 ■ Structured Design

        Print prt = new Print(index.getLines());         prt.printIndex();     } } We create four Java classes: •

Line: Creates the data structure for the lines that are input from a text file.

•

KwicIndex: Takes the words to ignore and the input lines and creates a sorted permuted index. Lines are shifted and added to the index based on the keyword.

•

Print: Takes the KwicIndex object and prints the permuted index in the right order and shifted accordingly.

•

Main: Checks that the command line arguments are correct, creates the initial KwicIndex object, and calls the methods to add new lines and to do the printing.

Giving this program a file called input.txt with the following input Descent of Man The Ascent of Man The Old Man and The Sea A Portrait of the Artist As a Young Man A Man is a Man but Bubblesort is a dog this is dumb produces the following output:                                     A Portrait of the Artist As a Young Man                            A Man is A Man but Bubblesort is a dog    A Man is a Man but Bubblesort is A dog         A Portrait of the Artist As A Young Man                                     A Man is a Man but Bubblesort is a dog                   A Portrait of the ARTIST As a Young Man            A Portrait of the Artist AS a Young Man                                 The ASCENT of Man                  A Man is a Man but BUBBLESORT is a dog                                     DESCENT of Man      something i do not know how to DO                         something i DO not know how to do A Man is a Man but Bubblesort is a DOG                             this is DUMB             something i do not know HOW to do                           something I do not know how to do                               A Man IS a Man but Bubblesort is a dog       A Man is a Man but Bubblesort IS a dog                                this IS dumb                  something i do not KNOW how to do                                   A MAN is a Man but Bubblesort is a dog                          Descent of MAN                          A Man is a MAN but Bubblesort is a dog                             The Old MAN and The Sea                       The Ascent of MAN

93

Chapter 7 ■ Structured Design

A Portrait of the Artist As a Young MAN                      something i do NOT know how to do                                 The OLD Man and The Sea                                   A PORTRAIT of the Artist As a Young Man                 The Old Man and The SEA                                     SOMETHING i do not know how to do                                     THIS is dumb         something i do not know how TO do       A Portrait of the Artist As a YOUNG Man We’ll continue this discussion on modular decomposition in much more detail in Chapter 8, in a discussion of object-oriented design.

Conclusion Structured design describes a set of classic design methodologies. These design ideas work for a large class of problems. The original structured design idea, stepwise refinement, has you decompose the problem from the top down, focusing on the control flow of the solution. It also relates closely to some of the architectures mentioned in Chapter 5, particularly the main program-subroutine and pipe-and-filter architectures. Modular decomposition is the immediate precursor to the modern object-oriented methodologies and introduced the ideas of encapsulation and information hiding. These ideas are the fundamentals of your design toolbox.

References Dahl, O. J., E. Dijkstra, et al. (1972). Structured Programming. (London, UK: Academic Press, 1972.) Dijkstra, E. “GoTo Statement Considered Harmful.” Communications of the ACM 11(3): 147-148 (1968). Hunt, A. and D. Thomas. The Pragmatic Programmer: From Journeyman to Master. (Boston, MA: AddisonWesley, 2000.) McConnell, S. Code Complete 2. (Redmond, WA: Microsoft Press, 2004.) Miller, G. A. “The magical number seven, plus or minus two: Some limits on our capacity for processing information.” Psychological Review 63: 81-97 (1956). Parnas, D. “On the Criteria to be Used in Decomposing Systems into Modules.” Communications of the ACM 15(12): 1053–1058 (1972). Wikipedia. “Separation of Concerns” (2009). http://en.wikipedia.org/wiki/ Separation_of_concerns. Retrieved December 7, 2009. Wirth, N. “Program Development by Stepwise Refinement.” CACM 14(4): 221-227 (1971).

94

CHAPTER 8

Object-Oriented Overview Object-oriented programming is an exceptionally bad idea, which could only have originated in California. —Edsger Dijkstra The object has three properties, which makes it a simple, yet powerful model building block. It has state so it can model memory. It has behavior, so that it can model dynamic processes. And it is encapsulated, so that it can hide complexity. —Trygve Reenskaug, Working With Objects Well, yes, we’ve all learned about the object-oriented programming (OOP) paradigm before. But it never hurts to go over some basic definitions so we’re all on the same page for our discussion about objectoriented analysis and design. First of all, objects are things. They have an identity (a name), a state (a set of attributes that describes the current data stored inside the object), and a defined set of behaviors that operate on that state. A stack is an object, as is an automobile, a bank account, a window, or a button in a graphical user interface. In an object-oriented program, a set of cooperating objects pass messages among themselves. The messages make requests of the destination objects to invoke methods that either perform operations on their data (thus changing the state of the object), or to report on the current state of the object. Eventually work gets done. Objects use encapsulation and information hiding (remember, they’re different) to isolate data and operations from other objects in the program. Shared data areas are (usually) eliminated. Objects are members of classes that define attribute types and operations. Classes are templates for objects. Classes can also be thought of as factories that generate objects. So an Automobile class will generate instances of autos, a Stack class will create a new stack object, and a Queue class will create a new queue. Classes may inherit attributes and behaviors from other classes. Classes may be arranged in a class hierarchy where one class (a super class or base class) is a generalization of one or more other classes (sub-classes). A sub-class inherits the attributes and operations from its super class and may add new methods or attributes of its own. In this sense a sub-class is more specific and detailed than its super class; hence, we say that a sub-class extends a super class. For example, a BankAccount object may include the customer’s name, address, balance, and a unique BankAccount ID number; it will also allow deposits and withdrawals and the current balance can be queried. A CheckingAccount is a more specific version of a BankAccount; it has all the attributes and operations of a BankAccount, but it adds data and behaviors that are specific to checking accounts, like check numbers and a per-check charge. In Java this feature is called inheritance, whereas in UML it’s called generalization.1

1

Fowler, M. UML Distilled. (Boston, MA: Addison-Wesley, 2000.)

© John F. Dooley 2017 J. F. Dooley, Software Development, Design and Coding, https://doi.org/10.1007/978-1-4842-3153-1_8

95

Chapter 8 ■ Object-Oriented Overview

There are a number of advantages to inheritance. It’s an abstraction mechanism that may be used to classify entities. It’s a reuse mechanism at both the design and the programming level. The inheritance graph is a source of organizational knowledge about domains and systems. And, of course, there are problems with inheritance as well. It makes object classes that aren’t self-contained; sub-classes can’t be understood without reference to their super classes. Inheritance introduces complexity, and this is undesirable, especially in critical systems. Inheritance also usually allows overloading of operators (methods in Java), which can be good (polymorphism) or bad (screening useful methods in the superclass). OOP has a number of advantages, among them easier maintenance, because objects can be understood as stand-alone entities. Objects are also appropriate as reusable components. But for some problems, there may be no mapping from real-world objects to system objects, meaning that OOP is not appropriate for all problems.

An Object-Oriented Analysis and Design Process Object-oriented analysis (OOA), design (OOD), and programming (OOP) are related but distinct. OOA is concerned with developing an object model of the application domain. So, for example, you take the problem statement, generate a set of features and (possibly) use cases,2 tease out the objects and some of the methods within those objects that you’ll need to satisfy the use case, and you put together an architecture of how the solution will hang together. That’s object-oriented analysis. OOD is concerned with developing an object-oriented system model to satisfy requirements. You take the objects generated from your OOA, figure out whether to use inheritance, aggregation, composition, abstract classes, interfaces, and so on, in order to create a coherent and efficient model, draw the class diagrams, flesh out the details of what each attribute is and what each method does, and describe the interfaces. That’s the design. Some people like object-oriented analysis, design, and programming3 and some people don’t.4 OOA allows you to take a problem model and re-cast it in terms of objects and classes, and OOD allows you to take your analyzed requirements and connect the dots between the objects you’ve proposed and fill in the details with respect to object attributes and methods. But how do you really do all this? Well, here is a proposed process from that starts to fill in some of the details.5 We’ll figure out the rest as we go along: 1. Write (or receive) the problem statement: Use this to generate an initial set of features. 2. Create the feature list: The feature list is the set of program features that you derive from the problem statement; it contains your initial set of requirements. The feature list may be a set of user stories. To help generate the feature list, you can put together a set of scenarios, narrative descriptions of how the user will walk through using the program to accomplish a task. A scenario should be technology agnostic and should be explicitly from the user’s perspective. It’s not how the program works; it’s about what the user wants to accomplish and how the user gets the task done. It can also talk about what the user knows. User stories are very brief and high level. Scenarios are longer and provide more detail. A user story might generate several scenarios.

Cockburn, A. Writing Effective Use Cases. (Boston, MA: Addison-Wesley, 2000.) Beck, K., and B. Boehm. “Agility through Discipline: A Debate.” IEEE Computer 36 (6):44-46 (2003). 4 Graham, Paul. “Why Arc isn’t Especially Object Oriented,” retrieved from www.paulgraham.com/noop.htmlon October 12, 2009. 5 McLaughlin, Brett D., et. al. Head First Object-Oriented Analysis & Design. (O’Reilly Media, Inc. Sebastopol, CA: 2007.) 2 3

96

Chapter 8 ■ Object-Oriented Overview

3. Write up use cases6: This helps refine the features, dig out new requirements, and expose problems with the features you just created. Use cases are more specific descriptions of how a user accomplishes a task using the program; they describe how the user interacts with the system. Use cases “. . . capture the goal of an action, the trigger event that starts a process, and then describe each step of the process including inputs, outputs, errors and exceptions. Use cases are often written in the form of an actor or user performing an action followed by the expected system response and alternative outcomes.7 ” Each scenario or user story might create several use cases. 4. Break the problem into subsystems or modules or whatever you want to call them as long as they’re smaller, self-contained bits usually related to functionality. 5. Map your features, subsystems, and use cases to domain objects; create abstractions. 6. Identify the program’s objects, methods, and algorithms. 7. Implement this iteration. 8. Test the iteration. 9. If you’ve not finished the feature list and you still have time and/or money left, go back to step 4 and do another iteration, otherwise . . . 10. Do final acceptance testing and release. Note that this process leaves out a lot of details, like the length of an iteration. How many features end up in an iteration? How and when do we add new features to the feature list? How exactly do we identify objects and operations? How do we abstract objects into classes? Where do we fix bugs that are found in testing? Do we do reviews of code and other project work products? Leaving out steps here is okay. We’re mostly concerned with the analysis and design elements of the process. We’ll discuss ideas on the rest of the process in this chapter, and some of the answers are also in Chapter 3 on project management. How do the preceding process steps fit into the software development life cycle? Well, I’m glad you asked. Recall that the basic development life cycle has four steps: 1. Requirements gathering and analysis 2. Design 3. Implementation and testing 4. Release, maintenance, and evolution

N.B. In some of the literature on requirements gathering, the definitions of scenario and use case used here are reversed. That is, the use case is a general description of accomplishing a goal, and the scenario is the list of explicit steps used to accomplish the task. I prefer the definitions given here. A lively discussion of the differences between user stories and use cases can be found at http://wiki.c2.com/?UserStoryAndUseCaseComparison. 7 Schaeffer, Nadine. “User Stories, Scenarios, and Use Cases,” retrieved from http://cloudforestdesign. com/2011/04/25/introduction-user-stories-user-personas-use-cases-whats-the-difference/ on July 10, 2017. 6

97

Chapter 8 ■ Object-Oriented Overview

We can easily assign the previous ten steps into four buckets, as follows:

Requirements Gathering and Analysis 1. Problem statement 2. Feature list creation 3. Use case generation

Design 1. Break up the problem. 2. Map features and use cases to domain objects. 3. Identify objects, methods, and algorithms.

Implementation and Testing 1. Implement this iteration. 2. Test the iteration. 3. If you’ve not finished with the feature list or out of time, go back to step 4, otherwise . . .

Release/Maintenance/Evolution 1. Do final acceptance testing and release. Once again we can ignore the details of each process step for now. These details really depend on the process methodology you choose for your development project. The description of the process just given uses an iterative methodology and can easily be fitted into an agile process or a more traditional staged release process. Note also that you’ll need to revisit the requirements whenever you get to step 4, because you’re likely to have uncovered or generated new requirements during each iteration. And whenever your customer sees a new iteration, they’ll ask for more stuff (yes, they will—trust me). This means you’ll be updating the feature list (and re-prioritizing) at the beginning of each new iteration.

Doing the Process Let’s continue by working through an extended example, seeing where the problem statement leads us and how we can tease out requirements and begin our object-oriented analysis.

The Problem Statement Burt, the proud owner of Birds by Burt, has created the ultimate in bird feeders. Burt’s Bird Buffet and Bath (B4), is an integrated bird feeder and birdbath. It comes in 12 different colors (including camo) and 1, 3, and 5 lb. capacities. It will hold up to one gallon of water in the attached bird bath and has a built-in hanger

98

Chapter 8 ■ Object-Oriented Overview

so you can hang it from a tree branch or from a pole. The B4 is just flying off the shelves. Alice and Bob are desperate for a B4, but they’d like a few changes. Alice is a techno-nerd and a fanatic songbird watcher. She knows that her favorite songbirds only feed during the day, so she wants a custom B4 that allows the feeding doors to open automatically at sunrise and close automatically at sunset. Burt, ever the accommodating owner, has agreed and the hardware division of Birds by Burt is hard at work designing the B4++ for Alice. Your job is to write the software to make the hardware work.

The Feature List The first thing we need to do is figure out what the B4++ will actually do. This version seems simple enough. We can almost immediately write down three requirements: •

The feeding doors must all open and close simultaneously.

•

The feeding doors should open automatically at sunrise.

•

The feeding doors should close automatically at sunset.

That doesn’t seem so bad. The requirements are simple, and no user interaction is required. The next step is to create a use case so we can see just what the bird feeder is really going to do.

Use Cases A use case is a description of what a program does in a particular situation. It’s the detailed set of steps that the program executes when a user asks for something. Use cases always have an actor—some outside agent that gets the ball rolling, and a goal—what the use case is supposed to have done by the end. The use case describes what it takes to get from some initial state to the goal from the user’s perspective.8 Here’s a quick example of a use case for the B4++: 1. The sensor detects sunlight at a 40% brightness level. 2. The feeding doors open. 3. Birds arrive, eat, and drink. 4. Birds leave. 5. The sensor detects a decrease in sunlight to a 25% brightness level. 6. The feeding doors close. Given the simplicity of the B4++, that’s about all we can expect out of a use case. In fact, steps 3 and 4 aren’t technically part of the use case, because they aren’t part of the program—but they’re good to have so we can get a more complete picture of how the B4++ is operating. Use cases are very useful in requirements analysis because they give you an idea, in English, of what the program needs to do in a particular situation, and because they often help you uncover new requirements. Note that in the use case we don’t talk about how a program does something, we only concentrate on what the program has to do to reach the goal. This can also include the inputs, outputs, and errors that occur. And it can include alternative lists of steps for different situations—for example, if the user makes an error, create two alternative use cases, one for how to treat the error and one for when the user doesn’t make the error. Most times there will be several use cases for every program you write. We’ve only got one because this version of the B4++ is so simple.

8

Cockburn, 2000.

99

Chapter 8 ■ Object-Oriented Overview

Decompose the Problem Now that we’ve got our use case, we can probably just decompose the problem and identify the objects in the program. This problem is quite simple. If you look at the use case and pick out the nouns, you see that we can identify several objects. Each object has certain characteristics and contributes to reaching the goal of getting the birds fed. (Yes, birds is a noun in the use case, but they are the actors in this little play, so for the purposes of describing the objects we ignore them—they’re not part of the program.) The other two nouns of interest are sensor and doors. These are the critical pieces of the B4++ because the use case indicates that they’re the parts that accomplish the goal of opening and closing the feeding doors at sunrise and sunset. It’s logical that they’re objects in our design. Here are the objects I came up with for this first version of the B4++ and a short description: •

BirdFeeder: The top-level object. The bird feeder has one or more feeding doors at which the birds will gather, and a sensor to detect sunrise and sunset. The BirdFeeder class needs to control the querying of the light sensor and the opening and closing of the feeding doors.

•

Sensor: There will be a hardware light sensor that detects different light levels. We’ll need to ask it about light levels.

•

FeedingDoor: There will be several feeding doors on the bird feeder. They have to open and close.

That’s probably about it for classes at this point. Now, what do they all do? To describe classes and their components, we can use another UML feature: class diagrams.

Class Diagrams A class diagram allows you to describe the attributes and the methods of a class. A set of class diagrams will describe all the objects in a program and the relationships between them. We draw arrows of different types between class diagrams to describe the relationships. Class diagrams give you a visual description of the object model you’ve created for your program. We saw a set of class diagrams for the Fox and Rabbit program we described in Chapter 5. Class diagrams have three sections: •

Name: The name of the class

•

Attributes: The instance data fields and their types used by the class

•

Methods: The set of methods used by the class and their visibility.

An example of a class diagram for our BirdFeeder class is shown in Figure 8-1.

100

Chapter 8 ■ Object-Oriented Overview

Figure 8-1. The BirdFeeder class The diagram shows that the BirdFeeder class has a single integer attribute, lightLevel, and a single method, operate(). By themselves, class diagrams aren’t terribly interesting, but when you put several of them together and show the relationships between them, you can get some interesting information about your program. What else do we need in the way of class diagrams? In our program, the BirdFeeder class uses the FeedingDoor and Sensor classes, but they don’t know (or care) about each other. In fact, although BirdFeeder knows about FeedingDoor and Sensor and uses them, they don’t know they’re being used. Ah, the beauty of object-oriented programming! This relationship can be expressed in the class diagram of all three classes, shown in Figure 8-2.

Figure 8-2. BirdFeeder uses FeedingDoor and Sensor In UML, the dotted line with the open arrow at the end indicates that one class (in our case, BirdFeeder) is associated with another class (in our case, either FeedingDoor or Sensor) by using it.

Code Anyone? Now that we’ve got the class diagrams and know the attributes, the methods, and the association between the classes, it’s time to flesh out our program with some code. In the BirdFeeder object, the operate() method needs to check the light levels and open or close the feeding doors depending on the current light level reported by the Sensor object—and does nothing if the current light level is above or below the threshold values.

101

Chapter 8 ■ Object-Oriented Overview

In the Sensor object, the getLevel() method just reports back the current level from the hardware sensor. In the FeedingDoor object, the open() method checks to see if the doors are closed. If they are, it opens them and sets a Boolean to indicate that they’re open. The close() method does the reverse. Listing 8-1 shows the code for each of the classes described. Listing 8-1. Code for B4++ Bird Feeder /** * class BirdFeeder * * @author John F. Dooley * @version 1.0 */ import java.util.ArrayList; import java.util.Iterator; public class BirdFeeder {     /* instance variables */     private static final int ON_THRESHOLD = 40;     private static final int OFF_THRESHOLD = 25;     private int lightLevel;     private Sensor s1;     private ArrayList doors = null;     /*      * Default Constructor for objects of class BirdFeeder      */     public BirdFeeder()     {         doors = new ArrayList();         /* initialize lightLevel */         lightLevel = 0;         s1 = new Sensor();         /* by default we have a feeder with just one door */         doors.add(new FeedingDoor());     }     /*      * The operate() method operates the birdfeeder.      * It gets the current lightLevel from the Sensor and      * checks to see if we should open or close the doors      */     public void operate()     {         lightLevel = s1.getLevel();         if (lightLevel > ON_THRESHOLD) {             Iterator door_iter = doors.iterator();

102

Chapter 8 ■ Object-Oriented Overview

            while (door_iter.hasNext()) {                 FeedingDoor a = (FeedingDoor) door_iter.next();                 a.open();                 System.out.println("The door has opened.");             }         } else if (lightLevel < OFF_THRESHOLD) {              Iterator door_iter = doors.iterator();              while (door_iter.hasNext()) {                 FeedingDoor a = (FeedingDoor) door_iter.next();                 a.close();                 System.out.println("The door has closed.");             }         }     } } /** * class FeedingDoor * * @author John Dooley * @version 1.0 */ public class FeedingDoor {     /* instance variables */     private boolean doorOpen;     /*      * Default constructor for objects of class FeedingDoors      */     public FeedingDoor()     {         /* initialize instance variables */         doorOpen = false;     }     /*      * open the feeding doors      * if they are already open, do nothing      */     public void open( )     {         /** if the door is closed, open it */         if (doorOpen == false) {             doorOpen = true;         }     }     /*      * close the doors      * if they are already closed, do nothing      */

103

Chapter 8 ■ Object-Oriented Overview

    public void close( )     {         /* if the door is open, close it */         if (doorOpen == true) {             doorOpen = false;         }     }     /*      * report whether the doors are open or not      */     public boolean isOpen()     {         return doorOpen;     } } /** * class Sensor * * @author John Dooley * @version 1.0 */ public class Sensor {     /* instance variables */     private int lightLevel;     /*      * Default constructor for objects of class Sensor      */     public Sensor()     {         /** initialize instance variable */         lightLevel = 0;     }     /**      * getLevel - return a light level      *      * @return the value of the light level      * that is returned by the hardware sensor      */     public int getLevel( )     {         /* till we get a hardware light sensor, we just fake it */         lightLevel = (int) (Math.random() * 100);         return lightLevel;     } }

104

Chapter 8 ■ Object-Oriented Overview

Finally, we have a BirdFeederTester class that operates the B4++: /** * The class that tests the BirdFeeder, Sensor, and * FeedingDoor classes. * * @version 0.1 */ public class BirdFeederTester {     private BirdFeeder feeder;     /*      * Constructor for objects of class BirdFeederTest      */     public BirdFeederTester()     {         this.feeder = new BirdFeeder();     }     public static void main(String [] args)     {         BirdFeederTester bfTest = new BirdFeederTester();         for (int i = 0; i < 10; i++) {             System.out.println("Testing the bird feeder");             bfTest.feeder.operate();             try {                 Thread.currentThread().sleep(2000);             } catch (InterruptedException e) {                 System.out.println("Sleep interrupted" + e.getMessage());                 System.exit(1);             }         }     } } When Alice and Bob take delivery of the B4++, they’re thrilled. The doors automatically open and close, the birds arrive and eat their fill. Birdsong fills the air. What else could they possibly want?

Conclusion Object-oriented design is a methodology that works for a very wide range of problems. Solutions to many problems in the real world are easily characterized as groups of cooperating objects. This single simple idea promotes simplicity of design, reuse of both designs and code, and the ideas of encapsulation and information hiding that Parnas advocated in his paper on modular decomposition. It’s not the right way to solve some problems, including problems like communications protocol implementations, but it opens up a world of new and better solutions for many others. It also closes the “intellectual distance” between the real-world description of a problem and the resulting code.

105

Chapter 8 ■ Object-Oriented Overview

References Beck, K., and B. Boehm. “Agility through Discipline: A Debate.” IEEE Computer 36 (6):44-46 (2003). Cockburn, A. Writing Effective Use Cases. (Boston, MA: Addison-Wesley, 2000.) Fowler, M. UML Distilled. (Boston, MA: Addison-Wesley, 2000.) Graham, Paul. “Why Arc Isn’t Especially Object Oriented,” retrieved from www.paulgraham.com/noop.html on October 12, 2009. McLaughlin, Brett D., et. al. Head First Object-Oriented Analysis & Design. (O’Reilly Media, Inc. Sebastopol, CA: 2007.) Wirfs-Brock, R. and A. McKean. Object Design: Roles Responsibilities, and Collaborations. (Boston, MA: Addison-Wesley, 2003.)

106

CHAPTER 9

Object-Oriented Analysis and Design When doing analysis you are trying to understand the problem. To my mind this is not just listing requirements in use cases. . . . Analysis also involves looking behind the surface requirements to come up with a mental model of what is going on in the problem. . . . Some kind of conceptual model is a necessary part of software development, and even the most uncontrolled hacker does it. —Martin Fowler1 Object-oriented design is, in its simplest form, based on a seemingly elementary idea. Computing systems perform certain actions on certain objects; to obtain flexible and reusable systems, it is better to base the structure of software on the objects than on the actions. Once you have said this, you have not really provided a definition, but rather posed a set of problems: What precisely is an object? How do you find and describe the objects? How should programs manipulate objects? What are the possible relations between objects? How does one explore the commonalities that may exist between various kinds of objects? How do these ideas relate to classical software engineering concerns such as correct- ness, ease of use, efficiency? Answers to these issues rely on an impressive array of techniques for efficiently producing reusable, extendible and reliable software: inheritance, both in its linear (single) and multiple forms; dynamic binding and polymorphism; a new view of types and type checking; genericity; information hiding; use of assertions; programming by contract; safe exception handling. —Bertrand Meyer2

Martin, Robert, “Single Responsibility Principle.” www.butunclebob.com/ArticleS.UncleBob.PrinciplesOfOod, retrieved on December 10, 2009. 2 Meyer, Bertrand. Object-Oriented Software Construction. (Upper Saddle River , NJ: Prentice Hall, 1988.) 1

© John F. Dooley 2017 J. F. Dooley, Software Development, Design and Coding, https://doi.org/10.1007/978-1-4842-3153-1_9

107

Chapter 9 ■ Object-Oriented Analysis and Design

When defining object-oriented analysis and design, it’s best to keep in mind your objectives. In both of these process phases we’re producing a work product that’s closer to the code that is your end goal. In analysis, you’re refining the feature list you’ve created and producing a model of what the customer wants. In design you’re taking that model and creating the classes that will end up being code. In analysis you want to end up with a description of what the program is supposed to do, its essential features. This end product takes the form of a conceptual model of the problem domain and its solution. The model is made up of a number of things, including use cases, user stories, scenarios, preliminary class diagrams, user interface storyboards, and possibly some class interface descriptions. In design you want to end up with a description of how the program will implement the conceptual model and do what the customer wants. This end product takes the form of an object model of the solution. This model is made up of groups of related class diagrams—their associations and descriptions of how they interact with each other. This includes the programming interface for each class. From here you should be able to get to the code pretty quickly.

Analysis What is object-oriented analysis? Well, it depends on whom you talk to. For our purposes, we’ll define objectoriented analysis as a method of studying the nature of a problem and determining its essential features and their relations to each other.3 Your objective is to end up with a conceptual model of the problem solution that you can then use to create an object model—your design. This model doesn’t take into account any implementation details or any constraints on the target system. It looks at the domain that the problem is in and tries to create a set of features, objects and relations that describe a solution in that domain. What makes a feature essential? Typically, a feature is essential if it’s a feature the customer has said they must have, if it’s a non-functional requirement that the program won’t run without, or if it’s a core program element that other parts of the program depend on. The conceptual model describes what the solution will do and will typically include use cases,4 user stories,5 user scenarios, and UML sequence diagrams.6 It can also include a description of the user interface and a preliminary set of UML class diagrams (but that, of course, is shading over into design). How do you create this conceptual model? As with all the other methodologies we’ve talked about, the correct answer is: it depends. It depends on understanding the problem domain, on understanding the feature list you’ve already come up with, and on understanding how the customer reacts to each of the program iterations they’ll see. As we’ll see, change is constant. The key part of object-oriented analysis is the creation of use cases. With use cases you create a detailed walkthrough of a scenario from the user’s perspective, and that walkthrough gives you an understanding of what the program is supposed to do from the outside. A program of any size will normally have several use cases associated with it. In fact, a single use case may have alternative paths through the scenario. When using an agile methodology, you’ll normally start with user stories that the product owner will create and then a set of scenarios that flesh out the user stories. The scenarios are then used to generate use cases. More on this later. Once you get a few use cases created, how do you get to the class diagrams? There are several methods suggested, but we’ll just go over one now and save the rest for later. The first method we’ll look at is called textual analysis. With textual analysis, you take your uses cases and examine the text for clues about classes in your programs. Remember that the object-oriented paradigm is all about objects and the behavior of those objects, so those are the two things to pluck out of your use cases.

McLaughlin, Brett D., et. al. Head First Object-Oriented Analysis & Design. (Sebastopol, CA: O’Reilly Media, Inc., 2007.) Cockburn, A. (2000). Writing Effective Use Cases. (Boston, MA: Addison-Wesley, 2000.) 5 Beck, K. Extreme Programming Explained: Embrace Change. (Boston, MA: Addison-Wesley, 2000.) 6 Fowler, M.UML Distilled. (Boston, MA: Addison-Wesley, 2000.) 3 4

108

Chapter 9 ■ Object-Oriented Analysis and Design

In textual analysis, you pluck potential objects out of the text by picking out the nouns in your use case. Because nouns are things and objects are (usually) things, the nouns stand a good chance of being objects in your program. In terms of behavior, you look at the verbs in the use case. Verbs provide you with action words that describe changes in state or actions that report state. This usually isn’t the end, but it gives you your first cut at method names and parameter lists for the methods.

An Analytical Example Let’s go back to Burt’s Bird Buffet and Bath, and the B4++. When last we left the B4++ it automatically opened the feeding doors at sunrise and closed them at sunset. The B4++ was a hit, and Alice and Bob were thrilled with its performance. Once again the B4 models were flying off the shelves. Then one day Burt gets a call from Alice. It seems she has an issue. Although the B4++ works just fine, Alice has noticed that she’s getting unwanted birds at her bird feeder. Recall that Alice is a songbird fanatic and she’s thrilled when cardinals, painted buntings, scarlet tanagers, American goldfinches, and tufted titmice show up at the feeder. But she’s not so thrilled when grackles, blue jays, and starlings drive away the songbirds and have their own feast. So Alice wants to be able to close the B4++ feeding doors when the bad birds show up and open them again when the songbirds come back. And you’re just the guy to do it. The first obvious question you ask Alice is, “How do you want to open and close the feeding doors?” “Well,” she says, “how about a remote control? That way I can stay inside the house and just open and close the doors when the birds arrive.” And so the game begins again. Lets assume that we’re an agile team and we’ll do our updated design using agile techniques. The first thing we’ll need is a new user story. Recall that a user story usually takes the form: “As a , I want to , in order to .” In this case we might say, “As Alice, the owner, I want to open and close the bird feeder doors with a remote control, in order to keep the predator birds away from the feeder.” From this user story we can generate a scenario that fleshes out what Alice wants to do, which might look like this: “Alice is sitting at her kitchen table having her morning coffee. The B4++ doors opened when the sun came out this morning and the feeder has attracted several songbirds. As Alice is watching the songbirds, a couple of blue jays arrive, chase the songbirds off, and begin feasting on the birdseed. Alice reaches for the remote control and presses the button. The bird feeder doors close smoothly and the blue jays fly off. Alice presses the remote control button again and the doors open. After a while the songbirds return and Alice can finish her coffee.” Just like last time, we can take this now fleshed-out problem statement and try to put together a use case. Our previous use case looked like this: 1. The sensor detects sunlight at a 40% brightness level. 2. The feeding doors open. 3. The birds arrive, eat, and drink. 4. The birds leave. 5. The sensor detects a decrease in sunlight to a 25% brightness level. 6. The feeding doors close. The first thing we need to decide is whether our new problem is an alternate path in this use case, or whether we need an entirely new use case. Let’s try a new use case. Why? Well, using the remote control doesn’t really fit into the sensor use case, does it? The remote can be activated at any time and requires a user interaction, neither of which fits with our sensor. So let’s see what we can come up with for a remote control use case: 1. Alice hears or sees birds at the bird feeder. 2. Alice determines that they are not songbirds.

109

Chapter 9 ■ Object-Oriented Analysis and Design

3. Alice presses the remote control button. 4. The feeding doors close. 5. The birds give up and fly away. 6. Alice presses the remote control button. 7. The feeding doors open again. Does this cover all the situations? Are there any we’ve missed? There are two things to think of. First, in step #1 we have “Alice hears or sees birds.” The question is should the or matter to us? In this case the answer is no, because Alice is the one deciding and she’s the actor in this use case. We can’t control the actor; we can only respond to something the actor wants to do and make available options for the actor to exercise. In our case, our program will need to wait for the signal from the remote control and then do the right thing. (Not to get ahead of ourselves, but look at our program now as an event-driven system, and the program has to wait, or listen, for an event before it does something). Secondly, what are the steps in the use case that will help us identify new objects? This is where our textual analysis comes in. In our previous version of this application, we’ve already got BirdFeeder, Sensor, and FeedingDoor objects. These are identified in the use case easily. What’s new now? The only new object here is the remote control. What does the remote control do? How many buttons does it have? What does the program do when a remote control button is pressed? In our example, the remote control seems relatively simple. Opening and closing the feeding doors is a toggle operation. The doors open if they are closed, and close if they are open. Those are the only options. So the remote really just needs a single button to implement the toggle function. At the end of the day we’ve got a new use case and a new class for the B4++ program (see Figure 9-1).

Figure 9-1. The new RemoteControl class And that seems to be all the analysis we need for this version of the program. This exercise provides us with a couple of guidelines we can use for analysis:

110

•

Make simple classes that work together by sending and responding to messages: In our example, the simple classes FeedingDoor and Sensor encapsulate knowledge about the current state of the BirdFeeder and allow us to control the bird feeder with simple messages. This simplicity allows us to easily add a new way of controlling the bird feeder with the RemoteControl class.

•

Classes should have one responsibility: Not only are the FeedingDoor and Sensor simple and easy to control, but they each only do one thing. That makes them easier to change later and easier to reuse.

Chapter 9 ■ Object-Oriented Analysis and Design

Design Now what about design? Assuming you’ve got a conceptual model from your analysis in the form of a few use cases and possibly a few class diagrams, your more detailed design should follow from this. In objectoriented design, the next steps are to firm up the class designs, decide on the methods your classes will contain, determine the relationships between the classes, and figure out how each of the methods will do what it’s supposed to do. In our current example, we’ve decided on four classes: BirdFeeder, FeedingDoor, Sensor, and RemoteControl. The first three classes we’ve already developed, so the question here is do we need to change any of these classes in order to integrate the RemoteControl class into the program? Figure 9-2 shows what we’ve got right now.

Figure 9-2. How to integrate the RemoteControl class? Thinking about it, it seems that nothing in FeedingDoor or Sensor should have to change. Why? Well, it’s because the BirdFeeder class uses those two classes, and they don’t need to use or inherit anything from any other class; they are pretty self-sufficient. If you recall, it’s the operate() method in the BirdFeeder class that does all the hard work. It has to check the light level from the Sensor and, if appropriate, send a signal to the doors to open or close. So it seems that maybe the RemoteControl class will work the same way. The question for our design is: does the BirdFeeder class also use the RemoteControl class, or does the RemoteControl class stand alone and just wait for an “event” to happen? Let’s take a look at the code for the operate() method again: public void operate() {       lightLevel = s1.getLevel();       if (lightLevel > ON_THRESHOLD) {           Iterator door_iter = doors.iterator();           while (door_iter.hasNext()) {               FeedingDoor a = (FeedingDoor) door_iter.next();               a.open();           }

111

Chapter 9 ■ Object-Oriented Analysis and Design

      } else if (lightLevel < OFF_THRESHOLD) {            Iterator door_iter = doors.iterator();            while (door_iter.hasNext()) {               FeedingDoor a = (FeedingDoor) door_iter.next();               a.close();           }       }   } In this method, we check the light level from the Sensor object, and if it’s above a certain level (the sun has risen), then we ask the doors to open. It’s the doors themselves that check to see if they are already open or not. Regardless, when the open() method returns, each door is open. The same thing happens with the close() method. Regardless of how they start out, when each invocation of close() returns its door is closed. It seems as if this is just the behavior we want from the RemoteControl object, except that instead of a light threshold, it responds to a button press. So, the pseudo-code for pressButton() will look like this: pressButton()         while (there are still doors left to process) do                 if (the door is open) then                         door.close()                 else                         door.open()                 end-if         end-while end-method. And from here you can just write the code.

Change in the Right Direction A key element of the last two sections is that object-oriented analysis and design are all about change. Analysis is about understanding behavior and anticipation of change, while design is about implementing the model and managing change. In a typical process methodology, analysis and design are iterative. As you begin to create a new program, you uncover new requirements; as the user begins to use your prototypes, they discover new ideas, things that don’t work for them, and new features they hadn’t mentioned previously. All these things require you to go back and re-think what you already know about the problem and what you have designed. In order to avoid what’s known as analysis paralysis, you need to manage this neverending flow of new ideas and requirements.

Recognizing Change There are a number of techniques that can be used to see and deal with change. The first we’ll look at is recognizing what might change in your design. Let’s look at the B4++ again. Right now, our B4++ will open and close the bird feeder’s doors at sunrise and sunset in response to the light levels returned by the sensor. It will also open and close the feeding doors in response to a button push from the remote control. What might change here?

112

Chapter 9 ■ Object-Oriented Analysis and Design

Well, the hardware might change. If the sensor changes, that might affect the Sensor class or it might cause your boss to re-think how the Sensor class should work. You might also get new hardware. This is just like the remote control addition we made above. And just like the remote control example, new hardware can result in the appearance of new use cases or changes to existing use cases. These changes can consequently ripple down through your class hierarchy. The requirements might change. Most likely new requirements would crop up. A requirement change can lead to alternate paths through use cases. This implies that behavior will change requirements, which then leads to design changes. Design change happens because requirements change. By thinking about what things can change in your program and design, you can begin to anticipate change. Anticipating change can lead you to be careful about encapsulation, inheritance, dependencies of one class on another, and so on.

Songbirds Forever While we’re talking about change, let's look at B4++ again. It’s several weeks now since Alice and Bob received delivery of their new and improved B4++ with remote control. Alice loves it. She can watch the birds out her kitchen window, and when the grackles swoop in she just hits the remote control button and the doors shut. The grackles leave disappointed, she hits the button again, and the doors open. The new version works like a charm and does everything they had asked for. There’s just one little thing . . . Alice has discovered that sometimes she has to run errands, or go to the bathroom, or watch her favorite nature show on the Discovery Channel. When she does this, she can’t close the door with the remote, and the grackles can come and feed to their hearts’ content, chasing away all the songbirds. So, Alice would like yet another small, insignificant change to the B4++, one that’s hardly worth mentioning, really. She wants the B4++ to detect the pesky birds and close the doors automatically. How to do this?

A New Requirement The new requirement is that “The B4++ must be able to detect the unwanted birds and close the doors automatically.” Is this a complete requirement? It doesn’t seem to be because it begs the obvious question: when do the doors open again? It seems we have at least a couple of things to decide: 1. How does the bird feeder detect the birds? 2. How do we distinguish between the unwanted birds and the songbirds? 3. When does the bird feeder open the doors again after they’ve been closed? Luckily for us, our sensor supplier, SensorsRUs, has just come out with a programmable audio sensor that will let us identify birdsong. If we integrate that hardware into the B4++, that takes care of item #1 above. It also turns out that the pesky birds have way different songs from the songbirds we want to attract, so that the audio sensor can be programmed via firmware to distinguish between the different bird species. That takes care of issue #2. What about issue #3—getting the closed doors open again? It seems as if there are two ways you can get the B4++ to open the doors again. We can have a timer that keeps the doors shut for a specific amount of time and then opens them again. This has the advantage of simplicity, but it’s also a pretty simplistic bit of programming. Simplistic in the sense that the timer program just implements a countdown timer with no information about the context in which it operates. It could easily open the door while there are still a bunch of unwanted birds around. Another way we could implement the bird identifier is to have it only open the door when it hears one of our songbirds. If you reason that the songbirds won’t be around if the pesky birds are still there, then the only time you’ll hear songbirds singing is if there are no pesky birds around. If that’s the case, then it’s safe to open the feeding doors.

113

Chapter 9 ■ Object-Oriented Analysis and Design

Let’s do a use case. Because opening and closing the feeding doors with the song identifier is a lot like using the remote control, let’s start with the RemoteControl use case and add to it: 1. Alice hears or sees birds at the bird feeder. 1.1 The songbird identifier hears birdsong. 2. Alice determines that they are not songbirds. 2.1 The songbird identifier recognizes the song as from an unwanted bird. 3. Alice presses the remote control button. 3.1 The songbird identifier sends a message to the feeding doors to close. 4. The feeding doors close. 5. The birds give up and fly away. 5.1 The songbird identifier hears birdsong. 5.2 The songbird identifier recognizes the song as from a songbird. 6. Alice presses the remote control button. 6.1 The songbird identifier sends a message to the feeding doors to open. 7. The feeding doors open again. What we’ve created here is an alternate path in the use case. This use case looks pretty awkward now, because the sub-cases look like they flow from the upper cases when, in fact, one or the other of them should be done. We can rewrite the use case to look like Table 9-1. Table 9-1. The Song Identifier Use Case and Its Alternate

Main Path

Alternate Path

1. Alice hears or sees birds at the bird feeder.

1.1 The songbird identifier hears birdsong.

2. Alice determines that they are not songbirds.

2.1 The songbird identifier recognizes the song as from an unwanted bird.

3. Alice presses the remote control button.

3.1 The songbird identifier sends a message to the feeding doors to close.

4. The feeding doors close. 5. The birds give up and fly away.

5.1 The songbird identifier hears birdsong. 5.2 The songbird identifier recognizes the song as from a songbird.

6. Alice presses the remote control button. 7. The feeding doors open again.

114

6.1 The songbird identifier sends a message to the feeding doors to open.

Chapter 9 ■ Object-Oriented Analysis and Design

These two paths aren’t exactly the same. For instance, in the main path, Alice sees the birds give up and fly away before she presses the remote control button. In the alternate path, the bird song identifier must wait until it hears birdsong before it can consider opening the feeding doors again. So, we could easily make these two different use cases. It depends on you. Use cases are there to illustrate different scenarios in the use of the program, so you can represent them in any way you want. If you want to break this use case up into two different ones, feel free. Just be consistent. You’re still managing change.

Separating Analysis and Design As mentioned, it’s difficult to separate analysis and design. The temptation for every programmer, particularly beginning programmers, is to start writing code now. That temptation bleeds over into doing analysis, design, and coding all at once and thinking about all three phases together. This is usually a bad idea unless your program is only about 10 lines long. It’s nearly always better to abstract out requirements and architectural ideas from your low-level design and coding. Chapters 5 and 6 talked about this separation. Separating object-oriented analysis and design is a particularly difficult task. In analysis, we’re trying to understand the problem and the problem domain from an object-oriented point of view. That means we start thinking about objects and their interactions with each other very early in the process. Even our scenarios and use cases are littered with loaded object words. Analysis and design are nearly inseparable— when you’re “doing analysis” you can’t help but “think design” as well. What should you do when you really want to start thinking about design? Your design must produce, at minimum, the classes in your system, their public interfaces, and their relationships to other classes, especially base or super classes. If your design methodology produces more than that, ask yourself if all the pieces produced by that methodology have value over the lifetime of the program. If they do not, maintaining them will cost you. Members of development teams tend not to maintain anything that doesn’t contribute to their productivity; this is a fact of life that many design methods don’t account for. All software design problems can be simplified by introducing an extra level of conceptual indirection. This one idea is the basis of abstraction, the primary feature of object-oriented programming. That’s why in UML, what we call inheritance in Java is called generalization. The idea is to identify common features in two or more classes and abstract those features out into a higher-level, more general class that the lower level classes then inherit from. When designing, make your classes as atomic as possible; that is, give each class a single, clear purpose. This is the Single Responsibility Principle that is discussed more about in the next chapter on design principles.7 If your classes or your system design grows too complicated, break complex classes into simpler ones. The most obvious indicator of this is sheer size: if a class is big, chances are it’s doing too much and should be broken up. You also need to look for and separate things that change from things that stay the same. That is, search for the elements in a program that you might want to change without forcing a redesign, and then encapsulate those elements in classes. All these guidelines are key to managing the changes in your design. In the end, you want a clean, understandable design that’s easy to maintain.

7

Martin, 2009.

115

Chapter 9 ■ Object-Oriented Analysis and Design

Shaping the Design Your goal is to invent and arrange objects in a pleasing fashion. Your application will be divided into neighborhoods where clusters of objects work toward a common goal. Your design will be shaped by the number and quality of abstractions and by how well they complement one another. Composition, form, and focus are everything. —Rebecca Wirfs-Brock and Alan McKean8 Identifying objects (or object classes) is a difficult part of object-oriented design. There is no “magic formula” for object identification. It relies on the skill, experience, and domain knowledge of system designers (that would be you). Object identification is an iterative process. You’re not likely to get it right the first time. You begin finding objects by looking for real-world analogues in your requirements. That gets you started, but it’s only the first step. Other objects hide in the abstraction layers of your domain. Where to find these hidden objects? You can look to your own knowledge of the application domain. You can look for operations that crop up in your requirements and in your architectural concepts of the system. You can even look to your own past experience designing other systems. Here are some steps to finding candidate objects in your system: 1. Write a set of use cases: These will describe how the application will work for a number of different scenarios. Remember, each use case must have a goal. Alternate paths through a use case may indicate new requirements that require a new use case. 2. Identify the actors: Identify the actors in each use case, the operations they need to perform, and the other things they need to use in performing their actions. 3. Name and describe each candidate object: Base the identification on tangible things in the application domain (like nouns). Use a behavioral approach and identify objects based on what participates in what behavior (use verbs). Objects can manifest themselves in a number of ways. They can be •

External entities that produce or consume information

•

Things that are part of the information domain (reports, displays, and the like)

•

Occurrences or events that occur within the system

•

Internal producers (objects that make something)

•

Internal consumers (objects that consume what producers make)

•

Places (remote systems, databases, and so on)

•

Structures (windows, frames)

•

People or characteristics of people (person, student, teacher, and so on)

•

Things that are owned or used by other objects (like bank accounts or automobile parts).

•

Things that are lists of other objects (like parts lists, any kind of collection, and so on)

Wirfs-Brock, R. and A. McKean. Object Design: Roles Responsibilities, and Collaborations. (Boston, MA, Addison-Wesley, 2003).

8

116

Chapter 9 ■ Object-Oriented Analysis and Design

4. Organize the candidate objects into groups: Each group represents a cluster of objects that work together to solve a common problem in your application. Each object will have several characteristics: •

Required information: The object has information that must be remembered so the system can function.

•

Needed services: The object must provide services relating to the system goals.

•

Common attributes: The attributes defined for the object must be common to all instances of the object.

•

Common operations: The operations defined for the object must be common to all instances of the object.

5. Make sure the groups you’ve created represent good abstractions for objects and work in the application: Good abstractions will help make your application easier to re-work when you inevitably need to change some feature or relationship in the application.

Abstraction Let's change tack here and talk about a different example. Alice and Bob have just moved to a new city and need to transfer their old Second City Bank and Trust bank accounts to First Galactic Bank. Alice and Bob are typically middle class and have several bank accounts they need to transfer: a checking account, a passbook savings account, and an investment account. Nobody actually opens a generic “bank account.” Instead they open different types of accounts, and each type has different characteristics. You can write checks on a checking account, but you can’t write checks on a passbook savings account. You can earn interest on a savings account, but you normally don’t earn interest on a checking account; you pay a monthly service fee instead. But all different types of bank accounts have some things in common. All of them use your personal information (name, social security number, address, city, state, ZIP), and they all allow you to deposit money and withdraw money. So, when putting together a program that handles “bank accounts,” you may realize that there will be common attributes and behaviors among several classes. Let’s look at some classes for a bank account example. Because we know that checking accounts, savings accounts, and investment accounts are all different, let’s first create three different classes and see what we’ve got (see Figure 9-3).

Figure 9-3. Bank accounts with a lot in common

117

Chapter 9 ■ Object-Oriented Analysis and Design

Notice that all three classes have a lot in common. One of the things we always try to do, no matter what design or coding techniques we’re using, is to avoid duplication of design and code. This is what abstraction is all about! If we abstract out all the common elements of these three classes, we can create a new (super) class BankAccount that incorporates all of them. The CheckingAcct, SavingsAcct, and InvestmentAcct classes can then inherit from BankAccount. BankAccount is shown in Figure 9-4.

Figure 9-4. A cleaner BankAccount class But wait! Is the BankAccount class one that we would want to instantiate? If you look, you’ll see that each of the other classes is much more specific than the BankAccount class is. So, there isn’t enough information in the BankAccount class for us to use. This means we’ll always be inheriting from it, but never instantiating it. It’s a perfect abstract class. (Note the little bit of UML in Figure 9-5—class diagrams of abstract classes put the class name in italics.)

Figure 9-5. BankAccount as an abstract class

118

Chapter 9 ■ Object-Oriented Analysis and Design

Abstract classes are templates for actual concrete classes. They encapsulate shared behavior and define the protocol for all subclasses. The abstract class defines behavior and sets a common state, and then concrete subclasses inherit and implement that behavior. You can’t instantiate an abstract class; a new concrete class must be created that extends the abstract class. As a guideline, whenever you find common behavior in two or more places, you should look to abstract that behavior into a class and then reuse that behavior in the common concrete classes. Here’s what we end up with after abstracting out all the personal data and common behavior into the BankAccount abstract class. Notice one more little bit of UML in Figure 9-6: the new UML arrow types, open arrow ends. These open arrows indicate inheritance; so the CheckingAcct class inherits attributes and methods from the BankAccount abstract class. UML calls it generalization because the super class generalizes the subclasses. That’s why the arrows point up to the super class.

Figure 9-6. The concrete account classes inherit from BankAccount

Conclusion In object-oriented analysis and design, it’s best to keep your objectives in mind. In analysis, you’re refining the feature list you’ve created and producing a model of what the customer wants. You want to end up with a description of what the program is supposed to do—its essential features. This creates a conceptual model of the problem domain and its solution. The model is made up of a number of things including user stories, scenarios, use cases, preliminary class diagrams, user interface storyboards, and possibly some class interface descriptions.

119

Chapter 9 ■ Object-Oriented Analysis and Design

In design, you’re taking that conceptual model and creating the classes that will end up being code. You want to end up with a description of how the program will implement the conceptual model and do what the customer wants. This is an object model of the solution. This model is made up of groups of related class diagrams, their associations, and descriptions of how they interact with each other. This includes the programming interface for each class. This design is an abstraction of the class details and code you’ll create later. From here you should be able to get to code pretty quickly.

References Beck, K. Extreme Programming Explained: Embrace Change. (Boston, MA: Addison-Wesley, 2000.) Cockburn, A. (2000). Writing Effective Use Cases. (Boston, MA: Addison-Wesley, 2000.) Fowler, M.UML Distilled. (Boston, MA: Addison-Wesley, 2000.) McLaughlin, Brett D., et. al. Head First Object-Oriented Analysis & Design. (Sebastopol, CA: O’Reilly Media, Inc., 2007.) Meyer, Bertrand. Object-Oriented Software Construction. (Upper Saddle River , NJ: Prentice Hall, 1988.) Martin, Robert, “Single Responsibility Principle." www.butunclebob.com/ArticleS.UncleBob. PrinciplesOfOod. Retrieved on December 10, 2009. Wirfs-Brock, R. and A. McKean. Object Design: Roles Responsibilities, and Collaborations. (Boston, MA, Addison-Wesley, 2003.)

120

CHAPTER 10

Object-Oriented Design Principles Devotion to the facts will always give the pleasures of recognition; adherence to the rules of design, the pleasures of order and certainty. —Kenneth Clark How can I qualify my faith in the inviolability of the design principles? Their virtue is demonstrated. They work. —Edgar Whitney Now that we’ve spent some time looking at object-oriented analysis and design, let’s recapitulate some of what we’ve already seen and add some more prose. First, let’s talk about some common design characteristics. First, designs have a purpose. They describe how something will work in a context, using the requirements (lists of features, user stories, and use cases) to define the context. Second, designs must have enough information in them that someone can implement them. You need enough details in the design so that someone can come after you and implement the program correctly. Next, there are different styles of design, just like there are different types of house architectures. The type of design you want depends on what it is you’re being required to build. It depends on the context (see, we’re back to context): if you’re an architect, you’ll design a different kind of house at the seashore than you will in the mountains. Finally, designs can be expressed at different levels of detail. When building a house, the framing carpenter needs one level of detail, the electrician and plumber another, and the finish carpenter yet another. There are a number of rules of thumb about object-oriented design that have evolved over the last few decades. These design principles act as guidelines for you the designer to abide by so that your design ends up being a good one: easy to implement, easy to maintain, and that does just what your customer wants. We’ve looked at several of them already in previous chapters, and here I’ve pulled out nine fundamental design principles of object-oriented design that are likely to be the most useful to you as you become that designer extraordinaire. I’ll list them in the next section and then explain them and give examples in the rest of the chapter.

© John F. Dooley 2017 J. F. Dooley, Software Development, Design and Coding, https://doi.org/10.1007/978-1-4842-3153-1_10

121

Chapter 10 ■ Object-Oriented Design Principles

List of Fundamental Object-Oriented Design Principles Here are the nine fundamental principles: 1. Encapsulate things in your design that are likely to change. 2. Code to an interface rather than to an implementation. 3. The Open-Closed Principle (OCP): Classes should be open for extension and closed for modification. 4. The Don’t Repeat Yourself Principle (DRY): Avoid duplicate code. Whenever you find common behavior in two or more places, look to abstract that behavior into a class and then reuse that behavior in the common concrete classes. Satisfy one requirement in one place in your code. 5. The Single Responsibility Principle (SRP): Every object in your system should have a single responsibility, and all the object’s services should be focused on carrying out that responsibility. Another way of saying this is that a cohesive class does one thing well and doesn’t try to do anything else. This implies that higher cohesion is better. It also means that each class in your program should have only one reason to change. 6. The Liskov Substitution Principle (LSP): Subtypes must be substitutable for their base types. (In other words, inheritance should be well designed and well behaved.) 7. The Dependency Inversion Principle (DIP): Don’t depend on concrete classes; depend on abstractions. 8. The Interface Segregation Principle (ISP): Clients shouldn’t have to depend on interfaces they don’t use. 9. The Principle of Least Knowledge (PLK) (also known as the Law of Demeter): Talk only to your immediate friends. This also relates to the idea of loose coupling. Objects that interact should be loosely coupled with well-defined interfaces. As you probably notice, there’s some overlap here, and one or more of the design principles may depend on others. That’s okay. It’s the fundamentals that count. Let’s go through these one at a time.

Encapsulate Things in Your Design That Are Likely to Change This first principle means to protect your classes from unnecessary change by separating the features and methods of a class that remain relatively constant throughout the program from those that will change. By separating the two types of features, we isolate the parts that will change a lot into a separate class (or classes) that we can depend on changing, and we increase our flexibility and ease of change. We also leave the stable parts of our design alone, so that we just need to implement them once and test them once. (Well, we hope.) This protects the stable parts of the design from any unnecessary changes. Let’s create a very simple class called Violinist. Figure 10-1 is a class diagram for the Violinist class.

122

Chapter 10 ■ Object-Oriented Design Principles

Figure 10-1. A Violinist Notice that the setUpMusic() and tuneInstrument() methods are pretty stable. But what about the play() method? It turns out that there are several different types of playing styles for violins: classical, bluegrass, and Celtic, just to name three. That means the play() method will vary, depending on the playing style. Because we have a behavior that will change depending on the playing style, maybe we should abstract that behavior out and encapsulate it in another class? If we do that, we get something like Figure 10-2.

Figure 10-2. Violinist and playing styles Notice that we’re using association between the Violinist class and the ViolinStyle abstract class. This allows Violinist to use the concrete classes that inherit the abstract method from the abstract ViolinStyle class. We’ve abstracted out and encapsulated the play() method—which will vary—in a separate class so that we can isolate any changes we want to make to the playing style from the other stable behaviors in Violinist.

Code to an Interface Rather Than to an Implementation The normal response to this design principle is: “Huh? What does that mean?” Well, here’s the idea. This principle, like many of the principles in this chapter, has to do with inheritance and how you use it in your program. Say you have a program that will model different types of geometric shapes in two dimensions. We’ll have a class Point that will represent a single point in 2D, and we’ll have an interface named Shape that will abstract out a couple of things that all shapes have in common: areas and perimeters. (Okay, circles and ellipses call it circumference—bear with me.) Figure 10-3 shows what we’ve got.

123

Chapter 10 ■ Object-Oriented Design Principles

Figure 10-3. A simple Point class and the common Shape interface If we want to create concrete classes of some different shapes, we’ll implement the Shape interface. This means the concrete classes must implement each of the abstract methods in the Shape interface. See Figure 10-4.

Figure 10-4. Rectangle, Circle, and Triangle all implement Shape Now we’ve got a number of classes that represent different geometric shapes. How do we use them? Say we’re writing an application that will manipulate a geometric shape. We can do this in two different ways. First, we can write a separate application for each geometric shape, as shown in Figure 10-5.

Figure 10-5. Using the geometric objects

124

Chapter 10 ■ Object-Oriented Design Principles

What’s wrong with these apps? Well, we’ve got three different applications doing the same thing. If we want to add another shape—say, a rhombus—we’d have to write two new classes, the Rhombus class, which implements the Shape interface, and a new RhombusApp class. Yuk! This is inefficient. We’ve coded to the implementation of the geometric shape rather than coding to the interface itself. How do we fix this? The thing to realize is that the interface is the top of a class hierarchy of all the classes that implement the interface. As such, it’s a class type, and we can use it to help us implement polymorphism in our program. In this case, because we have some number of geometric shapes that implement the Shape interface, we can create an array of Shapes that we can fill up with different types of shapes and then iterate through. In Java, we’ll use the List collection type to hold our shapes: import java.util.*; /** * ShapeTest - test the Shape interface implementations. * * @author fred * @version 1.0 */ public class ShapeTest {     public static void main(String [] args) {         List figures = new ArrayList();         figures.add(new Rectangle(10, 20));         figures.add(new Circle(10));         Point p1 = new Point(0.0, 0.0);         Point p2 = new Point(5.0, 1.0);         Point p3 = new Point(2.0, 8.0);         figures.add(new Triangle(p1, p2, p3));         Iterator iter = figures.iterator();         while (iter.hasNext()) {             Shape nxt =  iter.next();             System.out.printf("area = %8.4f perimeter = %8.4f\n",                 nxt.computeArea(), nxt.computePerimeter());         }     } } When you code to the interface, your program becomes easier to extend and modify. Your program will work with all the interface’s sub-classes seamlessly. As an aside, the principles just given let you know that you should be constantly reviewing your design. Changing your design will force your code to change because of the need to refactor. Your design is iterative. Pride kills good design; don’t be afraid to revisit your design decisions.

125

Chapter 10 ■ Object-Oriented Design Principles

The Open-Closed Principle The Open-Closed Principle (OCP) says that classes should be open for extension and closed for modification.1 What this means is to find the behavior in a class that doesn’t vary and abstract that behavior up into a super/base class. That locks the base code away from modification, but all sub-classes will inherit that behavior. You are encapsulating the behavior that varies in the sub-classes (those classes that extend the base class) and closing the base class from modification. The bottom line here is that in your well-designed code, you add new features not by modifying existing code (it’s closed for modification), but by adding new code (it’s open for extension). The BankAccount class example that we did in the previous chapter is a classic example of the OpenClosed Principle at work. In that example, we abstracted all the personal information into the abstract BankAccount class along with any methods to access and modify the data, closed it from modification, and then extended that class into the different types of bank accounts. In this situation, it’s very easy to add new types of bank accounts just by extending the BankAccount class again. We avoid duplication of code and we preserve the integrity of the BankAccount properties. See Figure 10-6.

Figure 10-6. The classic BankAccount example for OCP For example, in the BankAccount class, we define the withdraw() method that allows a customer to withdraw funds from an account. But the way in which withdrawals occur can differ in each of the extended account classes. While the withdraw() method is closed for modification in the BankAccount class, it can be overridden in the sub-classes to implement the specific rules for that type of account and thus extend the power of the method. It’s closed for modification but open for extension.

1

Larman, C. “Protected Variation: The Importance of Being Closed.” IEEE Software 18(3): 89-91 (2001).

126

Chapter 10 ■ Object-Oriented Design Principles

The Open-Closed Principle doesn’t have to be limited to inheritance either. If you have several private methods in a class, those methods are closed for modification, but if you then create one or more public methods that use the private methods, you’ve opened up the possibility of extending those private methods by adding functionality in the public methods.

The Don’t Repeat Yourself Principle This principle says to avoid duplicate code by abstracting out things that are common and placing those things in a single location.2 Don’t repeat yourself (DRY) is the motherhood and apple pie design principle. It’s been handed down ever since developers started thinking about better ways to write programs. Go back and look at Chapters 6 and 7 for discussion of this. With DRY, you have each piece of information and each behavior in a single place in the design. Ideally you have one requirement in one place. This means you should create your design so that there is one logical place where a requirement is implemented. Then if you have to change the requirement you have only one place to look to make the change. You also remove duplicate code and replace it with method calls. If you’re duplicating code, you’re duplicating behavior. DRY doesn’t have to apply just to your code either. It’s always a good idea to comb your feature list and requirements for duplications. Rewriting requirements to avoid duplicating features in the code will make your code much easier to maintain. Consider the final version of the B4++ bird feeder discussed in Chapter 9. The last thing we worked on was adding a song identifier to the feeder so that the feeding doors would open and close automatically. But let’s look at the two use cases we ended up with (see Table 10-1). Table 10-1. The Song Identifier Use Case and Its Alternate

Main Path

Alternate Path

1. Alice hears or sees birds at the bird feeder.

1.1 The songbird identifier hears birdsong.

2. Alice determines that they are not songbirds.

2.1 The songbird identifier recognizes the song as from an unwanted bird.

3. Alice presses the remote control button.

3.1 The songbird identifier sends a message to the feeding doors to close.

4. The feeding doors close. 5. The birds give up and fly away.

5.1 The songbird identifier hears birdsong. 5.2 The songbird identifier recognizes the song as from a songbird.

6. Alice presses the remote control button.

6.1 The songbird identifier sends a message to the feeding doors to open.

7. The feeding doors open again. Notice that we’re opening and closing the feeding doors in two different places—via the remote control and via the song identifier. But if you think about it, regardless of where we request the doors to be open or closed, they always open and close in the same way. So, this is a classic opportunity to abstract out the open and close door behavior and put them in a single place—say, the FeedingDoor class. DRY at work! Hunt, A. and D. Thomas. The Pragmatic Programmer: From Journeyman to Master. (Boston, MA: Addison-Wesley, 2000.)

2

127

Chapter 10 ■ Object-Oriented Design Principles

The Single Responsibility Principle This principle says that a class should have one, and only one, reason to change.3 Here’s an example of the overlap between these design principles that was mentioned above: SRP, the first principle about encapsulation, and DRY all say similar but slightly different things. Encapsulation is about abstracting behavior and putting things in your design that are likely to change in the same place. DRY is about avoiding duplicating code by putting identical behaviors in the same place. SRP is about designing your classes so that each does just one thing, and does it very well. Every object should have a single responsibility, and all the object’s services are targeted towards carrying out that responsibility. Each class should have only one reason to change. Put simply, this means beware of having your class try to do too many things. As an example, let's say we’re writing the embedded code for a mobile phone. After months (really) of discussions with the marketing folks, our first cut at a MobilePhone class looks like Figure 10-7.

Figure 10-7. A very busy MobilePhone class This class seems to incorporate a lot of what we would want a mobile phone to do, but it violates the SRP in several different ways. This class is not trying to do a single thing—it’s trying to do way too many things: make and receive phone calls, create, send, and receive text messages, create, send, and receive pictures, browse the Internet. The class doesn’t have a single responsibility. It has many. But we don’t want a single class to be impacted by these completely different forces. We don’t want to modify the MobilePhone class every time the picture format is changed, or every time we want to add a new picture-editing feature, or every time the browser changes. Rather, we want to separate these functions out into different classes so that they can change independently of each other. So how do we recognize the things that should move out of this class, and how do we recognize the things that should stay? Have a look at Figure 10-8.

McLaughlin, Brett D., et. al., Head First Object-Oriented Analysis & Design. (Sebastopol, CA: O’Reilly Media, Inc., 2007.)

3

128

Chapter 10 ■ Object-Oriented Design Principles

Figure 10-8. Mobile phone classes each with a single responsibility In this example, we ask the question “What does the mobile phone do (to itself )?” as opposed to “What services are offered by the mobile phone?” By asking questions like this, we can start to separate out the responsibilities of the objects in the design. In this case, we can see that the phone itself can get its own phone number, initialize itself, and connect itself to the mobile phone network. The services offered, on the other hand, are really independent of the actual mobile phone, and so can be separated out into PhoneCall, TextMsg, and Picture classes. So we divide up the initial one class into four separate classes, each with a single responsibility. This way we can change any of the four classes without affecting the others. We then add a controller class that runs the whole phone. That way we can add new features as well. We’ve simplified the design (although we’ve got more classes), and made it easier to extend and modify. Is that a great principle, or what?

The Liskov Substitution Principle The Liskov Substitution Principle (LSP), named after Turing Award–winner Dr. Barbara Liskov of MIT, tells us that all sub-classes must be substitutable for their base class.4 This principle says that inheritance should be well designed and well behaved. In any case, a user should be able to instantiate an object as a sub-class and use all the base class functionality invisibly. To illustrate the LSP, most books give an example that violates it and say, “Don’t do that,” which is just what we’ll do. One of the best and canonical examples of violating the LSP is the Rectangle/Square example. The example is all over the Internet. Robert Martin gives a great variation on this example in his book Agile Software Development, Principles, Patterns, and Practices,5 and we’ll follow his version of the example. Here it is in Java.

Wintour, Damien. “The Liskov Substitution Principle.” 1988. Downloaded on September 14, 2010 from www.necessaryandsufficient.net/2008/09/design-guidelines-part3-the-liskov-substitution-principle/. 5 Martin, R. C. Agile Software Development: Principles, Patterns, and Practices. (Upper Saddle River, NJ: Prentice Hall, 2003.) 4

129

Chapter 10 ■ Object-Oriented Design Principles

Say you have a class Rectangle that represents the geometric shape of a rectangle: /** * class Rectangle */ public class Rectangle {     private double width;     private double height;     /**      * Constructor for objects of class Rectangle      */     public Rectangle(double width, double height) {         this.width = width;         this.height = height;     }     public void setWidth(double width) {         this.width = width;     }     public void setHeight(double height) {         this.height = height;     }     public double getHeight() {         return this.height;     }     public double getWidth() {         return this.width;     } } And, of course, one of your users wants to have the ability to manipulate squares as well as rectangles. You already know that squares are just a special case of rectangles. In other words, a Square is a Rectangle. This problem therefore seems to require using inheritance. So, you create a Square class that inherits from Rectangle: /** * class Square */ public class Square extends Rectangle {     /**      * Constructor for objects of class Square      */     public Square(double side) {         super(side, side);     }

130

Chapter 10 ■ Object-Oriented Design Principles

    public void setSide(double side) {         super.setWidth(side);         super.setHeight(side);     }     public double getSide() {         return super.getWidth();     } } This seems to be okay. Notice that because the width and height of a Square are the same, we couldn’t run the risk of changing them individually, so setSide() uses setWidth() and setHeight() to set both for the sides of a Square. No big deal, right? Well, if we have a function like void myFunc(Rectangle r, double newWidth) {         r.setWidth(newWidth); } and we pass myFunc() a Rectangle object, it works just fine, changing the width of the Rectangle. But what if we pass myFunc() a Square object? It turns out that in Java the same thing happens as before, but that’s wrong. It violates the integrity of the Square object by just changing its width without changing its height as well. So we’ve violated the LSP here, and the Square can’t substitute for a Rectangle without changing the behavior of the Square. The LSP says that the sub-class (Square) should be able to substitute for the super class (Rectangle), but it doesn’t in this case. We can get around this. We can override the Rectangle class’ setWidth() and setHeight() methods in Square like this: public void setWidth(double w) {     super.setWidth(w);     super.setHeight(w); } public void setHeight(double h) {     super.setWidth(h);     super.setHeight(h); } These will both work and we’ll get the right answers and preserve the invariants of the Square object, but what’s the point in that? If we have to override a bunch of methods we’ve inherited, then what’s the point of using inheritance to begin with? That’s what the LSP is all about: getting the behavior of derived classes right and thus getting inheritance right. If we think of the base class as being a contract that we adhere to (remember the Open-Closed Principle?), then the LSP is saying that you must adhere to the contract even for derived classes. Oh, by the way, this works in Java because Java public methods are all virtual methods and are thus able to be overridden. If we had defined setWidth() and setHeight() in Rectangle with a final keyword or if they had been private, then we couldn’t have overridden them. In fact, private versions of those methods would not have been inherited to begin with. In this example, although a square is mathematically a specialized type of rectangle and one where the invariants related to rectangles still hold, that mathematical definition just doesn't work in Java. In this case, you don’t want to have Square be a sub-class of Rectangle; inheritance doesn’t work for you in this case because you think about rectangles having two different kinds of sides—length and width—and squares having only one kind of side. So, if a Square class inherits from a Rectangle class, the image of what a Square is versus what a Rectangle is gets in the way of the code. Inheritance is just the wrong thing to use here.

131

Chapter 10 ■ Object-Oriented Design Principles

How can you tell when you’re likely to be violating the Liskov Substitution Principle? Indications that you’re violating LSP include the following: •

A sub-class doesn’t keep all the external observable behavior of its super class.

•

A sub-class modifies, rather than extends, the external observable behavior of its super class.

•

A sub-class throws exceptions in an effort to hide certain behavior defined in its super class.

•

A sub-class that overrides a virtual method defined in its super class using an empty implementation in order to hide certain behavior defined in its super class.

•

Method overriding in derived classes is the biggest cause of LSP violations.6

Sometimes inheritance just isn’t the right thing to do. Luckily, you’ve got options. It turns out there are other ways to share the behavior and attributes of other classes. The three most common are delegation, composition, and aggregation. Delegation is what every manager should do: give away work and let someone else do it. If you want to use the behaviors in another class but you don’t want to change that behavior, consider using delegation instead of inheritance. Delegation says to give responsibility for the behavior to another class; this creates an association between the classes. Association in this sense means that the classes are related to each other, usually through an attribute or a set of related methods. Delegation has a great side benefit: it shields your objects from any implementation changes in other objects in your program; you’re not using inheritance, so encapsulation protects you.7 Let's show a bit of how delegation works with an example. When last we left Alice and Bob and their B4++, Alice was tired of using the remote to open and close the feeding doors to keep away the non-song birds. So, they’d requested yet another new feature: an automatic song identifier. With the song identifier, the B4++ itself would recognize songbird songs and open the doors, and keep them closed for all other birds. We can think of this in a couple of ways. The BirdFeeder class, because of the Single Responsibility Principle, shouldn’t do the identification of bird songs, but it should know what songs are allowed. We’ll need a new class, SongIdentifier, that will do the song identification. We’ll also need a Song object that contains a birdsong. Figure 10-9 shows what we’ve got so far.

Figure 10-9. A first cut at the song identifier feature

6 7

Wintour, 1998. Mclaughlin, 2007.

132

Chapter 10 ■ Object-Oriented Design Principles

The BirdFeeder knows about birdsong and keeps a list of the allowed songs for the feeder. The SongIdentifier has the single job of identifying a given song. There are two ways that this can happen. The first is that the SongIdentifier class can do the work itself in the identify() method. That would mean that SongIdentifier would need an equals() method in order to do the comparison between two songs (the allowed song from the door, and the song that the new B4++ hardware just sent to us). The second way of identifying songs is for the Song class to do it itself, using its own equals() method. Which should we choose? Well, if we do all the identification in the SongIdentifier class, that means that any time anything changes in a Song, we’ll have to change both the Song class and the SongIdentifier class. This doesn’t sound optimal. But! If we delegate the song comparison work to the Song class, then the SongIdentifier’s identify() method could just take a Song as an input parameter and call that method, and we’ve isolated any Song changes to just the Song class. Figure 10-10 shows the revised class diagrams.

Figure 10-10. Simplifying SongIdentifier and Song And our corresponding code might look like this: public class SongIdentifier {     private BirdFeeder feeder;     private FeedingDoor door;     public SongIdentifier(BirdFeeder feeder) {         this.door = feeder.getDoor();     }     public void identify(Song song) {         List songs = feeder.getSongs();         Iterator song_iter = songs.iterator();         while (song_iter.hasNext()) {            Song nxtSong = song_iter.next();            if (nxtSong.equals(song)) {                door.open();                return;            }         }         door.close();     } } public class Song {     private File song;

133

Chapter 10 ■ Object-Oriented Design Principles

    public Song(File song) {         this.song = song;     }     public File getSong() {         return this.song;     }     public boolean equals(Object newSong) {         if (newSong instanceof Song) {             Song newSong2 = (Song) newSong;             if (this.song.equals(newSong2.song)) {                 return true;             }         }         return false;    } } In this implementation, if we change anything with regards to a Song, then the only changes we make will be in the Song class, and SongIdentifier is insulated from those changes. The behavior of the Song class doesn’t change, although how it implements that behavior might. SongIdentifier doesn’t care how the behavior is implemented as long as it’s always the same behavior. BirdFeeder has delegated the work of handling birdsong to the SongIdentifier class, and SongIdentifier has delegated the work of comparing songs to the Song class, all without using inheritance. Delegation allows you to give away the responsibility for a behavior to another class and not have to worry about changing the behavior in your class. You can count on the behavior in the delegated class not changing. But sometimes you will want to use an entire set of behaviors simultaneously, and delegation doesn’t work for that. Instead, if you want to have your program use that set of behaviors, you need to use composition. We use composition to assemble behaviors from other classes. Say you’re putting together a space-based role playing game (RPG) called Space Rangers. One of the things you’ll model in your game is the spaceships themselves. Spaceships will have lots of different characteristics. For example, there are different types of ships: shuttles, traders, fighters, freighters, capital ships, and so on. Each ship will also have different characteristics, weapons, shields, cargo capacity, number of crew, and so on. But what will all the ships have in common? Well, if you want to create a generic Ship class, it will be hard to gather all these things together in a single Ship super class so you can create sub-classes for things like Shuttle, Fighter, Freighter, and the like. They’re all just too different. This seems to imply that inheritance isn’t the way to go here. So back to our question: what do all the ships have in common? We can say that all the Ships in Space Rangers have just two things in common: a ship type and a set of properties that relate to that ship type. This gets us to our first class diagram, shown in Figure 10-11.

134

Chapter 10 ■ Object-Oriented Design Principles

Figure 10-11. What do all spaceships have in common? This allows us to store the spaceship type and a map of the various properties for an instance of a ship. It means we can then develop the properties independently from the ships, and then different ships can share similar properties. For example, all ships can have weapons, but they can have different ones with different characteristics. This leads us to develop a weapons interface that we can then use to implement particular classes. We get to use these weapons in our SpaceShip by using composition. Remember that composition allows us to use an entire family of behaviors that we can be guaranteed won’t change. See Figure 10-12.

Figure 10-12. Using composition to allow the SpaceShip to use Weapons Remember that the open triangle in the UML diagram means inheritance (or in the case of an interface, it means implements). The closed diamond in UML means composition. So, in this design we can add several weapons to our properties map, and each weapon can have different characteristics, but all of them exhibit the same behavior. Note that in composition the component objects (Weapons) become part of a larger object (SpaceShip), and when the larger object goes away (you get blown up), so do the components. The object that’s composed of other behaviors owns those behaviors. When that object is destroyed, so are all its behaviors. The behaviors in a composition don’t exist outside of the composition itself. When your SpaceShip is blown up, so are all your Weapons.

135

Chapter 10 ■ Object-Oriented Design Principles

Of course, sometimes you want to put together a set of objects and behaviors in such a way that when one of them is removed, the others continue in existence. That’s what aggregation is all about. If the behaviors need to persist, then you must aggregate. Aggregation is when one class is used as a part of another class, but still exists outside of that other class. If the object does make sense existing on its own, then use aggregation— otherwise use composition. For example, a library is an example of aggregation. Each book makes sense on its own, but the aggregation of them all is a library. The key is to show an instance where it makes sense to use a component outside a composition, implying that it should have a separate existence. In Space Rangers, we can have Pilot objects in addition to SpaceShip objects. A Pilot can also carry weapons. Different ones, of course—Pilots probably don’t carry Cannon objects with them. Say a Pilot is carrying around a HandBlaster, so in object-oriented speak he’s using the behaviors of the HandBlaster. If a mad SpaceCow accidentally crushes the Pilot, is the weapon destroyed along with the Pilot? Probably not, hence the need for a mechanism where the HandBlaster can be used by a Pilot but has an existence outside of the Pilot class. Ta-da! Aggregation! We’ve seen three different mechanisms that allow objects to use the behaviors of other objects, none of which requires inheritance. As it’s said in Object-Oriented Analysis and Design, “If you favor delegation, composition, and aggregation over inheritance, your software will usually be more flexible and easier to maintain, extend, and reuse.8 ”

The Dependency Inversion Principle Robert C. Martin introduced the Dependency Inversion Principle (DIP) in his C++ Report and later in his classic book Agile Software Development.9 In his book, Martin defined the DIP as follows: 1. High-level modules should not depend on low-level modules. Both should depend on abstraction. 2. Abstractions should not depend on details. Details should depend on abstractions. The simple version of this is: don’t depend on concrete classes, depend on abstractions. Martin’s contention is that object-oriented design is the inverse of traditional structured design. In structured design, as we saw in Chapter 7, one either works from the top down, pushing details and design decisions as low in the hierarchy of software layers as possible, or one works from the bottom up, designing low-level details first and later putting together a set of low-level functions into a single higher-level abstraction. In both cases, the higher-level software depends on decisions made at the lower levels, including interface and behavioral decisions. Martin contends that for object-oriented design, this is backward. The Dependency Inversion Principle implies that higher-level (more abstract) design levels should create an interface that lower (more concrete) levels should code to. This will mean that as long as the lower level—concrete—classes code to the interface of the upper level abstraction, the upper level classes are safe. As Martin puts it, “The modules that contain the high-level business rules should take precedence over, and be independent of, the modules that contain the implementation details. High-level modules simply should not depend on low-level modules in any way.” Here’s a simple example. Traditionally, in structured design we write many programs with this general format: 1. Get input data from somewhere. 2. Process the data. 3. Write the data to somewhere else.

8 9

McLaughlin, 2007. Martin, 2003.

136

Chapter 10 ■ Object-Oriented Design Principles

In this example, the Processor uses the Collector to get data. It then packages the data and uses the Writer to write the data to, say, a database. If we draw this out, we get something that looks like Figure 10-13.

Figure 10-13. A traditional input-process-output model One problem with this implementation is that the Processor must create and use the Writer whose interface and parameter types it must know in order to write correctly. This means that the Processor must be written to a concrete implementation of a Writer and so must be rewritten if we want to change what kind of Writer we want. Say the first implementation writes to a File; if we then want to write to a printer or a database, we need to change Processor every time. That’s not very reusable. The Dependency Inversion Principle says that the Processor should be coded to an interface (we abstract Writer), and then the interface is implemented in separate concrete classes for each type of Writer destination. The resulting design looks like Figure 10-14.

Figure 10-14. Using an interface to allow different writer implementations In this way, different writers can be added, and as long as they adhere to the interface, Processor never needs to change. Note that the DIP is closely related to Principle #2: Code to an Interface.

137

Chapter 10 ■ Object-Oriented Design Principles

The Interface Segregation Principle This principle tells us that clients shouldn’t have to depend on interfaces they don’t use. In particular, they shouldn’t have to depend on methods they don’t use.10 We’ve talked a lot about interfaces in this chapter. Coding to interfaces, using interfaces to abstract out common details, and so on. We use interfaces to make our code more flexible and maintainable. So overall, interfaces are a great thing, right? Well, there are a few things you must beware of. One of the greatest temptations with respect to interfaces is to make them bigger. If an interface is good, then a bigger interface must be better, right? After all, you can then use the interface in many more objects and the user just has to not implement certain methods that they don’t need. However, by doing that you are ruining the cohesion of your interface. By “generalizing” an interface too much, you’re moving away from that single lightning bolt of a set of methods that are all closely related to each other to a jumble of methods that say hello to each other in passing. Remember: cohesion is good. Your applications should be cohesive, and the classes and interfaces they depend on should also be cohesive. You make your interfaces less cohesive and begin to violate the Interface Segregation Principle when you start adding new methods to your interface because one of the sub-classes that implements the interface needs it—and others do not. So what’s the answer here? How do we keep our interfaces cohesive and still make them useful for a range of objects? The answer is to make more interfaces. The Interface Segregation Principle implies that instead of adding new methods that are only appropriate to one or a few implementation classes, you make a new interface. You divide your bloated interface into two or more smaller, more cohesive interfaces. That way, new classes can just implement the interfaces they need and not implement ones that they don’t.

The Principle of Least Knowledge Also known as the Law of Demeter, the Principle of Least Knowledge (PLK) says to talk only to your immediate friends.11 The complement to strong cohesion in an application is loose coupling. That’s what the Principle of Least Knowledge is all about. It says that classes should collaborate indirectly with as few other classes as possible.12 Here’s an example. You’ve got a computer system in your car—we all do these days. Say you’re writing an application that graphs temperature data in the car. There are a number of sensors that provide temperature data and that are part of a family of sensors in the car’s engine. Your program should select a sensor and gather and plot its temperature data. (This example is derived from one found in Hunt).13 Part of your program might look like this: public void plotTemperature(Sensor theSensor) {         double temp = theSensor.getSensorData().getOilData().getTemp();         ... } This will likely work—once. But now you’ve coupled your temperature-plotting method to the Sensor, SensorData, and OilSensor classes. Which means that a change to any one of them could affect your plotTemperature() method and cause you to have to refactor your code. Not good. Martin, 2003. Martin, 2003. 12 Lieberherr, K., I. Holland, et al. Object-Oriented Programming: An Objective Sense of Style. OOPSLA ’88, Association for Computing Machinery, 1988. 13 Hunt, 2000. 10 11

138

Chapter 10 ■ Object-Oriented Design Principles

This is what the PLK urges you to avoid. Instead of linking your method to a hierarchy and having to traverse the hierarchy to get the service you’re looking for, just ask for the data directly: public void plotTemperature(double theData) {         ... } ... plotTemperature(aSensor.getTemp()); Yup, we had to add a method to the Sensor class to get the temperature for us, but that’s a small price to pay for cleaning up the mess (and the possible errors) earlier. Now your class is just collaborating directly with one class and letting that class take care of the others. Of course, your Sensor class will do the same thing with SensorData, and so on. This leads us to a corollary to the PLK: keep dependencies to a minimum. This is the crux of loose coupling. By interacting with only a few other classes, you make your class more flexible and less likely to contain errors.

Class Design Guidelines Finally, I’d like to present a list of 23 class design guidelines. These guidelines are somewhat more specific than the general design guidelines described earlier, but they’re handy to have around. These 23 class design guidelines are taken from Davis14 and McConnell15: 1. Present a consistent level of abstraction in the class interface. 2. Be sure you understand what abstraction the class is implementing. 3. Move unrelated information to a different class (ISP). 4. Beware of erosion of the class’s interface when you’re making changes (ISP). 5. Don’t add public members that are inconsistent with the interface abstraction. 6. Minimize accessibility of classes and members (OCP). 7. Don’t expose member data in public. 8. Avoid putting private implementation details into the class’s interface. 9. Avoid putting methods into the public interface. 10. Watch for coupling that’s too tight (PLK). 11. Try to implement “has a” relations through containment within a class (SRP). 12. Implement “is a” relations through inheritance (LSP). 13. Only inherit if the derived class is a more specific version of the base class. 14. Be sure to inherit only what you want to inherit (LSP). 15. Move common interfaces, data, and operations as high in the inheritance hierarchy as possible (DRY).

14 15

Davis, A. M. 201 Principles of Software Development. (New York, NY: McGraw-Hill, Inc., 1995.) McConnell, Steve, Code Complete, 2nd Edition. (Redmond, WA: Microsoft Press, 2004.)

139

Chapter 10 ■ Object-Oriented Design Principles

16. Be suspicious of classes of which there is only one instance. 17. Be suspicious of base classes that only have a single derived class. 18. Avoid deep inheritance trees (LSP). 19. Keep the number of methods in a class as small as possible. 20. Minimize indirect method calls to other classes (PLK). 21. Initialize all member data in all constructors, if possible. 22. Eliminate data-only classes. 23. Eliminate operation-only classes.

Conclusion This chapter examined a number of rules of thumb about object-oriented design that have evolved over the last few decades. These design principles act as guidelines for you, the designer, to abide by so that your design ends up being a good one, easy to implement, easy to maintain, and that does just what your customer wants. Importantly, these design principles give guidance when you’re feeling your way from features to design. They talk about ways to examine and implement the important object-oriented principles of inheritance, encapsulation, polymorphism, and abstraction. They also reinforce basic design principles like cohesion and coupling.

References Davis, A. M. 201 Principles of Software Development. (New York, NY: McGraw-Hill, Inc., 1995.) Hunt, A. and D. Thomas. The Pragmatic Programmer: From Journeyman to Master. (Boston, MA: Addison-Wesley, 2000.) Larman, C. “Protected Variation: The Importance of Being Closed.” IEEE Software 18(3): 89-91 (2001). Lieberherr, K., I. Holland, et al. Object-Oriented Programming: An Objective Sense of Style. OOPSLA ’88, Association for Computing Machinery, 1988. Martin, R. C. Agile Software Development: Principles, Patterns, and Practices. (Upper Saddle River, NJ: Prentice Hall, 2003.) McConnell, Steve, Code Complete, 2nd Edition. (Redmond, WA: Microsoft Press, 2004.) McLaughlin, Brett D., et. al., Head First Object-Oriented Analysis & Design. (Sebastopol, CA: O’Reilly Media, Inc., 2007.) Wintour, Damien. “The Liskov Substitution Principle.” 1988. Downloaded on September 14, 2010 from www.necessaryandsufficient.net/2008/09/design-guidelines-part3-the-liskov-substitutionprinciple/.

140

CHAPTER 11

Design Patterns Each pattern describes a problem which occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice. —Christopher Alexander1 Do you reinvent the wheel each time you write code? Do you have to relearn how to iterate through an array every time you write a program? Do you have to reinvent how to fix a dangling else in every ifstatement you write? Do you need to relearn insertion sort or binary search every time you want to use them? Of course not! Over the time you’ve spent writing programs, you’ve learned a set of idioms that you employ whenever you’re writing code. For example, if you need to iterate through all the elements of an array in Java, you’re likely to do the following: for (int i = 0; i < myArray.length; i++) {          System.out.printf(" %d ", myArray[i]); } or for (int nextElement: myArray) {         System.out.printf(" %d ", nextElement); } And the code just flows out of your fingertips as you type. These code patterns are sets of rules and templates for code that you accumulate as you gain more experience writing programs.

Alexander, C., S. Ishikawa, et al. A Pattern Language: Towns, Buildings, Construction. (Oxford, UK: Oxford University Press, 1977.)

1

© John F. Dooley 2017 J. F. Dooley, Software Development, Design and Coding, https://doi.org/10.1007/978-1-4842-3153-1_11

141

Chapter 11 ■ Design Patterns

Design patterns are the same thing—but for your design. If you take the time to learn a core group of design patterns, it will make your code more uniform and readable and improve its overall quality over time. The famous architect Christopher Alexander, in his book A Pattern Language, defined patterns for design in architecture. The same ideas carry over into software design. If you go back and read the Alexander quote at the top of this chapter, you’ll see the following three key elements in Alexander’s definition of design pattern: •

Recurring: The problem that evokes the design pattern must be a common one.

•

Core solution: The pattern provides a template for the solution; it tries to extract out the essence of the solution.

•

Reuse: The pattern must be easily reusable when the same problem appears in different domains.

In fact, you’ve already seen at least one design pattern so far in this book: the Model-View-Controller pattern (MVC) discussed in Chapter 5 is one of the earliest published examples of a software design pattern2. The MVC design pattern is used with programs that use graphical user interfaces. It divides the program into three parts: the Model that contains the processing rules for the program, the View that presents the data and the interface to the user, and the Controller that mediates communication between the Model and the View. In a typical object-oriented implementation, each of these abstractions becomes a separate object. The Gang of Four (Gamma, Helm, Johnson, and Vlissides), in their seminal book on design patterns, Design Patterns: Elements of Reusable Object-Oriented Software,3 define a design pattern as something that “names, abstracts, and identifies the key aspects of a common design structure that makes it useful for creating a reusable object-oriented design.” In other words, a design pattern is a named abstraction from a concrete example that represents a recurring solution to a particular, but common, problem—recurring, core solution, reuse. But why do we need design patterns in the first place? Why can’t we just get along with the objectoriented design principles we studied in Chapter 10 and with our old favorites abstraction, inheritance, polymorphism, and encapsulation? Well, it turns out that design is hard, that’s why. Design for reuse is even harder. Design is also much more of an art than a science or an engineering discipline. Experienced software designers rarely start from first principles; they look for similarities in the current problem to problems they’ve solved in the past. And they bring to the design table the set of design idioms they’ve learned over time. Design patterns provide a shared vocabulary that makes this expert knowledge available to everyone.

Design Patterns and the Gang of Four In their book, the Gang of Four describe design patterns as having four essential features: •

The Pattern Name: “. . . a handle we can use to describe a design problem, its solution, and consequences in a word or two. Naming a pattern immediately increases our design vocabulary.”

•

The Problem: Describes when to use the pattern. “It explains the problem and its context.”

Krasner, G. E. and S. T. Pope. “A cookbook for using the Model-View-Controller user interface paradigm in Smalltalk-80.” Journal of Object-Oriented Programming 1(3): 26-49 (1988). 3 Gamma, E., Helm, R., Johnson, R., Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. (Boston, MA: Addison-Wesley, 1995.) 2

142

Chapter 11 ■ Design Patterns

•

The Solution: “. . . describes the elements that make up the design, their relationships, responsibilities, and collaborations . . . the pattern provides an abstract description of a design problem and how a general arrangement of elements solves it.”

•

The Consequences: The results and tradeoffs of applying the pattern to a problem. These include time and space tradeoffs, but also flexibility, extensibility, and portability, among others.4

Design patterns are classified using two criteria: scope and purpose. Scope deals with the relationships between classes and objects. Static relationships between classes are fixed at compile time, whereas, dynamic relationships apply to objects and these relationships can change at run-time. Purpose, of course, deals with what the pattern does with respect to classes and objects. Patterns can deal with object creation, composition of classes or objects, or the ways in which objects interact and distribute responsibilities in the program. The Gang of Four describe 23 different design patterns in their book, dividing them into three different classes of patterns: creational, structural, and behavioral. •

Creational design patterns: Dealing with when and how objects are created, these patterns typically create objects for you, relieving you of the need to instantiate those objects directly.

•

Structural design patterns: These describe how objects are composed into larger groups.

•

Behavioral design patterns: These generally talk about how responsibilities are distributed in the design and how communication happens between objects.

The list is not meant to be complete, and over the years since the publication of the Gang of Four’s Design Patterns book, many more patterns have been added to this original list by developers everywhere. A recent Google search for the phrase “design patterns” yielded more than 10.6 million hits, so lots of object-oriented developers have jumped on the design patterns bandwagon.

The Classic Design Patterns The 23 (classic) design patterns described by the Gang of Four are as follows (in the remainder of this chapter we’ll go over the design patterns that are in italics): •

4

Creational Patterns •

Abstract Factory

•

Builder

•

Factory Method

•

Prototype

•

Singleton

Gamma et. al, 1995.

143

Chapter 11 ■ Design Patterns

•

•

Structural Patterns •

Adapter

•

Bridge

•

Composite

•

Decorator

•

Façade

•

Flyweight

•

Proxy

Behavioral Patterns •

Chain of Responsibility

•

Command

•

Interpreter

•

Iterator

•

Mediator

•

Memento

•

Observer

•

State

•

Strategy

•

Template Method

•

Visitor

Patterns We Can Use The patterns in this section are a representative sample of the classic design patterns, and are those that you’ll find the most useful right away.

Creational Patterns Creational patterns all have to do with creating objects. If we think about class definitions as templates for producing objects, then these patterns are all about how to create those templates. The two patterns we’ll look at next, Singleton and Factory Method, show us two different ways of thinking about creating objects.

144

Chapter 11 ■ Design Patterns

The Singleton Pattern Singleton5 is almost certainly the easiest of the design patterns to understand and to code. The idea is simple: you are writing a program and you have a need for one—and only one—instance of a class. And you need to enforce that “and only one” requirement. Examples of programs that would use a Singleton pattern are things like print spoolers, window managers, device drivers, and the like. What are the implications of the “one, and only one” requirement? Well, first, it means your program can only say new Singleton() once, right? But what’s to stop other objects in your program (or objects in the program that you didn’t write) from issuing another new Singleton()? The answer is—nothing! As long as your class can be instantiated once, other objects should be able to instantiate it again and again. What we need to do is to create a class that can be instantiated, once and only once, and which doesn’t use new to do the instantiation. You heard me right: we need a class that can be instantiated without using new. Go ahead, think about it. Here’s what we’ll do. The method that gets called when an object is instantiated is the constructor. In Java you can say new Singleton() because the Singleton() constructor is public—it’s visible from outside the class definition. If we want to keep the constructor so we can make instances of Singleton objects, but we don’t want anyone to be able to use new to do that, we must make the constructor private. “But wait!” you cry. “If the constructor is private then we can’t instantiate the object at all!” There is a way around this problem. If the constructor is private, then it can only be accessed from inside the class definition, so it’s entirely possible to instantiate the object from within the class definition itself! But this then recurses to the problem of how do we get to the constructor from outside the class definition? Well, in Java is there a way to access a method inside a class without having to have an instantiation of the class? (Think the Math class.) Aha! Class methods! If you create a public method that is static (a class method), then that method is visible outside the class definition without having the object actually instantiated. So, if you create a class with a private constructor and then use a static method to create an instance of the class, you can control how many instances of the class you create. Here’s the code: public class Singleton {         // this is the reference to the instance that will hang around         private static Singleton uniqueInstance;         // the private constructor – can't be accessed from outside         private Singleton() {                 // do stuff here to initialize the instance         }         // here's the static method we'll use to create the instance         public static Singleton getInstance() {                 if (uniqueInstance == null) {                         uniqueInstance = new Singleton();                 }                 // if uniqueInstance is not null, then it already exists                 return uniqueInstance;         }         // Other methods – after all Singleton is a real class }

5

Gamma et. al, 1995.

145

Chapter 11 ■ Design Patterns

And in order to use the Singleton class, we’d do something like this: public class SingletonTest {         public static void main(String [] args) {                 Singleton mySingle;                 mySingle = Singleton.getInstance();                 // and we do other stuff here         } } When we instantiate the Singleton instance by calling the getInstance() method, it will test to see if we’ve done this before. If not, it creates the instance using the private constructor in the Singleton class. If the instance already exists (the uniqueInstance variable is not null), then we just return the reference to the object. This version of the Singleton pattern isn’t without its problems; for one thing, it can fail if you’re writing a multi-threaded Java program. The solution just mentioned isn’t “thread safe.” It’s possible that, in between the test for the existing of a Singleton instance and the actual creation of an instance, your program could be swapped out while another thread executes. When it swaps back in, it could erroneously create another instance of the Singleton. There are relatively easy solutions to this. The simplest way to make your Singleton pattern thread-safe is to make the getInstance() method a synchronized method. That way it will execute to completion and not be swapped out. Here’s a version of the getInstance() method that is thread-safe: public synchronized static Singleton getInstance() {         if (uniqueInstance == null) {                 uniqueInstance = new Singleton();         }         return uniqueInstance; } Notice that the only difference is the inclusion of the synchronized keyword in the method signature. I’ll give an example of how to use the Singleton pattern later in this chapter.

The Factory Method Pattern Say you’ve got a small company that’s expanding across multiple cities in several states. Whenever you sell an item, you need to compute the sales tax for the locale your store is in. Every state and city has a different sales tax rate, and so you’ll need to keep track of the locations and only use the correct rate in any particular location. As you add new stores in new locations you don’t want to have to continue to change lots of your code in order to compute the sales tax in the new locations. And, of course, you have a program to model your operations and compute the sales tax on each sale. The question is, how do you write the program in such a way that you don’t have to change it every day or change a lot of it every time you open a new store? That’s where the Factory Method design pattern comes in. The Factory Method pattern6 creates objects for you. The Factory Method pattern says that you just define an interface or abstract class for creating an object but let the sub-classes themselves decide which class to instantiate. In other words, sub-classes are responsible for creating the instance of the class. You use the Factory Method pattern when you need to create several types of objects that are usually related to each other—they usually have the same abstract parent class so they are in the same class hierarchy—but that are different.

6

Gamma, et. al, 1995.

146

Chapter 11 ■ Design Patterns

In the Factory method, you create an object without exposing the creation logic to the client and refer to the newly created object using a common interface. The Factory Method pattern allows the sub-classes to choose the type of objects to create, and they do that at run-time. Factory Method promotes loose coupling by eliminating the need to bind application-specific classes into the client code. That means the client code interacts solely with the resultant interface or abstract class, so that it will work with any classes that implement that interface or that extend that abstract class. When do you use the Factory pattern? 1. When you don’t know ahead of time what class object you will need 2. When all the potential classes are in the same sub-class hierarchy 3. To centralize class selection code 4. When you don’t want the user to have to know every sub-class 5. To encapsulate object creation You ask the factory to create an object of type X and it creates one; you ask it to create an object of type Y and it creates one. This forces the creation of concrete classes to be relegated to sub-classes of an interface that knows how to create concrete classes and keeps your other classes closed for modification. All without changing X or Y or the store. In our example, we can create sales tax calculation objects, using a SalesTaxFactory class to generate the objects that will compute sales tax for different locations. The Factory method pattern allows you to define an interface for creating a family of objects, and it allows sub-classes to decide which members of the family to instantiate. It defers instantiation down into the sub-classes. In our example, we’ll have several classes: •

SalesTax: An interface that defines our sales tax objects

•

BostonTax: A concrete class that inherits from Tax

•

ChicagoTax: A concrete class that inherits from Tax

•

StLouisTax: A concrete class that inherits from Tax

•

SalesTaxFactory: Our concrete implementation that makes different Tax objects

•

SalesTaxDriver: A driver class that lets us sell items and compute sales tax

The Factory Method pattern depends on defining an interface for the objects we need and then allowing sub-classes that implement that interface to actually implement the objects. We can either use a Java interface or an abstract class to define our SalesTax interface. We’ll use an abstract class. Our SalesTax abstract class will look like the following: abstract class SalesTax {     protected double rate;     abstract void getRate();     public void calculateTax(double amount) {         System.out.printf("$%6.2f\n", amount * (1.0 +rate));     } }

147

Chapter 11 ■ Design Patterns

And our SalesTax sub-classes end up as concrete classes that override one or more of the methods from the SalesTax abstract class: public class BostonTax extends SalesTax {     public void getRate() {         rate = 0.0875;    } } public class ChicagoTax extends SalesTax {        public void getRate() {            rate = 0.075;       } } public class StLouisTax extends SalesTax {      public void getRate() {          rate = 0.05;     } } Finally, our concrete SalesTaxFactory, which will actually make a concrete SalesTax object object looks like this: public class SalesTaxFactory  {    /**     * use the makeTaxObject() method to get object of type SalesTax     */      public SalesTax makeTaxObject(String location) {         if(location == null) {             return null;         }  else if(location.equalsIgnoreCase("boston")) {             return new BostonTax();         }  else if(location.equalsIgnoreCase("chicago")) {             return new ChicagoTax();         }  else if(location.equalsIgnoreCase("stlouis"))  {             return new StLouisTax();         }         return null;    } } In order to test our factory, we create a driver. This is the client code that uses the Factory to create the correct types of objects: /** *  Test the Factory Method pattern. *  We use the SalesTaxFactory to get the object of concrete classes */

148

Chapter 11 ■ Design Patterns

import java.io.*; import java.util.Scanner;   public class SalesTaxDriver {       public static void main(String args[])throws IOException {          Scanner stdin = new Scanner(System.in);          SalesTaxFactory salesTaxFactory = new SalesTaxFactory();          //get an object of type SalesTax and call its getTax()method.          System.out.print("Enter the location (boston/chicago/stlouis): ");          String location= stdin.nextLine();          System.out.print("Enter the dollar amount: ");          double amount = stdin.nextDouble();          SalesTax cityTax = salesTaxFactory.makeTaxObject(location);          System.out.printf("Bill amount for %s of  $%6.2f is: ", location, amount);          cityTax.getRate();          cityTax.calculateTax(amount);         }     } Figure 11-1 shows what the entire program looks like in UML.

Figure 11-1. SalesTaxFactory example What should we notice about this Factory method pattern example? The following tells us some things about how the Factory method pattern works in this case: •

The factory method makeTaxobject() encapsulates the creation of the SalesTax object. Our driver just tells the factory which location to use.

•

The SalesTax interface provides the interface for the sub-classes to create the actual objects.

149

Chapter 11 ■ Design Patterns

•

The SalesTaxFactory concrete class actually creates the objects by implementing the makeTaxObject() method.

•

This leaves the SalesTax classes alone and makes it easier for the SalesTaxDriver to create new objects.

•

Notice also that the SalesTaxDriver class only deals with SalesTax objects. It doesn’t have to know anything about particular sales tax rates. The concrete SalesTax objects implement the methods from the SalesTax abstract class, and the SalesTaxDriver just uses them regardless of which type of SalesTax object you’ve created.

•

It also means that you can change the implementation of a particular type of SalesTax object without changing either the interface or the SalesTaxDriver.

There’s another variation of the Factory Method pattern. If you think about it, in our example we only ever need a single factory. If that’s the case, we could use the Singleton pattern to generate the factory and then just use it. This would change our SalesTaxFactory and SalesTaxDriver classes. They would end up looking like this: public class SingletonTaxFactory  {         /**          * We'll just create one SalesTaxFactory using the Singleton pattern          * To do that we need to make the constructor private and create a          * variable to hold the reference to the SalesTaxFactory object.          */          // this is the instance that will hang around         private static SingletonTaxFactory uniqueInstance;         // the private constructor – can't be accessed from outside         private SingletonTaxFactory() {                 // do stuff here to initialize the instance         }         // here's the static method we'll use to create the instance         public static SingletonTaxFactory getInstance() {                 if (uniqueInstance == null) {                         uniqueInstance = new SingletonTaxFactory();                 }                 return uniqueInstance;         }    /**     * use getTax method to get object of type Tax     */      public SalesTax getTax(String location) {           if(location == null) {               return null;           }           if(location.equalsIgnoreCase("boston")) {               return new BostonTax();           }    else if(location.equalsIgnoreCase("chicago")) {               return new ChicagoTax();

150

Chapter 11 ■ Design Patterns

          }    else if(location.equalsIgnoreCase("stlouis"))  {               return new StLouisTax();           }           return null;      } } And the client code then becomes: import java.io.*; import java.util.Scanner;   public class SingletonTaxDriver {       public static void main(String args[])throws IOException {          Scanner stdin = new Scanner(System.in);          /* get the single SalesTaxFactory that we need */          SingletonTaxFactory salesTaxFactory =                                             SingletonTaxFactory.getInstance();          System.out.print("Enter the location (boston/chicago/stlouis): ");          String location= stdin.nextLine();          System.out.print("Enter the dollar amount: ");          double amount = stdin.nextDouble();          SalesTax cityTax = salesTaxFactory.getTax(location);          System.out.printf("Bill amount for %s of  $%6.2f is: ", location, amount);          cityTax.getRate();          cityTax.calculateTax(amount);       }     }

Structural Patterns Structural patterns help you put objects together so you can use them more easily. They’re all about grouping objects together and providing ways for objects to coordinate to get work done. Remember, composition, aggregation, delegation, and inheritance are all about structure and coordination. The first Structural pattern we’ll look at here—the Adapter—is all about getting classes to work together.

The Adapter Pattern Here’s the problem. You’ve got a client program Foo that wants to access another class or library or package, Bar. The problem is, Foo is expecting a particular interface, and that interface is different from the public interface that Bar presents to the world. What are you to do?

151

Chapter 11 ■ Design Patterns

Well, you could rewrite Foo to change the interface it expects to conform to the interface that Bar is presenting. But if Foo is large, or if it’s being used by other classes, that may not be a viable possibility. Or you could rewrite Bar so it presents the interface that Foo is expecting. But maybe Bar is a commercial package and you don’t have the source code? That’s where the Adapter design pattern comes in7. You use the Adapter pattern to create an intermediate class that wraps the Bar interface inside a set of methods that presents the interface that Foo is looking for. Here’s the idea: the Adapter can interface with Foo on one side and with Bar on the other. So the interface to Bar doesn’t have to change, and Foo users gets the interface it expects. Everyone is happy! By the way, the Adapter design pattern is also called the Wrapper pattern because it wraps an interface8. Figure 11-2 illustrates.

Figure 11-2. The Adapter lets Foo use Bar There are two ways to implement adapters: class adapters, where the adapter will inherit from the target class, and object adapters that use delegation to create the adapter. Note the difference: a class adapter will sub-class an existing class and implement a target interface. An object adapter will sub-class a target class and delegate to an existing class. Figure 11-3 shows the UML for a generic class adapter.

Figure 11-3. A class adapter example

7 8

Gamma et. al, 1995. Gamma et. al, 1995.

152

Chapter 11 ■ Design Patterns

Note that the Adapter class inherits from the Adaptee class and implements the same Target interface that the Client class uses. Here’s the code for this example: public class Client {     public static void main(String [] args) {         Target myTarget = new Adapter();         System.out.println(myTarget.sampleMethod(12));     } } public interface Target {    int sampleMethod(int y); } public class Adapter extends Adaptee implements Target {     public int sampleMethod(int y) {         return myMethod(y);     } } public class Adaptee {    public Adaptee() {    }     public int myMethod(int y) {         return y * y;     } } The object adapter, on the other hand, still implements the Target interface but uses composition with the Adaptee class in order to accomplish the wrapping. It looks like this: public class Adapter implements Target {     Adaptee myAdaptee = new Adaptee();     public int sampleMethod(int y) {         return myAdaptee.myMethod(y);     } } In both cases, the Client doesn’t have to change! That’s the beauty of Adapter. You can change which Adaptee you’re using by changing the Adapter and not the Client.

153

Chapter 11 ■ Design Patterns

The Façade Pattern For a second example, we try to simplify interfaces. Say you have a set of classes that constitute a subsystem. They could be individual classes that make up a more complex system or part of a large class library. Let’s also say that each of those classes in the subsystem has a different interface. Finally, let’s say that you want to write a client program that uses some or all of those classes to get some work done. This would normally mean that to write the client program, you’d need to learn all the interfaces of all the classes in the subsystem in order to communicate with the subsystem and get your work done. Pictorially, this will look like Figure 11-4.

Figure 11-4. A client using several interfaces Clearly, this will make your client program complicated and very hard to maintain. This problem is what the Façade design pattern is here to fix. The idea behind Façade is to provide a single, simple, unified interface that makes it easier for your client to interact with the subsystem classes. When you use a Façade, you learn a single interface and use it to interact with all the subsystem classes. Pictorially, this looks like Figure 11-5.

Figure 11-5. A client using a Façade interface

154

Chapter 11 ■ Design Patterns

In addition to providing you with a single interface to use, the Façade pattern also hides any interface changes or additions from the client. It’s a classic example of the Principle of Least Knowledge from Chapter 10. Note that the Façade pattern may look a lot like the Adapter pattern from earlier. But don’t be fooled. What the Adapter pattern does is wrap a target interface and allow a client to use the interface it expects. The Façade simplifies one or more interfaces and presents that simplified interface for the client to use. Here’s an example for the Façade design pattern. Say you’re creating an online store and you’re putting together a simple program to compute the total amount a customer would have to pay for an item they want to order. In this program, you would need to look up the item, compute the payment, compute the sales tax, compute the delivery charge, and then total all that up and send it to the user. That leaves you with classes for SalesTax, Delivery, Payment, and Inventory. If we want to simplify the interface using Façade, we can create a new class, Order, that will hide the several interfaces and produce a simpler interface for a client program to use. In UML, this would look like Figure 11-6.

Figure 11-6. Façade example And the code for this simple example would look like the following: /** * Facade Design Pattern Example */ /** Check the Inventory for the item */ public class Inventory {     public String checkInventory(String itemID) {         /* code in here to check the database */         return "Inventory checked";     } } /** compute the payment for an item */ public class Payment {     public String computePayment(String itemID, String currency) {         return "Payment computed successfully";     } }

155

Chapter 11 ■ Design Patterns

/** compute the sales tax for an item */ public class SalesTax {     public String computeTax(String itemID, double rate) {         return "Tax computed";     } } /** compute the delivery charge for an item */ public class Delivery {     public String computeDelivery(String itemID, String location) {         return "Delivery amount computed";     } } /** * Here's the Facade */ public class Order {     private Payment pymt = new Payment();     private Inventory inventory = new Inventory();     private SalesTax salestax = new SalesTax();     private Delivery deliver = new Delivery();     /**      *  This is the new interface for buying an item      *  it incorporates all the different steps into a single      *  method call      */     public void placeOrder(String itemID, String currency, String location, double rate) {         String step1 = inventory.checkInventory(itemID);         String step2 = pymt.computePayment(itemID, currency);         String step3 = salestax.computeTax(itemID, rate);         String step4 = deliver.computeDelivery(itemID, location);         System.out.printf("%s\n",         System.out.printf("%s\n",         System.out.printf("%s\n",         System.out.printf("%s\n",     }

step1); step2); step3); step4);

  /** add more methods here for performing other actions */ } /** * Here's the client code. * Note how the Facade makes ordering something simple * by using it's interface */

156

Chapter 11 ■ Design Patterns

public class Client {        public static void main(String args[]){          Order order = new Order();          order.placeOrder("OR123456", "USD", "Chicago", 0.075);          System.out.println("Order processing completed");        } }

Behavioral Patterns Where creational patterns are all about how to create new objects, and structural patterns are all about getting objects to communicate and cooperate, behavioral patterns are all about getting objects to do things. They examine how responsibilities are distributed in the design and how communication happens between objects. The three patterns we’ll look at here all describe how to assign behavioral responsibilities to classes. The Iterator pattern is about how to traverse a collection of objects. The Observer pattern tells us how to manage push and pull state changes. The Strategy pattern lets us select different behaviors behind a single interface.

The Iterator Pattern If you’ve programmed in Java, you’ve seen iterators. I’ll get to that, but let’s start at the beginning. If you have a collection of elements, you can organize them in many different ways. They can be arrays, linked lists, queues, hash tables, sets, and so on. Each of these collections will have its own unique set of operations, but there’s usually one operation that you might want to perform on all of them: traverse the entire collection from beginning to end, one element at a time. Oh, and you want to traverse the elements in such a way that you don’t need to know the internal structure of the collection. And you may want to be able to traverse the collection backwards, and you may want to have several traversals going on at the same time. That’s what the Iterator pattern is for9. It creates an object that allows you to traverse a collection, one element at a time. Because of the requirement that you don’t need to know about the internal structure of the collection, an Iterator object doesn’t care about sorting order; it just returns each object as it’s stored in the collection, one at a time from first to last. The simplest iterator needs just two methods: •

hasNext(): Returns a true if there is an element to be retrieved—that is, if we’ve not reached the end of the collection yet—and false if there are no elements left.

•

getNextElement(): Returns the next element in the collection.

In the Iterator pattern, we have an Iterator interface that’s implemented to make a concrete Iterator object that’s used by a concrete Collections object. A client class then creates the Collection object and gets the Iterator from there. Figure 11-7 shows the UML version of this from Gamma et. al.

9

Gamma et. al, 1995.

157

Chapter 11 ■ Design Patterns

Figure 11-7. An example of using the Iterator pattern You can see that the client class uses the Collection and the Iterator interfaces, and the Concrete_ Iterator is part of and uses the Concrete_Collection. Note that the Collection_Interface will contain an abstract method to create an iterator for the Collection. This method is implemented in the Concrete_ Collection class, and when the client calls the method, a Concrete_Iterator is created and passed to the client to use. Starting in version 1.2, Java contained the Java Collections Framework (JCF) that included a number of new classes and interfaces to allow you to create collections of objects, including an Iterator interface. All these new types contained iterators. Java even included (just for collections of type List) an expanded Iterator called a ListIterator. With the ListIterator, you can go backwards through the list. Here’s an example of typical Iterator code in Java using both the Iterator and the ListIterator implementations: /** * Iterate through elements Java ArrayList using an Iterator * We then use ListIterator to go backwards through the same * ArrayList */ import java.util.ArrayList; import java.util.Iterator; import java.util.ListIterator; public class ArrayListIterator {     public static void main(String[] args) {         //create an ArrayList object         ArrayList arrayList = new ArrayList();         //Add elements to Arraylist         arrayList.add(1);         arrayList.add(3);         arrayList.add(5);         arrayList.add(7);         arrayList.add(11);         arrayList.add(13);         arrayList.add(17);         //get an Iterator object for ArrayList         Iterator iter = arrayList.iterator();

158

Chapter 11 ■ Design Patterns

        System.out.println("Iterating through ArrayList elements");         while(iter.hasNext()) {             System.out.println(iter.next());         }         ListIterator list_iter = arrayList.listIterator(arrayList.size());         System.out.println("Iterating through ArrayList backwards");         while(list_iter.hasPrevious()) {             System.out.println(list_iter.previous());         }     } } Note that when we create the ListIterator object, we pass it the number of elements in the ArrayList. This is to set the cursor that the ListIterator object uses to point to just past the last element in the ArrayList so it can then look backwards using the hasPrevious() method. In both the Iterator and ListIterator implementations in Java, the cursor always points between two elements so that the hasNext() and hasPrevious() method calls make sense; for example, when you say iter.hasNext(), you’re asking the iterator if there is a next element in the collection. Figure 11-8 is the abstraction of what the cursors look like.

Figure 11-8. Cursors in the Iterator abstraction Finally, some iterators will allow you to insert and delete elements in the collection while the iterator is running. These are called robust iterators. The Java ListIterator interface (not the Iterator) allows both insertion (via the add() method) and deletion (via the remove() method) in an iterator with restrictions. The add() method only adds to the position immediately before the one that would be the next element retrieved by a next() or immediately after the next element that would be returned by a previous() method call. The remove() method can only be called between successive next() or previous() method calls; it can’t be called twice in a row, and never immediately after an add() method call.

The Observer Pattern I love NPR’s Talk of the Nation: Science Friday radio show (http://sciencefriday.com). But I hardly get to listen to it when it’s broadcast because it’s on from 2:00–4:00 PM EST on Fridays and, because I work for a living, I can’t listen to it then. But I subscribe to the podcast and so every Saturday morning I get a new podcast of SciFri so I can listen to it on my iPod while I mow the lawn. If I ever get tired of SciFri, I can just unsubscribe, and I won’t get any new podcasts. That, ladies and gentlemen, is the Observer pattern. According to the Gang of Four, the Observer Pattern “. . . defines a one-to-many dependency between objects so that when one object changes state, all of its dependents are notified and updated automatically10.” So in my SciFri example, NPR is the “publisher” of the SciFri podcast, and all of us who “subscribe”

10

Gamma et. al, 1995.

159

Chapter 11 ■ Design Patterns

(or register) to the podcast are the observers. We wait for the SciFri state to change (a new podcast gets created) and then the publisher updates us automatically. How the updates happen differentiate between two different types of Observer—push and pull. In a push Observer, the Publisher (also known as the Subject in object-oriented speak) changes state and then pushes the new state out to all the Observers. In a pull Observer, the Subject changes state, but doesn’t provide a full update until the Observers ask for it—they pull the update from the Subject. In a variation of the pull model, the Subject may provide a minimal update to all the Observers notifying them that the state has changed, but the Observers still need to ask for the details of the new state. With the Observer pattern, we need a Subject interface so that the Subject and the Observer and the Client all can tell what the state interface they’re using is. We also need an Observer interface that just tells us how to update an Observer. Our publisher will then implement the Subject interface and the different “listeners” will implement the Observer interface. Figure 11-9 is a UML diagram of this.

Figure 11-9. The canonical Observer Pattern The client class is missing, but it will use both the ConcreteSubject and ConcreteObserver classes. Here’s a simple implementation of a push model version of all of these. Remember, it’s a push model because the ConcreteSubject object is notifying all the Observers whether they request it or not. First, we create the Subject interface that tells us how to register, remove, and notify the Observers: public interface Subject {     public void addObserver(Observer obs);     public void removeObserver(Observer obs);     public void notifyAllObservers(); } Next, we write the implementation of the Subject interface. This class is the real publisher, so it also needs the attributes that form the state of the Subject. In this simple version we use an ArrayList to hold all the Observers: import java.util.ArrayList; public class ConcreteSubject implements Subject {     private ArrayList observerList;         // these two variables are our state     private int subj_id;     private String msg;

160

Chapter 11 ■ Design Patterns

    public ConcreteSubject() {         observerList = new ArrayList();         this.subj_id = 0;         this.msg = "Hello";     }     public void addObserver(Observer obs) {         observerList.add(obs);     }     public void removeObserver(Observer obs) {         observerList.remove(obs);     }     public void notifyAllObservers() {         for (Observer obs: observerList) {             obs.update(this.subj_id, this.msg);            }     }     public void setState(int foo, String bar) {         this.subj_id = subj_id;         this.msg = msg;         notifyAllObservers();     } } Next, the Observer interface tells us how to update our Observers: public interface Observer {     public void update(int obs_id, String msg); } And then we write the implementation of the Observer interface: public class ConcreteObserver implements Observer {     private int obs_id;     private String msg;     Subject subj;     /**      * Constructor for objects of class ConcreteObserver      */     public ConcreteObserver(Subject subj) {         this.subj = subj;         subj.addObserver(this);     }    public void update(int obs_id, String msg) {         this.obs_id = obs_id;         this.msg = msg;         show();     }

161

Chapter 11 ■ Design Patterns

    private void show() {         System.out.printf("Id = %d Msg = %s\n", this.obs_id, this.msg);     } } And finally, the driver program that creates the publisher and each of the observers and puts them all together. public class ObserverDriver {     public static void main(String [] args) {         ConcreteSubject subj = new ConcreteSubject();         ConcreteObserver obj = new ConcreteObserver(subj);         subj.setState(12, "Monday");         subj.setState(17, "Tuesday");     } } And the output of executing the driver (which all comes from the show() method in the ConcreteObserver object will look like this: Id = 12 Msg = Monday Id = 17 Msg = Tuesday In many ways, the Observer design pattern works like the Java events interface. In Java you create a class that registers as a “listener” (our Observer) for a particular event type. You also create a method that is the actual observer and which will respond when the event occurs. When an event of that type occurs, the Java events object (our Subject) notifies your observer by making a call to the method you wrote, passing the data from the event to the observer method—Java events use the push model of the Observer pattern. For example, if you create a Button object in a Java program, you use the addActionListener() method of the Button object to register to observe ActionEvents. When an ActionEvent occurs, all the ActionListeners are notified by having a method named actionPerformed() called. This means that your Button object must implement the actionPerformed() method to handle the event.

The Strategy Pattern Sometimes you have an application where you have several ways of doing a single operation or you have several different behaviors, each with a different interface. One of the ways to implement something like this is using a switch statement, like so: switch (selectBehavior) {         case Behavior1:                 Algorithm1.act(foo);                 break;         case Behavior2:                 Algorithm2.act(foo, bar);                 break;

162

Chapter 11 ■ Design Patterns

        case Behavior3:                 Algorithm3.act(1, 2, 3);                 break; } The problem with this type of construct it that if you add another behavior, you need to change this code and potentially all the other code that has to select different behaviors. This is not good. The Strategy design pattern gets you around this. It says that if you have several behaviors (algorithms) you need to select from dynamically, you should make sure they all adhere to the same interface—a Strategy interface—and then that they’re selected dynamically via a driver, called the Context, that is told which to call by the client software. The Strategy pattern embodies two of our fundamental object-oriented design principles: encapsulate the idea that varies and code to an interface, not an implementation. This is illustrated in Figure 11-10.

Figure 11-10. A typical Strategy pattern layout Here are some examples of when you might use the Strategy pattern: •

Capture video using different compression algorithms

•

Compute taxes for different types of entities (people, corporations, nonprofits)

•

Plot data in different formats (line graphs, pie charts, bar graphs)

•

Compress audio files using different formats

In each of these examples you can think of having the application program telling a driver—the Context—which of the strategies to use and then asking the Context to perform the operation. As an example, let’s say you’re a newly minted CPA and you’re trying to write your own software to compute your customers’ tax bills. (Why a CPA would write their own tax program, I have no idea—work with me on this.) Initially, you’ve divided your customers into individuals who only file personal income taxes, corporations that file corporate income taxes, and nonprofit organizations that file hardly any taxes at all. Now, all these groups have to compute taxes, so the behavior of a class to compute taxes should be the same for all, but they’ll compute taxes in different ways. What we need is a Strategy setup that will use the same interface—to encapsulate what varies in our application, and to code the concrete classes to an interface—and allow our client class to select which type of tax customer to use. Figure 11-11 is a diagram of what our program will look like.

163

Chapter 11 ■ Design Patterns

Figure 11-11. Using the Strategy pattern to select a tax behavior We create a TaxStrategy interface that all the concrete TaxStrategy classes will implement: public interface TaxStrategy {   public double computeTax(double income); } Because the only thing that varies here is how the tax is computed, our TaxStrategy interface just includes the computeTax() method: Then we create each of the concrete TaxStrategy classes, each of which implements the tax computation for that particular type of customer: public class PersonalTaxStrategy implements TaxStrategy {     private final double RATE = 0.25;     public double computeTax(double income) {         if (income = amount)         {             System.out.println(name + " is about to withdraw " + amount);             try {                 System.out.println(name + " is going to sleep");                 Thread.sleep(500);             } catch (InterruptedException ex) {                 ex.printStackTrace();             }             System.out.println(name + " woke up");             account.withdraw(amount);             System.out.printf("%s completes the withdrawal\n", name);             System.out.printf("New balance is $%d\n", account.getBalance());         } else {             System.out.println("Sorry, not enough for "                         + Thread.currentThread().getName());         }     } } /** *  inner class to represent a simple bank account */ class BankAccount {     private int balance = 100;     public int getBalance () {         return balance;     }     public void withdraw(int amount) {         balance = balance - amount;     } } /** * the driver to run the experiment */ class FGMain {     public static void main(String[] args) {         FredAndGladys theJob = new FredAndGladys();

181

Chapter 12 ■ Parallel Programming

        Thread one = new Thread(theJob);         Thread two = new Thread(theJob);         one.setName("Fred");         two.setName("Gladys");         one.start();         two.start();     } } When we compile and execute this program, the first part of what we get is shown next. All things being equal, Fred and Gladys can alternate making withdrawals, but there’s a race condition. (Note that the balance is first reported as $80 and then as $90 because both threads are holding copies of the balance variable in their caches before they write it back to memory. Gladys writes first (after Fred has withdrawn) and we see an $80. Then Fred writes and we see a $90. Gladys then gets in before Fred again and the amount is correct: Gladys is about to withdraw 10 Gladys is going to sleep Fred is about to withdraw 10 Fred is going to sleep Fred woke up Fred completes the withdrawal Gladys woke up Gladys completes the withdrawal New balance is $80 Gladys is about to withdraw 10 Gladys is going to sleep New balance is $90 Fred is about to withdraw 10 Fred is going to sleep Fred woke up Fred completes the withdrawal New balance is $70 Fred is about to withdraw 10 Fred is going to sleep Gladys woke up Gladys completes the withdrawal New balance is $60 This all happens because it’s the JVM (or the operating system) that controls when threads get to use the CPU again. Luckily, Java has a way to fix this. There’s a keyword, synchronized, that you can use in a method signature and that creates a critical section so that only one thread at a time is allowed to execute in that method. Any other thread that attempts to enter the synchronized method is blocked. If we synchronize the makeWithdrawal(int amount) method, then when we run it we get the following: Fred Fred Fred Fred

182

is about to withdraw 10 is going to sleep woke up completes the withdrawal

Chapter 12 ■ Parallel Programming

New balance is $90 Gladys is about to withdraw 10 Gladys is going to sleep Gladys woke up Gladys completes the withdrawal New balance is $80 Fred is about to withdraw 10 Fred is going to sleep Fred woke up Fred completes the withdrawal New balance is $70 Fred is about to withdraw 10 Fred is going to sleep Fred woke up Fred completes the withdrawal New balance is $60 Which now has everyone in sync7. Another way to try avoiding race conditions (available since Java 5) is to use the volatile keyword on the variable balance in the BankAccount class. If this is done, then no cache copies of balance are kept, and the values are correctly updated most of the time. Because the variable is always stored in main memory (and not in the cache) and is always written back to main memory, then multiple threads could be writing to a shared volatile variable and still have the correct value stored in main memory. But if a thread needs to first read the value of a shared volatile variable and then update that value with a new one, using a volatile variable is no longer good enough to guarantee that the variable’s value remains synchronized. Because there’s a gap in time between reading the current value of the variable from main memory and then writing the new value, there’s still a race condition where Fred and Gladys might both read the current value of balance and generate (the same) new value and write it. The volatile variable is now out of sync. There’s no way to fix this other than being careful about when and how a shared volatile variable is used in your program. One should also be careful about the use of the synchronized keyword, because it adds more overhead and thus impacts performance. And although synchronized methods provide mutual exclusion and thread-safe code, they don’t protect against deadlock. Deadlock occurs when two processes or threads are contending for the same resources, the resources can be locked, and the processes refuse to give them up. As an example, the following sequence of events leads to deadlock:

7

•

Thread a enters synchronized method foo (and gets the key, locking out any other thread).

•

Thread a goes to sleep.

•

Thread b enters synchronized method bar (and gets the key, locking out any other thread).

•

Thread b tries to enter foo, can’t get the key, and waits.

•

Thread a wakes up, tries to enter bar, can’t get the key and waits.

•

Neither can proceed until they acquire the other key.

https://docs.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html

183

Chapter 12 ■ Parallel Programming

This is known (from Dijkstra) as a deadly embrace. How to fix a deadlock? Well, you shouldn’t depend on Java to do it because Java can’t detect a deadlock. The best methods involve mitigation; work carefully to make sure that this situation doesn’t happen. For more on concurrency and parallel programming in Java, see the online Java Tutorials8.

OpenMP9 OpenMP stands for Open Multi-Processing. It’s a very popular open source application programming interface (API) that enables simple creation of parallel programs. There are OpenMP implementations for practically all hardware architectures and bindings for C, C++, and Fortran. OpenMP assumes a shared memory model where threads share variables, so there are race condition possibilities like in Java already discussed. It’s not really a programming language. Rather it consists of compiler directives (#pragma’s in C & C++), library routines (which gives you an API), and environment variables. OpenMP uses a fork-join parallelism model. In this model there is one master thread. This thread is charged with dynamically creating N parallel threads (called a team of threads) for a parallel region of the code. The number of threads created depends on OpenMP’s assessment of how much parallelism is needed to execute the parallel region. When all those threads finish, program execution goes back to one master thread again until another parallel region is encountered. Figure 12-2 illustrates what fork-join parallelism looks like.

Figure 12-2. Fork-join parallelism The programmer indicates to the compiler where the parallel regions should be using the compiler directive #pragma omp parallel. This directive creates an SPMD (single program multiple data) program where each thread executes the same code. The threads are created dynamically as needed. The maximum number of threads can be controlled with the OMP_NUM_THREADS environment variable or with the omp_set_ num_threads(N) OpenMP library function. The programmer can figure out which thread they currently are using omp_get_thread_num() library function, which returns the current thread number. OpenMP will add parallelism dynamically until the demands of the program are met (or until it reaches the maximum number of threads allowed).

8 9

https://docs.oracle.com/javase/tutorial/essential/concurrency/index.html The name OpenMP is the property of the OpenMP Architecture Review Board. See www.openmp.org.

184

Chapter 12 ■ Parallel Programming

The master thread will create threads with the OpenMP parallel compiler directive. This directive tells the compiler that the next code statement block is to be considered a parallel region to be executed by each processor. For example, #pragma omp parallel num_threads(8) will instruct OpenMP to create up to seven threads of the parallel region that immediately follows the #pragma. Seven threads are created instead of eight because the master thread is also used in the parallel computation. In C or C++, this might look like the following: long list[1000]; #pragma omp parallel num_threads(8) {         int threadID = omp_get_thread_num();         foo(list, threadID); } Note that because OpenMP uses a shared memory model, we need to guarantee synchronization and that the program can also still encounter race conditions. Recall that two common techniques for enforcing synchronization are barriers and mutual exclusion. OpenMP has a compiler directive to set a barrier, #pragma omp barrier. It can also create critical sections (and thus enforce mutual exclusion) by using another directive, #pragma omp critical. Only one thread at a time can enter a critical section. OpenMP also allows a very fine-grained form of synchronization by allowing the programmer to create an atomic operation. Using the #pragma omp atomic compiler directive, the programmer can apply mutual exclusion to a single statement that must update a memory location. Allowable operations in an atomic region include: x op= expression, x++, ++x, x--, and --x. If the #pragma omp parallel compiler directive creates an SPMD program and forces the program to execute the code in the parallel region, how do we get a loop to divide up the work into the threads (known as worksharing) so that we can execute the entire program faster? OpenMP assures this will happen with another directive, #pragma omp for. This directive has the effect of splitting up the work in the loop among all the threads in the team. To illustrate how you would use all these OpenMP directives and library functions, let’s take a look at a fairly common example in parallel computing. If you took calculus in school, you remember that one way to look at the integral is as the area under a curve. In numerical analysis, there’s a technique called the trapezoid rule that allows you to approximate the definite integral of a function by measuring and summing up a number of trapezoids (or rectangles) under the curve drawn by your function. It turns out that the function f(x) = 1.0 / (1.0 + x2) using values of x from 0.0 to 1.0 is an approximation to π / 4. So we can write a program that uses the trapezoid rule to compute π / 4 and then just multiply to get a value for π. Figure 12-3 shows what the function looks like.

Figure 12-3. Function to compute p / 4

185

Chapter 12 ■ Parallel Programming

And here’s a serial version of this program in C: /* *  Serial program to estimate the area under a curve f(x) *  It really computes pi/4 and approximates the trapezoidal rule *  using rectangles instead of trapezoids. */ #include #include /*   * Here's the function we'll be computing   */ double f(double x) {     return 1.0 / (1.0 + x * x); } int main (int argc, char **argv) {     int steps = 1000000000;     /* number of rectangles – 1 billion */     double width = 0.0;         /* width of each rectangle */     double x, pi4, sum = 0.0;     /* get the width of each rectangle */     width = 1.0 / (double) steps;     /* loop to compute the area under f(x) */     for (int i = 0; i 25000 then     begin         statement1;         statement2     end else         statement3; C/C++/Java: if (income > 25000) {         statement1;         statement2; } else         statement3; There are a number of tools available that will help with coding and testing issues. Some links to popular tools are https://www.owasp.org/index.php/Source_Code_Analysis_Tools, https://en.wikipedia.org/wiki/List_of_ tools_for_static_code_analysis, https://www.jetbrains.com/resharper/, and http://www.softwaretestinghelp.com/tools/top-40-static-code-analysis-tools/

6

215

Chapter 14 ■ Code Construction

Note in both examples that a single statement is considered a block and doesn’t require the block delimiter elements. Note also in Pascal the semicolon is the statement separator symbol, so is required between statements, but because else and end are not the end of a statement, you don’t use a semi-colon right before else or end (confused? most people are); in C, C++, and Java, the semicolon is the statement terminator symbol, and must be at the end of every statement. This is easier to remember and write; you just pretty much put a semicolon everywhere except after curly braces. Simplicity is good. Finally, when we format a block, we can try to emulate the built-in block boundary in languages that don’t have it by requiring that every block use the block delimiter lexical elements. C/C++/Java: if (income > 25000) {         statement1;         statement2; } else {         statement3; } In this example, we want to pretend that the left and right curly braces are part of the control structure syntax, and so we use them to delimit the block, no matter how large it is. To emphasize that the block delimiter is part of the control structure, we put it on the same line as the beginning of the control statement. We can then line up the closing block boundary element with the beginning of the control structure. This isn’t a perfect emulation of the built-in block element language feature, but it comes pretty close and has the advantage that you’re less likely to run into problems with erroneous indentation, like the following: C/C++/Java: if (income > 25000)         statement1;         statement2;         statement3; In this example, the erroneous indentation for statement2 and statement3 can lead the reader to believe that they are part of the if statement. The compiler is under no such illusions. Overall, using an emulating block-boundaries style works very well, is readable, and clearly illustrates the logical structure of your program. It’s also a great idea to put block boundaries around every block, including just single statement blocks. That lets you eliminate the possibility of the erroneous indentation error from earlier. So if you say if (income > 25000) {         statement1; } it’s then clear that in if (income > 25000) {         statement1; }         statement2;         statement3;

216

Chapter 14 ■ Code Construction

statement2 and statement3 are not part of the block, regardless of their indentation. It also means that you can now safely add extra statements to the block without worrying about whether they’re in the block or not: if (income > 25000) {         statement1;         statement2;         statement3;         statement4;         statement5; }

White Space White space is your friend. You wouldn’t write a book without any spaces between words, or line breaks between paragraphs, or chapter divisions, would you? Then why would you write code with no white space? White space allows you to logically separate parts of the program and to line up block separators and other lexical elements. It also lets your eyes rest between parts of the program. Resting your eyes is a good thing. The following are some suggestions on the use of white space: •

Use blank lines to separate groups (just like paragraphs).

•

Within a block, align all the statements to the same tab stop (the default tab width is normally four spaces).

•

Use indentation to show the logical structure of each control structure and block.

•

Use spaces around operators.

•

In fact, use spaces around array references and function/method arguments as well.

•

Do not use double indentation with begin-end block boundaries.

Block and Statement Style Guidelines As mentioned previously, the “emulating block boundaries” style works well for most block-structured languages: •

Use more parentheses than you think you’ll need: I especially use parentheses around all my arithmetic expressions—mostly just to make sure I haven’t screwed up the precedence rules. fx = ((a + b) * (c + d)) / e;

•

Format single statement blocks consistently: Use the emulating block-boundaries technique: if (average > MIN_AVG) {     avg = MIN_AVG; }

217

Chapter 14 ■ Code Construction

•

For complicated conditional expressions, put separate conditions on separate lines: if (('0'