Disaster Recovery and Business Continuity IT Planning, Implementation, Management and Testing of Solutions and Services Workbook 1921523255, 9781921523250

A professional technical roadmap to IT Disaster Recovery & Business Continuity planning, implementation, management

722 103 2MB

English Pages 194 Year 2008

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Disaster Recovery and Business Continuity IT Planning, Implementation, Management and Testing of Solutions and Services Workbook
 1921523255, 9781921523250

Table of contents :
Title & Copyright......Page 2
Write a Review......Page 3
Table of Contents......Page 4
1 INTRODUCTION ROADMAP......Page 6
2 DISASTER RECOVERY......Page 10
3 SUPPORTING DOCUMENTS......Page 44
3.1 Objectives and Goals......Page 46
3.2 Policies, Objectives & Scope......Page 50
3.3 Business Justification Document......Page 56
3.4 Business Impact Analysis......Page 62
3.5 Example Business Impact Assessment......Page 64
3.6 Risk Assessment Template......Page 72
3.7 Environmental Architectures & Standards......Page 80
3.8 Reciprocal Arrangements......Page 86
3.9 Business Continuity Strategy......Page 106
3.10 Management of Risk (MOR) Framework......Page 118
3.11 Risk Assessment Questionnaire......Page 122
3.12 Typical Contents of a Recovery Plan......Page 130
3.13 Communication Plan......Page 136
3.14 Example E-mail Text......Page 142
3.15 Emergency Response Template......Page 146
3.16 Salvage Plan Template......Page 156
3.17 Vital Records Template......Page 162
3.18 Roles and Responsibilities......Page 168
3.19 Process Manager......Page 170
3.20 Reports, KPIs and other Metrics......Page 174
3.21 Business and IT Flyers......Page 180
4 IMPLEMENTATION PLAN......Page 184
5 FURTHER READING......Page 194

Citation preview

Disaster Recovery and Business Continuity IT Planning, Implementation, Management and Testing of Solutions and Services Workbook

Notice of Rights: Copyright © The Art of Service. All rights reserved. No part of this book may be reproduced or transmitted in any form by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Notice of Liability: The information in this book is distributed on an “As Is” basis without warranty. While every precaution has been taken in the preparation of the book, neither the author nor the publisher shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the instructions contained in this book or by the products described in it. Trademarks: Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations appear as requested by the owner of the trademark. All other product names and services identified throughout this book are used in editorial fashion only and for the benefit of such companies with no intention of infringement of the trademark. No such use, or the use of any trade name, is intended to convey endorsement or other affiliation with this book. ITIL® is a Registered Community Trade Mark of OGC (Office of Government Commerce, London, UK), and is Registered in the U.S. Patent and Trademark Office.

Write a Review and Receive a Bonus Emereo eBook of Your Choice

Up to $99 RRP – Absolutely Free If you recently bought this book we would love to hear from you – submit a review of this title and you’ll receive an additional free ebook of your choice from our catalog at http://www.emereo.org.

How Does it Work? Submit your review of this title via the online store where you purchased it. For example, to post a review on Amazon, just log in to your account and click on the ‘Create Your Own Review’ button (under ‘Customer Reviews’) on the relevant product page (you’ll find plenty of example product reviews on Amazon). If you purchased from a different online store, simply follow their procedures.

What Happens When I Submit my Review? Once you have submitted your review, send us an email via [email protected], and include a link to your review and a link to the free eBook you’d like as our thank-you (from http://www.emereo.org – choose any book you like from the catalog, up to $99 RRP). You will then receive a reply email back from us, complete with your bonus ebook download link. It's that simple!

Disaster Recovery Workbook

Table of Contents  1 

INTRODUCTION ROADMAP ....................................................................................................................... 5 



DISASTER RECOVERY ................................................................................................................................. 9 



SUPPORTING DOCUMENTS ..................................................................................................................... 43 

3.1  3.2  3.3  3.4  3.5  3.6  3.7  3.8  3.9  3.10  3.11  3.12  3.13  3.14  3.15  3.16  3.17  3.18  3.19  3.20  3.21 

OBJECTIVES AND GOALS ............................................................................................................................... 45  POLICIES, OBJECTIVES & SCOPE ..................................................................................................................... 49  BUSINESS JUSTIFICATION DOCUMENT ............................................................................................................. 55  BUSINESS IMPACT ANALYSIS ......................................................................................................................... 61  EXAMPLE BUSINESS IMPACT ASSESSMENT ....................................................................................................... 63  RISK ASSESSMENT TEMPLATE ........................................................................................................................ 71  ENVIRONMENTAL ARCHITECTURES & STANDARDS ............................................................................................. 79  RECIPROCAL ARRANGEMENTS ....................................................................................................................... 85  BUSINESS CONTINUITY STRATEGY ................................................................................................................ 105  MANAGEMENT OF RISK (MOR) FRAMEWORK .............................................................................................. 117  RISK ASSESSMENT QUESTIONNAIRE ............................................................................................................ 121  TYPICAL CONTENTS OF A RECOVERY PLAN .................................................................................................... 129  COMMUNICATION PLAN .......................................................................................................................... 135  EXAMPLE E‐MAIL TEXT ............................................................................................................................ 141  EMERGENCY RESPONSE TEMPLATE ............................................................................................................. 145  SALVAGE PLAN TEMPLATE ........................................................................................................................ 155  VITAL RECORDS TEMPLATE ....................................................................................................................... 161  ROLES AND RESPONSIBILITIES .................................................................................................................... 167  PROCESS MANAGER ................................................................................................................................ 169  REPORTS, KPIS AND OTHER METRICS .......................................................................................................... 173  BUSINESS AND IT FLYERS ......................................................................................................................... 179 



IMPLEMENTATION PLAN ....................................................................................................................... 183 



FURTHER READING ............................................................................................................................... 193 

Page 3

Disaster Recovery Workbook

Page 4

Disaster Recovery Workbook

1

INTRODUCTION ROADMAP

Many organizations are looking to implement IT Service Asset & Continuity Management (ITSCM) as a way to improve the structure and quality of the business and recover from disaster. This document describes the contents of the Disaster Recovery Workbook. The information found within the book is based on the ITIL Version 3 framework, specifically the Service Design phase which incorporates the updated ITIL version 3 IT Service Asset & Continuity Management process. The workbook is designed to answer a lot of the questions that the IT Service Asset & Continuity Management process raises and provides you with useful guides, templates and essential, but simple assessments. The supporting documents and assessments will help you identify the areas within your organization that require the most activity in terms of change and improvement. Presentations can be used to educate or be used as the basis for management presentations or when making business cases for disaster recovery. The additional information and bonus resources will enable you to improve your organizations methodology knowledge base. The workbook serves to act as a starting point. It will give you a clear path to travel. It is designed to be a valuable source of information and activities.

The Disaster Recovery Workbook: Flows logically, Is scalable, Provides presentations, templates and documents, Saves you time.

Page 5

Disaster Recovery Workbook

Step 1 Start by reviewing the PowerPoint presentation: •

Disaster Recovery

This presentation will give you a good knowledge and understanding of all the terms, activities and concepts required within the IT Service Asset & Continuity Management process and how they will enable recovery from disaster. They can also be used as the basis for management presentations or when making a formal business case for IT Service Asset & Continuity Management implementation. Make sure you pay close attention to the notes pages, as well as the slides, as references to further documents and resources are highlighted here.

Page 6

Disaster Recovery Workbook

Step 2 If you did not look at the supporting documents and resources when prompted during the PowerPoint presentation, do this now. Below is an itemized list of the supporting documents and resources for easy reference. You can use these documents and resources within your own organization or as a template to help you in prepare your own bespoke documentation. Objectives and Goals Policies, Objectives and Scope Business Justification Document Business Impact Analysis Example Business Impact Assessment Risk Assessment Template Environmental Architectures and Standards Reciprocal Arrangements Business Continuity Strategy MOR Framework Risk Assessment Questionnaire Typical Contents of a Recovery Plan Communication Plan Example E-mail Text Emergency Response Template Salvage Plan Template Vital Records Template Roles and Responsibilities Process Manager Reports, KPIs and other Metrics Business and IT Flyers The supporting documents and resources found within the book will help you fill these gaps by giving you a focused, practical and user-friendly approach to IT Service Asset & Continuity Management.

Page 7

Disaster Recovery Workbook

Step 2 continued... Alternatively, continue by working through the IT Service Asset & Continuity Management Implementation Plan with the focus on your organization. This will help you ascertain the IT Service Asset & Continuity Management maturity for your organization. You will able to identify gaps and areas of attention and/or improvement. The supporting documents and bonus resources found within the workbook will help you fill these gaps by giving you a focused, practical and user-friendly approach to disaster recovery.

Page 8

Disaster Recovery Workbook

2

DISASTER RECOVERY

Disaster Recovery

Page 9

Disaster Recovery Workbook

Page 10

Disaster Recovery Workbook

Page 11

Disaster Recovery Workbook

Information on Objectives and Goals can be found on page 45. Further information on Policies, Objectives and Goals can be found on page 49.

Page 12

Disaster Recovery Workbook

The ITSCM process includes: • • •



• • •

Agreement of the scope of the ITSCM process and the policies adopted. Business Impact Analysis (BIA) to quantify the impact loss of IT service would have on the business. Risk Analysis: the risk identification and risk assessment to identify potential threats to continuity and the likelihood of the threats becoming a reality. This also includes taking measures to manage the identified threats where this can be cost justified. Production of the overall ITSCM strategy. This can be produced following the two steps identified above, and is likely to include elements of risk reduction as well as a selection of appropriate and comprehensive recovery options. Production of an ITSCM plan, which again must be integrated with the overall BCM plans. Testing of the plans. Ongoing operation and maintenance of the plans.

Page 13

Disaster Recovery Workbook

ITSCM should be driven by business risk as identified by Business Continuity Planning, and ensures that the recovery arrangements for IT services are aligned to identify business impacts, risks and needs. More information can be found on page 55 in the Business Justification Document.

Page 14

Disaster Recovery Workbook

Disaster: NOT part of daily operational activities and requires a separate system. (Not necessarily a flood, fire etc. may be due to a blackout or power problem and the SLAs are in danger of being breached). BCM: Business Continuity Management: Strategies and actions to take place to continue Business Processes in the case of a disaster. It is essential that the ITSCM strategy is integrated into and a subset of the BCM strategy. BIA: Business Impact Analysis- quantifies the impact loss of IT service would have on the business. More information can be found on page 61 in the Business Impact Analysis. An Example Business Impact Assessment is also available on page 63. Risk Assessment: Evaluate Assets, Threats and Vulnerabilities. A Risk Assessment Template can be found on page 71. Scope: The scope of IT Service Asset & Continuity Management considers all identified critical business processes and IT service(s) that underpin them. This may include hardware, software, essential services and utilities, critical paper records, courier services, voice services & physical location areas e.g. offices, data centres etc.

Page 15

Disaster Recovery Workbook

Counter Measures: Measures to prevent or recover from disaster Manual Workaround: Using non-IT based solution to overcome IT service disruption Gradual recovery: aka Cold standby (>72hrs). Intermediate Recovery: aka Warm standby (24-72hrs) Fast Recovery: Can also be known as Hot Standby ( these are indicators for you to create some specific text. Watch also for highlighted text which provides further guidance and instructions.

Page 43

Disaster Recovery Workbook

Page 44

Disaster Recovery Workbook

3.1

Objectives and Goals

IT Services Detailed Objectives/Goals Process: IT Service Asset & Continuity Management

Status: Version:

0.1

Release Date:

Page 45

Disaster Recovery Workbook

Detailed Objectives/Goals for IT Service Asset & Continuity Management The document is not to be considered an extensive statement as its topics have to be generic enough to suit any reader for any organization. However, the reader will certainly be reminded of the key topics that have to be considered. The detailed objectives for IT Service Asset & Continuity Management should include the following salient points: Objective

Notes

To provide assurance to the business that in the event of disaster, IT can recover the necessary services within agreed business time scales to support the continuity of the business.

Met/Exceeded/Shortfall ☺ Dates/names/role titles

IT Service Asset & Continuity Management will provide a cost effective and sustained level of recoverability that is aligned with needs and objectives of the business. Minimize the adverse affects on the IT Infrastructure and the Business by designing for recovery in the event of a disaster. Once developed an IT Service Asset & Continuity Management process can be used to plan for recovery for the business before prolonged loss of service can cause significant harm to the IT services being delivered. To establish efficient assessment guidelines that covers the business, technical and financial aspects of IT Service Asset & Continuity Management and the supporting infrastructure. Generally this will involve different people so the challenge is designing a process that minimizes the time taken. To develop a variety of activities to cater for the required levels of recoverability. For example, there are a wide degree of potential impacts that loss of service may have on the environment. If we can categorize and target these areas, then we can pre-build models

Page 46

Disaster Recovery Workbook

for dealing with them when a disaster occurs. To establish ground rules that distinguishes between Continuity and Availability. Develop working relationships with all other process areas. The IT Service Asset & Continuity Management process should be considered a proactive one with requiring input from other process areas. Obvious links include Security Management (Confidentiality, Integrity and Availability), Service Level Management (to help gather requirements), Availability Management (planning for availability) and Network Management tools (to identify potential threats or loss of service to the IT Infrastructure). Develop a sound IT Service Asset & Continuity Management process and look for continuous improvement.

Use these objectives to generate discussion about others that may be more appropriate to list than those provided. Refer also to the Communication Plan on page 135 for ideas on how to communicate the benefits of IT Service Asset & Continuity Management.

Page 47

Disaster Recovery Workbook

Page 48

Disaster Recovery Workbook

3.2

Policies, Objectives & Scope

IT Services Policies, Objectives & Scope Process: IT Service Asset & Continuity Management

Status:

In draft Under Review Sent for Approval Approved Rejected

Version:

IT Service Continuity Date:

Page 49

Disaster Recovery Workbook

Policies, Objectives and Scope for IT Service Asset & Continuity Management The document is not to be considered an extensive statement as its topics have to be generic enough to suit any reader for any organization. However, the reader will certainly be reminded of the key topics that have to be considered. Policy Statement A course of action, guiding principle, or procedure considered expedient, prudent, or advantageous Use this text box to answer the “SENSE OF URGENCY” question regarding this process. Why is effort being put into this process? Not simply because someone thinks it’s a good idea. That won’t do. The reason has to be based in business benefits. You must be able to concisely document the reason behind starting or improving this process. Is it because of legal requirements or competitive advantage? Perhaps the business has suffered major problems or user satisfaction ratings are at the point where outsourcing is being considered. The relationship between ITSCM and Security is another aspect to build into the Policy statement.

Page 50

Disaster Recovery Workbook

The basic premise of Security and IT Service Asset & Continuity Management is the continual identification and management of RISK.

A policy statement any bigger than this text box, may be too lengthy to read, lose the intended audience with detail, not be clearly focussed on answering the WHY question for this process.

Page 51

Disaster Recovery Workbook

Objectives Statement Something worked toward or striven for; a goal Use this text box to answer the “WHERE ARE WE GOING” question regarding this process. What will be the end result of this process and how will we know when we have reached the end result? Will we know because we will establish a few key metrics or measurements or will it be a more subjective decision, based on instinct? A generic sample statement on the “objective” for IT Service Asset & Continuity Management is: The IT Service Asset & Continuity Management objective is described as a process for controlling and coordinating the IT Service Continuity of IT Services and systems in such a way as to support the requirements of the organization. IT The service must be provided after and based on an on-going analysis of organizational risks, costs and associated benefits (sometimes referred to as ROI – Return on Investment) Service Asset & Continuity Management will provide a structure and repeatable process to support this requirement without affecting the normal operating levels of service. Note the keywords in the statement. For the statement on IT Service Asset & Continuity Management they are “controlling and coordinating” and “without affecting levels of service”. These are definite areas that we can set metrics for and therefore measure progress. An objective statement any bigger than this text box, may be too lengthy to read, lose the intended audience with detail, not be clearly focused on answering the WHERE question for this process.

The above Objective Statement was; Prepared by: On:

And accepted by: On:

Refer to Reports, KPIs and other Metrics on page 173 for metrics, KPI’s for IT Service Asset & Continuity Management.

Page 52

Disaster Recovery Workbook

Scope Statement The area covered by a given activity or subject Use this text box to answer the “WHAT” question regarding this process. What are the boundaries for this process? What does the information flow look like into this process and from this process to other processes and functional areas?

A generic sample statement on the “scope” for IT Service Asset & Continuity Management is: The IT Service Asset & Continuity Management process will be responsible for creating a usable level of IT service provision following an unplanned outage, until such time that standard levels of service can be restored. IT Service Asset & Continuity Management will be responsible for the establishment and on-going management of an environment that can be used during such times (the nature of the environment being dependant on risk, cost and benefit variables). An scope statement any bigger than this text box, may be too lengthy to read, lose the intended audience with detail, not be clearly focused on answering the WHAT question for this process.

The above Scope Statement was; Prepared by: On:

And accepted by: On:

Page 53

Disaster Recovery Workbook

Page 54

Disaster Recovery Workbook

3.3

Business Justification Document

IT Services Business Justification Process: IT Service Asset & Continuity Management

Status:

In draft Under Review Sent for Approval Approved Rejected

Version:

Release Date:

Page 55

Disaster Recovery Workbook

Business Justification Document for IT Service Asset & Continuity Management The document is not to be considered an extensive statement as its topics have to be generic enough to suit any reader for any organization. However, the reader will certainly be reminded of the key topics that have to be considered. This document serves as a reference for HOW TO APPROACH THE TASK OF SEEKING FUNDS for the implementation of the IT Service Asset & Continuity Management process. This document provides a basis for completion within your own organization. This document was; Prepared by: On:

And accepted by: On:

Page 56

Disaster Recovery Workbook

IT Service Asset & Continuity Management Business Justification A strong enough business case will ensure progress and funds are made available for any IT initiative. This may sound like a bold statement but it is true. As IT professionals we have (for too long) assumed that we miss out on funds while other functional areas (eg. Human resources and other shared services) seem to get all that they want. However, the problem is not with them, it’s with US. We are typically poor salespeople when it comes to putting our case forward. We try to impress with technical descriptions, rather than talking in a language that a business person understands. For example:

We say

We should say

We have to increase IT security controls, with the implementation of a new firewall.

Two weeks ago our biggest competitor lost information that is now rumored to be available on the internet.

The network bandwidth is our biggest bottleneck and we have to go to a switched local environment.

The e-mail you send to the other national managers will take 4 to 6 hours to be delivered. It used to be 2 to 3 minutes, but we are now using our computers for many more tasks.

Changes to the environment are scheduled We are making the changes on Sunday afternoon. There will be less people working for a period of time when we expect there then. to be minimal business impact.

Doesn’t that sound familiar? To help reinforce this point even further, consider the situation of buying a new fridge. What if the technically savvy sales person wants to explain “the intricacies of the tubing structure used to super cool the high pressure gases, which flow in an anti-clockwise direction in the Southern hemisphere”. Wouldn’t you say “too much information, who cares – does it make things cold?” Well IT managers need to stop trying to tell business managers about the tubing structure and just tell them what they are interested in.

Page 57

Disaster Recovery Workbook

So let’s know look at some benefits of the process. Remember that the comments here are generic, as they have to apply to any organization.

Benefits

Notes/Comments/Relevance

Through a properly controlled and structured IT Service Asset & Continuity Management process we will be able to more effectively help in recovery of IT services in the event of a disaster and provide assurance of IT Services in line with the business requirements. This is achieved through the nature of the process by understanding such things Business Impact Analysis, Risk Assessment, Business Continuity and the true needs of the business. A reduction in the amount of unavailability from a disaster will therefore allow IT to spend more time on aligning the IT Services with the needs of the Business. A heightened visibility and increase communication related to Availability of Services for both business and IT support staff because of reduced downtime in the event of a disaster. The reader should be able to draw upon experience regarding the overall negative impact of the business when IT departments have been concerned with high levels of unavailability for services that are critical to the business. Organizations and therefore IT environments are becoming increasing complex and continually facing new challenges. The ability to meet these challenges is dependent on the speed and flexibility of the organization. The ability to cope with more changes at the business level will be directly impacted by how well IT Departments can reduce the amount of time in loss of service due to bad IT Service Asset & Continuity Management planning. (Reader, here you can describe a missed opportunity, due to bad IT Service Asset & Continuity Management or a process dragged down by bureaucracy). Noticeable increases in the potential productivity of end users and key personnel through reduced interruption times, higher levels of availability. The goal statement of IT Service Asset & Continuity

Page 58

Disaster Recovery Workbook

Management is to ensure Business Continuity in the event of an IT Disaster. By the very nature of this statement we can expect to start seeing a reduction in loss of time due to service availability issues and bad planning. Whether end users and staff take advantage of this reduced down-time is not an issue for IT professionals to monitor. Knowing that we have made more working time available is what we need to publish – NOT productivity rates. An ITIL IT Service Asset & Continuity Management process will guide you towards understanding the financial implications of all those necessary requirements needed in the IT infrastructure. This has real benefits as it may prevent an organization from spending money on areas of the IT Infrastructure where there really isn’t a need for building immediate recovery services for the business. IT Service Asset & Continuity Management aides in improving the security aspects of the organization with respect to IT. IT Service Asset & Continuity Management will work in conjunction with Security Management to implement those security requirements described in the Security Policy. Correct management of Security Requirements will help in maintaining the right levels of availability needed by the business. The ITSCM Manager will ensure that any potential impact of the loss of service has been fully assessed prior to starting an IT Service Asset & Continuity Management process. With a sound IT Service Asset & Continuity Management process we can expect an overall improvement in the recovery of IT Services as better planning can occur under a structured, repeatable process. Any ITIL process has the potential to increase the credibility of the IT group, as they offer a higher quality of service, combined with an overall professionalism that can be lacking in ad-hoc activities.

Page 59

Disaster Recovery Workbook

Page 60

Disaster Recovery Workbook

3.4

Business Impact Analysis

A valuable source of input when trying to ascertain the business needs, impacts and risks is the Business Impact Analysis (BIA). The BIA is an essential element of the overall business continuity process and will dictate the strategy for risk reduction and disaster recovery. Its normal purpose is to identify the effect a disaster would have on the business. It will show which parts of the organization will be most affected by a major incident and what effect it will have on the company as a whole. It therefore enables the recognition of the most critical business functions to the company’s survival and where this criticality differs depending on the time of the day, week, month or year. Additional, experience has shown that the results from the BIA can be an extremely useful input for a number of other areas as well, and will give a far greater understanding of the service than would otherwise be the case. The BIA could be divided into two areas: • •

One by business management, which has to investigate the impact of the loss (or partial loss) of a business process of a business function. This includes the knowledge of manual workarounds and their costs. A second role located in Service Management is essential to break down the effects of service loss to the business. This element of the BIA shows the impact of service disruption to the business. The services can be managed and influenced by Service Management. Other aspects also covered in ‘Business BIA’ cannot be influenced by Service Management.

A BIA should be conducted to help define the business continuity strategy and to enable a greater understanding about the function and importance of the service as part of the design phase of a new or changed service. This will enable the organization to define: • • • • •

Which are the critical services, what constitutes a major incident on these services, and the subsequent impact and disruption caused to the business – important in decided when and how to implement changes Acceptable levels and times of service outage levels – again important in the consideration of change and implementation schedules Critical business and service periods – important periods to avoid The cost of loss of service – important for Financial Management The potential security implications of a loss of service – important considerations in the management of risk.

Page 61

Disaster Recovery Workbook

Page 62

Disaster Recovery Workbook

3.5

Example Business Impact Assessment

IT Services Example Business Impact Assessment Process: IT Service Asset & Continuity Management

Status:

In draft Under Review Sent for Approval Approved Rejected

Version:

Release Date:

Page 63

Disaster Recovery Workbook

Document Control Author Prepared by Document Source This document is located on the LAN under the path: I:/IT Services/Service Delivery/Functional Specifications/ Document Approval This document has been approved for use by the following: ♦

, IT Services Manager



, IT Service Delivery Manager



, IT Service Continuity Process Manager



, Customer representative or Service Level Manager

Amendment History Issue

Date

Amendments

Completed By

Distribution List When this procedure is updated the following copyholders must be advised through email that an updated copy is available on the intranet site:

Business Unit

Stakeholders

IT

Page 64

Disaster Recovery Workbook

Introduction Purpose The Business Impact Analysis allows an analysis and then an identification of the basic critical IT requirements needed to support the business. The purpose of this document is to provide an overview of findings in this analysis. Scope This document describes the following: Summary of each service provided by IT Services including Summary of the Continuity Strategy for each applicable service Detailed list of Continuity Strategy for each applicable service Note: It is assumed for each service described in this document that the supporting back-end technology is already in place and operational. Audience This document is relevant to all staff in Ownership IT Services has ownership of this document. Related Documentation Include in this section any related Service Level Agreement reference numbers and other associated documentation: IT Service Asset & Continuity Management Policies, Guidelines and Scope Document Business Continuity Strategy Template Risk Assessment Reciprocal Arrangements Relevant SLA and procedural documents IT Services Catalogue Relevant Technical Specification documentation Relevant User Guides and Procedures

Page 65

Disaster Recovery Workbook

Executive Overview Describe the purpose, scope and organization of the Continuity Management Strategy document. Scope As not all IT Services may initially be included within the Business Impact Analysis. Use this section to outline what will be included and the timetable for other services to be included. Scope for the BIA may be determined by the business, therefore covering only a select few of the IT Services provided by the IT department that are seen as critical to the support of the business processes.

Page 66

Disaster Recovery Workbook

IT Service Definition This section is where you will document the Service Descriptions for the services or applications used by the Business people. This information should be x-referenced to your Service Catalogue and/or related Service Level Agreements. You need to list all the Services here that the BIA is required on. IT Service

Owner

Business Process

Business Owners

SLA #/Service Catalogue Reference

BIA Score (from Details below)

Form + escalation + resource + time

Service A

J. Ned

Billing

T. Smith

SLA001

Email

A. Boon

Communication

R. Jones

SLA234

SAP

C. Jones

Invoice and Payroll

P. Boon

SLA123

Service B

L. Smith

Marketing

R. Reagan

SLA009

Service C

R. Smith

Manufacturing

R. Smith

SLA007

Notes/ Comments

Note: a high score here indicates a service area that has a potentially high Business impact of lost. The Score can also be used as a guide for the types of service recovery (intermediate, immediate, etc.) that would be acceptable to the business. This score is a starting point for recovery options and considerations.

Page 67

Disaster Recovery Workbook

Service A (This section needs to be duplicated for each service listed in the table above)

Form of Loss In this table for this service describe how a disruption to this service will be seen within the business. For example, will the loss of service invoke a contract that has set costs associated with it? If we lose this service can we expect to lose customers/clients/market share. Define each form that the loss of the service may take and give each a score for magnitude. Form

Description

Reputation Our industry is reputation sensitive Third party Contract is invoked to support provide 24 hour recovery Frustration Not high, as end of end users have other users tasks to perform. Etc. Etc. Total Form of Loss Score

Magnitude Score (1 is negligible, 10 Severe) 9 7

4

Etc. 56

Other triggers to identify forms of loss: Breach of contract, Breach of law or industry imposed standards Safety issues Confidence drop in skills of Service Providers

Escalation Use this section to specify for this Service/application the speed at which it is likely that the situation regarding the loss of this service will degrade overall performance. That is, provide a score of 1 (low) to 10 (highest) that indicates how the service loss will grow in severity. Escalation Score (1 is slow/barely noticeable, 10 Rapid pace of overall deterioration) 9

Page 68

Disaster Recovery Workbook

Resources Factor Use this section to specify for this Service/application the combination of the complexity of facilities and the level of skills required in the people that will permit this service to stay operating, in the event of a failure.

That is, provide a score of 1 (low resource requirement) to 10 (maximum resources and skills required). Resources Score (1 is minimal skills and resources required to maintain service, 10 Expert level of skills and extensive resources) 3 Time Considerations There are two time considerations to factor in to your decision on a score in this area. The first time issue relates to how quickly/slowly the business requires just bare levels of service restored. The second time issue relates to how quickly/slowly the entire service and all associated systems should be fully operational.

That is, provide a score of 1 (slow return to service is acceptable) to 10 (where any real delay in service restoration could have a dire impact upon the business) Time Factor Score (1 is non-urgent, 10 is business critical) 5 Conclusion (not part of the repetitive process) This template has given you a concise and simple way to look at the impact that the loss of particular IT Services will have on an organization. We must however remember that the impact of loss will change over time. A BIA should be performed on a regular time basis (to coincide with reviews of the Service Level Management – Service Catalog or Service Level Agreement reviews).

Page 69

Disaster Recovery Workbook

Appendices Include any applicable appendixes that are needed. E.g. Mission statement and/or business objectives, which drove this BIA. Relevant details of people who provided input Terminology Make sure that all terminology is captured and documented correctly. E.g. CMDB ITSCM SLA UC

Configuration Management Data Base Information Technology Services Continuity Management Service Level Agreement Underpinning Contract

Page 70

Disaster Recovery Workbook

3.6

Risk Assessment Template

IT Services Risk Assessment Template Process: IT Service Asset & Continuity Management

Status: Version:

0.1

Release Date:

Page 71

Disaster Recovery Workbook

Document Control Author Prepared by Document Source This document is located on the LAN under the path: I:/IT Services/Service Delivery/Risk Assessment/ Document Approval This document has been approved for use by the following: ♦

, IT Services Manager



, IT Service Delivery Manager



, IT Service Continuity Process Manager



, Customer representative or Service Level Manager

Amendment History Issue

Date

Amendments

Completed By

Distribution List When this procedure is updated the following copyholders must be advised through email that an updated copy is available on the intranet site: Business Unit

Stakeholders

IT

Page 72

Disaster Recovery Workbook

Introduction Purpose The purpose of this document is to provide a risk assessment template. Scope This document describes the following: Summary of services and their risks A risk template Note: It is assumed for each service described in this document that the supporting back-end technology is already in place and operational. Audience This document is relevant to all staff in Ownership IT Services has ownership of this document. Related Documentation Include in this section any related Service Level Agreement reference numbers and other associated documentation: IT Service Asset & Continuity Management Policies, Guidelines and Scope Document Business Continuity Strategy Template Risk Assessment Reciprocal Arrangements Relevant SLA and procedural documents IT Services Catalogue Relevant Technical Specification documentation Relevant User Guides and Procedures

Page 73

Disaster Recovery Workbook

Executive Overview Describe the purpose, scope and organization of the document. Scope Not all IT Services may initially be included within the Risk Analysis. Use this section to outline what will be included and the timetable for other services to be included. Scope for the assessment may be determined by the business, therefore covering only a select few of the IT Services provided by the IT department that are seen as critical to the support of the business processes.

Page 74

Disaster Recovery Workbook

IT Service Definition This section is where you will document the Service Descriptions for the services or applications used by the Business people. This information should be x-referenced to your Service Catalogue and/or related Service Level Agreements. You need to list all the Services here that the BIA is required on. IT Service

Owner

Business Process

Business Owners

SLA #/Service Catalogue Reference

BIA Score (from Details below)

Form + escalation + resource + time

Service A

J. Ned

Billing

T. Smith

SLA001

Email

A. Boon

Communication

R. Jones

SLA234

SAP

C. Jones

Invoice and Payroll

P. Boon

SLA123

Service B

L. Smith

Marketing

R. Reagan

SLA009

Service C

R. Smith

Manufacturing

R. Smith

SLA007

Notes/ Comments

Note: a high score here indicates a service area that has a potentially high Business impact of lost. The Score can also be used as a guide for the types of service recovery (intermediate, immediate, etc.) that would be acceptable to the business. This score is a starting point for recovery options and considerations.

Page 75

Disaster Recovery Workbook

Service A (This section needs to be duplicated for each service listed in the table above)

Service Description Include a description of the service being assessed for risks that may cause a disruption to the business services. Threats In the below table capture all the threats for the components that make up the service. Components

Threats

Probability

Vulnerabilities In the below table capture al the likely vulnerabilities for the components that make up the service. Components

Threats

Probability

Page 76

Disaster Recovery Workbook

Risk Summary Business Logistics Process: Process Owner: Rob Thomas

IT Service: IT Owner:

Logistics Systems ,

Completed Date: Duration:

RISK SUMMARY Troy Jones, Network Manager SERVICE RISK 03 / 24 / 2002 SCORE 15 working days

Component

Priority

Magnitude

Threat Probability

Vulnerability Probability

Risk Value

Server Router Backup Device

1 1 4

1 1 4

1 4 4

1 1 4

1 2 4

Completed By:

1 Action

Comments

Priority: This lists the priority in which the components of the IT Infrastructure are assessed in the event of a disaster. Magnitude: This indicates the criticality of the component to the IT Service Threat Probability: Indicates likelihood of a threat materializing and affecting the component. Vulnerability Probability: This indicates the likelihood of any vulnerability for the component being exploited either deliberately or accidentally. Risk Value: The risk value is an arbitrary figure derived from the Threat and Vulnerability Probabilities. Service Risk Score: Indicates the danger level of a risk actually taking place with regards to this service. A number 1 indicates an extremely high level of probability.

Page 77

Disaster Recovery Workbook

Appendices Include any applicable appendixes that are needed. E.g. Mission statement and/or business objectives, which drove this BIA. Relevant details of people who provided input Terminology Make sure that all terminology is captured and documented correctly. E.g. CMDB ITSCM SLA UC

Configuration Management Data Base Information Technology Services Continuity Management Service Level Agreement Underpinning Contract

Page 78

Disaster Recovery Workbook

3.7

Environmental Architectures & Standards

This document contains details of environmental architectures and standards. Every organization should produce an environmental policy for equipment location, with minimum agreed standards for particular concentrations of equipment. Additionally, minimum standards should be agreed for the protection of buildings containing equipment and equipment room shells. The following tables cover the major aspects that need to be considered, with example characteristics. Building / Site Access Building and site protection

Entry External Environment Services

Secure perimeters, secure entrances, audit trail Security fencing, video camera, movement and intruder detectors, window and door alarms, lightning protectors, good working environment (standard) Multiple controlled points of entry Minimize external risks Where possible and justifiable, alternate routes and suppliers for all essential services, including network services

Major Equipment Room Access

Location

Visibility Shell

Equipment Delivery

Internal Floor Separate Plant Room

External

Secure controlled entry, combination lock, swipe card, video camera (if business critical and unattended) First floor wherever possible, with no water, gas, chemical or fire hazards within the vicinity, above, below or adjacent No signage, no external windows External shell: waterproof, airtight, soundproofed, fire-resistant (0.5 hours to 4 hours depending on criticality) Adequate provision should be made for the delivery and positioning of large delicate equipment Sealed Uninterruptible Power Supply (UPS). Electrical supply and switching, air-handling units, dual units and rooms if business critical Generator for major data centres and business-critical systems

Page 79

Disaster Recovery Workbook

Major Data Centres Access

Temperature

Humidity Control Air Quality

Power

False Floors

Internal Walls Fire detection/prevention

Environmental Protections

Lighting Power Safety

Fire Extinguishers

Secure and controlled entry, combination lock, swipe card, video camera (if business critical and unattended) Strict control, 22° (±3°). Provide for up to 55W/m2. 6° variation throughout the room and a maximum of 6° per hour Strict control, 50% (±10%) Positive pressure, filtered intake low gaseous pollution (e.g. sulphur dioxide ≤ 0.14 ppm), dust levels for particles > 1 micron, less than 5 x 106 particles/m3. Auto shut-down on smoke or fire detection Power Distribution Unit (PDU), with threephase supply to non-switched boxes, one per piece of equipment, with appropriate rated circuit-breakers for each supply. Alternatively, approved power distribution strips can be used. Balanced three-phase loadings. UPS (online or line interactive with Simple Network Management Protocol [SNMP] Management) to ensure voltage supplied is within ± 5% of rating with minimal impulse, sags, surges and over/under voltage conditions Antistatic, liftable floor tiles 600 x 600mm on pedestals, with alternate pedestals screwed to the solid floor. Minimum of 600mm clearance to solid floor. Floor loadings of up to 5kN/m2 with a recommended minimum of 3m between false floor and ceiling From false floor to ceiling, fire-resistant, but with air flow above and below floor level HSSD or VESDA multi-level alarm with auto FM200 (or alternative halon replacement) release on ‘double-knock’ detection For smoke, temperature, power, humidity, water and intruder with automated alarm capability. Local alarm panels with repeater panels and also remote alarm capability Normal levels of ceiling lighting with emergency lighting on power failure Clean earth should be provided on the PDU and for all equipment. With clearly marked remote power-off buttons on each exit. Dirty power outlets, clearly marked, should also be supplied. Sufficient electrical fire extinguishers with

Page 80

Disaster Recovery Workbook

Vibration Electromagnetic Interference Installations

Network Connections

Disaster Recovery

adequate signage and procedures Vibrations should be minimal within the complete area Minimal interference should be present (1.5V/m ambient field strength) All equipment should be provided and installed by qualified suppliers and installers to appropriate electrical and health and safety standards The equipment space should be flood-wired with adequate capacity for reasonable growth. All cables should be positioned and secured to appropriate cable trays Fully tested recovery plans should be developed for all major data centres including the use of stand-by sites and equipment

Regional Data Centres and Major Equipment Centres Access

Temperature Humidity Control Air Quality

Power

False Floors

Internal Walls Fire Detection/Prevention

Secure controlled entry, combination lock, swipe card, video camera (if business critical and unattended) Temperature control, 22° (± 5°), preferable Strict control: 50% (± 10%), preferable Positive pressure, filtered intake low gaseous pollution (e.g. sulphur dioxide ≤ 0.14ppm), dust levels for particles > 1 micron, less than 5 x 106 particles/m3. Auto shut-down on smoke or fire detection PDU with three-phase supply to nonswitched boxes, one per piece of equipment, with appropriate rated circuit-breakers for each supply. Alternatively, approved power distribution strips can be used. Balanced three-phase loadings. Room UPS to ensure voltage supplied is within ± 5% of rating with minimal impulse, sags, surges and over/under voltage conditions Antistatic, liftable floor tiles 600 x 600mm on pedestals, with alternate pedestals screwed to the solid floor. Minimum of 600mm clearance to solid floor. Floor loadings of up to 5kN/m2 with a recommended minimum of 3m between false floor and ceiling From false floor to ceiling, fire-resistant, but with air flow above and below floor level Generally fire detection but not suppression, although HSSD or VESDA multi-level alarm

Page 81

Disaster Recovery Workbook

Environmental Detectors

Lighting Power Safety

Fire extinguishers Vibration Electromagnetic Interference Installations

Network connections

Disaster recovery

with auto FM200 (or alternative halon replacement) release on ‘double-knock’ detection may be included if business-critical systems are contained For smoke, temperature, power, humidity, water and intruder with automated alarm capability Normal levels of ceiling lighting with emergency lighting on power failure Clean earth should be provided on the PDU and for all equipment. With clearly marked remote power-off buttons on each exit. Dirty power outlets, clearly marked, should also be supplied Sufficient electrical fire extinguishers with adequate signage and procedures Vibrations should be minimal within the complete area Minimal interference should be present (1.5V/m ambient field strength) All equipment should be provided and installed by qualified suppliers and installers to appropriate electrical and health and safety standards The equipment space should be flood-wired with adequate capacity for reasonable growth. All cables should be positioned and secured to appropriate cable trays Fully tested recovery plans should be developed for all regional data centres, including the use of stand-by sites and equipment where appropriate

Server or Network Equipment Rooms Access

Temperature

Humidity Control Air Quality Power False Floors

Secure controlled entry, by combination lock, swipe card or lock and key. In some cases equipment may be contained in open offices in locked racks or cabinets Normal office environment, but if in closed/locked rooms adequate ventilation should be provided Normal office environment Normal office environment Clean power supply with a UPS-supplied power to the complete rack Recommended minimum of 3m between floor

Page 82

Disaster Recovery Workbook

Internal Walls Fire Detection / Prevention Environmental Detectors Lighting Power Safety

Fire Extinguishers Vibration Electromagnetic Interference Installations

Network Connections

Disaster Recovery

and ceiling with all cables secured in multicompartment trunking Wherever possible all walls should be fireresistant Normal office smoke/fire detection systems, unless major concentrations of equipment For smoke, power, intruder with audible alarm capability Normal levels of ceiling lighting with emergency lighting on power failure Clean earth should be provided for all equipment. With clearly marked power-off buttons Sufficient electrical fire extinguishers with adequate signage and procedures Vibrations should be minimal within the complete area Minimal interference should be present (1.5V/m ambient field strength) All equipment should be provided and installed by qualified suppliers and installers to appropriate electrical and health and safety standards The equipment space should be flood-wired with adequate capacity for reasonable growth. All cables should be positioned and secured to appropriate cable trays Fully tested recovery plans should be developed where appropriate

Office Environments Access

Lighting, temperature, humidity and air quality

Power False Floors Fire Detection / Prevention and Extinguishers

All offices should have the appropriate secure access depending on the business, the information and the equipment contained within them A normal clean, comfortable and tidy office environment, confirming to the organization’s health, safety and environmental requirements Clean power supply for all computer equipment, with UPS facilities if appropriate Preferred if possible, but all cables should be contained within appropriate trunking Normal office smoke/fire detection systems and intruder alerting systems, unless there are major concentrations of equipment. Sufficient fire extinguishers of the appropriate

Page 83

Disaster Recovery Workbook

Network Connections

Disaster Recovery

type, with adequate signage and procedures The office space should preferably be floodwired with adequate capacity for reasonable growth. All cables should be positioned and secured to appropriate cable trays. All network equipment should be secured in secure cupboards or cabinets Fully tested recovery plans should be developed where appropriate

Page 84

Disaster Recovery Workbook

3.8

Reciprocal Arrangements

IT Services Reciprocal Arrangements Process: IT Service Asset & Continuity Management

Status: Version:

0.1

Release Date:

Page 85

Disaster Recovery Workbook

Document Control Author Prepared by Document Source This document is located on the LAN under the path: I:/IT Services/Service Delivery/ITSCM/ Document Approval This document has been approved for use by the following: ♦

, IT Services Manager



, IT Service Delivery Manager



, National IT Help Desk Manager



, ITSCM Manager

Amendment History Issue

Date

Amendments

Completed By

Distribution List When this procedure is updated the following copyholders must be advised through email that an updated copy is available on the intranet site:

Business Unit

Stakeholders

IT

Page 86

Disaster Recovery Workbook

Introduction Purpose The purpose of this document is to provide relevant Business Units with the existing Reciprocal Arrangements for their IT Services. Scope This document describes the following: Details of Reciprocal Arrangements between > and > Note: It is assumed for each service described in this document that the supporting back-end technology is already in place and operational. Audience This document is relevant to all staff in Ownership IT Services has ownership of this document. Related Documentation Include in this section any related Service Level Agreement reference numbers and other associated documentation: IT Service Asset & Continuity Management Policies, Guidelines and Scope Document Business Impact Analysis Template Risk Assessment Relevant SLA and procedural documents IT Services Catalogue Relevant Technical Specification documentation Relevant User Guides and Procedures

Page 87

Disaster Recovery Workbook

1. Executive Overview Describe the purpose, scope and organization of the document. 2. Scope As not all IT Services may initially be included within the Continuity Management Strategy document, it is important to set the scope for what will be included. Scope for the Business Continuity Strategy may be determined by the business, therefore covering only a select few of the IT Services provided by the IT department that are seen as critical to the support of the business processes. 3. Overview General principles Include in this section any general principles for the reciprocal arrangement. Below are some examples of these principles. and PC facilities for (Specify number of staff members) staff. • Periodic testing and checking of the plan. • Access to facilities in (Specify location) including (Specify client 2). It is understood that: • Neither firm should make a profit or a loss from this arrangement. • Both parties will agree to confidentiality of data, clients, and business practices.

Page 88

Disaster Recovery Workbook

• •

• • • • • •

• •

Neither party will seek compensation from the other should any problems or difficulties arise from the service provided. This plan will be shown to (Client 1's) supplier and, although their approval of such will not be sought, their comments, the subject of the agreement of (Client 1) and (Client 2), will be incorporated into the plan. Refer to Appendix F for (Specify supplier) agreement to the Plan. All items to be used in these plans will be maintained and kept in good working order. Termination can only occur within (Specify length of time) written notice unless otherwise mutually agreed. The agreement will run for (Specify duration of agreement) at a time, to be renewable if both parties agree. The insurers of each company will be made aware of these plans. If a service is provided for more than (Specify a number of days) elapsed days (including weekends), then the host will be compensated by the client by payment of agreed fees. The client will advise all relevant parties of these temporary arrangements (i.e., business clients, etc.) including the new address, phone number(s) and fax number(s) and will also advise reversion when the service terminates. Any data tapes, letter-headed stationery, or other items at the reciprocal party’s office will be stored in a secure, lockable place. The plan will be capable of being implemented within (Specify Time) of a requirement arising within normal office hours. All effort will be made to ensure rapid assistance out of normal office hours.

>> Definition of a disaster In this section provide an agreed upon definition of disaster. This is integral to the success of the arrangement. Be very specific and provide all necessary details. Period of service Capture the service period for the arrangement in the event of a disaster. This will include maximum duration from the time of the disaster and will usually be provided free of charge by the host for a specified time. This time will be determined here. Period beyond specified times may be charged at a nominal fee. Include also the renewing of the contract, probably every year, but should never exceed the date of the contract expiration date.

Page 89

Disaster Recovery Workbook

4. Prerequisites system data Any necessary documentation and appropriate systems Sufficient free space will be set aside on computer systems to handle any loaded data. Insurers for both companies will be made fully aware of these arrangements. Stationary concerns, i.e. adequate printing facilities etc. A list of main staff contacts will be distributed, including contact numbers and locations. A current signed agreement to this plan is in force. The host will only provide services if its own office is not subject to disruption at the same time as the client's. This is intended purely to cover both parties in the instance where one or more events disrupt both offices simultaneously.

>> This is very important. Well established prerequisites allow both parties to understand the upfront responsibilities. Remember that most reciprocal arrangements will be a two way street. What is meant by this is that Company A will use Company B facilities in the event of a disaster, and Company B will use Company A in the event of a disaster. As such, it is important not to use this list in an aggressive manner.

Page 90

Disaster Recovery Workbook

5. Alignment Specify kind of system, for example: Logistics system has many options available within it and that the > programs will be used from a common source. Provision of the > will be a minimum of, but not limited to: The follow should be maintained: • Transaction processing • Consignment processing • Risk Processing • Reporting • Document Archiving If required and agreed at the time of need, provision may be made for > facilities. >> Include in this section how administration of the system will be performed, and what limitations or caveats will be in place for the administration of the system. Include responsibilities pertaining to data backup, restoration and storage. Keep in mind, that in the event of an emergency, Change and Release procedures are imperative to ensure a structure recovered operation without causing more issues. Include things like the following: will be aligned so that no more than seven days elapse between installation of like releases at each site. This requires advance warning to the companies of planned releases. At least (Specify time period, for example: two weeks) written notice will be provided. If one party accepts a pre-release program/fix they should notify the other in order that alignment can be maintained, if such is

Page 91

Disaster Recovery Workbook

required. If misalignment does occur, it is agreed that the oldest release will be upgraded to the newer release, whether this relates to the client or the host. >> Specific data and applications It is important to ensure that alignment of specific data and applications is maintained. To do this, release numbering and release processes becoming integral to the arrangement. It is important that certain elements be consistent across sites. List them here. Also list products that do not have to be replicated: • NT Server will be used • Microsoft software will be used Further example of information: > Also include how access to the facilities will be made available. This will be both physical and logical access. Backup facilities In this section, list the backup facilities that need to be kept in line with the other sites or systems.

Page 92

Disaster Recovery Workbook

6. Provisions . In addition, the service will be provided if the client's office or surrounding area is closed by the authorities. >> The following sections list out those necessary provisions required in the event of a disaster. Office space Include in this section how access to office space will be arranged. Include a table of names for staff that will require access. Make sure you inform security. Establish if any passes are required for access or elevators. Directions to the location of the hosting client's office should also be listed here. The following points need to be considered: • Access hours • Access on weekends • Allowable office space • Access to pertinent areas within the hosts environment – this may exclude certain server rooms or floors in the building • Etc.

Work space How will the work space be set up to cater for the client? Below are some example words that can be used. > Meeting space List any requirements for meeting space. Storage space Provide details about applicable storage space for the client. This may include such things as: • Store Room • Lockable cabinet • Etc. Disaster Recovery Plan Template Safe Provide details about facilities for storing cash, cheque books, or other valuable items will be made available to the client as available. Office equipment > Telephone Number of phones and reimbursement plan should be included here. Fax Include details about fax facilities.

Page 94

Disaster Recovery Workbook

E-mail Include details about e-mail facilities. Mail, courier, and messenger services Include details about mail, courier and messenger services and reimbursement plans. Stationery, photocopying, and other facilities Include details about general office services. Computer equipment > PC Specify any details regarding the provision or supply of PC equipment, including setup, storage, leasing schedules etc. Printer Include printer information. This will be the type of printer, the number of printers, and any stationary. Backups (initial data load) Include any backup information. This will include: • Procedures • Technical Equipment • People • Roles and Responsibilities . Backups (within service provision) > Specify platform from which data should be backed up Include platform information and responsibilities pertaining to the host and client.

Specialist requirements Non-standard items Record any exceptions regarding client and host requirements in this section. This could include things like nil access to specific equipment or services. Slips, cover notes, and other documents List any requirements covering documentation. Include how they are stored and retrieved in the event of a disaster.

Restrictions List any restrictions that may be applicable when the arrangement is in place and being used. >

Page 96

Disaster Recovery Workbook

7. Termination Procedure Of hosting service This will normally occur when the client has restored adequate facilities in their own environment. List the reasons for termination and also the roles and responsibilities. Include any necessary clean up. Of the agreement > 8. Responsibilities Responsibilities for the plan rest with the following: Client 1: Client 2: The Directors concerned are: Client 1: Client 2: 9. Testing the Plan >

Page 98

Disaster Recovery Workbook

10. APPENDIX A

AGREEMENT TO DIASTER RECOVERY PLAN BETWEEN CLIENT 1 AND CLIENT 2

Client 1 Name: Signed: Title: Dated:

_____________________________ _____________________________ _____________________________ _____________________________

Client 2 Name: ______________________________ Signed: ______________________________ Title: ______________________________ Dated: ______________________________ Disaster Recovery Plan Template

Page 99

Disaster Recovery Workbook

11. APPENDIX B

Disaster Recovery Plan SERVICE CONTACTS

Client 1 Name

Title

Phone Number

Locations / Dept

Title

Phone Number

Locations / Dept

Client 2 Name

Disaster R

Page 100

Disaster Recovery Workbook

12. APPENDIX C Disaster Recovery Plan

STAFF TO BE RESIDENT

Client 1 Name

Title

Phone Number

Locations / Dept

Client 2 Name

Title

Phone Number

Locations / Dept

Page 101

Disaster Recovery Workbook

13. APPENDIX D Disaster Recovery Plan STAFF NEEDING TO VISIT OTHER SITE Client 1 Name

Title

Phone Number

Locations / Dept

Client 2 Name

Title

Phone Number

Locations / Dept

Disaster Recovery Plan Template

Page 102

Disaster Recovery Workbook

14. APPENDIX E Disaster Recovery Plan Allocation of resources at Client 2

Item Desks Phones Fax Laptops PCs Servers Printers LAN WAN Applications Licenses Logons

Description

Comments

Disaster Recovery Plan Template

Page 103

Disaster Recovery Workbook

15. APPENDIX G Client 2 Limited - Items stored Off-Site At CLIENT 1 under supervision of: The list below is provided for example. Specify items to be stored and quantity of items per your individual circumstances.

Item Stationary PCs Laptops Backup Devices Desks Chairs etc

Qty

Location

Template

Page 104

Disaster Recovery Workbook

3.9

Business Continuity Strategy

IT Services Business Continuity Strategy Process: IT Service Asset & Continuity Management

Status: Version:

0.1

Release Date:

Page 105

Disaster Recovery Workbook

Document Control Author Prepared by Document Source This document is located on the LAN under the path: I:/IT Services/Service Delivery/Functional Specifications/ Document Approval This document has been approved for use by the following: ♦

, IT Services Manager



, IT Service Delivery Manager



, National IT Help Desk Manager

Amendment History Issue

Date

Amendments

Completed By

Distribution List When this procedure is updated the following copyholders must be advised through email that an updated copy is available on the intranet site:

Business Unit

Stakeholders

IT

Page 106

Disaster Recovery Workbook

Introduction Purpose The purpose of this document is to provide relevant Business Units with the Business Continuity Strategies for the range of services provided by IT Services to the community. Scope This document describes the following: Summary of each service provided by IT Services including Summary of the Continuity Strategy for each applicable service Detailed list of Continuity Strategy for each applicable service Note: It is assumed for each service described in this document that the supporting back-end technology is already in place and operational. Audience This document is relevant to all staff in Ownership IT Services has ownership of this document. Related Documentation Include in this section any related Service Level Agreement reference numbers and other associated documentation: IT Service Asset & Continuity Management Policies, Guidelines and Scope Document Business Impact Analysis Template Risk Assessment Reciprocal Arrangements Relevant SLA and procedural documents IT Services Catalogue Relevant Technical Specification documentation Relevant User Guides and Procedures

Page 107

Disaster Recovery Workbook

Executive Overview Describe the purpose, scope and organization of the Continuity Management Strategy document. Scope As not all IT Services may initially be included within the Continuity Management Strategy document, it is important to set the scope for what will be included. Scope for the Business Continuity Strategy may be determined by the business, therefore covering only a select few of the IT Services provided by the IT department that are seen as critical to the support of the business processes.

Page 108

Disaster Recovery Workbook

IT Service Continuity Strategy Summary This section provides a summary of all the IT Services covered within the Business Continuity Strategy. It provides a breakdown of all the IT Services, the Recovery Options, Owners of IT Service, Affected Business Processes, and Threat to Business Operations, Service Level Agreements, and associated procedures. IT Service

Owner

Service A

J. Ned

Email

A. Boon C. Jones L. Smith R. Smith

SAP Service B Service C

Work Around Yes

Recovery Options Grad-ual Intermediate

Yes

Backup Tapes Backup CD Backup Tapes Rebuild

Yes

Rebuild

No No

Reciprocal Arrangement Reciprocal Arrangement No

Immediate

Business Process

Threat to Business

Business Owners

No

Billing

High

T. Smith

Communication Invoice and Payroll Marketing

Low

R. Jones

Very High

P. Boon

No

Replicated Server Replicated Service No

Medium

No

No

Manufacturing

R. Reagan R. Smith

High

Service Level Agreements SLA Response Recovery # Time Time SLA 4 Hours 8 Hours 001 SLA 2 Hours 4 Hours 234 SLA 30 mins 2 Hours 123 SLA 1 Hour 3 Hours 009 SLA 30 mins 2 Hours 007

Page 109

Applicable Procedures IncMgt101 ComRec23 1 N/A N/A N/A

Disaster Recovery Workbook

Service A Please Note. Some of the sub-headings here may not be applicable for the IT Service. For example, in some instance where you have an Immediate Recovery Option for a Service it may not be applicable to have spent money on Gradual Recovery Options, and vice versa. Description Provide a description of the IT Services. Include all relevant SLA and Ownership Details. Service

Owner

Service Level Agreements Procedures Business Process SLA Response Recovery # Times Times

Business Impact

Risk Summary In this section provide a brief description of any know and major risks to the IT Service. Risks are determined by understanding the assets that are involved in the service, the threats to those assets and any identified threats. A risk summary table, below, provides a summary of risks to the IT Service. Assets/Service Threats

Vulnerabilities

Risk Level

Definitions: A threat is 'how likely is it that a particular service will be disrupted. Vulnerability assesses what the impact will be upon the organization if the threat manifests. The risk level is then the combination of the threat and the vulnerability. It can be reached through a quantitative analysis or simply a subjective feel.

After completing the above table, summaries the overall risk to the service and the impact on the business. Manual Work Around As it is not possible to always provide an immediate IT solution to every disaster, it is therefore imperative to capture any manual work around options that may be available.

Page 110

Disaster Recovery Workbook

A manual work around can be seen as an effective interim measure until the IT Service has been restored. The manual workarounds will be for both IT departments and the Business. List all Manual Work Around options in this section. IT Procedure

Owner

Business Procedure

Business Owner

Reciprocal Arrangements In some situations, organizations will rely on likeminded businesses to provide services in the event that they experience some sort of loss of service. This is called a reciprocal Arrangement. Reciprocal Arrangements may be made with several organizations for the one service. Reciprocal Arrangement - Contract or Agreement

Agreed Services (Underpinning Contract or Service Level Agreement) Response Recovery Durations Time Time

Business Contact

IT Service Contact

Business Contact

IT Service Contact

Reciprocal Arrangement - Contract or Agreement

Agreed Services (Underpinning Contract or Service Level Agreement) Response Recovery Durations Time Time

Page 111

Disaster Recovery Workbook

Gradual Recovery This recovery options is used for services where immediate restoration of business processes is not needed and can function for up to a period of 24 to 72 hours as defined in a service agreement. Agreements Agreements will include those with the business for Gradual Recovery. They will also include any additional accommodation and services plans. Technology This section will provide details about the computer systems and network plans, as well as any telecommunications plans. Security In the event of a disaster, there may be some impact on the security of the IT Department and the business as a whole. In this section include any security issues, appropriate security plans, and security to be tested and revisited after recovery. Finance Include in this section any required finance for the recovery options. This information will be used in the budgeting process for subsequent years. Personnel List all responsible personnel for the recovery of this service. Summary Provide a summary for the Gradual Recovery of Service >

Page 112

Disaster Recovery Workbook

Intermediate Recovery This recovery options is used for services that are important enough to the business that a 4 to 24 hour restoration period is required. Agreements Agreements will include those with the business for Gradual Recovery. They will also include any additional accommodation and services plans. Technology This section will provide details about the computer systems and network plans, as well as any telecommunications plans. Security In the event of a disaster, there may be some impact on the security of the IT Department and the business as a whole. In this section include any security issues, appropriate security plans, and security to be tested and revisited after recovery. Finance Include in this section any required finance for the recovery options. This information will be used in the budgeting process for subsequent years. Personnel List all responsible personnel for the recovery of this service. Summary Provide a summary for the Gradual Recovery of Service >

Page 113

Disaster Recovery Workbook

Immediate Recovery This recovery options is used for services where immediate restoration of business processes is needed and the business will suffer severe consequences if restoration is not within 2 to 4 hours. Agreements Agreements will include those with the business for Gradual Recovery. They will also include any additional accommodation and services plans. Technology This section will provide details about the computer systems and network plans, as well as any telecommunications plans. Security In the event of a disaster, there may be some impact on the security of the IT Department and the business as a whole. In this section include any security issues, appropriate security plans, and security to be tested and revisited after recovery. Finance Include in this section any required finance for the recovery options. This information will be used in the budgeting process for subsequent years. Personnel List all responsible personnel for the recovery of this service. Summary Provide a summary for the Gradual Recovery of Service >

Page 114

Disaster Recovery Workbook

Appendices Include any applicable appendixes that are needed. E.g. Logical Schematic of the IT environment. Contact details

Page 115

Disaster Recovery Workbook

Terminology Make sure that all terminology is captured and documented correctly. E.g. CMDB ITSCM SLA UC

Configuration Management Data Base Information Technology Services Continuity Management Service Level Agreement Underpinning Contract

Page 116

Disaster Recovery Workbook

3.10 Management of Risk (MOR) Framework A standard methodology, such as the Management of Risk (M_o_R), should be used to assess and manage risks within an organization. The M_o_R framework is illustrated below in Figure 1.

Management of Risk Principles Embed and review

M_o_R Approach Risk Register

M_o_R Approach Issue Log

Implement Identify

Communicate

Plan

Assess

M_o_R Approach Risk Management Plan

M_o_R Approach Risk Management Policy

M_o_R Approach Risk Management Process Guide

Page 117

Disaster Recovery Workbook

The M_o_R approach is based around the above framework, which consists of the following: M_o_R principles: these principles are essential for the development of good risk management practice and are derived from corporate governance principles. M_o_R approach: an organization’s approach to these principles needs to be agreed and defined within the following living documents: • Risk Management Policy • Process Guide • Plans • Risk registers • Issue Logs. M_o_R Processes: the following four main steps describe the inputs, outputs and activities that ensure that risk are controlled: • Identify: the threats and opportunities within an activity that could impact the ability to reach its objective. • Assess: the understanding of the net effect of the identified threats and opportunities associated with an activity when aggregated together • Plan: to prepare a specific management response that will reduce the threats and maximize the opportunities. • Implement: the planned risk management actions monitor their effectiveness and take corrective action where responses do not match expectations. Embedding and reviewing M_o_R: having put the principles, approach and processes in place, they need to be continually reviewed and improved to ensure they remain effective. Communication: having the appropriate communication activities in place to ensure that everyone is kept up-to-date with changes in threats, opportunities and any other aspects of risk management.

Page 118

Disaster Recovery Workbook

The M_o_R method requires the evaluation of risks and the development of a risk profile, see example shown in Figure 2.

Fire/ explosion

Most Severe Chemical leak

Least severe

Loss of PBX/ACD Server failure

Severity / Impact

Acceptable risk

Storm damage

Major network failure Theft

Power Failure Corrupt database

Least likely

Coffee spill on PC



Most likely

risk

Likelihood of occurrence Figure 2 shows an example risk profile, containing many risks that are outside the defined level of ‘acceptable risk’. Following the Risk Analysis it is possible to determine appropriate risk responses or risk reduction measures (ITSCM mechanisms) to manage the risks i.e. reduce the risk to an acceptable level or mitigate the risk. Wherever, possible, appropriate risk responses should be implemented to reduce either the impact or the likelihood, or both, of these risks from manifesting themselves.

Page 119

Disaster Recovery Workbook

Page 120

Disaster Recovery Workbook

3.11 Risk Assessment Questionnaire

IT Services Risk Assessment Questionnaire Process: IT Service Asset & Continuity Management

Status:

In draft Under Review Sent for Approval Approved Rejected

Version:

Release Date:

Page 121

Disaster Recovery Workbook

Instructions for Completing the Risk Assessment Questionnaire Please answer the following information security program questions as of the examination date pre-determined by the ACME. The majority of the questions require only a “Yes” or “No” response; however, you are encouraged to expand or clarify any response as needed directly below each question, or at the end of this document under the heading “Clarifying or Additional Comments”. For any question deemed non-applicable to your institution or if the answer is “None”, please respond accordingly (“NA” or “None”). Please do not leave responses blank. At the bottom of this document is a signature block, which must be signed by an executive officer attesting to the accuracy and completeness of all provided information. I hereby certify that the following statements are true and correct to the best of my knowledge and belief. Officer’s Name and Title Institution’s Name and Location

Officer’s Signature

Date Signed

As of Date

This is an official document. Any false information contained in it may be grounds for prosecution and may be punishable by fine or imprisonment.

Page 122

Disaster Recovery Workbook

PART 1 – RISK ASSESSMENT An IT risk assessment is a multi-step process of identifying and quantifying threats to information assets in an effort to determine cost effective risk management solutions. To help us assess your risk management practices and the actions taken as a result of your risk assessment, please answer the following questions: a. Name and title of individual(s) responsible for managing the IT risk assessment process: b. Names and titles of individuals, committees, departments or others participating in the risk assessment process. If third-party assistance was utilized during this process, please provide the name and address of the firm providing the assistance and a brief description of the services provided: c. Completion date of your most recent risk assessment: d. Is your risk assessment process governed by a formal framework/policy (Y/N)? e. Does the scope of your risk assessment include an analysis of internal and external threats to confidential customer and consumer information as described in …... of the ACME’s Rules and Regulations (Y/N)? f. Do you have procedures for maintaining asset inventories (Y/N)? g. Do risk assessment findings clearly identify the assets requiring risk reduction strategies (Y/N)? h. Do written information security policies and procedures reflect risk reduction strategies identified in “g” above (Y/N)? i.

Is your risk assessment program formally approved by the Board of Directors at least annually (Y/N)? If yes, please provide the date that the risk assessment program was last approved by the Board of Directors:

j.

Are risk assessment findings presented to the Board of Directors for review and acceptance (Y/N)? If yes, please provide the date that the risk assessment findings were last approved by the Board of Directors:

Page 123

Disaster Recovery Workbook

PART 2 – OPERATIONS SECURITY AND RISK MANAGEMENT To help us assess how you manage risk through your information security program, please answer the following questions for your environment. If any of the following questions are not applicable to your environment, simply answer “N/A.” a. Please provide the name and title of your formally designated IT security Officer: b. Please provide the name and title of personnel in charge of operations: c. Do you maintain topologies, diagrams, or schematics depicting your physical and logical operating environment(s) (Y/N)? d. Does your information security program contain written policies, procedures, and guidelines for securing, maintaining, and monitoring the following systems or platforms: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.

Core banking system (Y/N)? Imaging (Y/N)? Fed Line and/or wire transfer (Y/N)? Local area networking (Y/N)? Wide-area networking (Y/N)? Wireless networking – LAN or WAN (Y/N)? Virtual private networking (Y/N)? Voice over IP telephony (Y/N)? Instant messaging (Y/N)? Portable devices such as PDAs, laptops, cell phones, etc. (Y/N)? Routers (Y/N)? Modems or modem pools (Y/N)? Security devices such as firewall(s) and proxy devices. (Y/N)? Other remote access connectivity such as GoToMyPC, PcAnyWhere, etc. (Y/N)? 15. Other – please list: e. Do you have formal logging/monitoring requirements for 1-15 above (Y/N)? f. Do you have formal configuration, change management, and patch management procedures for all applicable platforms identified in “d.” above (Y/N)? g. Do you have an antivirus management program to protect systems from malicious content (Y/N)? h. Do you have an anti-spyware management program to protect enduser systems (Y/N)?

Page 124

Disaster Recovery Workbook

i.

Do you have a formal intrusion detection program, other than basic logging, for monitoring host and/or network activity (Y/N)?

j. Has vulnerability testing been performed on internal systems (Y/N)? If yes, please provide date performed and by whom: k. Has penetration testing of your public or Internet-facing connection(s) been performed (Y/N)? If yes, please provide date performed and by whom: l. Do you have an incident response plan defining responsibilities and duties for containing damage and minimizing risks to the institution (Y/N)? If yes, does the plan include customer notification procedures (Y/N)? m. Do you have a physical security program defining and restricting access to information assets (Y/N)? n. Do you have a vendor management program (Y/N)? o. Are all of your service providers located within the United States (Y/N)? p. Do you have an employee acceptable use policy (Y/N)? If yes, please provide how often employees must attest to the policy contents: q. Do you have an employee security awareness training program (Y/N)? If yes, please indicate the last date training was provided: r. Are you planning to deploy new technology within the next 12 months (Y/N)? If you answered “Yes”, were the risks associated with this new technology reviewed during your most recent risk assessment (Y/N)? s. Have you deployed new technology since the last ACME examination that was not included in your last risk assessment (Y/N)? t. Is security incorporated into your overall strategic planning process (Y/N)? u. Do you have policies/procedures for the proper disposal of information assets (Y/N)?

Page 125

Disaster Recovery Workbook

PART 3 – AUDIT/INDEPENDENT REVIEW PROGRAM To help us assess how you monitor operations and compliance with your written information security program, please answer the following questions: a.

Please provide the name and title of your IT auditor or employee performing internal IT audit functions. Include who this person reports to, and a brief description of their education and experience conducting IT audits.

b.

Do you have a written IT audit/independent review program (Y/N)?

c.

Please provide the following information regarding your most recent IT audit/independent review: 1. 2. 3. 4. 5.

Audit Date: Firm name (if external): Was an audit report produced (Y/N)? Date audit report was reviewed and approved by the Board: Audit scope and objectives:

d.

Does audit coverage include a comparison of actual system configurations to documented/baseline configuration standards (Y/N)?

e.

Does audit coverage include assessing compliance with the

f.

Does audit coverage include assessing users and system services access rights (Y/N)?

g

Is audit involved in your risk assessment process (Y/N)?

h.

Briefly describe any security incidents (internal or external) affecting the bank or bank customers occurring since the last ACME IT examination.

i.

Briefly describe any known conflicts or concentrations of duties.

Page 126

Disaster Recovery Workbook

PART 4 - DISASTER RECOVERY AND BUSINESS CONTINUITY To help us assess your preparedness for responding to and recovering from an unexpected event, please answer the following: a.

Do you have an organization-wide disaster recovery and business continuity program (Y/N)? If yes, please provide the name of your coordinator:

b.

Are disaster recovery and business continuity plans based upon a business impact analyses (Y/N)? If yes, do the plans identify recovery and processing priorities (Y/N)?

c.

Is disaster recovery and business continuity included in your risk assessment (Y/N)?

d.

Do you have formal agreements for an alternate processing site and equipment should the need arise to relocate operations (Y/N)?

e.

Do business continuity plans address procedures and priorities for returning to permanent and normal operations (Y/N)?

f.

Do you maintain offsite backups of critical information (Y/N)? If “Yes,” is the process formally documented and audited (Y/N)?

g.

Do you have procedures for testing backup media at an offsite location (Y/N)?

h.

Have disaster recovery/business continuity plans been tested (Y/N)? If “Yes”, please identify the system(s) tested, the corresponding test date, and the date reported to the Board:

Page 127

Disaster Recovery Workbook

Any Clarifying or Additional Comments

Page 128

Disaster Recovery Workbook

3.12 Typical Contents of a Recovery Plan DOCUMENT CONTROL This document must be maintained to ensure that the systems, Infrastructure and facilities included, appropriate support business recovery requirements. Document distribution Copy 1. 2. 3. 4.

Issued to

Date

Position

Document Revision This document will be reviewed every X months. Current Revision: dd/mm/yyyy Next Revision: dd/mm/yyyy Revision Date

Version No

Summary of Changes

Document Approval This document must be approved by the following personnel: Name

Title

Signature

Page 129

Disaster Recovery Workbook

SUPPORTING INFORMATION Introduction This document details the instructions and procedures that are required to be followed to recover or continue the operations of systems, infrastructure, services or facilities to maintain Service Continuity to the level defined or agreed with the business. Recovery Strategy The systems, infrastructure, services, or facilities will be recovered to alternative systems, Infrastructure services or facilities. It will take approximately X hours to recover the systems, Infrastructure, services or facilities. The system will be recovered to the last known point of stability/data integrity, which is point in day/timing. Invocation The following personnel are authorized to invoke this plan: 1. 2. Interfaces and dependencies on other plans Details of the inter-relationships and references with all other continuity and recovery plans and how the interfaces are activated. General Guidance All requests for information from the media or other sources should be referred to the Company procedure. When notifying personnel of a potential or actual disaster, follow the defined operational escalation procedures, and in particular: • • • •

Be calm and avoid lengthy conversation Advise them of the need to refer information requests to escalation point Advise them of expectations and actions (avoid giving them details of the Incident unless absolutely necessary) If the call is answered by somebody else o Ask if the contact is available elsewhere o If they cannot be contacted, leave a message to contact you on a given number o Do not provide details of the incident Page 130

Disaster Recovery Workbook

o Always document call time details, responses and actions All activities and contact/escalation should be clearly and accurately recorded. To facilitate this, actions should be in a checklist format and there should be space to record the date and time the activity was started and completed, and who carried out the activity. Dependencies System, Infrastructure, service, facility or interface dependencies should be documented (in priority order) so that related recovery plans or procedures that will need to be invoked in conjunction with this recovery plan can be identified and actioned. The person responsible for invocation should ensure recovery activities are coordinated with these other plans. System

Document Reference

Contact

Contact Lists Lists of all contact names, organizations and contact details and mechanisms: Name

Organization / Role

Title

Contact Details

Recovery Team The following staff/functions are responsible for actioning these procedures or ensuring the procedures are actioned and recording any issues or problems encountered. Contact will be made via the normal escalation procedures. Name

Title

Contact Details

Page 131

Disaster Recovery Workbook

Recovery Team Checklist To facilitate the execution of key activities in a timely manner, a checklist similar to the following should be used. Task Confirm invocation Initiate call tree and escalation procedures Instigate and interface with any other recovery plans necessary (e.g. BCP, Crisis Management, Emergency Response Plan) Arrange for backup media and documentation to be shipped to recovery site(s) Establish recovery teams Initiate recovery actions Confirm progress reporting Inform recovery team of reporting requirements Confirm liaison requirements with all recovery teams Advise customers and management of estimated recovery completion

Target Completion

Actual Completion

Recovery procedure Enter recovery instructions / procedures or references to all recovery procedures here. Content/format should be in line with company standards for procedures. If there are none, guidance should be issued by the Manager or Team Leader for the area responsible for the system, Infrastructure, services or facility. The only guideline is that the instructions should be capable of being executed by an experienced professional without undue reliance on local knowledge. Where necessary, references should be made to supporting documentation (and its location), diagrams and other information sources. This should

Page 132

Disaster Recovery Workbook

include the document reference number (if it exists). It is the responsibility of the plan author to ensure that this information is maintained with this plan. If there is only a limited amount of supporting information, it may be easier for this to be included within the plan, providing this plan remains easy to read/follow and does not become too cumbersome.

Page 133

Disaster Recovery Workbook

Page 134

Disaster Recovery Workbook

3.13 Communication Plan

IT Services Communication Plan Process: IT Service Asset & Continuity Management

Status:

In draft Under Review Sent for Approval Approved Rejected

Version:

Release Date:

Page 135

Disaster Recovery Workbook

Communication Plan for IT Service Asset & Continuity Management The document is not to be considered an extensive statement as its topics have to be generic enough to suit any reader for any organization. However, the reader will certainly be reminded of the key topics that have to be considered. This document serves as a GUIDE FOR COMMUNICATIONS REQUIRED for the IT Service Asset & Continuity Management process. This document provides a basis for completion within your own organization. This document contains suggestions regarding information to share with others. The document is deliberately concise and broken into communication modules. This will allow the reader to pick and choose information for e-mails, flyers, etc. from one or more modules if and when appropriate. This document was; Prepared by: On:

And accepted by: On:

Page 136

Disaster Recovery Workbook

Initial Communication Sell the Benefits First steps in communication require the need to answer the question that most people (quite rightly) ask when the IT department suggests a new system, a new way of working. WHY? It is here that we need to promote and sell the benefits. However, be cautious of using generic words. Generic Benefit statements

Specific Organizational example

Improved Customer Service Reduction in the number of Incidents

This is important because… In recent times our incidents within IT have… Apart from the obvious benefits, the IT department in recent times has… A recent example of … saw the individual and others in the company start to…

Provides quicker resolution of Incidents Improved Organizational learning

The above Communication module (or elements of) was/were distributed; To: On:

By: On:

Page 137

Disaster Recovery Workbook

IT Service Asset & Continuity Management Goal The Goal of IT Service Asset & Continuity Management The Goal of IT Service Asset & Continuity Management can be promoted in the following manner. Official Goal Statement: To recover IT Services, in the event of a disaster, within agreed business time scales so as to support the overall business continuity of the organization. •

High visibility and wide channels of communication are essential in this process. Gather specific requirements from nominated personnel

(Special Tip: Beware of using only Managers to gain information from, as the resistance factor will be high) •

Oversee the monitoring of process to ensure that the business needs of IT are not impacted, but taking into account that changes are required to ensure continued high levels of IT Service Delivery and Support.



Provide relevant reports to nominated personnel.

(Special Tip: Beware of reporting only to Managers. If you speak to a lot of people regarding Service Support and Delivery then you need to establish ways to report to these people the outcomes and progress of the discussions). Always bear in mind the “so what” factor when discussing areas like goals and objectives. If you cannot honestly and sensibly answer the question “so what” – then you are not selling the message in a way that is personal to the listener and gets their “buy-in”.

The above IT Service Asset & Continuity Management Goals module was distributed; To: On:

By: On:

Page 138

Disaster Recovery Workbook

IT Service Asset & Continuity Management Activities Intrusive & Hidden Activities The list of actions in this module will have a direct impact on end users and IT Staff. They will be curious as to why working with them in this manner, rather than the historical method of just “doing it”. There could be an element of suspicion and resistance, so consider different strategies to overcome this initial skepticism. Initiation • Interview and record the needs from the Business • Promote and advertise ITSCM • Show Business Benefits Business Impact Analysis • Show impact on business due to loss of service • High Impact over a short time • Low Impact over a long time Risk Assessment • Look at the threats and vulnerabilities • List assets that may be a target and their importance to the business • Communicate Countermeasures Business Continuity Strategy • Communicate the different recovery options • Gradual • Intermediate • Immediate Implementation • Reciprocal Arrangements • Recovery Plans • Countermeasures Operational Management • Initial Testing • Annual Testing • Change Management

Information regarding activities was distributed; To: On:

By: On:

Page 139

Disaster Recovery Workbook

IT Service Asset & Continuity Management Planning Costs Information relating to costs may be a topic that would be held back from general communication. Failure to convince people of the benefits will mean total rejection of associate costs. If required, costs fall under several categories: •

Personnel – IT Service Asset & Continuity Management staff, technical management team (Set-up and ongoing of the technical infrastructure)



Accommodation – Physical location (Set-up and ongoing)



Software – Tools (Set-up and ongoing)



Hardware – Infrastructure (Set-up)



Education – Training (Set-up and ongoing)



Procedures – external consultants etc (Set-up)

The costs of implementing IT Service Asset & Continuity Management will be outweighed by the benefits. For example, many organizations have a negative perception of the IT Service Asset & Continuity Management process as it doesn’t seem to offer any visible services. To alleviate this, customers and end-users need to be constantly informed of the service being provided. This provides good customer service and adds a level of comfort to the users in the sense that they can “see” action taking place. A well run IT Service Asset & Continuity Management process will make major inroads into altering the perception of the IT Organization.

Details regarding the cost of IT Service Asset & Continuity Management were distributed; To: On:

By: On:

Page 140

Disaster Recovery Workbook

3.14 Example E-mail Text

IT Services E-Mail Text Process: IT Service Asset & Continuity Management

Status:

In draft Under Review Sent for Approval Approved Rejected

Version:

Release Date:

Note: SEARCH AND REPLACE

Search for any > as your input will be required Also review any yellow highlighted text

Page 141

Disaster Recovery Workbook

Introduction In the next section of this document is an example email text that can be distributed across your organization. Note, that this is just one piece of text for one email. However, it is advisable to create a few different versions of the below text, which you can store in this document, for future use. This is very important, as each time you send an email regarding your IT Service Asset & Continuity Management process it should be different and targeted to the correct audience. This document provides a method for also keeping track of your communication that you have made to the rest of the organization, and to keep in focus the promises that have been made regarding this process.

Page 142

Disaster Recovery Workbook

Dear > IT Service Asset & Continuity Management Program The IT Department is embarking on a programme to ensure that – in the event of an unplanned and major service outage, we are able to respond and restore IT services. What does this mean to you? The IT Department continually strives to improve the service it delivers to its customers. The IT Services department provides internal support for . In order to improve the IT Services and ensure that they are aligned with the needs of the organization, we have decided to embark on a service improvement programme. This programme will result in the implementation of a process called IT Service Asset & Continuity Management. Why the need for IT Service Asset & Continuity Management? Organizations are required to operate and provide a service at all times. It’s that simple. Increasing competition and a growth in the requirement by consumers for instant or near real time response has fuelled the necessity for an agreed level of IT services to be provided following an interruption to the business. Such “interruptions” can be a loss of a single application or a complex system failure – all the way through to the loss of a building (e.g. through fire, flood, etc.)

We have defined the Goal for IT Service Asset & Continuity Management as follows: The goal for IT Service Continuity is to support the Business Continuity Management process (following pre-defined losses in organizational ability), through the delivery of IT services, within agreed times and costs.

> What is your involvement? The IT Department will be creating a list of IT Services that it delivers. This will be captured in a Service Catalogue (SC). The list of services will then be presented to the different departments within . From this list, each department will be able to pick the service that they use, and through our requirements gathering, make comments about the requirements for that service during times of major outage or loss.

Page 143

Disaster Recovery Workbook

From this, we will be able to then formulate agreements on the services being provided. These agreements are called Service Level Agreements and they include the requirements for continuity. This will help ensure that the IT Department is aligning it’s Services with the business needs, provide a way to measure the services, set expectations of the services being delivered, and more importantly provide an avenue for discovery in service improvement. We have appointed an IT Service Continuity Manager to help drive this process. The IT Service Continuity Manager will be the interface between the IT Department and the Department heads within the organization. The IT Service Continuity Manager will work closely with the business in defining the necessary services and agreeing their level of availability. The following can be considered a list of benefits to be derived from the process: > The commencement date of the new process is scheduled for: > OR Completion of the process will be: > This is a detailed process and there may be some operational difficulties to overcome, but with your support, I am sure we can provide an extremely beneficial process to both the Business / and IT. If you have any questions regarding this, please do not hesitate to contact me on > >

Page 144

Disaster Recovery Workbook

3.15 Emergency Response Template

IT Services Emergency Response Template Process: IT Service Asset & Continuity Management

Status:

In draft Under Review Sent for Approval Approved Rejected

Version:

Release Date:

Page 145

Disaster Recovery Workbook

Document Control Author Prepared by Document Source This document is located on the LAN under the path: I:/IT Services/Service Delivery/Emergency Response Plan/ Document Approval This document has been approved for use by the following: ♦

, IT Services Manager



, IT Service Delivery Manager



, IT Service Continuity Process Manager



, Customer representative or Service Level Manager

Amendment History Issue

Date

Amendments

Completed By

Distribution List When this procedure is updated the following copyholders must be advised through email that an updated copy is available on the intranet site:

Business Unit

Stakeholders

IT

Page 146

Disaster Recovery Workbook

Introduction Purpose The purpose of this document is to provide an emergency response template. Scope This document describes the following: An emergency response template Note: It is assumed for each service described in this document that the supporting back-end technology is already in place and operational. Audience This document is relevant to all staff in Ownership IT Services has ownership of this document. Related Documentation Include in this section any related Service Level Agreement reference numbers and other associated documentation: IT Service Asset & Continuity Management Policies, Guidelines and Scope Document Business Continuity Strategy Template Risk Assessment Reciprocal Arrangements Relevant SLA and procedural documents IT Services Catalogue Relevant Technical Specification documentation Relevant User Guides and Procedures

Page 147

Disaster Recovery Workbook

Executive Overview Describe the purpose, scope and organization of the document. Scope Not all IT Services may initially be included within the Emergency Response Plan. Use this section to outline what will be included and the timetable for other services to be included. Scope for the assessment may be determined by the business, therefore covering only a select few of the IT Services provided by the IT department that are seen as critical to the support of the business processes. The emergency response plan is fairly simple in concept and should be used in conjunction with the Salvage Plan Template.

Page 148

Disaster Recovery Workbook

IT Service Emergency Response Summary This section is to provide a brief summary of the information contained in the next sections of the document. The below table provides an example of information that can be captured to create a summary of the Emergency Response plans for the IT Services listed in this document. Service Customer Description Response Recovery IT Contact Procedures Times Options Owner Number

This template can be distributed to the business as it helps in setting the expectation of the level of service they will receive in the event a disaster is experienced. There should be no use of technical terms in the above table.

Page 149

Disaster Recovery Workbook

Emergency Response Plan (ERP) – Service A This section should be repeated for each Service. Introduction In this section provide some detail about the ERP. Include things like which aspects of the infrastructure are included, which services, why it is necessary etc. > Response Strategy In this section detail the strategy being used to respond to the IT Disaster. Important things to cover are: • • • • •

Service Agreements to Respond Process of Response Escalation Strategies Key Personnel for the Service Priorities for restoration

Invocation The following personnel are authorised to invoke this plan: Business Sponsors > > >

>

>

>

IT Sponsors

>>

>>

>>

Page 150

Disaster Recovery Workbook

Dependencies In this section list dependencies for this IT Service. Dependencies will be other systems, infrastructure, facilities, documentation etc. The below table provides a template for capturing this information: Dependant Dependant Service Components

Impact on Service A

Service Operational Underpinning Level Level Contracts Agreement Agreement # #

Dependant or contributor

Response Team The following listed people are responsible for performing the actions listed in the Response Plan. They are to ensure that the procedures are carried out in the most efficient and effective manner possible. Name

Title

Phone Number

Locations / Dept

Page 151

Disaster Recovery Workbook

Response Plan Listed Procedures for Response for Service - A: Procedure Name

Description

Owner

Location

Response Plan for Service - A: Step

Action

Responsibility

1 2

Record disasters Provide disaster report

3

Alert Business

4 5

Alert Salvage Team Perform initial investigation Implement Salvage Procedures

6

Target Actual Completion Completion

Service Desk Avail. Mgt, Inc Mgt, Problem Mgt Service Delivery Manager Network Manager Salvage Team Salvage Team, Service Delivery Manager

Equipment Needed:

CI # SER345 RT5700 RT4567 MS001

Serial # 15434563 54444443 76547457 N/A

IT Components (Configuration Items (CI)) CI Name Type EMERO Hardware CISCO-002 Hardware CISCO-001 Hardware MS Office Software

Sub-Type Server Router Router Microsoft

Page 152

Disaster Recovery Workbook

Appendices Include any applicable appendixes that are needed. Terminology Make sure that all terminology is captured and documented correctly.

Page 153

Disaster Recovery Workbook

Page 154

Disaster Recovery Workbook

3.16 Salvage Plan Template

IT Services Salvage Plan Template Process: IT Service Asset & Continuity Management

Status: Version:

0.1

Release Date:

Page 155

Disaster Recovery Workbook

Document Control Author Prepared by Document Source This document is located on the LAN under the path: I:/IT Services/Service Delivery/Salvage Plan/ Document Approval This document has been approved for use by the following: ♦

, IT Services Manager



, IT Service Delivery Manager



, IT Service Continuity Process Manager



, Customer representative or Service Level Manager

Amendment History Issue

Date

Amendments

Completed By

Distribution List When this procedure is updated the following copyholders must be advised through email that an updated copy is available on the intranet site:

Business Unit

Stakeholders

IT

Page 156

Disaster Recovery Workbook

Introduction Purpose The purpose of this document is to provide a salvage plan template. Scope This document describes the following: A salvage plan template Note: It is assumed for each service described in this document that the supporting back-end technology is already in place and operational. Audience This document is relevant to all staff in Ownership IT Services has ownership of this document. Related Documentation Include in this section any related Service Level Agreement reference numbers and other associated documentation: IT Service Asset & Continuity Management Policies, Guidelines and Scope Document Business Continuity Strategy Template Risk Assessment Reciprocal Arrangements Relevant SLA and procedural documents IT Services Catalogue Relevant Technical Specification documentation Relevant User Guides and Procedures

Page 157

Disaster Recovery Workbook

1. Executive Overview Describe the purpose, scope and organization of the document. 2. Scope Not all IT Services may initially be included within the Salvage Plan. Use this section to outline what will be included and the timetable for other services to be included. Scope for the assessment may be determined by the business, therefore covering only a select few of the IT Services provided by the IT department that are seen as critical to the support of the business processes. The salvage plan is fairly simple in concept and should be used in conjunction with the Emergency Response Template.

Page 158

Disaster Recovery Workbook

3. Sample Salvage Plan DISASTER PLANNING EXAMPLE SALVAGE ASSESSMENT WORKSHEET

IT Service Description: IT Service Owner:

___________________________ ___________________________

Business Process: Business Process Owner:

___________________________ ___________________________

Records Series Title ____________________________ Note: This is the title for this salvage plan for this service Storage of Plan: Hardcopy ( ) Microfilm ( ) Electronic ( ) Other (specify) ______________________________________________________________ __

Salvage of Service Needed: Yes No

( ) ( )

If Yes, By What Method: Commercial Provider Backup Rebuild System Restore

( ( ( (

) ) ) )

Listed Procedures for Service Salvage: Procedure Name

Description

Owner

Location

Page 159

Disaster Recovery Workbook

Service Salvage Plan: Step 1 2

Action Record disasters Provide disaster report

3 4 5 6

Alert Business Alert Salvage Team Perform initial investigation Implement Salvage Procedures

Responsibility Service Desk Avail. Mgt, Inc Mgt, Problem Mgt Service Delivery Manager Network Manager Salvage Team Salvage Team, Service Delivery Manager

Equipment Needed:

CI # SER345 RT5700 RT4567 MS001

Serial # 15434563 54444443 76547457 N/A

IT Components (Configuration Items (CI)) CI Name Type EMERO Hardware CISCO-002 Hardware CISCO-001 Hardware MS Office Software

Sub-Type Server Router Router Microsoft

Page 160

Disaster Recovery Workbook

3.17 Vital Records Template

IT Services Vital Records Template Process: IT Service Asset & Continuity Management

Status: Version:

0.1

Release Date:

Page 161

Disaster Recovery Workbook

Introduction Purpose The purpose of this document is to provide a vital records template. Scope This document describes the following: A Vital Records template Note: It is assumed for each service described in this document that the supporting back-end technology is already in place and operational. Audience This document is relevant to all staff in Ownership IT Services has ownership of this document. Related Documentation Include in this section any related Service Level Agreement reference numbers and other associated documentation: IT Service Asset & Continuity Management Policies, Guidelines and Scope Document Business Continuity Strategy Template Risk Assessment Reciprocal Arrangements Relevant SLA and procedural documents IT Services Catalogue Relevant Technical Specification documentation Relevant User Guides and Procedures

Page 162

Disaster Recovery Workbook

Executive Overview Describe the purpose, scope and organization of the document. Scope Each organisation and department should develop a vital records plan. The first part of the plan is a description of records that are vital to continued operation or for the protection of legal and financial rights of the organisation. The plan should also include specific measures for storing and periodically cycling (updating) copies of those records. The description of vital records is based on identification and inventorying. Organisations may take the following steps to identify and inventory vital records: Consult with the official responsible for disaster coordination, Review organization statutory and regulatory responsibilities and existing emergency plans for insights into the functions and records that may be included in the vital records inventory, Review documentation created for the contingency planning and risk assessment phase of emergency preparedness. The offices performing those functions are obvious focuses of an inventory, Review current file plans of offices that are responsible for performing critical functions or may be responsible for preserving rights, and Review the organization records manual or records schedule to determine which records series potentially qualify as vital. Organisations must exercise caution in designating records as vital and in conducting the vital records inventory. There are suggestions that from 1 to 7 percent of an organisation records may be vital records. Only those records series or electronic information systems (or portions of them) most critical to emergency operations or the preservation of legal or financial rights should be so designated. Agencies must make difficult and judicious decisions in this regard. The inventory of vital records should include: The name of the office responsible for the records series or electronic information system containing vital information The title of each records series or information system containing vital information Identification of each series or system that contains emergencyoperating vital records or vital records relating to rights The medium on which the records are recorded The physical location for offsite storage of copies of the records series or system The frequency with which the records are to be cycled (updated). Page 163

Disaster Recovery Workbook

VITAL RECORDS INVENTORY FORM Department: Authorizing Signature: _______________ Date of Signature:

Division:

Sub-Division: _________________________

IT Service: ____________________________

Records Title

Location-Building, Floor, Room

Retention

Container

Format

Security Copy Location-Format Building, Floor, Room

Page 164

Disaster Recovery Workbook

Appendices Include any applicable appendixes that are needed. E.g. Mission statement and/or business objectives, which drove this BIA. Relevant details of people who provided input Terminology Make sure that all terminology is captured and documented correctly.

E.g. CMDB Configuration Management Data Base ITSCM Information Technology Services Continuity Management SLA Service Level Agreement UC Underpinning Contract

Page 165

Disaster Recovery Workbook

Page 166

Disaster Recovery Workbook

3.18 Roles and Responsibilities IT Service Continuity Manager The IT Service Continuity Manager is responsible for ensuring that the aims of IT Service Asset & Continuity Management are met. This includes such tasks and responsibilities as: • •

• • • • • • • • • • • • • • •

Performing Business Impact Analyses for all existing and new services Implementing and maintaining the ITSCM process, in accordance with the overall requirements of the organization’s Business Continuity Management process, and representing the IT services function within the Business Continuity Management process Ensuring that all ITSCM plans, risks and activities underpin and align with all BCM plans, risks and activities, and are capable of meeting the agreed and documented targets under any circumstances Performing risk assessment and risk management to prevent disasters where cost-justifiable and where practical Developing and maintaining the organization’s continuity strategy Assessing potential service continuity issues and invoking the Service Continuity Plan if necessary Managing the Service Continuity Plan while it is in operation, including fail-over to a secondary location and restoration to the primary location Performing post mortem reviews of service continuity tests and invocations, and instigating corrective actions where required Developing and managing the ITSCM plans to ensure that, at all times, the recovery objectives of the business can be achieved Ensuring that all IT service areas are prepared and able to respond to an invocation of the continuity plans Maintaining a comprehensive IT testing schedule, including testing all continuity plans in line with business requirements and after every major business change Undertaking quality reviews of all procedures and ensuring that these are incorporated into the testing schedule Communicating and maintaining awareness of ITSCM objectives within the business areas supported and IT service areas Undertaking regular reviews, at least annually, of the Continuity Plans with the business areas to ensure that they accurately reflect the business needs Negotiating and managing contracts with providers of third-party recovery services Assessing changes for their impact on Service Continuity and Continuity Plans Attending CAB meetings when appropriate

Page 167

Disaster Recovery Workbook

Page 168

Disaster Recovery Workbook

3.19 Process Manager

IT Services Process Manager Process: IT Service Asset & Continuity Management

Status:

In draft Under Review Sent for Approval Approved Rejected

Version:

Release Date:

Page 169

Disaster Recovery Workbook

Detailed responsibilities of the IT Service Asset & Continuity Management process owner The ITSCM Manager…..

1.

Description Will develop and maintain the IT Service Asset & Continuity Management Process.

2.

Will develop, maintain and promote IT Service Asset & Continuity Management. Will coordinate process reviews utilizing independent parties to provide an objective view on the simplicity of the process and areas for improvement.

3.

4.

5.

6.

7.

Will be responsible for implementing any design improvements identified. Will chair the Recovery meetings that are used to identify and action recovery issues and to verify that all steps were completed and the objective of the process was achieved. Arrange and run all IT Service Asset & Continuity Management reviews with the IT Service Asset & Continuity Management team. The reviews where necessary will include other IT Departments as well as key customers. Will control and review: • Any outstanding process related actions • Current targets for availability performance • The process mission statement Will manage IT Service Delivery during times of a disaster. This includes coordinating the disaster recovery team and liaising with the business.

Notes/Comments Use the notes/ Comments column in different ways. If you are looking to apply for a process role, then you can check yourself against the list (with ticks or look to update your resume). If you are looking to appoint a process manager or promote someone from within the organization you can make notes about their abilities in the particular area.

Make available relevant, concise reports that are both timely and readable for Customers and Management

Page 170

Disaster Recovery Workbook

Detailed skills of the IT Service Asset & Continuity Management process owner The ITSCM Manager…. . 1.

Description The ITSCM Manager will display a communication style based around information and escalation. Have practical and quantifiable process management experience.

2.

3.

4.

5.

He / She will be a Senior IT Manager High degree of analytical skills to be able to assess the impact of disasters on different business areas and people. High degree of analytical skill needed to be able to help in the process or restoring service as quickly as possible in the event of a disaster. Technical ability in being able to read data from the IT Service Asset & Continuity Management process that will help with the identification of trends and improvements relating to disaster recovery. An ability to run a meeting according to strict guidelines (not to get side-tracked on items that one person may be interested in). Must possess skills in influencing and negotiation. The ITSCM Manager must be able to communicate with people at all levels of the organization. This is especially important during a disaster.

6.

The process manager must be able to demonstrate ways to “do things differently” that will improve the process.

7.

Must be able to think logically about disaster recovery issues that could affect the organization and design appropriate assessment and diagnosis activities.

Notes/Comments Use the notes/ Comments column in different ways. If you are looking to apply for a process role, then you can check yourself against the list (with ticks or look to update your resume). If you are looking to appoint a process manager or promote someone from within the organization you can make notes about their abilities in the particular area.

This will provide a strong link into the Problem Management process and Service Level Management process.

Page 171

Disaster Recovery Workbook

Page 172

Disaster Recovery Workbook

3.20 Reports, KPIs and other Metrics

IT Services Reports, KPIs and other Metrics Process: IT Service Asset & Continuity Management

Status:

In draft Under Review Sent for Approval Approved Rejected

Version:

Release Date:

Note: SEARCH AND REPLACE

Search for any > as your input will be required Also review any yellow highlighted text

Page 173

Disaster Recovery Workbook

Reports and KPI Targets for IT Service Asset & Continuity Management The document is not to be considered an extensive statement as its topics have to be generic enough to suit any reader for any organization. However, the reader will certainly be reminded of the key topics that have to be considered. This document serves as a GUIDE ON SUITABLE KEY PERFORMANCE INDICATORS (KPIs) and REPORTS FOR MANAGEMENT for the IT Service Asset & Continuity Management process. This document provides a basis for completion within your own organization. This document contains suggestions regarding the measures that would be meaningful for this process. The metrics demonstrated are intended to show the reader the range of metrics that can be used. The message must also be clear that technology metrics must be heavily supplemented with non-technical and business focused metrics/KPI’s/measures. This document was; Prepared by: On:

And accepted by: On:

Page 174

Disaster Recovery Workbook

Key performance indicators (KPI’s) Continuous improvement requires that each process needs to have a plan about “how” and “when” to measure its own performance. While there can be no set guidelines presented for the timing/when of these reviews; the “how” question can be answered with metrics and measurements. With regard to timing of reviews then factors such as resource availability, cost and “nuisance factor” need to be accounted for. Many initiatives begin with good intentions to do regular reviews, but these fall away very rapidly. This is why the process owner must have the conviction to follow through on assessments and meetings and reviews, etc. If the process manager feels that reviews are too seldom or too often then the schedule should be changed to reflect that. Establishing SMART targets is a key part of good process management. SMART is an acronym for: Simple Measurable Achievable Realistic Time Driven

Metrics help to ensure that the process in question is running effectively.

Page 175

Disaster Recovery Workbook

With regard to IT SERVICE ASSET & CONTINUITY MANAGEMENT the following metrics and associated targets should be considered: Key Performance Indicator

Identification of Risks – in scope Risks that the organization has an element of control over Identification of Risks – out of scope Risks that the organization has no element of control over Meetings held (on time) to review performance

Target Value

Time Frame/Notes/Who

(some examples) .Increasing dependency on business systems Increased incidents of politically destabilizing activity. A reducing number here may be a good indication or at least the number should be stable.

Costs of Service Continuity process decreasing vs. level of expected service. Number of tests carried out as part of the ITSCM process The number of invocations of the ITSCM plan or element of the plan. Number of interviews held with business staff to discuss their continuity requirements. Number of education or awareness sessions held to brief IT and/or business staff on the ITSCM process Number of changes that resulted in a change to the ITSCM plan. Number of changes detected that should have resulted in a change to the ITSCM plan – but didn’t. Others

Special Tip: Beware of using percentages in too many cases. It may even be better to use absolute values when the potential number of maximum failures is less than 100.

Page 176

Disaster Recovery Workbook

Reports for Management Management reports help identify future trends and allow review of the “health” of the process. Setting a security level on certain reports may be appropriate as may be categorizing the report as Strategic, Operational or Tactical. The acid test for a relevant report is to have a sound answer to the question; “What decisions is this report helping management to make?” Management reports for IT Service Asset & Continuity Management could include: Report

Time Frame/Notes/Who

Expected growth in demand for the service (will generally be high at start-up, but then plateau) Serious outages and invocation steps of the Continuity plan Backlog details of process activities work outstanding (along with potential negative impact regarding failure to complete the work in a timely manner) – but also provide solutions on how the backlog can be cleared. Simple breakdown of Continuity process and the relationship between business and IT. Description of the “triggers” that begin the Continuity process and how these “triggers” are reviewed. Analysis and results of meetings completed The situation regarding the process staffing levels and any suggestions regarding redistribution, recruitment and training required. Human resource reporting including hours worked against project/activity (including weekend/after hours work). Relevant Financial information– to be provided in conjunction with Financial Management for IT Services.

Page 177

Disaster Recovery Workbook

Page 178

Disaster Recovery Workbook

3.21 Business and IT Flyers

IT Services Business and IT Flyers Process: IT Service Asset & Continuity Management

Status:

In draft Under Review Sent for Approval Approved Rejected

Version:

Release Date:

Note: SEARCH AND REPLACE

Search for any > as your input will be required Also review any yellow highlighted text

Page 179

Disaster Recovery Workbook

The following pages provide 2 examples of flyers that can printed and distributed throughout your organization. They are designed to be displayed in staff rooms. Note, they are examples, and your input is required to complete the flyers. Remember, the important thing is to ensure that the message delivered in the flyer is appropriate to the audience that will be reading it. So think about how and where you will be distributing the flyers.

Page 180

Disaster Recovery Workbook

IT Service Asset & Continuity Management Key Points: • Ability to cope with an emergency

IT Services Department Wanted: Continued IT Services

• Agreed Levels of Service

The IT Department is embarking on an IT Service Asset & Continuity Management implementation Programme.

• Recovery options

>

• IT Supporting the business

>

> Provide contact lists for the IT Department as well as the business managers that they can contact.>>

Where are you going to send

THE BENEFITS >

THE PROCESS

>>

Page 181

Disaster Recovery Workbook

IT Service Asset & Continuity Management KEEPING “IT” GOING IN TIMES OF ADVERSITY

HELP US HELP YOU Contact your immediate Manager to let them know what you services you need if there is an unexpected and major interruption to IT delivery. We need to know about the Services YOU NEED. IMPROVED SERVICE DELIVERY IS OUR GOAL KNOW YOUR SERVICE RIGHTS

Sponsored by IT SERVICES “Constantly improving and aligning to your needs”

Page 182

Disaster Recovery Workbook

4

IMPLEMENTATION PLAN

IT Services Implementation Plan/Project Plan Skeleton Outline Process: IT Service Asset & Continuity Management

Status: Version:

0.1

Release Date:

Page 183

Disaster Recovery Workbook

Planning and implementation for IT Service Asset & Continuity Management This document as described provides guidance for the planning and implementation of the IT Service Asset & Continuity Management ITIL process. The document is not to be considered an extensive plan as its topics have to be generic enough to suit any reader for any organization. However, the reader will certainly be reminded of the key topics that have to be considered for planning and implementation of this process. Initial planning When beginning the process planning the following items must be completed: CHECK

DESCRIPTION

☺ or 2 or date Get agreement on the objective (use the ITIL definition), purpose, scope, and implementation approach (e.g. Internal, outsourced, hybrid) for the process. Assign a person to the key role of process manager/owner. This person is responsible for the process and all associated systems. This will person will generally be the Network or Operations Manager or Service Delivery Manager. Conduct a review of activities that would currently be considered as an activity associated with this process. Make notes and discuss the “re-usability” of that activity. The key activities of IT Service Asset & Continuity Management are: •

Business Impact Analysis



Risk Assessment



Designing for Recovery



Procedural Testing

Create and gain agreement on a high-level process plan and a design for any associated process systems. NOTE: the plan need not be detailed. Too many initiatives get caught up in too much detail in the planning phase. KEEP THE MOMENTUM GOING.

Page 184

Disaster Recovery Workbook

Review the finances required for the process as a whole and any associated systems (expenditure including people, software, hardware, accommodation). Don’t forget that the initial expenditure may be higher than the ongoing costs. Don’t forget annual allowances for systems maintenance or customizations to systems by development staff. Agree the policy regarding this process

Create Strategic statements Refer to Policies, Objectives and Scope for more template information regarding Policy, Objective and Scope statements. Policy Statement The policy establishes the “SENSE OF URGENCY” for the process. It helps us to think clearly about and agree on the reasons WHY effort is put into this process. An inability to answer this seemingly simple, but actually complex question is a major stepping stone towards successful implementation The most common mistake made is that reasons regarding IT are given as the WHY we should do this. Reasons like to make our IT department more efficient are far too generic and don’t focus on the real issue behind why this process is needed. The statement must leave the reader in no doubt that the benefits of this process will be far reaching and contribute to the business in a clearly recognizable way. Objective Statement When you are describing the end or ultimate goal for a unit of activity that is about to be undertaken you are outlining the OBJECTIVE for that unit of activity. Of course the activity may be some actions for just yourself or a team of people. In either case, writing down the answer to WHERE will this activity to me/us/the organization is a powerful exercise. There are many studies that indicate the simple act of putting a statement about the end result expected onto a piece of paper, then continually referring to it, makes achieving that end result realistic.

Page 185

Disaster Recovery Workbook

As a tip regarding the development of an objective statement; don’t get caught up in spending hours on this. Do it quickly and go with your instincts or first thoughts – BUT THEN, wait a few days and review what you did for another short period of time and THEN commit to the outcome of the second review as your statement. Scope Statement In defining the scope of this process we are answering what activities and what “information interfaces” does this process have. Don’t get caught up in trying to be too detailed about the information flow into and out of this process. What is important is that others realize that information does in fact flow. For example, with regard to the IT SERVICE ASSET & CONTINUITY MANAGEMENT process we can create a simple table such as: IT Service Asset & Continuity Management Information flows Process IT Service Asset & Continuity Management Problem Management

Process Problem Management

Information ITSCM planning awareness and training

to

IT Service Asset & Continuity Management

Historical information for planning

IT Service Asset & Continuity Management Change Management

to

Change Management

RFC’s for evaluation pertaining to affects on recovery

to

IT Service Asset & Continuity Management

ITSCM planning awareness and training

IT Service Asset & Continuity Management Service Level Management

to

Service Level Management

Service Level Requirements

to

IT Service Asset & Continuity Management

ITSCM planning awareness and training

to

Page 186

Disaster Recovery Workbook

Steps for Implementation There can be a variety of ways to implement this process. For a lot of organizations a staged implementation may be suited. For others a “big bang” implementation – due to absolute equality may be appropriate. In reality however, we usually look at implementation according to pre-defined priorities. Consider the following options and then apply a suitable model to your own organization or case study. STEPS

NOTES/ /RELEVANCE/DATES/WHO

Define the Objective and Scope for IT Service Asset & Continuity Management Establish and agree on a clear definition for the words: •

Disaster



Gradual Recovery



Intermediate Recovery



Immediate Recovery

This is one of the most interesting aspects. It can be very difficult to get everyone to agree to a definition, and it can be very difficult to establish the correct understanding of the definition. However, get this right, and the rest of the process is made easier. Seek initial approval Establish and Define Roles and Responsibilities for the process. Appoint an ITSCM Manager.

Establish and Define the Scope for IT Service Asset & Continuity Management and the relationships with IT Services

Establish IT Service Asset & Continuity Management Process Establish and Define Relationship with all other processes. This is another key aspect of the IT Service Asset & Continuity Management process. IT Service Asset &

Page 187

Disaster Recovery Workbook

Continuity Management is where we are helping set assurance of IT Service capability in the event of a disaster. IT Service Asset & Continuity Management works closely with Service Level Management to achieve this.

Establish monitoring levels. Continuity of service as seen by the business is related to the service and not the components that make up the service.

Define reporting standards

Publicize and market

The priority selection has to be made with other factors in mind, such as competitive analysis, any legal requirements, and desires of “politically powerful influencers”. Costs The cost of process implementation is something that must be considered before, during and after the implementation initiative. The following points and table helps to frame these considerations: (A variety of symbols have been provided to help you indicate required expenditure, rising or falling expenditure, level of satisfaction regarding costs in a particular area, etc. Initial Personnel

During 0

Ongoing /

Costs of people for initial design of process, implementation and ongoing support Accommodation



Costs of housing new staff and any associated new equipment and space for documents or process related concepts. Software New tools required to support the process and/or the costs of migration from an existing tool or system to the

Page 188

Disaster Recovery Workbook

new one. Maintenance costs Hardware New hardware required to support the process activities. IT hardware and even new desks for staff. Education Re-education of existing staff to learn new techniques and/or learn to operate new systems. Procedures Development costs associated with filling in the detail of a process activity. The step-by-step recipe guides for all involved and even indirectly involved personnel.

In most cases, costs for Process implementation have to be budgeted for (or allocated) well in advance of expenditure. Part of this step involves deciding on a charging mechanism (if any) for the new services to be offered. Build the team Each process requires a process owner and in most situations a team of people to assist. The IT Service Asset & Continuity Management process is one of the processes in the Service Delivery set that shows very visible benefits from the outset and is very influential in setting the perception of IT Services to its customers and end users. Of course a lot will be dependant on the timing of the implementation and whether it is to be staged or implemented as one exercise.

Refer to Roles and Responsibilities on page 167 for roles, responsibilities and tasks of involved personnel.

Analyze current situation and FLAG Naturally there are many organizations that have many existing procedures/processes and people in place that feel that the activities of IT Service Asset & Continuity Management is already being done. It is critical to identify these systems and consider their future role as part of the new process definition.

Page 189

Disaster Recovery Workbook

Examples of areas to review are: Area

Notes

Power teams Current formal procedures Current informal procedures Current role descriptions Existing organizational structure Spreadsheets, databases and other repositories Other…

Implementation Planning After base decisions regarding the scope of the process and the overall planning activities are complete we need to address the actual implementation of the process. It is unlikely that there will not be some current activity or work being performed that would fit under the banner of this process. However, we can provide a comprehensive checklist of points that must be reviewed and done. Implementation activities for IT Service Asset & Continuity Management Activity

Notes/Comments/Time Frame/Who

Review current and existing IT Service Asset & Continuity Management practices in greater detail. Make sure you also review current process connections from these practices to other areas of IT Service Delivery and Support.

Review the ability of existing functions and staff. Can we “reuse” some of the skills to minimize training, education and time required for implementation?

Establish the accuracy and relevance of current processes, procedures and meetings. As part of this step if any information is credible document the transition from the current format to any new format that is selected.

Decide how best to select any vendor that will provide assistance in this process area (including tools, external

Page 190

Disaster Recovery Workbook

consultancy or assistance to help with initial high workload during process implementation).

Establish a selection guideline for the evaluation and selection of tools required to support this process area (i.e. IT Service Asset & Continuity Management tools).

Purchase and install tools required to support this process (i.e. IT Service Asset & Continuity Management tool). Ensure adequate skills transfer and on-going support is catered for if external systems are selected.

Create any required business processes interfaces for this process that can be provided by the automated tools (e.g. reporting – frequency, content).

Document and get agreement on roles, responsibilities and training plans.

Communicate with and provide necessary education and training for staff that covers the actual importance of the process and the intricacies of being part of the process itself.

An important point to remember is that if this process is to be implemented at the same time as other processes that it is crucial that both implementation plans and importantly timing of work is complementary. Cutover to new processes The question of when a new process actually starts is one that is not easy to answer. Most process activity evolves without rigid starting dates and this is what we mean when we answer a question with “that’s just the way it’s done around here”. Ultimately we do want the new process to become the way things are done around here, so it may even be best not to set specific launch dates, as this will set the expectation that from the given date all issues relating to the process will disappear (not a realistic expectation).

Page 191

Disaster Recovery Workbook

Page 192

Disaster Recovery Workbook

5

FURTHER READING

For more information on other products available from The Art of Service, you can visit our website: http://www.theartofservice.com If you found this guide helpful, you can find more publications from The Art of Service at: http://www.amazon.com

Page 193