1 / 15

ITCP / ITDR Audit Program/ Test Recovery Checklist

ITCP / ITDR Audit Program/ Test Recovery Checklist. ROBERT K. DUGGAN, CPA, CIA, CISA. Why do we need to test the ITDR / ITCP?. ITCP/ DRP often doesn’t work. We discover it doesn’t work when we really need it to work. We pay a fortune to maintain it. (Tier 4-6- $400K- $2M and up!)

Download Presentation

ITCP / ITDR Audit Program/ Test Recovery Checklist

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ITCP / ITDR Audit Program/ Test Recovery Checklist ROBERT K. DUGGAN, CPA, CIA, CISA

  2. Why do we need to test the ITDR / ITCP? • ITCP/ DRP often doesn’t work. • We discover it doesn’t work when we really need it to work. • We pay a fortune to maintain it. (Tier 4-6- $400K-$2M and up!) • DR test recoveries are fun!

  3. Tiers of Configuration • IBM sets Tiers 1-6 for CICS operating on z/OS • Based on configuration - Tiers 1-3 being 1 week to >24 hours recovery time • Tiers 4-6 being <24 hours (large manufacturers/distributors with continuous processing needs and low downtime tolerance to business to instantaneous (Tier 6- banks- 0 downtime tolerance) (see IBM .com for more information) • Today’s example is on a Tier 4 Scenario for medium to large organizations with 24 hour RTO requirement for critical applications (If you have a mainframe you most likely need Tier 3 up) • < 24 hour recovery of critical platforms and applications – key success factors and evaluation steps are similar for the tiers

  4. Tiers of Configuration • Determined by Business Impact Analysis and Risk Assessment • RTO / RPO • Recovery of critical platforms and applications – regardless of tier or platform, key success factors and evaluation steps are similar for all tiers . Configuration and RTO changes.

  5. Today : 3 Levels of Assessment of the ITDR • Walkthru -“Tabletop”- Scenario with roles and responsibilities • Functional Exercise – Verify the effectiveness of the backup by platform • Off-Site Test Restore – Verify the effectiveness of the IT DR plan offsite at the test center

  6. BCP/ITDR Key Concept: Two different things, but: ITDR and BCP are severely impaired without each other.

  7. Walkthru / Tabletop • Should occur well before the offsite test • Include vendor team • Follow up process with platform owners/DR team and vendor team to resolve issues noted prior to actual test restore • Audit interviews platform support teams, IT Director, DR Manager assigned as part of planning to get an understanding of objectives and where the process is on an evolutionary scale

  8. Major Gaps- DRP Walkthru • Call tree notification system dysfunctional / not at vendor, call trees incomplete or not defined • Persons who can declare not defined or poorly separated (or the wrong people) – vendor cannot take action under contractual terms • Support teams not defined / backups for key members • Approval process for changes to DR Documents • DR Documents not current and at vendor/on secure website • Vendor in same geographic area

  9. Major Gaps- DRP Walkthru • Step by step instructions for platform owner / vendor operators are not crystal clear • No clear assignment of responsibilities or documented procedures for key platform owners • No clear assignment of responsibility for vendor personnel or appropriate training on platforms • Backups for key personnel not defined • Business impact analysis and risk assessment not current/tier of recovery is insufficient- Example: Distributor switch from call center to web application/proprietary remote order entry system

  10. Major Gaps- Functional Exercise- Test Recovery • Vendor personnel or backup recovery personnel cannot restore the system • Port mapping / system documentation not complete / up to date • Insufficient remote software / hardware support level • Vendor hardware is insufficient • Insufficient procedures / lack of clean updated scripts • Poorly trained recovery personnel

  11. Major Gaps- Functional Exercise- Test Recovery • Backup not really effective- verify successful recovery of each platform using a checklist and document verification method (system, volume information in header screens). PS - Don’t ask for screenshots in the middle of a DR test. Just catch platform, LPAR, times, and volume information – observe/confirm effective validation. • Application recovery not verified during the 24 hour test/inaccurate RTO • Inaccurate system documentation leads to failure to meet RTO • Port mapping is inaccurate /not maintained properly by hardware support

  12. Major Gaps- Functional Exercise- Test Recovery • Restore personnel cannot follow scripts without assistance from the company platform team • Test results not verified by DR Test Manager/DR Manager or test leader is not independent/does not rotate by test • Teams do not complete verification checklist or keep testing notes- it is an evolving process that needs to build • Teams do not update DR Instructions following test restore for lessons learned- expensive process- should have a post restore review with follow up task list • Teams do not accurately capture RT/RP , evaluate against true RTO/RPO by platform and application

  13. Some Resources • www.searchdisasterrecovery.com • www.IBM.com

  14. Questions:

  15. Thank You Be sure to find me on Linked-In

More Related