1 / 24

How to automatically test and validate your backup and recovery strategy

How to automatically test and validate your backup and recovery strategy. Ruben . Gaspar .Aparicio@cern.ch November 17 th , 2010. Agenda. Recovery platform principles Use Cases Conclusion. Recovery platform principles. Validate tape backupsets Isolation

tieve
Download Presentation

How to automatically test and validate your backup and recovery strategy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. How to automatically test and validate your backup and recovery strategy Ruben.Gaspar.Aparicio@cern.ch November 17th, 2010

  2. Agenda Recovery platform principles Use Cases Conclusion

  3. Recovery platform principles • Validate tape backupsets • Isolation • no use of catalog: controlfile needs to have all backup information needed • capped tnsnames.ora • no user jobs must run • Automatic cleanup except otherwise configured or an error arises. • Flexible and easy to customize • Take advantage of a restored database: exports can be configured -> further validation • Spans several Oracle homes (9i,10g,11g) and OS: solaris & linux 32 & 64 bits. • Maximize recovery server: several recoveries at the same time • Easy to deploy: any server can be a recovery server

  4. Platform Requirements • Server with Linux (>RHE4) or Solaris (>8) • Oracle database server target release(s) for single instance: • 9i: 32 & 64 bits • 10g • 11gR1 & 11gR2 • Perl (v5.8.5) & bash shell should be available • TDP-Oracle libraries (v5.5.3) • Enough storage to carry intended recoveries on SAN or NAS.

  5. Component view Runs anywhere: ~2600 lines of Perl & Bash

  6. Software layout

  7. Skeleton of a recovery Restore & Recovery export

  8. Export sequence diagram Long Time Archive to Tape Copy to offsite server

  9. Recovery Server: Installation set up • Recoveries are carried out every week, for important db (cron job): • If recovery_wrapper.sh or similar not available:

  10. Traces • All actions are logged by Logger.pm • Last successful recovery scripts are kept: $dirtobackup='/ORA/dbs03/oradata/BACKUP'; • E-mail notifications

  11. Important configuration options ASM configuration

  12. Use Case I: user logical error. Table lost • Recover a table as it was on 16th Dec 2009 at 05:00am • Do we have an export that could fit?

  13. Use Case 1 cont. I • Set PITR: • Launch it: • It will dispatch following scripts in order: Metalink ID 433335.1

  14. Use Case 1 cont. II job_queue_processes is already 0

  15. Use Case 2: using a recovery as backup to disk Recovery Server Public interface Interconnect interface RAC Backup to disk. SATA disks 01 02 03 04

  16. Use Case 2 cont. I • Set-up • Run restore/recover: • Copy archived redo logs from production to recovery server if needed. • Catalog recovered db on production and check it out! Just create recovery scripts

  17. Use Case 4: Backup strategy validation Recovery Server Public interface Interconnect interface RAC Backup to disk. SATA disks 05 06 07 08 01 02 03 04

  18. Use case 4 cont. I • New scripts and templates introduced for VLDB backup • backup incremental level 0 … check logical database skip readonlyformat…; • RO tablespaces are backed up every night • Retention policy changed time window to redundancy • Validation possibilities: • Full restore: time/resource cost • Partial restore:

  19. Use Case 4 cont. II • Restore/recover the READ WRITE part of the database • Before open db offline RO tablespacedatafiles • Once db open, restore selected RO tablespaces

  20. Use Case 5: schema broken after application upgrade in a VLDB • If schema is self-contained • Kind of tablespace PITR but much simpler • $tblpitr • expdpdidn’t work in some cases:

  21. Use case 5 cont. I • db_restore.rcv • db_start.sql

  22. Use case 5 cont. II • Database in size: • Full: ~420Gb • Partial (recovering one schema): ~ 71Gb Partial recovery was 63% faster than Full recovery

  23. Conclusion • More than 3000 recoveries performed so far • Recovery platform shows useful to: • Can help to estimate real restore/recover time (SLA) • Validates regularly your tape backups • Helps to test your backup strategy • Helps to test your recovery strategy • Helps in a number of use cases i.e. recover from logical user errors • Maximize your recovery infrastructure -> take consistent exports • Total isolation from production • Easy installation • Open source project {http://sourceforge.net/projects/recoveryplat/} • It can be adapted to different tape vendor: netbackup, EMC network backup • Add new functionality

  24. Thank You !

More Related