1 / 37

High availability and site redundancy with Exchange 2007: Notes from the field

High availability and site redundancy with Exchange 2007: Notes from the field. Gareth Ireland Infrastructure Consultant. Session Objectives And Takeaways. Session Objectives: Understanding High Availability requirements and objectives of a business.

gelsey
Download Presentation

High availability and site redundancy with Exchange 2007: Notes from the field

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

  2. Session Objectives And Takeaways • Session Objectives: • Understanding High Availability requirements and objectives of a business. • Understanding what to protect in an Exchange Server 2007 environment • Understanding Exchange Server 2007 features and solutions for protecting services and data • Understanding issues to consider in site resiliency solutions • Compare an Exchange 2003 Geo-cluster deployment to that of a Exchange 2007 solution.

  3. Session Objectives And Takeaways • Session Objectives (cont.): • Practical demonstration of an Exchange Server 2007 High Availability deployment. • Key Takeaways: • New High Availability features and solutions reduce the chance of disaster • New Disaster Recovery features and solutions reduce the time of recovery when disasters do occur • Demystify the concepts of High Availability features in Exchange Server 2007

  4. High Availability Requirements of Business

  5. Types of Failures • Mid-Scale • Full server failure • Complete cluster failure • Large storage failure, e.g., SAN failure • Small-Scale • Accidently deleted items • Deleted mailbox • Disk failure • Disk Controller failure • Database Corruption • Log Corruption • Storage failure (DAS) • Large-Scale • Total site failure

  6. Exchange Server 2007 High Availability Exchange Organization Hub Transport server role Unified Messaging server role Edge Transport server role CCR Cluster Internet Mailbox server role (Active) Mailbox server role (Passive) Client Access server role Overview

  7. Exchange Server 2007 High Availability Solutions Matrix

  8. What to Protect and How

  9. MB Q Q Logs DB Single Copy Cluster SMTPMBOWA • Exchange Server 2003 • Requires shared storage • SMTP, OWA, and Mailbox are cluster-aware • Single copy of mailbox data • Up to 8-node Active/Passive • 2-Node Active/Active • Geo-Clusters required Synchronized Storage Replication. • Split-Brain Scenarios • Exchange Server 2007 • Requires shared storage • Mailbox Only • Simple redundancy for other roles • Single copy of mailbox data • Up to 8-node Active/Passive • Active/Active cut • Improvements in Install, Management, Behavior Logs DB

  10. Q Single Copy Cluster Limitations MB • Deployment/operationalcost and complexity • Recovery time varies based on backup technology, but can be lengthy and painful • Data redundancy requires integration of partner technology Logs DB

  11. Local Continuous Replication (LCR) • Standalone server data availability • Data outages expensive to recover • Significant data loss (hours?) • Previous versions of Exchange requiredpartner products for replication • What is LCR? • Data replication on a single serverin a single datacenter • Enabled per storage group • Easy to configure

  12. Local Continuous Replication • Key things to know: • Per storage group, manual configuration • Adds overhead to server • Some configuration limitations • Benefits • Enables recovery in minutes • Enables recovery without data loss • Enables large mailboxes • Variety of storage and backup options • Decreases TOC by enabling I/O offload • Within reach of broad set of customers

  13. Standby Continuous Replication Service Pack 1 FileShare DB Logs Logs Logs DB DB

  14. Q Standby Continuous Replication CCR MBX Passive Node MBX SCC

  15. Standby Continuous Replication • Designed for datacenter recovery • Enables standby configurations out of the box • No clustering required between servers • No single subnet requirement • Spans multiple AD sites • Granular configuration • Flexible configuration • Many-to-many • Manual activation

  16. Cluster Continuous Replication FileShare • Two-node Active/Passive failover cluster • File Share Witness (MNS Quorum) • No shared storage • Witness on Hub Transport • Automatic recovery • Continuous data replication • Full redundancy • One or two datacenter solution Logs DB Logs DB

  17. Cluster Continuous Replication WitnessKB 921181 FileShare • Outage Management • Easy-to-use scheduled outage support • Automatic recovery of unscheduled outages • Symmetric failover • Resource requirements • Variety of backup options • Reduced backup TCO • Configuration limitations Logs DB DB Logs

  18. Cluster Continuous Replication Benefits • Fast recovery to data problems on active node • No single point of failure • Simplified hardware requirements • Simplified storage requirements • Simplified deployment • Exchange-provided replication solution • Enables Mailbox server failover to second datacenter • Improved management experience • Ability to offload VSS-based backups

  19. Cluster Continuous Replication CCR failover behavior • Cluster service monitors the resources • Failure detection is not instantaneous • IP Address or Network Name resource failures cause failover • A machine, or network access to it, has failed completely • Exchange service failure or timeout doesn’t cause failover • The service is restarted on the same node • Database failure doesn’t cause failover • Don’t want to move 49 databases because 1 failed

  20. Continuous Replication Available configurations CCR Standalone Server Cluster Store ReplicationService Active Node Passive Node Store ReplicationService DB Copy Logs pulled by Passive LCR DB Copy

  21. Continuous Replication Basic architecture • A ‘pull’ model • Exchange server creates log files normally • Log files are copied by Replication service • Share created on the active node • Exxnnnnnnnn.log files copied as they appear • Replication service keeps a copy of the database up-to-date • Inspects, and replays log files • Exx.log is copied for handoff/failover

  22. Cluster Continuous Replication Active Passive Online seed \\node1\GUID Updated DB E00.log E0000000012.log E0000000011.log Advance DB by playing logs E0000000012.log E0000000011.log Copy and verify logs

  23. Clustered Continuous Replication Failover Scenarios Scheduled outage Scheduled outage to correct corruption (logs available) Scheduled outage to correct corruption (No logs available) Transport Dumpster Store Crash OS blue screen Incremental Replay Active Network Failure Logs copied Geographically Dispersed Cluster Single machine failure Geographically Dispersed Cluster Datacenter failure

  24. DEMO : Useful CCR cmdlets Get-ClusterMailboxServerStatus Status information of the cluster Get-StorageGroupCopyStatus Complete status information of CCR or LCR copy Move-ClusterMailboxServer Scheduled (Lossless) move of Exchange resource Update-StorageGroupCopy Initiate or resync an CCR or LCR copy (use Suspend-StorageGroupCopy and Resume-StorageGroupCopy cmdlet as required) Get-TransportConfig and Set-TransportConfig Get and set transport dumpster configuration.

  25. Active Move-ClusteredMailboxServer Scheduled outage Node 1 Node 2 • Passive node copies log files • Exx.log is in use • On move, Exx.log is copied • Designations are now reversed E00 (Gen 6) E00 (Gen 5) E00 0000 0005 E00 (Gen 5) E00 (Gen 4) E00 0000 0004 E00 0000 0004 E00 (Gen 3) E00 0000 0003 E00 0000 0003 E00 (Gen 2) E00 0000 0002 E00 0000 0002 E00 0000 0001 E00 0000 0001

  26. Active Failover Unscheduled Outage Node 1 Node 2 • Failover without copying all log files is called “lossy” • Passive DB is not completely up-to-date • Log generation numbers are reused • Log files havedifferent content! • Database might be different! E00 (Gen 6) E00 (Gen 5) E00 (Gen 5) E00 (Gen 5) E00 0000 0005 E00 0000 0004 E00 0000 0004 E00 (Gen 4) E00 (Gen 4) E00 0000 0004 E00 (Gen 3) E00 0000 0003 E00 0000 0003 E00 0000 0002 E00 (Gen 2) E00 0000 0002 E00 0000 0001 E00 0000 0001

  27. Transport Dumpster Transport Dumpster is a feature that is only enabled for use by Clustered Continuous Replication The transport dumpster submits recently delivered mail after an unscheduled outage from the Hub Transport Servers It is enabled by default and should always be turned on when using CCR The transport dumpster is enabled organization wide by setting the amount of storage available per storage group and setting the time to retain mail in the dumpster What it does: The Hub Transport server maintains a queue of mail that was recently delivered to a clustered mailbox server In the event of an unplanned failover, CCR automatically requests every Hub Transport server in the site to redeliver mail from the transport dumpster queue The information store automatically deletes the duplicates and redelivers mail that was lost Transport Dumpster

  28. Types of Failures 2003 vs. 2007 Reviewed

  29. New Approaches for Site Disasters Exchange Server 2007 Cluster Continuous Replication • Stretch CCR on Windows 2003 • 1 node per datacenter • Integrated data & server redundancy • Separate storage for each node ineach site • Flexible hardware options • Mailbox server failover and switchover (manual & automatic) • File Share Witness quorum • Requirements • AD fix up for other Exchange roles on site failover • Windows 2003 still requires single subnet • Network pipe between datacenters must carry wide range of traffic

  30. Stretch CCR with Windows 2003 Internet Internet BACK DO IT! Edge 4 Edge 1 DC/GC 2 MBX 1 MBX 2 CAS 1 CAS 2 HUB 1 DC/GC 1 Edge 2 Edge 3 DC/GC 4 CAS 3 CAS 4 HUB 3 HUB 2 HUB 4 DC/GC 3 MX Record MX Record //mail.tailspin.com/… //mail.tailspin.com/… Network Load=Replication+HUB+CAS+Heartbeats+AD Access+ Client Access+ AD Replication ??Dedicated or Non-Dedicated?? (fswsvr) (fswsvr) Cluster Continuous Replication (CCR) Public Same Subnet CMS CMS Private Same Subnet AD Site: Quincy AD Site: Redmond Primary Data Center Standby Data Center

  31. Demo Total site disaster

  32. Demo Lab Setup

  33. Questions & Answers

  34. Blogcasts, Webcasts, & Whitepapers • Support Webcast • Microsoft Exchange 2007 Disaster Recovery • http://support.microsoft.com/kb/937563/en-us LCR: http://msexchangeteam.com/archive/2006/05/24/427788.aspx CCR: http://msexchangeteam.com/archive/2006/08/09/428642.aspx SCR: http://msexchangeteam.com/archive/2007/02/23/435699.aspx

  35. Resources Dial Tone Recovery Using an Alternate Server http://technet.microsoft.com/en-us/library/bb310785.aspx File Share Witness for Cluster Continuous Replication http://support.microsoft.com/kb/921181 Database Portability http:// technet.microsoft.com/en-us/library/bb123954.aspx Technical Communities, Webcasts, Blogs, Chats & User Groups http://www.microsoft.com/communities/default.mspx Microsoft Learning and Certification http://www.microsoft.com/learning/default.mspx Microsoft Developer Network (MSDN) & TechNet http://microsoft.com/msdn http://microsoft.com/technet Trial Software and Virtual Labs http://www.microsoft.com/technet/downloads/trials/default.mspx

  36. Thank you http://www.microsoft.com/southafrica/ucs/2007

More Related