server cluster futures
Skip this Video
Download Presentation
Server Cluster Futures

Loading in 2 Seconds...

play fullscreen
1 / 32

Server Cluster Futures - PowerPoint PPT Presentation

  • Uploaded on

Server Cluster Futures. Elden Christensen Program Manager Windows Server Cluster Group eldenc @ Microsoft Corporation. Session Outline. Key Takeaways: Raising the quality bar for High Availability Server Cluster Futures Changes in Windows Server 2003 Service Pack 1

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Server Cluster Futures' - katelynn

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
server cluster futures

Server Cluster Futures

Elden Christensen

Program Manager

Windows Server Cluster Group eldenc @

Microsoft Corporation

session outline
Session Outline
  • Key Takeaways:Raising the quality bar for High Availability
  • Server Cluster Futures
    • Changes in Windows Server 2003 Service Pack 1
    • Microsoft Cluster Configuration Validation Tool
    • Planned Features for Windows codenamed “Longhorn”
key takeaways
Key Takeaways
  • Question: What does my storage need to supportto work with clustering?
  • Support the following SCSI Commands
    • SCSI-3 SPC-2 Compliant
      • Reserve(6)
      • Release(6)
      • Logical Unit Reset
      • Unique ID's - Device identification page 83h with Identifier Type 2 or 3
    • Update storage to support these commands
      • PERSISTENT RESERVE IN Read Keys (00h)
      • PERSISTENT RESERVE IN Read Reservation (01h)
      • PERSISTENT RESERVE OUT Reserve (01h)
        • Scope: LU_SCOPE (0h)
        • Type: Write Exclusive – Registrants Only (5h)
      • PERSISTENT RESERVE OUT Release (02h)
      • PERSISTENT RESERVE OUT Clear (03h)
      • PERSISTENT RESERVE OUT Preempt (04h)
      • PERSISTENT RESERVE OUT Register AND Ignore ExistingKey (06h)
key takeaways con t
Key Takeaways (con’t)
  • Question: What does my MPIO software need to support to work with clustering?
  • Update multi-path software to support new Persistent Reservations
    • All multi-path I/O ISV's and OEM’s with custom MS MPIO DSM’s must support registering and unregistering Persistent Reservations across all paths
  • See the storage MPIO DSM breakout sessions for more information on ‘How’
  • No current multi-path software will workon Longhorn!
  • You must update MPIO software to work withLonghorn Clustering
key takeaways con t5
Key Takeaways (con’t)
  • Question: What driver model do I need to support to work with Longhorn clustering?
    • Update
      • Will no longer support scsiport based mini-port drivers with Server Clusters
        • Fibre Channel, SAS, and non-MS Software Initiator based iSCSI solutions
        • Does not include parallel-SCSI based solutions
        • Note: This is for Longhorn only
shared storage management
Shared Storage Management
  • Mechanism used to managed disks varies
    • Windows 2000 – SCSI Bus Reset to break reservationsto a target
    • Windows 2003 RTM with SCSIPort – Bus Reset tobreak reservations
    • Windows 2003 RTM with StorPort – Logical Unit Reset tobreak reservations
      • Logical Unit first, then Target, then Bus reset by calling IOCTL_STORAGE_BREAK_RESERVATION
    • Windows 2003 SP1 with StorPort – Unique ID's to better identify disks to perform enhanced arbitration with fewer resets
    • Planned for Longhorn – Persistent Reservations with a completely new disk arbitration algorithm / mechanism forSAN utopia
cluster improvements in sp1
Cluster Improvements in SP1
  • Cluster Administrator performance improvements for large clusters
    • CluAdmin is now multithreaded
  • More SAN Friendly
    • No bus reset floods if LUN’s support Unique IDs
  • Full support for iSCSI Clusters
  • Improved shared disk identification mechanism during Setup
  • Maintenance Mode for Disks
    • Allows Chkdsk, Formatting, Volume restores of Clustered disks
what is clusprep
What is ClusPrep?
  • Runs a focused set of tests on a collection of servers that are intended to be a cluster
  • Potential problems with the hardware or configuration that could prevent clusters from working properly are caught before the cluster goes in production
    • Ensures that the solution you are about to deploy is rock solid
  • Currently in Beta1, planned to be provided as a download from
  • When executed on a configured cluster, it will do a software inventory, perform network testing, validate system configuration
  • Why? The vast majority of cluster installation and operation problem reports are frommisconfigured systems
what does clusprep inventory
Domain membershipand role

CPU architecture

All systems have same QFE and SP level

All systems have same OS version (and support clustering)

System drivers

Analysis of unsigned drivers

PnP device inventory

List of running services and processes

Memory information

HBAs and NICs

Environment variable

BIOS information

What Does ClusPrep Inventory?

Does a Complete Inventory of Each Server

what does clusprep validate
What Does ClusPrep Validate?
  • Verify 2 NICs per server
  • Each NIC has different IP address, and each is on a different subnet
  • Each server can communicate with every other
  • Verify shared disks accessible from all machines, visible only once, and uniquely identifiable
  • Verify network and disk I/O latencies
  • Verify bus reset or LUN reset
  • Verify SCSI reserve/release, reservation breaking, reservation defense
  • Verify online/offline (failover simulation)
XML Based for Easy Collection and Remote Analysis

Hyperlinks to Testing Details

Flags Test Results

Easy Web Browser Interface

cluster plans in longhorn
Cluster Plans in Longhorn
  • What’s Clustering in Longhorn all about?
    • Simplicity, Security, Stability
  • Clusters for people without PhD’s
    • Easy to create, use, and manage
    • Reduce Clustering Total Cost of Ownership
  • Improved Security
  • Networking for the 21st century
  • Designed for Storage Area Networks


Subject to Change

easy to create clusters
Easy to Create Clusters
  • Improved Cluster Setup
    • Setup is streamlined and simplified
      • Create an entire cluster in one seamless step
    • Thorough cluster testing to ensure your clusterwill function properly
      • All the power of a full cluster test suite in your hands to guarantee the actual cluster you are setting up will provide rock solid stability
    • Fully scriptable for automated deployments
easy to migrate clusters
Easy to Migrate Clusters
  • Cluster Migration Tool
    • Will assist migration of a cluster configurationfrom one cluster to another
    • Rolling upgrade of Windows 2003 to Longhorn cluster
      • Will be a “Roll Forward” model
improved management
Improved Management
  • All New Cluster Administrator Tool!
    • Designed to be task based and easy to use
    • Less dials-n-knobs to worry about
      • What’s all this IsAlive/LooksAlive stuff I don’t care about,just make my cluster work!
    • Tell us what you want to do and we’ll take careof the rest
      • I would like to make this File Share Highly Available…
new cluster mmc snap in
New Cluster MMC Snap-in

Cluster Administrator Tool Today…

Task Based

new cluster administrator roles
New Cluster Administrator Roles

Cluster Roles

Attributes of the Role

  • Expanded tool functionality forbetter manageability
    • Cluster Administrator graphical tool
    • Command line (cluster.exe)
    • Fully scriptable with WMI
      • Enhanced WMI functionality over Windows 2003
  • Migration from legacy cluster debug logging (cluster.log) to Event Tracing for Windows (ETW)
  • Virtual Server Share Scoping
    • Just see the shares available through thatVirtual Server
    • Removes user confusion when browsing Clusters
  • Ability to modify resource dependencies while resources are online
    • Facilitates scaling up disks while applicationsare online
  • Cluster VSS Writer for Backup & Restore
enhanced dependencies
Enhanced Dependencies

IP Address Resource A

IP Address Resource B

Network Name Resource

  • Network Name resource stays up if either IP Address resource A or B are up
    • Today both resource A and B have to be online for the Network Name to be available to users
    • Allows redundant resources and scoping impact to dependent services and applications



designed for sans
Designed for SANs
  • Improved disk fencing for shared disks
    • Enhanced to use SCSI-3 Persistent Reservations
  • No more device resets!
    • No longer uses SCSI Bus Resets which can be disruptive on a SAN
    • Disks are never left in an unprotected state
    • Lower Risk of Volume Corruption
  • Improved Supportability
    • Superior disk discovery and recoverability mechanisms
storage enhancements
Storage Enhancements
  • Support for GUID Partition Table (GPT) disks
    • Allows support for larger then 2 TB partitions
    • GPT provides improved redundancy and recoverability
    • Support for all platforms: x86, x64, and Itanium
  • Support for Hardware Snapshot restores of Clustered Disks
    • Improved disk Maintenance Mode will allow giving temporary exclusive access to online clustered disks to other applications
quorum improvements
Quorum Improvements
  • New best-of-both-worlds quorum model
    • Hybrid of Majority Node Set (MNS) logic and Shared DiskQuorum model
    • This model will replace both of the existing models
  • Scales from
    • Small to large node clusters
    • Clusters with or without shared disks
    • Geographically dispersed clusters
  • Can achieve current “Classic” quorum or MNSquorum functionality
    • Shared quorum disk is optional
  • NO single point of failure
    • Can survive loss of the Quorum disk
new hybrid quorum model
New Hybrid Quorum Model
  • 2+ Nodes with Shared Storage
    • Majority based on replicas of cluster data
    • Example shows 3 total replicas, so this configurationis resilient to the loss of any 1 replica



Private Storage Device

Private Storage Device


Shared Storage Device

new hybrid quorum model26
New Hybrid Quorum Model
  • 3+ Nodes without Shared Storage
    • Majority of replicas needed to operate cluster
    • Example shows 3 total replicas, so this configuration is resilient to the loss of any 1 replica
    • Same HA Characteristics as previous example



Private Storage Devices


enhanced security
Enhanced Security
  • Pure Kerberos based authentication
    • No more legacy NTLM
    • Secure mutual authentication
    • Enhanced encryption
    • Better performance
  • Moved from datagram (UDP) protocols to secure TCP session oriented protocols
  • Auditing of Cluster Access
    • “ Who failed over this group…?”
    • Logged to Security Event Log
    • Can be bubbled up through security tools or remote event management such as MOM
networking for the 21st century
Networking for the 21st Century
  • Integrated with new LH TCP/IP Stack
  • Full IPv6 Support
    • Client Access via IPv6
    • Tunnel IPv6 address resources for IPv4 compatibility
    • Inner-node communication with IPv6
  • No more legacy dependencies on NetBIOS
    • Ready for NetBIOS-less environments
      • Simplifying the transport of SMB traffic
      • Removes WINS and NetBIOS name resolution broadcasts
      • Standardizing name resolution on DNS
geographically dispersed clusters
Geographically Dispersed Clusters
  • No More Single-Subnet Limitation
    • Allow cluster nodes to communicate across network routers
    • No more having to connect nodes with VLANs!
  • Configurable Heartbeat Timeouts
    • Increase to Extend Geographically Dispersed Clusters over greater distances
    • Decrease to detect failures faster and take recovery actions for quicker failover
  • Using new Quorum with 3 sites, “wiser decisions” about automatic failover is provided
call to action
Call to Action
  • Ensure that your HBAs use Storport drivers
  • Storage needs to support the followingSCSI commands:
    • Persistent Reservations
    • Unique ID support
  • MPIO/DSM vendors need to handle Persistent Reservations correctly
community resources
Community Resources
  • Windows Hardware & Driver Central (WHDC)
  • Technical Communities
  • Non-Microsoft Community Sites
  • Microsoft Public Newsgroups
  • Technical Chats and Webcasts
  • Microsoft Blogs
additional resources
Additional Resources
  • Cluster Community Site
  • List of Cluster Newsgroups
  • Clustering Best Practices
  • Windows 2003 Cluster Technology Center
  • Frequently Asked Cluster Questions: