1 / 38

HA Considerations in a WAS V5.0 Deployment

HA Considerations in a WAS V5.0 Deployment. Guy Nirpaz WebSphere Specialist guy_nirpaz@il.ibm.com. Introduction To High Availability. High Availability System and Infrastructure Designed to Minimize or Eliminate Outages Continuous Availability “ Nonstop Service” Service Levels

nuala
Download Presentation

HA Considerations in a WAS V5.0 Deployment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HA Considerations in a WAS V5.0 Deployment Guy NirpazWebSphere Specialist guy_nirpaz@il.ibm.com

  2. Introduction To High Availability • High Availability • System and Infrastructure Designed to Minimize or Eliminate Outages • Continuous Availability • “Nonstop Service” • Service Levels • Providing a Defined Availability Level

  3. Typical Non-clustered Unix Availability The “5 Nines”

  4. SPOF Elimination Technologies • Hardware/OS Clustering • Example Products • HACMP, SunCluster , MCServiceGuard, MS Clustering, Veritas Cluster Server • Software Clustering • Example Products • UDB/DB2 ESE/DPF, Oracle RAC, Sybase Replication Server • Software Clustering Relies on Database Replication

  5. Active-Active / Active-Standby Client Client Server 1 Server 2 Active Standby Active Active-Active Active-Standby

  6. 1.5 % + 1.5 % + 0.5 % + 1.9 % + 0.4 % + 1.8 % + 2.6 % + 1.8 % + 0.1 % + 0.2 % + 0.4 % + 0.3 % Total System Availability 98.5 % 98.5 % 99.5 % 98.1 % 99.6 % 98.2 % Browser ISP Network Switch /Router Load balancer Firewall Http Server Firewall App Server Dist. DB Server SNA WAN Gateway Host CICS/DB 97.4 % 98.2 % 99.9 % 99.8 % 99.6 % 99.7 % 87% Total system availability including all elements is :

  7. WAS Terminology • Application Server – Application Server Process • Node – Physical Machine hosting 1 or more App Servers • Node Agent – Manages App Servers on a Node (1 per node) • Cell – Administrative Domain (1 or more nodes) • Deployment Manager – Manages Cell via Node Agents • Cluster – One or more App Servers in a Cell

  8. Topology Components Deployment Mgr. Database(s) HTTP Client WAS App HTTP Server WAS App Node Agent WAS App WAS 5 ND JMS WES Internet LDAP WAS App WAS App HTTP Server Node Agent WAS App WAS 5 ND Firewall

  9. HTTP Deployment Mgr. Database(s) HTTP Client WAS App HTTP Server WAS App Node Agent WAS App WAS 5 ND JMS WES Internet LDAP WAS App WAS App HTTP Server Node Agent WAS App WAS 5 ND Firewall

  10. HTTP Server Plug-in Detects Failure Marks Container as Unavailable Tries Next Server in ServerCluster Web Container Web Container App Server App Server HTTP Server WebSphere Servlet Requests HTTP Servlet Requests Server Plug-in Plug-in HTTP/S Protocol Traffic

  11. Plug-in Configuration (plugin-cfg.xml) • Configuration Refresh • <Config RefreshInterval=60> • ConnectTimeout • # of seconds plugin tries to contact an app server process • Default is zero (not specified prior to V5.02) • Plugin will block many requests until OS times out attempting to connect to with dead machines • Server CloneID>=“xxx” Name=“clone2” ConnectTimeout=“10”> • LoadBalanceWeight • Weight assigned to server for load distribution • Corresponds to Server Weight in SM • RetryInterval • Depends on # of Servers TCP/IP Timeout for OS • Too Small & Performance Degrades • Recommended • 10 Seconds + (# of clones * TCP Timeout) • Note: Every RetyInterval every server/process will block one request for ConnectTimeout seconds

  12. HTTP Server HTTP Server HTTP Request HA HTTP Server HTTP Requests WES (Primary) HTTP Requests WES (Secondary) TCP/IP Traffic

  13. Web Container – Servlet/JSP Deployment Mgr. Database(s) HTTP Client WAS App HTTP Server WAS App Node Agent WAS App WAS 5 ND JMS IP Sprayer Internet LDAP WAS App WAS App HTTP Server Node Agent WAS App WAS 5 ND Firewall

  14. Session Manager – Failover Options • Database • Utilize Database HA • Memory-To-Memory Replication • Build on Small, Fast 'pub/sub' Engine • Runs in an Existing Server Process • Separate Thread per Queue • Can be set up Peer-to-Peer or Client-Server Default is Peer-to-Peer • Multiple 'channels' available for partitioning • Messages can be encrypted - DES or Triple DES • Does not provide a “persistent” store • Both Perform Essentially the Same!! • Object Serialization/Deserialization is 95% of Performance “Cost”

  15. Session Manager • Updates Occur • At End of the Servlet Service Method • The Default Prior to V5.0 • Manually • Requires Use of IBM Extension to HttpSession • At the End of a Specified Time Interval • V5.0/V5.01 Default • Time Interval Defaults to 120 Seconds • Recommended Tuning of 10 Seconds for Performance and Failover

  16. Handling single point of failure (SPOF) Web Web Web Container Web Container Container Container WAS WAS WAS WAS Web Container Web Container Web Container WAS (store) WAS (store) WAS (store) Memory To Memory Replication – Configurations WAS (Store) WAS (Store) Dedicated App Servers for HTTP Session Store

  17. EJB – RMI/IIOP Deployment Mgr. Database(s) HTTP Client WAS App HTTP Server WAS App Node Agent WAS App WAS 5 ND JMS IP Sprayer Internet LDAP WAS App WAS App HTTP Server Node Agent WAS App WAS 5 ND Firewall

  18. EJB – Two Major Steps • Locating the EJB Home in the Name Server … InitialContext ic = new InitialContext () ; Object o = ic.lookup (“java:comp/env/ejbs/CustomerHome”) ; … • Invoking methods on remote objects … Customer cust = customerHome.findByPrimaryKey (id) ; cust.getName () ; ...

  19. 5: EJB Method Request 7: Additional EJB Requests WLM Info + IOR 6: EJB Method Result & Server Routing List 1: Initial Request 2: Indirect IOR 3: Direct IOR 4: IOR for EJB Server & WLM Context IIOP WLM v5.0 ... Hashtable env = new Hashtable(); env.put( Context.INITIAL_CONTEXT_FACTORY, "com.ibm.websphere.naming.WsnInitialContextFactory"); env.put( Context.PROVIDER_URL, "corbaloc::myhost1:9810,:myhost1:9811,:myhost2:9810"); Context initialContext = new InitialContext(env); ... 1: EJB Lookup EJB Container Server 1 AppServer1 EJB Container Server 2 AppServer2 EJB Container Server 3 Name Service LSD AppServer3 Client JVM Process Server Cluster NodeAgent

  20. Web Container App Server Java Client EJB Container EJB Container App Server App Server WebSphere IIOP Requests • EJB Requests IIOP Traffic IIOP Traffic

  21. IIOP Failover • Handled by SM Runtime and ORB Plug-in • Application Servers • Deployment Manager/Node Agent is “parent” to All Application Servers • State Changes Are “pushed” to Application Servers • Clients • Updated with Response from Application Server • Epoch Number Change is Used as Indicator • Marks Server as Unavailable After Failures • com.ibm.ejs.wlm.MaxCommFailures on command line • com.ibm.ejs.wlm.UnusableInterval on command line • Time out for network requests (when machines fail) • OS TCP/IP timeout (prior to connection being established) • com.ibm.CORBA.requestTimeout on command line (after connection is established) • Avoid Dynamic Port Assignment after App Server Restart and for Firewalls • Configure End Points in SM (Admin Browser) • ORB_LISTENER_ADDRESS • com.ibm.CORBA.ListenerPort in V4 and before • SAS_*_LISTENER_ADDRESS and CSIv2_*_LISTENER_ADDRESS

  22. IIOP Failover Extensions in V5.02 Enterprise • Dynamic WLM • Weights Adjusted Based On PMI Feedback • Adjusts Every 7 Seconds (same as Edge Server Default) • V5.02 Adjustment Based on • CPU Utilization - 88% • Response Time - 2% EJB + 2% Web • Conc.Requests - 2% EJB + 2% Web • Total Req. Rate - 2% EJB + 2% Web • Not Configurable/Customizable in V5.02 • Cross Domain/Cell Failover • Requires Identical Cluster in Two Cells (Mirroring) • Same Application and Cluster Name • Number of Cluster Members Can Differ

  23. EJB • Stateless Session Bean • WLM by infrastructure • No HA issues • Stateful Session Bean • WLM on lookup only • Doesn’t support auto-faliover (according to J2EE Spec.) • Entity Bean • State is in persistent store (usually DB) • WLM with session and TX affinity • TX log for XA Transactions

  24. JMS Deployment Mgr. Database(s) HTTP Client WAS App HTTP Server WAS App Node Agent WAS App WAS 5 ND JMS IP Sprayer Internet LDAP WAS App WAS App HTTP Server Node Agent WAS App WAS 5 ND Firewall

  25. WebSphere JMS Server WebSphere App Server WebSphere App Server Node Agent Node Agent Node Agent MDB MDB Embedded JMS Server Deployment Manager WebSphere App Server Node Agent MDB

  26. WebSphere App Server WebSphere App Server WebSphere App Server Node Agent Node Agent Node Agent MDB MDB MDB HA Embedded JMS Server Heartbeat WebSphere JMS Server (Primary) WebSphere JMS Server (Secondary) HA Cluster SW HA Cluster SW Mirrored Disks

  27. WebSphere App Server WebSphere App Server WebSphere App Server Node Agent Node Agent Node Agent MDB MDB MDB HA WebSphere MQ Heartbeat WebSphere MQ (Primary) WebSphere MQ (Secondary) HA Cluster SW HA Cluster SW Mirrored Disks

  28. Systems Management Deployment Mgr. Database(s) HTTP Client WAS App HTTP Server WAS App Node Agent WAS App WAS 5 ND JMS IP Sprayer Internet LDAP WAS App WAS App HTTP Server Node Agent WAS App WAS 5 ND Firewall

  29. V5 SM Failover – Deployment Manager • Deployment Manager failover not addressed in Network Deployment • Can be "Nannied" (Unix initab or Windows Service) • Consequence of Deployment Manager failure: • Unable to broadcast configuration changes to Node Agents • Admin Console unavailable • wsadmin unavailable (unless manually directed to specific server) • In short, you cannot make any changes to the central configuration • Deployment Manager handles IIOP WLM routing table 'masters' • If the Deployment Manager is down, failed or stopped servers will still have routed requests • Until plug-in marks as down • Performance degradation is possible if Deployment Manager stays down • IIOP WLM for New ORB Clients will “pin” to One Application Server • In V5.01 New ORB Clients Utilize “stale” Server ClusterList

  30. V5 SM Failover – Node Agent • Node Agents have copies of all configuration information • Can be "Nannied" (Unix initab or Windows Service) • Consequence of Node Agent failure • Local Configuration May Not Reflect Global Configuration Until Synchronization (restart) • Application Servers Cannot Be Started Until Node Agent is Restarted • Can Be Stopped Via command line.

  31. WAS App WAS App Node Agent WAS App HTTP Server WAS 5 ND Simple HA Topology Cell Deployment Mgr. WAS App WAS App Node Agent WAS App HTTP Server WAS 5 ND DB HA Software DB HA Software

  32. Application Session & LDAP Application Session & LDAP WAS App WAS App WAS App WAS App Node Agent Node Agent WAS App WAS App WAS 5 ND WAS 5 ND MQ Clustering MQ Clustering WAS App WAS App WAS App WAS App Node Agent Node Agent WAS App WAS App WAS 5 ND WAS 5 ND HA “Gold Standard” Cell 1 HTTP Server WES Deployment Mgr. IP Sprayer Deployment Mgr. HTTP Server Cell 2 HTTP Server WES Deployment Mgr. IP Sprayer Deployment Mgr. HTTP Server

  33. HA “Gold Standard” • Two (or More) Cells • Hardware Isolation • Software Isolation • Planned Maintenance • Insurance Against Catastrophic Outage • May Require More Administrative Effort • Don’t Forget “Rule of 3” • With 2 of “Everything”, an Outage (Planned or Unplanned) Reduces Capacity by 50% • Is No Longer Fault Tolerant

  34. HA Equipment Requirements • Minimum “Price of Admission” for HA • Firewall Servers = 2 • IP/HTTP Sprayers = 2 • HTTP Servers = 2 • Application Servers = 2 • Deployment Manager = 2 • JMS Servers = 2 • Database Servers = 2 • LDAP Servers = 2 • Total = 16 • Some “Layer Compression” is Possible Through Collocation of Components • Security May Be Compromised • Administration More Difficult

  35. WebSphere HA Take-aways • Use HA Topology that suites your needs • Consider entire system • WAS has built in capabilities for HA – use them • Speak to IBM – we can help

  36. Additional Resources • Whitepapers • Establishing database failover support with High Availability Cluster Multi-Processing (HACMP) • http://www-4.ibm.com/software/webservers/appserv/hascenario.html • A Highly Available & Scalable LDAP Cluster in an IBM AIX Environment • http://www-1.ibm.com/servers/esdd/articles/ldap/index.html • WebSphere Connection Pooling • http://www-4.ibm.com/software/webservers/appserv/whitepapers/connection_pool.pdf • Server Clusters For High Availability in WebSphere Application Server Network Deployment Edition 5.0 • http://www-1.ibm.com/support/entdocview.wss?rs=180&context=SSEQTP&q=&uid=swg27002473&loc=en_US&cs=utf-8&lang=en • Implementing a Highly Available Infrastructure for WebSphere Application Server Network Deployment, Version 5.0 without Clustering • http://www7b.software.ibm.com/webapp/dd/transform.wss?URL=/wsdd/library/techarticles/0304_alcott/alcott.xml&xslURL=/wsdd/xsl/document.xsl&format=one-column • Redbooks (http://www.redbooks.ibm.com) • IBM WebSphere V5.0 Applications: Ensuring High Performance and Scalability SG24-6198-00

  37. שאלות ?

  38. Don’t forget to give us feedback Presentation Code: A10

More Related