1 / 73

11gR2 RAC/Grid Clusterware: Best Practices, Pitfalls, and Lessons Learned Presented during DOUG meeting held on 10/21/

11gR2 RAC/Grid Clusterware: Best Practices, Pitfalls, and Lessons Learned Presented during DOUG meeting held on 10/21/2010 at Dallas, TX. Ramkumar Rajagopal. I ntroduction. DBARAC is a specialty database consulting firm with expertise in a variety of industries based at Austin , Texas.

trey
Download Presentation

11gR2 RAC/Grid Clusterware: Best Practices, Pitfalls, and Lessons Learned Presented during DOUG meeting held on 10/21/

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 11gR2 RAC/Grid Clusterware: Best Practices, Pitfalls, and Lessons LearnedPresented during DOUG meeting held on 10/21/2010 at Dallas, TX Ramkumar Rajagopal

  2. Introduction • DBARAC is a specialty database consulting firm with expertise in a variety of industries based at Austin, Texas. • Our people are experts in Oracle Real application clustered database focused solutions for managing large database systems. • We provide proactive database management services including but not limited to In-house, on-shore DBA support , remote DB support, database maintenance and backup and recovery. • Our DBA Experts provide specialized services in the areas of - • Root cause analysis • Capacity planning • Performance tuning, • Database migration and consolidation • Broad industry expertise • High-Availability RAC database specialists • End-to-end database support 11GR2 Grid clusterware

  3. Introduction Presenter • Senior Database Consultant DBARAC • Oracle Database/Applications DBA since 1995 • Dell, JP Morgan Chase, Verizon • Presenter @ Oracle Open world 2007 • Author Dell Power Solutions articles 11GR2 Grid clusterware

  4. AGENDA • Introduction • Node eviction issue in 10g • What is “11GR2 Grid Clusterware”? • The Challenges • What’s different today? • “We’ve seen this before, smart guy…” • Architecture and Capacity Planning • Upgrade Paths • Pre-installation best practices • Grid Clusterware Installation • Clusterware Startup sequence • Post Install steps • RAC Database build steps • Summary • Q&A 11GR2 Grid clusterware

  5. Why a node is evicted? • Split brain condition • IO fencing • CRS keeps the lowest number node up • Node eviction detection 11GR2 Grid clusterware

  6. Root Causes of Node Eviction • Network heartbeat lost • Voting disk problems • cssd is not healthy • Oprocd • Hang check timer • cssd and oclsomon race to suicide 11GR2 Grid clusterware

  7. 11GR2 Grid Clusterware Improvements • Node eviction algorithm is enhanced • Prevent a split-brain problem without rebooting the node • Oracle High Availability Services Daemon • Will still reboot in some cases • Faster relocation of services on node failure in 11GR2 11GR2 Grid Clusterware

  8. 9i/10g RAC Scenario • Several separate versions of databases • Several servers • Space/resource issues • Lesser resources • Provisioning takes time 11GR2 Grid clusterware

  9. Top concerns • How many are using 11GR2 Grid clusterware? • Do you have more than one mission-critical databases within single RAC cluster? • Can you allocate resources dynamically to handle peak volumes of various application loads without downtime? • Issues on using shared infrastructure • Will my database availability and recovery suffer ? • Will my database performance suffer ? • How to manage a large clustered environment to meet sla’s for several applications? 11GR2 Grid clusterware

  10. Why 11GR2 Grid CRS? • 11GR2 Grid Clusterware is… • An Architecture • An IT Strategy • Clusterware& ASM storage deployed together • Many, many Oracle Database Instances • Drives Consolidation 11GR2 Grid clusterware

  11. Challenges • Skilled resources • Meeting SLA’s • End-to-end testing not possible • Security Controls • Capacity issues • Higher short-term costs 11GR2 Grid clusterware

  12. What’s different today? • 11gR2 Grid CRS & ASM supports • 11GR2, 11GR1, 10gR1 and 10gR2 Single Instances • Powerful servers, 64Bit O/s • Provisioning Framework to deploy • Grid control 11GR2 Grid clusterware

  13. 11GR2 RAC DB Architecture Planning 11GR2 Grid clusterware

  14. Capacity Planning • What are the current requirements? • What are the future growth requirements in the next 6-12months? • To meet the demand – estimate the hardware requirements • Data retention requirements • Archiving and purging 11GR2 Grid clusterware

  15. Capacity Planning metrics • Database metrics for capacity planning • CPU & memory Utilization • I/O rates • Device utilization • Queue length • Storage utilization • Response time • Transaction rate • Network Packet loss • Network Bandwidth utilization 11GR2 Grid clusterware

  16. Capacity Planning Strategy • Examine existing engagement processes • Examine existing capacity of servers/storage • Define Hardware/database scalability • Provisioning for adding capacity • Integration testing • Large clustered database • SLA requirements 11GR2 Grid clusterware

  17. Comparison – 10g vs 11GR2 • Server consolidation • Database consolidation • Instance consolidation • Storage consolidation 11GR2 Grid clusterware

  18. AGENDA so far… • Introduction • Node eviction issue in 10g • What is “11GR2 Grid clusterware”? • The Challenges • What’s different today? • “We’ve seen this before, smart guy…” • Architecture and Capacity Planning • Upgrade Paths • Pre-installation best practices • Grid Clusterware Installation • Clusterware Startup sequence • Post Install steps • RAC Database build steps • Summary • Q&A 11GR2 Grid clusterware

  19. Upgrade Paths • Out-of-place clusterware upgrade • Rolling Upgrade • Oracle 10gR2 - from 10.2.0.3 • Oracle 11gR1 - from 11.1.0.6 11GR2 Grid clusterware

  20. Pre-installation best practices • Network Requirements • Cluster Hardware Requirements • ASM Storage Requirements • Verification Checks 11GR2 Grid clusterware

  21. Pre-Installation best practices Network Configuration • SCAN -Single Client Access Name • Failover - Faster relocation of services • Better Load balancing • MTU package size of Network Adapter (NIC) • Forwarder, zone entries and reverse lookup • Ping tests • Two dedicated interconnect switches for redundant interconnects • Run cluvfy 11GR2 Grid clusterware

  22. Pre install - Network - SCAN Configuration 11GR2 Grid clusterware

  23. Pre Install Network - SCANVIP Troubleshooting • SCAN Configuration: • $GRID_HOME/bin/srvctl config scan • SCAN Listener Configuration: • $GRID_HOME/bin/srvctl config scan_listener • SCAN Listener Resource Status: • $GRID_HOME/bin/crsctl stat res -w "TYPE = ora.scan_listener.type“ • $GRID_HOME/Listener.ora • Local and remote listener parameters 11GR2 Grid clusterware

  24. Pre Install -Cluster Hardware requirements • Os/kernel same on all servers in the cluster • Minimum 32 GB of RAM • Minimum Swap space 16GB • Minimum Grid Home free space 16GB • For each Oracle Home directory allocate 32 GB of space (for each db -32GB) • Allocate adequate disk space for centralized backups • Allocate adequate storage for ASM diskgroups – DATA and FRA 11GR2 Grid clusterware

  25. Cluster Hardware requirements continued… • Most cases: use UDP over 1 Gigabit Ethernet • For large databases - Infiniband/IP or 10 Gigabit Ethernet • Use OS Bonding/teaming to “virtualize” interconnect • Set UDP send/receive buffers high enough • Crossover cables are not supported 11GR2 Grid clusterware

  26. Pre Install - ASM Storage configuration • In 11gR2 ASM diskgroups are used • Grid infrastructure - OCR, Voting disk and ASMspfile. • Database - DATA and FRA. • OCR and voting disks for Grid clusterware • OCR can now be stored in Automatic Storage Management (ASM). • Add Second diskgroup for ocr using • - ./ocrconfig -add +DATA02 • Change the compatibility of the new diskgroup to 11.2 as follows: • ALTER DISKGROUP DATA02 SET ATTRIBUTE ‘COMPATIBILITY.ASM’=’11.2’; • ALTER DISKGROUP DATA02 SETATTRIBUTE ‘COMPATIBILITY.RDBMS’=’11.2’; 11GR2 Grid clusterware

  27. AGENDA so far… • Introduction • Node eviction issue in 10g • What is “11GR2 Grid clusterware”? • The Challenges • What’s different today? • “We’ve seen this before, smart guy…” • Architecture and Capacity Planning • Upgrade Paths • Pre-installation best practices • Grid Clusterware Installation • Clusterware Startup sequence • Post Install steps • RAC Database build steps • Q&A 11GR2 Grid clusterware

  28. Hardware/Software details • 10gR2 architecture – 9 database servers, 25TB storage • Original Database Version: 10.2.0.5 • Original RAC cluster version : 10.2.0.1 • Original Operating System: Ret Hat Linux 5 As 64 Bit • Storage Type : ASM & RAW Storage • 11gR2 Grid architecture – 4 database servers, 40TB storage • New Database Version: 11.2.0.2 • New Grid Clusterware/ASM version: 11.2.0.2 • New Operating System : Ret Hat Linux 5 As 64Bit • Data migration steps using Rman backup and restore and data pump export dump files 11GR2 Grid Clusterware

  29. 11GR2 Migration Steps • Install 11gR2 Grid clusterware and Asm • Install 11gR2 database binaries for each database separately • Create the 11gR2 database • Add additional ASM diskgroups • Install 11GR1/10gR2 database binaries • Create 11GR1/10gR2 databases • Take backup • Restore the data 11GR2 Grid Clusterware

  30. Pre-verification checks- cluvfy • Before Clusterware installation • ./cluvfy stage -pre crsinst -n node1,node2, node3 –verbose • Before Database installation • ./cluvfy stage -pre dbinst -n node1,node2, node3-fixup -verbose 11GR2 Grid Clusterware

  31. 11gR2 Grid Clusterware Installation – Step 1 11GR2 Grid Clusterware

  32. Step-2 11GR2 Grid Clusterware

  33. Step-3 11GR2 Grid Clusterware

  34. Step-4 11GR2 Grid Clusterware

  35. Step 5 11GR2 Grid Clusterware

  36. Step 6 11GR2 Grid Clusterware

  37. Step 7 11GR2 Grid Clusterware

  38. Step 8 11GR2 Grid Clusterware

  39. Step 8 cont.. 11GR2 Grid Clusterware

  40. Step 9 11GR2 Grid Clusterware

  41. Step 9 cont... 11GR2 Grid Clusterware

  42. Step 10 11GR2 Grid Clusterware

  43. Step 11 11GR2 Grid Clusterware

  44. Step 11 cont.. 11GR2 Grid Clusterware

  45. Step 12 11GR2 Grid Clusterware

  46. Step 12 11GR2 Grid Clusterware

  47. Step 13 11GR2 Grid Clusterware

  48. Step 14 11GR2 Grid Clusterware

  49. Runfixup.sh • root> /tmp/runfixup.sh • Response file being used is :/tmp/CVU_11.2.0.1.0_grid/fixup.response • Enable file being used is :/tmp/CVU_11.2.0.1.0_grid/fixup.enable • Log file location: /tmp/CVU_11.2.0.1.0_grid/orarun.log • Setting Kernel Parameters... • fs.file-max = 327679 • fs.file-max = 6815744 • net.ipv4.ip_local_port_range = 9000 65500 • net.core.wmem_max = 262144 • net.core.wmem_max = 1048576 • uid=501(grid)gid=502(oinstall)groups=502(oinstall), • 503(asmadmin),504(asmdba) 11GR2 Grid Clusterware

  50. Step 15 11GR2 Grid Clusterware

More Related