1 / 54

Configuring Service Manager for Performance and Scale

Concurrency Infrastructure Practice Lead, Nate Lasnoski, presented this material at the Microsoft Management Summit (MMS) in April 2013, in Las Vegas, NV. Lasnoski discusses configuration changes, tools, and strategies for improving the performance and scalability of the System Center Service Manager installation. Audio for this presentation is included after the last slide, in the embedded video. For more information http://www.concurrency.com

Concurrency
Download Presentation

Configuring Service Manager for Performance and Scale

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. For more information: www.concurrency.com

  2. Configuring Service Manager for Performance and Scale Nathan Lasnoski Infrastructure Architect, Microsoft MVP Concurrency SD-B312 For more information: www.concurrency.com

  3. Experiences with Service Manager and Orchestrator Visibility Understand what you deliver,how its configured, and how you support it … One configuration database for 20,000+ servers, 1000+ SQL databases, and work items Process Improvement Optimize or re-shape processes for enterprise organizations … 1,000,000+ work instances, each of which saved on average of 15 minutes of IT time Automation (Client or Datacenter) Take manual processes and replace them with automation … 200,000+ user and asset automations, each of which saving an hour of time For more information: www.concurrency.com

  4. What does performance mean? Console Operations Incidents, problems, and changes open and close quickly Read operations provide content quickly to IT users Portal Performance No or minimal “please wait” on the portal Submissions complete without failure Automation Performance Automation execution time. Application deployment in 1 vs. 5 minutes makes a difference Report Performance Report data is delivered quickly and user expectations are set For more information: www.concurrency.com

  5. What does scale mean? Massive User Counts Customers with 200,000+ Users Huge Datacenters Datacenters with 20,000+ Servers Geography Geographically dispersed users “Follow-the-sun” Quantity of Requests Millions of incidents, requests, and changes For more information: www.concurrency.com

  6. Agenda Architecture for “Large” Scale Deployments (50,000+) SQL Server Configuration Tips User Experience Optimization Workflow and Connector Optimization Data Warehouse Optimization For more information: www.concurrency.com

  7. Deployment Architecture for Large Scale For more information: www.concurrency.com

  8. Assumptions for Large Scale Deployments General Assumptions Geographically dispersed IT analysts and end users Constant availability without large maintenance windows Failover scenarios for alternate site Need for offloading functions to scale as needs arise General Assumptions Large scale of users (50,000+ users) Large scale of work items (ex: millions of incidents per year) Large scale of configuration items (hundreds of thousands of computers, software, servers, etc.) For more information: www.concurrency.com

  9. Hardware Recommendations for Large Scale Management Servers 4 – 8 core 16 GB RAM (console and workflow servers) SQL Servers 16 – 24 core 32+ GB RAM High performance disk DW SQL Servers 16 – 24 core 32+ GB RAM High performance disk For more information: www.concurrency.com

  10. Hardware Recommendations for Large Scale SQL Servers and DW SQL Servers 8 – 16 core 32+ GB RAM High performance disk SharePoint and Web Content Servers 4 - 8 core 16 – 32 GB RAM 80 GB HD Orchestrator Management and Runbook Servers 4 – 8 core 16 – 32 GB RAM High performance disk For more information: www.concurrency.com

  11. Example Architecture for 50,000+ Users Management Servers Dedicated workflow mgmt. server Dedicated console servers with load balancer Dedicated mgmt. servers for Orchestrator SQL Clustering for all components AlwaysOn for operations databases For more information: www.concurrency.com

  12. SQL Server Configuration For more information: www.concurrency.com

  13. SQL Server Configuration TempDB High performance LUN, split IO to log volume Multiple TempDBs (1 per 2 cores), but only one log, normalized size TempDB performance is critical to Service Manager performance ServiceManager Database High performance LUN, split IO to log volume Transaction log set to BASIC by default Test performance of your disk: “sqlio -kW -t2 –s30 -o1 -frandom -b64 -BH -LS c:\Testfile.dat “ RAM Allocation Configure for 2 GB less than the total RAM on the SQL server For more information: www.concurrency.com

  14. SQL Server Configuration SQL Broker Drastic performance difference with workflows Validate SQL Broker set to 1 vs. 0 Checking SQL Broker SELECT is_broker_enabled FROM sys.databases WHERE name = 'ServiceManager‘ Setting SQL Broker (change window only) ALTER DATABASE ServiceManager SET SINGLE_USER WITH ROLLBACK IMMEDIATE ALTER DATABASE ServiceManager SET ENABLE_BROKER ALTER DATABASE ServiceManager SET MULTI_USER For more information: www.concurrency.com

  15. Max Degree of Parallelism Defines parallel processing rules in SQL Default is “0” and allows a single query to use all processor cores Better results in our environments with “1” to “4”. Learn more and set as required. Test, as your experience may vary. http://msdn.microsoft.com/en-us/library/ms181007(v=SQL.105).aspx SELECT name, value FROM sys.configurations WHERE name = ‘max degree of parallelism’ Validate Read Committed Snapshot Isolation (RCSI) Validating isolation levels SELECT name, is_read_committed_snapshot_on FROM sys.databases where name = 'ServiceManager‘ ALTER DATABASE ServiceManager SET READ_COMMITTED_SNAPSHOT ON For more information: www.concurrency.com

  16. SQL Performance for ServiceManager DB RAM Usage Page File % Usage should be less than 1% Memory – Available Mbytes should not be below 100mb (indicates starved OS) Disk Performance Ave. Disk Sec/Read and /Write should be less than 20 msec typically (some spikes ok) Ave. Disk Queue Length can indicate disk IO issues, though less valuable than Ave. Disk Sec/Read CPU Performance Validate SQL service and CPU usage. Change Grooming Settings Minimize Retention Settings – ideally less than a week or two Create custom grooming rules For more information: www.concurrency.com

  17. SQL Logs and Recovery Model Check your log size and expansion Does your log write, clear, and have expansion space? Check your recovery model Use SIMPLE, or configure a transaction log backup Suggestion is to use SIMPLE For more information: www.concurrency.com

  18. High Availability SQL 2012 AlwaysOn High availability with AlwaysOn http://blogs.technet.com/b/babulalghule/archive/2013/02/17/how-to-install-service-manager-2012-sp1-with-a-sql-2012-alwayson-availability-groups.aspx For more information: www.concurrency.com

  19. DEMO: Checking your SQL Settings For more information: www.concurrency.com

  20. User Experience Optimization For more information: www.concurrency.com

  21. Pick the Right User Experience Self-Service Portal End user self-service interactions or IT interactions Service Requests are excellent for “low / no training”, interactions Console High bandwidth, low latency scenarios RemoteApp necessary for high bandwidth scenarios Less than 100 msec or less 150 – 200 msec has 40% degradation Web Consoles (GridPro and Cireson) Excellent for low bandwidth and/or latency scenarios Some match with console experience (GridPro) For more information: www.concurrency.com

  22. Console Optimization Console Optimization Slower performance when maximized Apply SP1 for console memory leak Minimize Quantity of Views Delete views not needed. More views slows console load time. Use Search vs. Views Use searching vs. views to find data quickly Configure the Global Operators Group Limits uses selectable when assigning work items (drastic performance improvement) For more information: www.concurrency.com

  23. View Optimization Basic Views Bring back only information from one class The fastest view to create Type Projections / Combination Classes Combine classes to bring back information that includes relationships Use the smallest projection possible Download type projections for Incident, Problem, Change, and Service Request Advanced Classes Do not use the (Advanced) class in views. Use a type projection. The (Advanced) class can be used in searches that return small quantities http://blogs.technet.com/b/servicemanager/archive/2010/12/02/faq-why-is-my-custom-incident-view-so-slow.aspx http://blogs.technet.com/b/servicemanager/archive/2011/09/19/new-change-request-type-projections-management-pack.aspx

  24. Scoped User Roles vs. Non-Scoped Scoped Scoped user roles facilitate filtering based on groups and queues Minimize use of scoped user roles due to additional table join in the database For more information: www.concurrency.com

  25. Portal Performance Portal and Icons More icons means longer load time Consider using a “start page” to point to Service Offerings Scope Request Offerings Use user roles to scope access to service offerings and request offerings The more service offerings and request offerings, the longer the load Disable App Pool Recycling for SharePoint and SSP Default is nightly recycling which causes slow initial performance Make sure not to set recycling to high memory usage http://technet.microsoft.com/en-us/library/cc753179(v=WS.10).aspx For more information: www.concurrency.com

  26. Portal Performance Use Searching in Request Offerings Don’t increase the size of the query results Use pre-search filters to allow searching of hundreds of thousands of CIs in seconds http://blog.concurrency.com/infrastructure/service-manager-request-query-result-filtering/ Use Known Information Use previous questions and default values to filter queries Use known information (such as user name and relationship to computer) Avoid MP ENUM questions as they cannot be used in post-selection filters For more information: www.concurrency.com

  27. DEMO: Pre-Search Filters For more information: www.concurrency.com

  28. Server and Workflow Optimization For more information: www.concurrency.com

  29. Dedicated Servers Use a Dedicated Workflow Server Always the first server in the management group by default Use a Dedicated Orchestrator Target Server Any server in the management group Used for any Orchestrator interaction Use a dedicated account for Orchestrator automations For more information: www.concurrency.com

  30. Workflow Tweaks Implied Permissions Workflow Consider disabling implied permissions Utilize direct permissions with un-scoped operator role New Priority Calculation Disable or use priority calculation rule Disable Incident_Adjust_PriorityAndResolutionTime_Custom_Rule.Add if using SLOs First Assigned Relationship Disable “WorkItem_SetFirstAssignedTo_RelationshipAdd_Rule” if not used SLO Application Limit to what needed, as SLO application is performance heavy Orchestrator can apply SLOs based on complex rule matrixes.

  31. Group Calculation Interval Minimize Group Calculation Interval Default is every 30 seconds, 6000000 = 10 minutes HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\System Center\2010\Common\GroupCalcPollingIntervalMilliseconds For more information: www.concurrency.com

  32. Troubleshooting Performance Delays How to find a problematic workflow Use queries in blog post: http://blogs.technet.com/b/servicemanager/archive/2013/01/14/troubleshooting-workflow-performance-and-delays.aspx Setting the Watermark Artificially moving the watermark forward… only last resort For more information: www.concurrency.com

  33. DEMO: SM Checking Workflows For more information: www.concurrency.com

  34. Orchestrator When to Use Orchestrator Complex workflows, ex. Incident categorization, routing, notifications Use vs. many individual Service Manager workflows High capacity automation performance Performance distribution of workflow processes Incident Categorization Monitor Object or “Get” initiation Complex routing and translation For more information: www.concurrency.com

  35. Orchestrator Using Orchestrator with Notifications Complex logic and offloading notifications http://blog.concurrency.com/infrastructure/scsm-notifications-with-orchestrator-roll-up/ For more information: www.concurrency.com

  36. Orchestrator RunbookInitiation Option 1 Service Manager workflow initiates runbook activity Easier and out of box Runbook Initiation Option 2 Orchestrator initiator runbookto find active runbooks Drastically faster than Service Manager OOB initiation (about 4 times faster!!) For more information: www.concurrency.com

  37. DEMO: Using Orchestrator for Workflow For more information: www.concurrency.com

  38. Connector Configuration Constrain Sync Only sync those groups, users, and configuration items you need Null is Bad Do not sync “null” values on connectors. Connectors fight with each other. For more information: www.concurrency.com

  39. Connector Configuration DCM Synchronization Disable DCM workflow http://blogs.technet.com/b/mihai/archive/2012/11/30/configuration-manager-connector-s-dcm-rule-can-cause-massive-performance-issues-in-service-manager.aspx Using the DNS Trick for Active Directory DNS trick (fix in UR2) Sync Custom Variables Sync variables from Active Directory not covered by connector, or faster http://blog.concurrency.com/featured-post/how-to-sync-other-properties-from-active-directory-to-service-manager-using-orchestrator/ (password last set variable) For more information: www.concurrency.com

  40. Workflow / Connector Scheduling Option 1: Configure in XML or PowerShell Can cause workflows to run more efficiently Option 2: Configure Orchestrator to Turn Jobs on and off Configure Orchestrator job with schedule to turn on and off / start / end For more information: www.concurrency.com

  41. Batch Size Tweak the batch size Can cause workflows to run more efficiently <Rule ID="CIListRule" Enabled="true" Target="OMConnectorLibrary!Microsoft.EnterpriseManagement.LinkingFramework.OpsMgrConnector.SyncWorkflowTarget" ConfirmDelivery="false" Remotable="true" Priority="Normal" DiscardLevel="100"> <Category>Maintenance</Category> <DataSources> <DataSource ID="DS1" TypeID="Subscriptions!Microsoft.SystemCenter.CmdbInstanceSubscription.DataSourceModule"> <Subscription> <InstanceSubscriptionType="$MPElement[Name='OMConnectorLibrary!Microsoft.EnterpriseManagement.LinkingFramework.OpsMgrConnector.OpsMgrCIs']$"> <UpdateInstance /> </InstanceSubscription> <!--<StartWatermark>1</StartWatermark>--> <PollingIntervalInSeconds>10</PollingIntervalInSeconds> <BatchSize>1000</BatchSize> </Subscription> </DataSource> </DataSources> For more information: www.concurrency.com

  42. Related Orchestrator Tips For more information: www.concurrency.com

  43. Orchestrator Topologies and Service Manager Orchestrator Use a dedicated SQL server Use dedicated runbook servers Use dedicated runbook and management servers for specialized functions Service Manager Target specific Service Manager server, not used for consoles or workflows Hardware Runbook servers: 4 core, 16 – 32 GB RAM Management servers: 4 core, 16 – 32 GB RAM SQL servers: 4 core, 16 – 32 GB RAM For more information: www.concurrency.com

  44. Orchestrator Topologies and Service Manager Runbook Design Use smaller, more easily segmented runbooks DO NOT use “activity specific” logging in production (is substantially slower by 3 to 1 ratio) RunbookExecution Use dedicated runbook servers Use dedicated runbook servers for specialized functions Clear the Logs Automatically clear logs, unless the cycle is to rapid, causing it to lock up the system Manually clear the logs or use Orchestrator job: http://blog.concurrency.com/infrastructure/virtualization/manually-clearing-orchestrator-logs/ For more information: www.concurrency.com

  45. Orchestrator Topologies and Service Manager Beware of the “Licensing Issue” Make sure to upgrade to SP1, beware of “license” issue in earlier Orchestrator builds Hardware and Logging Make sure your Orchestrator servers are on great hardware, especially disk hardware Do not configure your logging improperly (use either simple or full with transaction log backups) For more information: www.concurrency.com

  46. DEMO: Manually Clearing the Logs For more information: www.concurrency.com

  47. Data Warehouse For more information: www.concurrency.com

  48. DW and SQL Performance Scalability and SQL Sizing Importance Use a dedicated SQL server Allocate enough RAM and don’t starve the OS Understand analysis services impact and potentially offload Beware cube processing impact Running Jobs Manually How to execute a job to correct issues Order of events when updating the DW Snapshots in CMDB Utilize snapshot driven CMDB reports for accessing non-DW data without performance impact Provide reports in self-service driven model, such as through Reporting Services + SSP For more information: www.concurrency.com

  49. Configure Transform Batch Size Configure the Batch Size of the Transform Adjust to larger batch The transform module default batch size is 50,000 items. The batch size can be adjusted by inserting rows into the DWRepository.ETL.Configuration table as follows: insert into DWRepository.ETL.Configuration( ConfigurationFilter, ConfigurationPath, ConfiguredValueType, ConfiguredValue ) values (        'etl.Transform',        'BatchSize',        'Int32', '100000' ) For more information: www.concurrency.com

  50. Questions! For more information: www.concurrency.com

More Related