1 / 63

Planning for Scale and Capacity

Planning for Scale and Capacity. Kimmo Forss Lead IW Architect Microsoft Corporation. Simon Skaria Senior Program Manager Microsoft Corporation. Satish Matthew Program Manager Microsoft Corporation. James Petrosky Sr. Consultant Microsoft Corporation. Doron Bar-Caspi

talasi
Download Presentation

Planning for Scale and Capacity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Planning for Scale and Capacity Kimmo Forss Lead IW Architect Microsoft Corporation • Simon Skaria • Senior Program Manager Microsoft Corporation Satish Matthew Program Manager Microsoft Corporation James Petrosky Sr. Consultant Microsoft Corporation • Doron Bar-Caspi • Senior Program Manager Microsoft Corporation

  2. Capacity Planning The process of evaluating a technology againstthe needs of an organization, and making an educated decision about the procurement, design, and configuration of hardware and software to meet the demands specific to a system being installed.

  3. Common Questions • How much data can we store? • How many users can our environment support? • How much hardware do we need? • How many sites can we run on our servers? • How do we validate our design? • What is the SharePoint Capacity Planning Tool and how can I use it? • What tools can we use to measure performance? • How do we plan and monitor our storage needs & performance?

  4. SharePoint Planning Lifecycle Usage Scenarios SharePoint Planning Adaptive Refinements Solution Design Physical Configuration

  5. Session Objectives And Takeaways • Session Objectives • Discuss the SharePoint Planning process • Discuss the Components and Factors to Consider when Planning for Performance and Capacity in MOSS • Provide Recommendations and Best Practices • Describe the considerations when planning a SQL Server backend for a large SharePoint deployment. • List Best practices for disks and other hardware tuning to optimize SharePoint performance. • Demonstrate the SharePoint Capacity Planning Tool • Leave with a Better Understanding of the Planning Process, SharePoint Platform, & Recommendations • Describe the Process for Determining the Hardware and Topology Requirements

  6. AgendaPlanning for Scale and Capacity • Usage Scenarios • Solution Design • Physical Design & Configuration • Plan for Software Boundaries • Estimate Performance and Capacity Requirements (Throughput Targets) • Plan Hardware and Storage Requirements • Monitoring & Adaptive Refinements • SharePoint Capacity Planning Tool Performance and capacity planning: The process of mapping your solution design to a farm size and set of hardware that will support your business goals.

  7. Usage Scenarios

  8. Usage Scenarios • Understanding the business needs, goals, and desired value added by the solution • Who are the end-users? • How will they use the solution? • What are the different use-case scenarios? • What does the data look like? • How it’s used (i.e. “High Impact”, “Archived”, etc.) • Business types (i.e. “HR”, “Sales”, etc.) • File types (i.e. “.docx”, “.pptx”, etc.) • Average size for each file type • Where are the end-users located?

  9. SharePoint 2007 Usage Scenarios Business Intelligence Collaboration Forms and BusinessProcesses Enterprise Content Management Web Content Management Search

  10. Global Intranet Scenario • Centralized or Distributed • Global teams or local teams • Where to put the sites • End user experience • Bandwidth/Latency • Operational costs • Search • Offline • Network accelerators • Security • ACLs

  11. Web Publishing Scenario • Characteristics • Fewer content creators (may be external agencies) • Large number of viewers (authenticated or not) • Small number of central sites for content publishing • Approval workflow • Staging and deployment • Database Characteristics • Publishing databases mainly read • Design considerations • Design for end to end performance • Page size matters! (html, .js,.css,….) • More memory than CPU intensive (caching) • Navigation and search ability

  12. Search Scenario • Characteristics • Multiple sources • Security trimming • Indexing Characteristics • Network/CPU intensive • Querying Characteristics • CPU/Memory/IO Intensive • Database Characteristics • Property store in SQL • Design considerations • 50 M items / catalog (farm)

  13. Solution Design

  14. Solution DesignAn Example • Central authoring farm • Content is staged, reviewed and approved  • Approved content pushed to the globally distributed data centers • My Sites available to users locally in their data center • Business intelligence reports for finance available via KPI dashboard • Global search available to everyone

  15. Physical Design & Configuration

  16. Physical Design Planning Continuum • MOSS On-premise • Complete customizability • Facilitates rich, complex KM scenarios • SMEs plans, designs and implements Planning and expertise • Core WSS • For small orgs • Simple on-premise • MOSS Online • Complete MOSS offering • Limited customization • Office Live • Hosted for grassroots orgs • No planning necessary Functionality and Complexity

  17. Physical Design Planning • Components • Software Boundaries • Throughput Targets • Data Capacity • Hardware • Planning Activities • Plan for Software Boundaries • Estimate Performance and Capacity Requirements • Plan Hardware and Storage Requirements • Test and Validate Your Design Performance and capacity planning: The process of mapping your solution design to a farm size and set of hardware that will support your business goals.

  18. Software Boundaries

  19. Plan for Software BoundariesPhysical Design • Object Categories • Software Scalability vs. Hardware Scalability • Test Results, Findings, and Recommendations from the Product Group • Test Environment • Test Results • Recommendations • Other Considerations

  20. Plan for Software BoundariesObject Categories • Site Objects • Site Collections, Web sites, documents, document libraries, list items, document file size, etc. • People Objects • User profiles, security principals, etc. • Search Objects • Search indexes, Indexed documents • Logical Architecture Objects • Shared Services Providers, Site Collections, Content Databases, Zones, etc. • Physical Objects • Servers: Index, WFE, Database, Application, etc.

  21. SharePoint Containment Hierarchy

  22. Plan for Software BoundariesSoftware Scalability vs. Hardware Scalability • Software scalability • Recommendations for acceptable performance based on software behavior and characteristics • Hardware scalability • Does not change/modify software behavior or characteristics…but can increase overall throughput of a server farm and might be necessary to achieve acceptable performance as the number of objects approach recommended limits

  23. Plan for Software BoundariesProduct Group's Test Environment • Hardware Specifications: • Network: Gigabit Ethernet (one billion bits/sec) • Farm Configurations Tested:

  24. Plan for Software BoundariesTest Results and Findings • Throughput vs. Number of Site Collections in One Content Database

  25. Moving Site Collections Usingstsadm -o mergecontentdbs Kimmo Forss Lead IW Architect Microsoft Corporation demo

  26. Plan for Software BoundariesTest Results and Findings • Throughput differences between flat document library vs. document library with folders See “Scaling to Extremely Large Lists and Performant Access Methods” at http://blogs.msdn.com/sharepoint/archive/2007/07/25/scaling-large-lists.aspx

  27. Plan for Software BoundariesRecommendations & Guidelines (subset) • For all recommendations, visit “Plan for software boundaries (Office SharePoint Server)” at http://technet2.microsoft.com/Office/en-us/library/6a13cd9f-4b44-40d6-85aa-c70a8e5c34fe1033.mspx

  28. Information Architecture Limit content DB • Soft limit* for the size of a ContentDB: 100GB. In most cases, exceeding 100GB is discouraged. • If you can excuse going over 100GB, make sure: • Test your I/O subsystem for adequate perf. • Use a single site collection in this DB. • Remember to test your Backup solution for this size. For minimum downtime – we recommend adequate tools, like a differential backup solution. * Your experience may vary: H/W and usage profile dependant.

  29. Information Architecture Manage Large Lists • SharePoint support large lists, but you must carefully plan how users view the lists to prevent performance impacts. • For best performance, do not go over 2,000 items in a list level (for example, the root of the list or a single folder). • If you must create and browse large lists, define and use customized filtered views that are configured to return less than 5,000 items.

  30. Plan for Software BoundariesOther Considerations • Throughput vs. number of Web servers • Test findings showed plateau at 5:1 (YMMV) • Perform tests in your environment • Other Recommendations • Carefully plan your site hierarchy and design • Minimize # Web applications and application pools • Limit # of Shared Service Providers • Plan for database growth • Follow data and feature best practices andsuggested limits.

  31. Usage Profiles

  32. Estimate Performance & CapacityUsage Profiles • Determine Usage Profile • Usage profile == User community’s behavior • Distribution of requests across content • Operation types and frequency • Existing solution in place? Mine IIS logs • Leverage usage profiles provided in configurations tested by Product Group as starting point:

  33. Estimate Performance & CapacitySample Usage Profile (WSS Collaboration)

  34. Estimate Performance & CapacityThroughput Requirements • Estimating Throughput Targets • User response time, concurrency Warning: Plan for Peak Concurrency Throughput targets (in RPS) at various concurrency rates (recommended response time of 1 – 2 seconds)

  35. Estimate Performance & CapacityOther Factors • Other configuration factors that can influence throughput targets • Indexing (schedule indexing window off-hours) • Caching enabled? • Output Caching and Cache Profiles • Object Caching • Disk-based Caching for Binary Large Objects • If interested in learning more about Caching in MOSS, the next session in this room will provide more info %(AG306 Performance and Optimization Strategies for MOSS 2007) • Page customizations • Custom Web parts

  36. Estimate Performance & CapacityOther Factors, Latency • Latency components • Server processing, SQL processing, # SQL round trips, AJAX processing, security trimming • Client processing • Javascript, CSS, AJAX requests, HTML load, Client machine specs • Wire transfer, Bandwidth, size of download • Recommendations • Primary cause of latency problems: custom web parts • Watch for: SQL round trips, unnecessary data, excessive client side script • Re-use existing client code versus adding more • Design code for performance – (Use HTML and .Net best practices) • Profile your solutions

  37. Hardware & Storage

  38. Plan Hardware and StorageHow SharePoint Scales • Designed to grow with organization needs • Server resources: x32, x64, CPU, RAM, HDD • Recommend 64-bit for back end services (SQL) which can leverage additional addressable memory • SQL: HDD configuration critical • Server Farm • Topology restrictions removed • WFE, Query, Index, Excel Calc, Project, SQL • Adopted WSS adage: content only limited by HW capability* • Sites: In WSS 3.0, Portals sites are "just another WSS site”

  39. Plan Hardware and Storage64-bit vs. 32-bit Hardware • WSS 3.0 and MOSS 2007 can work on both • 64-bit hardware can be mixed within a farm (and even at the server role*) • 64-bit Hardware Recommended; • This is last version of 32-bit • 32-bit can directly address only a 2GB Memory Address Space • 64-bit supports up to 1,024 GB Memory (Physical and/or Addressable) • 32-bit may perform better when using <= 2GB RAM, but we recommend 64-bit for future investments and scalability • Larger # of Processors • 64-bit HW Prioritization • SQL Server  Index  Excel  Search  WFE

  40. Plan Hardware and StorageSingle Server Example • One Server Configured as: • Web Front-End Server Role • Application Server Role • Database Server Role • Appropriate for limited use-scenarios including the following: • Installing Office SharePoint Server 2007 for evaluation purposes. • Deploying only Microsoft Windows SharePoint Services 3.0. • Deploying a subset of the Office SharePoint Server 2007 features. • Deploying Office SharePoint Server 2007 for a limited purpose (such as for a single department) or for a limited number of users.

  41. Web Server + Query Server Clustered SQLServer Application Server Plan Hardware and StorageMulti-Server Farm Example • Optimizes performance of web servers • Increases redundancy and reduces points of failure • Redundancy at WFE, Query, and Database server roles • Determine configuration based on your business needs and goals • Determine config of other Application roles (Excel Services, Index, Forms, etc)

  42. Plan Hardware and StorageStorage Considerations • Primary Metric: Document Storage • Plan for 1.2 – 1.5 x file system size for SQL Server • note: metric is closely tied to RAID level used on SQL disks • Secondary Metric: Index Size • Index Server: (5 – 12% of total size of all indexed content) * 3 • Query Server: Same as Index Server

  43. Plan Hardware and StorageStorage Considerations • Read and review new whitepaper “Performance Recommendations for Storage Planning and Monitoring” at http://technet2.microsoft.com/Office/en-us/library/ca472046-7d4a-4f17-92b1-c88a743a5e3c1033.mspx?mfr=true • Discourages Content DBs > 100GB (if larger, then try to limit to a single site collection per DB) • Stresses design to respect published software boundaries • Manage lists and libraries with many items (>2,000) • Start with a dedicated server running SQL Server 2005 • Separate and prioritize your data among disks • Physical storage & RAID recommendations

  44. Plan for Hardware and StoragePlanning for SQL Server is a Must • Tests and deployment experience shows that a healthy SQL Server is the basis for a healthy SharePoint farm. • Sub-optimal SQL Server will radiate to other components in the farm. • Slow response from SQL Server will result in WFE requests buildup in a queue, and will cause unpredicted symptoms. • I/O subsystem hardware plays a significant role.

  45. Plan for Hardware & StorageSQL box memory • “4 GB is the minimum required memory, 8 GB is recommended for medium size deployments, and 16 GB and above is recommended for large deployments.” • What influences the amount of RAM? • Number and size of Content databases • Number of concurrent requests to SQL • Total user base • Size and width of commonly used lists • Remember: Minimum is where we start…

  46. Plan for Hardware & Storage DAS vs SAN • Both type of storage can scale, perform, and serve a multi-TB farm. • So… where’s the difference? • Ease of management • Growth potential • Advanced capabilities (snapshots, remote replication) • The ability to share with other applications • Price…

  47. Plan for Hardware & Storage SQL Server disks • When prioritizing data among faster disks, use the following ranking: • TempDB data and transaction logs • Database transaction log files • Search database. • Database data files • Note: In a heavily read-oriented portal, prioritize data over logs.

  48. Plan for Hardware & Storage SQL Server files • Best Practices: • Allocate TempDB on RAID 1 (or R1 variants) • Separate Data and Logs across disks • For TempDB, Create multiple data files up to the number of CPU cores • Pre-Grow files (don’t rely on Autogrow) • Allocate dedicated disks for Search

  49. Plan for Hardware & Storage SQL Server Disks • Disk array design is critical to good SQL performance • In general more spindles (disks) = better performance • Calculate array performance. Plan for .75 to 1IOPS per GB of array for content. 1.5 to 2 for temp, search and log • How to calculate array performance: Spindle IOPS capability * Number of Spindles = Total IOPS • Common spindle capabilities: • U320 SCSI 10K = 100 IOPS, 15K 130 IOPS • Fiber Channel 10K = 130 IOPS, 15K = 200 IOPS • SAS 10K 165IOPS, 15K = 260 IOPS • Example: FC 15K disk X 10 disks = 200 * 10 = 2000IOPS • Raid 1 verses Raid 5. Read speed = same. Write speed twice as bad on Raid 5

More Related