1 / 42

Dealing with large Content Scenarios in SharePoint Server 2007

Dealing with large Content Scenarios in SharePoint Server 2007 . Architecture, Challenges, and Strategies Abrar Chisti, Microsoft Corporation. Agenda. Overview Manageability Planning Availability Case Study Takeaway’s. Content Database Growth. Use as Document Repository

lynch
Download Presentation

Dealing with large Content Scenarios in SharePoint Server 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dealing with large Content Scenarios in SharePoint Server 2007 Architecture, Challenges, and Strategies Abrar Chisti, Microsoft Corporation

  2. Agenda • Overview • Manageability • Planning • Availability • Case Study • Takeaway’s

  3. Content Database Growth • Use as Document Repository • Multiple versions of documents • 70-95% of size is File Stream • Storage of large Multi Media files • Lack of Governance/Site Quotas • One Large Site Collection • Lack of Planning

  4. Is SharePoint the Right Solution? • SharePoint sites evolve organically. • Database Capacity planning is often overlooked • Limited or no Governance • One or more large content database(s) • Difficulty for IT to maintain • IO Throughput and Latency is effected

  5. Manageability

  6. Plan for Manageability • Limit Content Database Size to <= 100G • If Content DB Size is > 100G • Use Differential/Incremental Backups • SQL Server 2005/2008 • DPM 2007 • Test & Baseline IO Sub-System • Set DB Auto-growth to Fixed Value • Split Sites in Content DB to multiple Content DB’s

  7. Backup & Restore Options

  8. How to Manage Content • Split Content Database • Move Site Collections between Databases • Move Sites into Site Collections (Re-Parent) • May need to promote sub sites to sites • May need to move site collections between web applications • Use OOB or 3rd Party Tools • Stsadm –o export/import • Stsadm –o backup/restore • Stsadm –o mergecontentdb • Content Deployment API (Selective)

  9. How to Limit Storage • Document Libraries • Limit # of Versions. • Archive or Delete Old Sites • Archive or Delete Unused Sites • Impose Site Quotas • Different types of quotas – Small/Med/Large • Take into Consideration Recycle Bin • Manage Lists for Performance

  10. Upgrade Hardware/Software • Ensure Latest SP/Patch • Use Dedicated SQL Server • Use 64 Bit Architectures and 64 Bit OS • Use MS Hardware Recommendations • Use SQL Server connection alias when you configure your farm • Increase Bus Bandwidth

  11. Take Advantage of SQL Server 2008 Capabilities • Performance - Implement database backup compression. • Availability - Implement log stream compression. • Security – Implement Transparent Data Encryption (TDE). • Resource management – Use SQL Server 2008 Resource Governor • Be Aware of DB Migration Considerations

  12. Content Archival/Reduction • Use Database Snapshots • Use Records Repository Implementation • Externalize (BLOB) storage

  13. Database Snapshot • Provides “snapshot” of Content DB at given instant. • Requires Same DB Server Instance • Refers to the Original Database • Uses “Copy on write” mechanism • Need to create Separate Web App.

  14. Records Repository

  15. Remote/External Blob Storage • Reduce Storage Costs • External Blob Storage API • Remote Blob Storage API • SQL Server 2008 has support for RBS • Can write BLOB directly using RBI • http://blogs.msdn.com/sqlrbs/

  16. External Blob Based Solution -BLOB IO is moved to Web Front End -Supports Compression And Encryption Capability

  17. Planning

  18. Plan for Software Boundaries • Bottom Up Approach • Plan for SQL Storage • SharePoint Performance Recommendations • # of Site Collections/Content DB • 50,000 • # of Site Collections/Web Application • 150,000 Site Collections • 100 Content DB’s Per Web Application • Use Multiple SQL Servers for Higher Scalability

  19. Storage Architecture • Use Appropriate Disk and SAN interface • SCSI vs IDE vs SATA vs SAS • Consideration – Hot Swap, Multiple IO, Speed, Capacity, Protocol • Use Appropriate Disks and RAID Arrays • Faster Disks/Arrays • Separate Disks for TempDB, ContentDB, and Trans Logs • Multiple Data Files for Large Content and Search DB’s • Distribute files across Disks

  20. Content Database Allocation • SharePoint Allocation of Content DB’s • Pre-Allocate Pool of db’s • Round Robin Scheme between DB’s • Based on Delta between Max sites and Current sites • Example • Site Collection Per Database • Create Database with 100G (using ALTER DB Command) • Leverage Managed Paths

  21. Availability

  22. Clustering • SAN or Shared Disks • Use Windows/SQL Clustering for HA • Dedicated Disks or DAS • Use SQL Server Mirroring

  23. Redundancy across Data Centers • Log Shipping • Synchronous Mirroring • Asynchronous Mirroring • SQL Server 2008 Log Compression

  24. High Availability Farm

  25. Monitoring

  26. Monitoring • Processor: % Processor Time: _Total. On the computer that is running SQL Server, this counter should be kept between 50 percent and 75 percent. • System: Processor Queue Length: (N/A). 2 x #of core CPUs. • Memory: Available Mbytes: (N/A). Monitor this counter to ensure that you maintain a level of at least 20 percent of the total physical RAM available. • Memory: Pages/sec: (N/A). Monitor this counter to ensure that it remains below 100.

  27. Disk Counters • Logical Disk: Disk Transfers/sec • Logical Disk:Disk Read Bytes/sec & Disk Write Bytes/sec • Logical Disk: Average Disk sec/Read (Read Latency)/Avg Disk Sec/Write • Logical Disk: Average Disk Byte/Read/Write • Physical Disk: % Disk Time • Logical Disk: Current Disk Queue Length • Logical Disk: Average Disk Reads/Sec and Logical Disk

  28. Performance Monitoring • Perfmon • Analyze Logs using codeplex tools • Favorite Web Monitoring (3rd Party) solution. • System Center Operations Manager (SC-OM) • SharePoint Monitoring Toolkit • http://blogs.msdn.com/sharepoint/archive/2007/12/10/announcing-new-system-center-operations-manager-2007-packs-for-wss-3-0-and-moss-2007.aspx

  29. Case Study Large Automotive Loan Origination Application

  30. Large Storage Scenario (Phase I) • Ability to house 10.5 million content items (1+TB). • System input with "normal" input load, defined as 27,000 document per day (1 day = 10 hours). • Simulate user load to represent 200 users simultaneously accessing the system to: • Use search to find elements of document metadata. • View a document (scanned TIFF image). • Update elements of document metadata.

  31. Phase II • Ability to house 50 million content items (5+TB). • 35 million TIFF images. • 15 million Microsoft Office documents • Determine the maximum number of users the solution could support. • Users perform the following tasks: • Use search to find elements of document content (full-text) and metadata. • View a document (scanned TIFF image or Microsoft Office document).

  32. Architectural Overview Logical Architecture – Phase I

  33. Architectural OverviewLUN/DBMatrix

  34. Takeaway’s • Optimize Performance • Planning & Monitoring • Plan for Scale • Plan for Availability • Plan for Manageability

  35. References • SQL Server Database Optimization • http://technet.microsoft.com/en-us/library/cc263261.aspx • Plan for Software Boundaries • http://technet.microsoft.com/en-us/library/cc262787.aspx • Move Site Collections to new Content Database • http://technet.microsoft.com/en-us/library/cc825328.aspx • Enable SharePoint 2010 to Use Remote BLOB Storage • http://technet.microsoft.com/en-us/library/ee748641(office.14).aspx/ • Content Deployment API (PRIME) • http://msdn.microsoft.com/en-us/library/cc264073.aspx • Integration of SQL Server 2008 and SharePoint • http://msdn.microsoft.com/en-us/library/cc264073.aspx • Use Database Snapshots for Archiving Sites • http://technet.microsoft.com/en-us/library/cc706872.aspx • Configure Availability in SharePoint Farm • http://technet.microsoft.com/en-us/library/dd207311.aspx • Case Study for Large Content Scenario • http://technet.microsoft.com/en-us/library/cc262067.aspx • Scaling Storage Architecture • http://www.knowledgelake.com/whitepaper/Scaling%20SharePoint%202007%20-%20Storage%20Architecture.pdf

  36. Tools Availability • SPUsed Space Info • SPSiteInfo • Content Deployment Wizard • Migrate from other source systems. • Other tools in CodePlex • 3rd Party • Metalogix, Qwest, Tzunami, AvePoint, StoragePoint, Knowledge Lake

More Related