1 / 28

SharePoint Saturday DC 2017 Correlation or Bust?

Learn about using correlation IDs, diagnostic logging, log levels, usage and health data collection, and more in SharePoint. Presented by Toby McGrail, Senior SharePoint Technical Architect.

ewright
Download Presentation

SharePoint Saturday DC 2017 Correlation or Bust?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SharePoint Saturday DC 2017 Correlation or Bust? Toby McGrail Senior SharePoint Technical Architect

  2. Agenda • Introduction 03 • What is a Correlation ID? 04 • Diagnostic Logging - Overview 05 • Log Levels – Event Logging 06 • Log Levels – Trace Logging 07 • Usage and Health Data Collection 08 • Health Analyzer 09 • Developer Dashboard 10 • Developer Browser Tools 11 • Warm Up App Pools/Sites 12 • Tools 13 • PowerShell Commands 14 • ULS Viewer 15 • Search and Crawl Logs 16 • Monitoring 17 • Custom Error Pages 18 • IIS App Pools and Sites 19 • Counters and Thresholds 20 • Network Troubleshooting 21 • Client/Browser Issues 22 • Recap 23 • Questions? 24

  3. Introduction – Who Am I? • Toby McGrail – Senior SharePoint Technical Architect DXC Technology • 12 Years SharePoint Infrastructure and Consultant Experience • Over 25 IT Experience • Specializing in SharePoint Architecture in US Public Sector and Migration Specialist

  4. What is a Correlation ID? • Correlation Ids are GUIDs(Genuine Unique Identification) assigned to events which transpire during the lifecycle of a resource request. • As problems occur, the Correlation Id is commonly surfaced within the context of an error when presented to the person initiating the request or through the Developer Dashboard if configured properly. June 5, 2017 4

  5. Diagnostic Logging – Overview • The primary goal of monitoring is to ensure a healthy SharePoint Environment so that you can achieve service performance objectives such as short response time. • You can use the monitoring features from the SharePoint Central Administration and PowerShell scripts to monitor the SharePoint Environment and services. • Logs and reports track SharePoint Environment and service status. • You can read the logs from the logging database. The advantage of using logging database is that you can configure your view and export the logs to Excel. • The logs and reports from Central Administration help you understand how the SharePoint 2013 system is running, analyze and repair problems, and view metrics for the sites. • Log Levels • Trace Logs • Event Throttling June 5, 2017 6

  6. Log Levels – Event Logging • It is important that you choose an appropriate severity level. The severity level of an event is displayed in the Windows Event Log and is used by administrators and registered by monitoring tools to indicate how severe or important an event is. Choosing an appropriate level is a key part of the health and monitoring design for your component or system. • Now to the Levels • Critical Error - Events that demand the immediate attention of the system administrator. They are generally directed at the global (system-wide) level, such as System or Application. They can also be used to indicate that an application or system has failed or stopped responding. • Error - Events that indicate problems, but in a category that does not require immediate attention. • Warning -Events that provide forewarning of potential problems; although not a response to an actual error, a warning indicates that a component or application is not in an ideal state and that some further actions could result in a critical error. • Information - Events that pass noncritical information to the administrator, similar to a note that says: "For your information.“ • Verbose - Verbose status, such as progress or success messages. June 5, 2017 7

  7. Log Levels – Trace (ULS) • When writing a trace log by using the ULS API, you must specify a severity level. The severity level is displayed in the ULS trace log and is commonly used by reporting or filtering tools. For this reason, it is important to choose an appropriate level. • Now to the Levels • Unexpected - Similar to an Assert (an assumption in code that a condition is true at a particular point), this message indicates that a logic check failed that is atypical, or the message returns an unexpected error code. These generally represent code bugs that should be investigated and fixed. • Monitorable - Traces that indicate a problem, but do not need immediate investigation. The intent is to collect data and analyze it over time, looking for problem trends. • High - General functional detail, the high priority events that happen in the environment. Examples include global configuration modifications, service start and stop, timer jobs completed, and so on • Medium - Useful to help support or test teams debug customer or environmental issues. These likely include messages indicating that individual features have succeeded or failed, such as creating a new list, modifying a page, and so on. • Verbose - Useful primarily to help developers debug low-level code failures. Not generally useful to anyone who does not have access to source code or symbols. Most event tracing that does not need to be enabled all the time should be set at the Verbose level. • . June 5, 2017 8

  8. Usage and Health Data Collection • SharePoint stores Usage and Health Information in Files and in a Database. • Consumes disk space and has a huge effect on Performance. Remember that these files can fill up server space if not configured correctly. Always remember to set a limit and don’t make it unlimited or you will see your disk space disappear rapidly • Something that needs to be managed closely and includes: • Health Data Collection – Lots of timer jobs to monitor and maintain • Log Collection – Timer Job to copy events from files into the Database • . June 5, 2017 9

  9. Health Analyzer • Identifies possible problems and gives the Farm Admin Possible solutions • Some of the Solutions have the Repair Now however in most cases they don’t work or are not “Best Practices” • Applies a set of rules that can be extended or in most Environments customized to the needs of the Farm • Rules are applied for some of the following categories • Security • Performance • Configuration • Availability • Timer Jobs perform these monitoring tasks and collect the monitoring data • Some of these notifications are not always helpful but more time consuming than anything else • Some of the alerts however are also very useful in finding potential issues that you would only find by monitoring the ULS Logs June 5, 2017 10

  10. Developer Dashboard • Don’t be fooled by the name its more a tool to help you troubleshoot problems and performance issues • Easily Troubleshoot Problems with Page Rendering • Three Types of modes that you need to be aware of • Off - Not Displayed • On – Rendering on Each and Every Page • OnDemand – Hidden until you manually click on the Developer Dashboard Icon • Granular Control on Visibility provided – Users that have Customization permissions by default • Great way to Monitor Custom Code when the Developer uses the SPMonitoredScope Tag – It’s a great idea to make your solutions use this tag. • Use PowerShell to enable DD in SP2013 and 2016. • $ds= [Microsoft.SharePoint.Administration.SPWebService]::ContentService.DeveloperDashboardSettings$ds.DisplayLevel = 'OnDemand'$ds.TraceEnabled = $true$ds.Update()``

  11. Warm Up App Pools and Sites • SharePoint App Pools are part of IIS (By default they recycle automatically) and Recycling App Pools are essential to running fast on first load. • Create a Warm Up Script that runs using Scheduled tasks every morning. • Run the task about 30 minutes before the first person comes in the office. For example I have it run at 530 AM EST. • Warm up all web applications and site collections for more reliability • Customize your script depending on environment and run with Powershell! • Sample Script • #------------------# Ensure the SharePoint Snappin has been loaded#------------------if ( (Get-PSSnapin -Name "Microsoft.SharePoint.PowerShell" -ErrorAction SilentlyContinue) -eq $null ) { Add-PSSnapin "Microsoft.SharePoint.PowerShell“} #------------------# Simple method to write status code with a colour#------------------function Write-Status([Microsoft.PowerShell.Commands.WebResponseObject] $response) { $foregroundColor = "DarkRed“ if($response.StatusCode -eq 200) { $foregroundColor = "DarkGreen“ } write-host ([string]::Format("{0} (Status code: {1})", $response.StatusDescription, $response.StatusCode)) -ForegroundColor $foregroundColor } •  #------------------ Warm-up all web applications • #------------------Get-SPWebApplication | ForEach-Object { write-host ([string]::Format("WebApplication request fired for {0} [{1}]. ", $_.DisplayName, $_.Url)) –NoNewline Write-Status -response (Invoke-WebRequest $_.url -UseDefaultCredentials -UseBasicParsing) }  • #------------------ • # Since the root of web applications use different templates then other site collections, also load other sites of different # types. This ensures their assemblies also get loaded in memory • #------------------ • $additionalUrls = @("https://sharepoint.jmlfdc.mil/sites/search" ;, "https://sharepoint.jmlfdc.mil" ;, • , • "https:/sitename.com/sites/blog" ;, • "https://sitename/sites/SPTOBY" ;) $additionalUrls | ForEach-Object { • write-host ([string]::Format("Additional web request fired for Url: {0}. ", $_)) -NoNewline • Write-Status -response (Invoke-WebRequest $_ -UseDefaultCredentials -UseBasicParsing) }

  12. Tools • Troubleshooting tools are key and will make your job easier and help you resolve issues faster. Resolutions are not always easy but having the tools to resolve are. Here are some of the tools that I use • Wireshark is the world’s foremost and widely-used network protocol analyzer. • Developer Dashboard – Built into SharePoint • Fiddler - The free web debugging proxy for any browser, system or platform • Developer Browser Tools F12 • Performance Monitor – Performance data of servers and workstations • ULS Viewer – The easiest way to look through ULS Logs

  13. PowerShell Commands • PowerShell is a vital part of SharePoint Administration and architect. Here are a few you should use • Merge-SPLogFile -Path "C:\Logs\FarmMergedLog.log" –Overwrite • Get-SPDeletedSite | select Path , siteid • Find Errors in a Content Database - Test-SPContentDatabase -name WSS_Content_DB –webapplication • Give SPShell Access - Add-SPShellAdmin -Username domain\username -database(Get-SPContentDatabase -> -webapplication) • Create new site - New-SPSite -Url http://localhost/Sites/NewSiteCollection- OwnerAlias username • List all items in a site - Get-SPWeb -Identity | Select -Expand Lists | Select -Expand Items |->select Name, Url • Get a list of failed timer jobs - Get-SPTimerJob | Select -Expand HistoryEntries | Where {$_.Status -ne "Succeeded"} -> | group JobDefinitionTitle • SharePoint Configuration after Upgrade - PSConfig.exe -cmd upgrade -inplace b2b -wait –force • Restart SharePoint Service - net start SPTraceV4; net start SPWriterV4; net start SPAdminV4; net start SPTimerV4; net start w3svc • Get all Service Application –GetSPServiceApplication • Configure ULS and Data Collection through PowerShell • Set-SPDiagnosticConfig -LogLocation D:\DiagnosticLogs • Set-SPDiagnosticConfig –LogMaxDiskSpaceUsageEnabled

  14. ULS Viewer • ULS Viewer is a Windows application that provides a simplified view of ULS log files in SharePoint 2013 • Easiest way to read or parse through the Trace Logs • Allows you to access them in real time • Filter using columns specific key words or the most helpful one Correlation ID!!! • Very basic yet powerful all in one tool

  15. Search and Crawl Logs • Crawl Logs are vital to keeping your search running effectively and Performance is at its premium. Fix the following issues immediately when seeing them in crawl logs • Top Level documents especially start addresses • Virtual Servers • Content DB • Crawl Health Reports Give you valuable information on How Search is Performing • Query Health Reports – Queries are what the user sees so keeping query errors to a minimum is key! • CPU and Memory Load issues will cause search to slow down and even stop • Error Breakdown page is very useful and lists all issues immediately

  16. Monitoring • Monitoring SharePoint is often overlooked in smaller SharePoint farms but don’t let this be the case. Not monitoring your farm leads to more issues that can be and should be avoided • HTTP “Ping is a useful command but doesn’t help when troubleshooting • Remember SharePoint implements custom error messages. (AKA the Correlation ID error message or the Working on it Error Message • Most common error codes 404 and 401 can be hidden • Monitor your Timer Jobs, Scheduled Tasks and ULS Logs • Develop a page that checks SharePoint Services. Every twenty minutes for Upper Management Viewing

  17. Custom Error Pages • Create Custom Pages to allow for more in depth logging. • Example HTTP Throttling for Performance Issue • Custom Error Page to help Admin and Support with user with important data • Correlation ID • Web Front End Server • Time of Error • User affected • Log Name

  18. IIS App Pools and Sites • Common Issues with SharePoint App Pools • IIS Resets not done correctly • No Recycling or Restarting of App Pools • IIS Website is stopped • Create Task to have App Pools recycled daily and restarted once a week. Also have them restart automatically • IIS Logging to see why App Pools and Sites have stopped or is not responding.

  19. Counters and Thresholds • Processor Utilization – Not to exceed 80 Percent but ideally under 50 Percent • Available Memory – Greater then 10 Percent • Disk Latency Less then 25 MS but ideal situation is 15 MS • SQL Server is more like 10 MS

  20. Networking Troubleshooting • SharePoint is Fast on Server but slow on client • Slow only across VPN Clients • Slow on Server and Client. Communication Issue with SQL Server is most likely the issue • Networking Tools • Microsoft Network Monitoring • Wireshark

  21. Client and Browser Issues • Is the issue across the network or just one or very few users experience the issues • Make sure that all clients are at Organization approved browser level • SharePoint relies heavily on JavaScript • Older Browser deliver poor user adoption and/or support • IE9 and above are faster more reliable and have more functionality. • Firefox Version 5 or later. Not all SharePoint features work in Firefox • Chrome is my Favorite and loads faster then most browser

  22. Recap • Know your Environment – Troubleshooting starts here! • Performance Baselines help detect and limit issues and problems • Monitoring is the Key! • Pay attention to Log Files – Both Event and ULS Logs. ULS Viewer should become your best friend next to PowerShell. • Tools • Developer Dashboard • Browser Tools • Fiddler • Wireshark • ULS Viewer • Diagnose one issue at a time! Don’t always trust google when implementing a solution. Thoroughly test in dev and/or test environments before moving it to the Production Farm • PowerShell is like your Super Power of SharePoint Administration. Know the basics and use scripts to keep your engine running at optimal speed and performance.

  23. Questions? • Do you have any issues that you have seen that we have not covered • Don’t forget to fill out a survey • Visit our Wonderful Sponsors • My Blog • http://tobymcgrail.com:2020/SPADMIN • Contact Information: • Toby McGrail – toby.mcgrail@dxc.com • Twitter - @SPTOBY1

  24. Thank you.

  25. Housekeeping… • You must be present to win at the wrap-up… • Remember to stop by to say hi to our sponsors

  26. Thanks to our Sponsors!!!

  27. Join us at #SharePint after the conference! Why? To network with fellow SharePoint professionals What? SharePint!!! When? 4:45 PM Where? Announced at Conference Wrap-Up

More Related