1 / 86

Windows Server 2003: An Update

Windows Server 2003: An Update. Demand Technology Software 1020 Eighth Avenue South, Suite 6, Naples, FL 34102 phone: (239) 261-8945 fax: (239) 261-5456 e-mail: markf@demandtech.com http://www.demandtech.com. Outline. Windows Server 2003 Overview Support for big Iron 64-bit processors

bran
Download Presentation

Windows Server 2003: An Update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Windows Server 2003: An Update Demand Technology Software 1020 Eighth Avenue South, Suite 6, Naples, FL 34102 phone: (239) 261-8945 fax: (239) 261-5456 e-mail: markf@demandtech.com http://www.demandtech.com

  2. Outline • Windows Server 2003 Overview • Support for big Iron • 64-bit processors • ccNUMA architecture multiprocessors • LRU-based memory management • WSRM policy-based performance manager • New web services application processing architecture (IIS 6.0 Web Gardens) • Assessment

  3. Big Iron support

  4. Windows 2003 architecture • No major architectural changes • processr.sys support for processor-specific optimizations • ccNUMA support • New LRU-based memory management • Identical 32 and 64-bit versions • Multiple service host (svchost) process address spaces • For purposes of extended security granularity

  5. Windows 2003 architecture • Multiple service host (svchost) process address spaces • How to identify them C:\>tasklist /svc /fi "Modules eq srvsvc.dll"

  6. Windows 2003 architecture • Multiple service host (svchost) process address spaces • How to identify them Module identification function:

  7. Windows evolution • Common NT code base now in use across all Microsoft platforms • Windows XP for workstations • Windows Server 2003 • Windows CE • Embedded Windows • xBox??? • Which allows .NET Framework programs to execute everywhere

  8. .NET Languages Enterprise Services (Clustering) ADO.NET XML Support Other services Forms ASP.NET XML, SOAP Common Language Runtime (CLR) COM+ MSMQ AD WMI Win32 .NET Framework (for applications)

  9. Windows 2003 evolution: new in WinXP

  10. Windows 2003 evolution: new in Win2003

  11. Windows 2003 evolution: new in Win2003

  12. Beyond Windows 2003 • Longhorn Preview • Production versions no slated to ship until 2006 or 2007 • Workstation-oriented UI changes • Designed around 64-bit machine requirements • New WinFS file system • Comprehensive “flat” view, alongside traditional tree view • New performance monitoring interface gradually will replace Performance Library DLLs

  13. Windows Server 2003 Extended Processor Support • Hardware Abstraction Layer (HAL) • Process & Thread Context switching • Interrupt processing • Synchronization primitives • SMP processor signaling • Virtual memory addressing • Processr.sys • CPU-specific optimizations, e.g., • ccNUMA • Hyperthreaded Processors • Power-conserving portables Ntoskrnl.exe Processr.sys HAL

  14. Windows Server 2003 Extended Processor Support • 32-bit Intel • Currently @ 3 GHz • Power-conserving Pentium 4 M • Both conventional & HT multiprocessors • New, improved measurement interface • Support is currently available only in Intel vTune • 64-bit Intel • ccNUMA support • AMD-64 • Expected in Service Pack 1

  15. Intel 686 microarchitecture • Increased execution parallelism; • remove some of the instruction sequence dependencies • Instructions are translated into RISC micro-instructions • Pool of 40 GP pseudo-Registers • Micro-ops can be executed out of sequence • Executed instructions are retired

  16. Intel 686 micro-architecture(single threaded, superscalar)

  17. Pentium 4 Hyperthreading • Two instruction streams scheduled to execute concurrently in the same pipeline. • When one instruction stream stalls, instructions from the other thread can be executed. • Implementation: • External interface is replicated • Internal resources are partitioned and/or shared

  18. Pentium 4 Hyperthreading Duplicated Partitioned

  19. Pentium IV Hyperthreading • Two simultaneously executing threads can saturate internal execution pipeline resources! • Windows 2003 support • A new Win32 GetLogicalProcessorInformation API call retrieves information about logical processors and related hardware. • Scheduler support: spread the load across physical processors first, then schedule logical processors when physical processors are all “busy” • HALT the processor in Idle mode • processr.sys function

  20. Intel 786 IA-64 architecture • Itanium-2: 1.6 GHz and higher • EPIC: Explicitly Parallel Instruction Execution • Explicit parallelism (up to 3 instructions in a bundle) • Predication • Speculation • Massive Resources • Extended Register sets • New instruction set! • Requires compilers that can take advantage of the architecture

  21. AMD-64 • AMD’s 64-bit extension to IA-32 • 8 additional GPRs in Long Mode • Microarchitecture similar to the IA-32 • Native IA-32 instruction execution in Short Mode

  22. Symmetric Multiprocessing (SMP)

  23. Multiprocessing scalability • Queued spin lock support • During spin lock execution, micro-ops are discharged into the instruction execution pipeline faster than they can be executed • Leads to resource shortages • When the spin lock test finally succeeds, large portions of the pipeline have to be flushed

  24. Multiprocessing scalability • Queued spin lock support • Uses PAUSE instruction • Slows down loop execution slightly to a rate that is synchronized to memory bus access so that • Allows the processor to detect immediately a change in the value of the loop synchronization variable • KeAcquireInStackQueuedSpinLockAtDpcLevel, KeReleaseInStackQueuedSpinLockFromDpcLevel

  25. ccNUMA Support • Cache-coherent Non-uniform Memory Access • Consist of Multiprocessor nodes • Processors (usually 4) • Local memory (shared memory bus) • Memory controller (for remote access) • Introduces another level of cache coherence • Memory accesses are non-uniform when comparing Next Level Cache hits to main memory references • Remote memory latency is 3-5 times slower than local memory access • Overcomes the congestion problems of a single bus architecture if there is sufficient node affinity in the workload

  26. ccNUMA Support • Threads scheduled on their ideal node • Real memory allocated locally, whenever possible • Maintains multiple Pools & Available Bytes queues

  27. ccNUMA Support • Traditionally, the drawback to NUMA was that applications had to be re-coded to take full advantage of the architecture • Performance is sensitive to the long latency associated with remote memory access

  28. Multiprocessor partitioning • May be the only way to optimize large, n-way multiprocessors • e.g., HP Superdome, Unisys ES7000 16-64 processor machines • But it requires commitment! • Understanding your current workload CPU processing requirements • Continuous monitoring of the workload on a per processor basis • Ensure that excessive numbers of threads from “loved ones” are not in the Ready Queue • Periodic review of the partitioning scheme

  29. Multiprocessor partitioning • The workload must be concentrated enough on its dedicated CPUs so that it will benefit from cache warm starts, but not too concentrated that it causes excessive processor queuing.

  30. Multiprocessor partitioning • The interrupt workload should not be so concentrated on its dedicated CPUs that interrupt processing is subject to excessive interrupt pending time delays • Between 5-30% concentration for % Interrupt Time is probably ideal

  31. Multiprocessor partitioning • Reskit Interrupt Filter tool:

  32. Multiprocessor partitioning • WSRM policies:

  33. Multiprocessor partitioning • WSRM policies: • Caution: • Not designed to be used with any application that performs its own partitioning • If an app sets a Processor Affinity mask, WSRM will honor it!

  34. Multiprocessor partitioning

  35. Win64 Virtual Memory

  36. Page replacement • Windows 2003 uses a form of the popular LRU page replacement algorithm • Pages in process working sets are aged using the access bits maintained by the hardware • Older pages without their access bits set are “trimmed” first • Recently trimmed pages are retained in the Standby list (aka, the page cache) and returned to the process working set via transition faults. • Eventually, older trimmed pages on the Standby List are “re-purposed” and place on the Free List

  37. Page replacement

  38. Page replacement • Age of a Page • Most Recent: access bit is set • Older: Older access bit set last page trimming scan • Even older: Not accessed at the time of the last scan • Oldest: Recently trimmed page marked in “transition” that are kept in the Standby List (page cache) • Page trimming working set scans • Threshold-driven: the actual rate is not reported • Efficiency is very important! • Only enough pages to replenish the Standby List are trimmed

  39. Page replacement • Available Bytes = (free + zero + standby) • Windows Server 2003 utilizes RAM much more efficiently than previous versions of the OS • Previously, the only way pages were aged was using the transition fault mechanism • Applications which attempt to allocate as much real memory as they can get require special consideration • MS Exchange • MS SQL Server

  40. Page replacement

  41. SQL Server memory tuning • For processes that perform their own working set management: • CreateMemoryResourceNotification to create a memory resource notification object • Sends two events to “listening” processes: • LowMemoryResourceNotification • HighMemoryResourceNotification

  42. IIS memory tuning • IIS 6.0 memory caching • Kernel cache (new) • http Responses • System File Cache • .htm, .jpg, .gif files, etc. • IIS Object cache • File handles • Active Server Pages • Script engines • Templates

  43. IIS memory tuning • Monitoring IIS 6.0 memory caching • Kernel cache (new) • Web Server Cache object • System File Cache • Cache object (MDL interface) • IIS Object cache • IIS Global object • Active Server Pages • Active Server Pages

  44. IIS memory tuning • Tuning parameters for controlling IIS memory caching: • MaxCachedFileSize – defaults to 256 KB • MemCacheSize – defaults to dynamic sizing • ObjectCacheTTL - default is 30 seconds • objects include File handles, Directories • .htm, .gif, and .jpg files are cached in the system file cache • OpenFilesInCache - default is 1000 per 32 MB (obsolete)

  45. IIS ASP memory tuning • Tuning parameters for controlling IIS Active Server Pages memory caching: • AspTemplateCache – defaults to 250 files in IIS 5.0 • AspScriptEngineCacheMax – defaults to 120 • Allow each processing Thread to cache its script engine • AspScriptFileCacheSize – defaults to 500 • Allow each processing Thread to cache its precompiled script engine • Monitor Active Server Pages: ASP Templates Cached, Template Cache Hit Rate, Free Script Engines in Cache

  46. New Win 2003 tools • Resource Kit tools are now supported! • Consume • Kernrate • Poolmon • Windows NT kernel Measurement Interface (Trace) reports

  47. Win 2003 performance monitoring • NTSMF support • Configuration file • Analogous to Disable Performance Counters Registry flag • Use it with Perflib DLLs that have persistent problems and pernicious side-effects • .NET Framework perflibs • Lotus Notes Perflib • Logical & Physical Disk counters are always on! • IPv6 and TCPv6 support • Discourage use of WMI interface

More Related