1 / 74

DB203 - Windows Server 2012 R2 & SQL Server 2014 Infrastruktur

DB203 - Windows Server 2012 R2 & SQL Server 2014 Infrastruktur. Michael Frandsen. Principal C onsultant MentalNote michaelf@mentalnote.dk. Agenda. SQL Server storage challenges The SAN legacy Traditional interconnects SMB past New ”old” interconnects File Shares – the new Black

aliza
Download Presentation

DB203 - Windows Server 2012 R2 & SQL Server 2014 Infrastruktur

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DB203 - Windows Server 2012 R2 &SQL Server 2014 Infrastruktur Michael Frandsen Principal Consultant MentalNote michaelf@mentalnote.dk

  2. Agenda • SQL Server storage challenges • The SAN legacy • Traditional interconnects • SMB past • New ”old” interconnects • File Shares – the new Black • “DIY” Shared Storage • Microsoft vNext

  3. Bio - Michael Frandsen I have worked in the IT industry for just over 21 years, 17 of these has been spent as a consultant.My typical clients are Fortune 500 Companies, most of them global corporations I have a close relationship with Microsoft R&D in Redmond, with the Windows team for 19 years, ever since the first beta of Windows NT 3.1, SQL server for 18 years, since the first version Microsoft did by themselves, v4.21a – I am in various advisory positions in Redmond and am involved in vNext versions of Windows, Hyper-V, SQL Server and Office/SharePoint. Specialty areas: • Architecture & design • High performance • Storage • Low Latency • Kerberos • Scalability (scale-up & scale-out) • Consolidation (especially SQL Server) • High-Availability • VLDB • Data Warehouse platforms • BI platforms • High Performance Computing (HPC) clusters • Big Data platforms & architecture

  4. Bio - Michael Frandsen

  5. SQL Server storage challenges • Capacity • Fast • Shared • Reliable

  6. The SAN legacy • Because it’s expensive … it must be fast • SAN Vendor sales pitch • SAN typical • SAN non-match

  7. The SAN legacy • Shared storage or Direct Attached SAN

  8. The SAN legacy • Widespread misconception

  9. The SAN legacy • Complex stack A A DSM MPIO Algorithm MPIO CACHE SQL SERVER WINDOWS CPU CORES FC SWITCH WWN Zoning B B CACHE SCSI Controller Port Logic FC HBA FC HBA STORAGE CONTROLLER XOR Engine A A B B A B CPU Feed Rate HBA Port Rate Switch Port Rate SP Port Rate LUN Read Rate Disk Feed Rate SQL Server Read Ahead Rate DISK DISK DISK DISK LUN LUN

  10. SAN Bottleneck Typical SAN load: Low to medium I/O processor load (top - slim rectangles) Low cache load (Middle - big rectangles) Low disk spindle load (lower half - squares)

  11. SAN Bottleneck Typical Data Warehouse / BI / VLDB SAN load: High I/O processor load – maxed out (top - slim rectangles) High cache load (Middle - big rectangles) Low disk spindle load (lower half - squares)

  12. SAN Bottleneck Ideal Data Warehouse / BI / VLDB SAN load: Low to medium I/O processor load (top - slim rectangles) Low to medium cache load (Middle - big rectangles) High disk spindle load (lower half - squares)

  13. Traditional interconnects • Fibre Channel • Stalled at 8Gb/s for many years • 16Gb/s FC still very exotic • Strong movement towards FCoE (Fibre Channel over Ethernet) • iSCSI • Started in low-end storage arrays • Many still 1Gb/s • 10Gb/E storage arrays typically have few ports compared to FC • NAS • NFS, SMB, etc.

  14. File Share reliability Is this mission critical technology?

  15. SMB 1.0 - 100+ Commands • Protocol negotiation, userauthentication and shareaccess (NEGOTIATE, SESSION_SETUP_ANDX, TRANS2_SESSION_SETUP, LOGOFF_ANDX, PROCESS_EXIT, TREE_CONNECT, TREE_CONNECT_ANDX, TREE_DISCONNECT) • File, directory and volumeaccess (CHECK_DIRECTORY, CLOSE, CLOSE_PRINT_FILE, COPY, CREATE, CREATE_DIRECTORY, CREATE_NEW, CREATE_TEMPORARY, DELETE, DELETE_DIRECTORY, FIND_CLOSE, FIND_CLOSE2, FIND_UNIQUE, FLUSH, GET_PRINT_QUEUE,  IOCTL, IOCTL_SECONDARY, LOCK_AND_READ, LOCK_BYTE_RANGE, LOCKING_ANDX, MOVE, NT_CANCEL, NT_CREATE_ANDX, NT_RENAME, NT_TRANSACT, NT_TRANSACT_CREATE, NT_TRANSACT_IOCTL, NT_TRANSACT_NOTIFY_CHANGE, NT_TRANSACT_QUERY_QUOTA, NT_TRANSACT_QUERY_SECURITY_DESC, NT_TRANSACT_RENAME, NT_TRANSACT_SECONDARY, NT_TRANSACT_SET_QUOTA, NT_TRANSACT_SET_SECURITY_DESC, OPEN, OPEN_ANDX, OPEN_PRINT_FILE, QUERY_INFORMATION, QUERY_INFORMATION_DISK, QUERY_INFORMATION2, READ, READ_ANDX, READ_BULK, READ_MPX, READ_RAW, RENAME, SEARCH, SEEK, SET_INFORMATION, SET_INFORMATION2, TRANS2_CREATE_DIRECTORY, TRANS2_FIND_FIRST2, TRANS2_FIND_NEXT2, TRANS2_FIND_NOTIFY_FIRST, TRANS2_FIND_NOTIFY_NEXT, TRANS2_FSCTL , TRANS2_GET_DFS_REFERRAL, TRANS2_IOCTL2, TRANS2_OPEN2, TRANS2_QUERY_FILE_INFORMATION, TRANS2_QUERY_FS_INFORMATION, TRANS2_QUERY_PATH_INFORMATION, TRANS2_QUERY_PATH_INFORMATION, TRANS2_REPORT_DFS_INCONSISTENCY, TRANS2_SET_FILE_INFORMATION, TRANS2_SET_FS_INFORMATION, TRANS2_SET_PATH_INFORMATION, TRANSACTION, TRANSACTION_SECONDARY, TRANSACTION2, TRANSACTION2_SECONDARY, UNLOCK_BYTE_RANGE, WRITE, WRITE_AND_CLOSE, WRITE_AND_UNLOCK, WRITE_ANDX, WRITE_BULK, WRITE_BULK_DATA, WRITE_COMPLETE, WRITE_MPX, WRITE_MPX_SECONDARY, WRITE_PRINT_FILE, WRITE_RAW) • Other (ECHO, TRANS_CALL_NMPIPE, TRANS_MAILSLOT_WRITE, TRANS_PEEK_NMPIPE, TRANS_QUERY_NMPIPE_INFO, TRANS_QUERY_NMPIPE_STATE, TRANS_RAW_READ_NMPIPE, TRANS_RAW_WRITE_NMPIPE, TRANS_READ_NMPIPE, TRANS_SET_NMPIPE_STATE, TRANS_TRANSACT_NMPIPE, TRANS_WAIT_NMPIPE, TRANS_WRITE_NMPIPE) 14 distinct WRITE operations ?!??

  16. SMB 2.0 - 19 Commands • Protocol negotiation, user authentication and share access(NEGOTIATE, SESSION_SETUP, LOGOFF, TREE_CONNECT, TREE_DISCONNECT) • File, directory and volume access(CANCEL, CHANGE_NOTIFY, CLOSE, CREATE, FLUSH, IOCTL, LOCK, QUERY_DIRECTORY, QUERY_INFO, READ, SET_INFO, WRITE) • Other(ECHO, OPLOCK_BREAK) • TCP is a required transport • SMB2 no longer supports NetBIOS over IPX, NetBIOS over UDP or NetBEUI

  17. SMB 2.1 • Performance improvement • Up to 1MB MTU to better utilize 10Gb/E • ! Disabled by default ! • Real benefit required app support • Ex. Robocopy in W7 / 2K8R2 is multi-threaded • Defaults to 8 threads, range 1-128

  18. SQL Server SMB support • < 2008 • Using UNC path could be enabled with trace flag • Not officially supported scenario • No support for system databases • No support for failover clustering • 2008 R2 • UNC path fully supported by default • No support for system databases • No support for failover clustering

  19. Two things happened SQL Server 2012 Windows Server 2012

  20. SQL Server 2012 • UNC support expanded • System Databases supported on SMB • Failover Clustering supports SMB as shared storage • … and TempDB can now reside on NON-shared storage  • Mark Souza commented: Great Suggestion!

  21. Windows Server 2012 • InfiniBand • NIC Teaming • SMB 3.0 • RDMA • Multichannel • SMB Direct

  22. New “old” interconnects InfiniBand characteristics • Been around since 2001 • Used mainly for HPC clusters and Super Computing • High throughput • RDMA capable • Low latency • Quality of service • Failover • Scalable

  23. InfiniBand throughput Network Bottleneck Alleviation: InfiniBand (“Infinite Bandwidth”) and High-speed Ethernet (10/40/100 GE) • Bit serial differential signaling • Independent pairs of wires to transmit independent data (called a lane) • Scalable to any number of lanes • Easy to increase clock speed of lanes(since each lane consists only of a pair of wires) • Theoretically, no perceived limit on the bandwidth

  24. InfiniBand throughput Network Speed Acceleration with IB and HSE

  25. InfiniBand throughput Most commercialimplementationsuse 4x lanes 56Gb/s - 64/66 bit encoding • 6,8GB/s pr port SDR - Single Data RateDDR - Double Data RateQDR - Quad Data RateFDR - Fourteen Data RateEDR - Enhanced Data RateHDR - High Data RateNDR - Next Data Rate

  26. InfiniBand throughput Trends in I/O Interfaces with Servers PCIe Gen2 4x: 2GB/s Data Rate • 1,5GB/s Effective Rate PCIeGen2 8x: 4GB/s Data Rate • 3GB/s EffectiveRate (I/O links have their own headers and other overheads!)

  27. InfiniBand throughput Low-level Uni-directional Bandwidth Measurements InfiniBanduses RDMA (Remote Direct Memory Access) HSE can support RoCE (RDMA over Converged Ethernet) RoCEmakes a hugeimpact on small I/O

  28. InfiniBandlatency Ethernet Hardware Acceleration • Interrupt Coalescing • Improves throughput, but degrades latency • Jumbo Frames • No latency impact; Incompatible with existing switches • Hardware Checksum Engines • Checksum performed in hardware -> significantly faster • Shown to have minimal benefit independently • Segmentation Offload Engines (a.k.a. Virtual MTU) • Host processor “thinks” that the adapter supports large Jumbo frames, but the adapter splits it into regular sized (1500-byte) frames • Supported by most HSE products because of its backward compatibility -> considered “regular” Ethernet

  29. InfiniBandlatency IB Hardware Acceleration • Some IB models have multiple hardware accelerators • E.g., Mellanox IB adapters • Protocol Offload Engines • Completely implement ISO/OSI layers 2-4 (link layer, network layer and transport layer) in hardware • Additional hardware supported features also present • RDMA, Multicast, QoS, Fault Tolerance, and many more

  30. InfiniBandlatency HSE vs IB • Fastest 10Gb/E NIC’s 1-5 µs • Fastest 10Gb/E switch 2,3 µs • QDR IB 100 nano sec => 0,1µs • FDR IB 160 nano sec => 0,16 µs - slight increase due to 64/66 encoding • Fastest HSE RoCE end to end 3+ µs • Fastest IB RDMA end to end <1 µs

  31. InfiniBandlatency Links & Repeaters • Traditional adapters built for copper cabling • Restricted by cable length (signal integrity) • For example, QDR copper cables are restricted to 7m • Optical cables with Copper-to-opticalconversion hubs • Up to 100m length • 550 picoseconds copper-to-optical conversion latency • That’s 0,00055 µs or 0,00000055 ms

  32. File Shares – the new Black Why file shares? • Massively increased stability • Cleaned up protocol • Transparent Failover between cluster nodes • with no service outage! • Massively increased functionality • Multichannel • RDMA and SMB Direct • Massively decreased complexity • No more MPIO, DSM, Zoning, HBA tuning, Fabric zoning etc.

  33. New protocol - SMB 3.0 • Which SMB protocol version is used

  34. Transparent Failover SQL Server or Hyper-V Server • Failover transparent to server apps • Zero downtime • Small IO delay during failover • Supports • Planned moves • Load balancing • OS restart • Unplanned failures • Client redirection (Scale-Out only) • Supports both file and directory operations • Requires: • Windows Server 2012 Failover Clusters • Both server running application and file server cluster must be Windows Server 2012 Connections and handles auto-recovered; application IO continues with no errors Normal operation 1 3 2 Failover to Node B \\fs1\share \\fs1\share File Server Cluster File Server Node A File Server Node B

  35. SMB Multichannel Single 10GbE RSS-capable NIC Multiple 1GbE NICs Multiple 10GbE in a NIC team Multiple RDMA NICs Full Throughput • Bandwidth aggregation with multiple NICs • Multiple CPUs cores engaged when using Receive Side Scaling (RSS) Automatic Failover • SMB Multichannel implements end-to-end failure detection • Leverages NIC teaming if present, but does not require it Automatic Configuration • SMB detects and uses multiple network paths SMB Client SMB Client SMB Client SMB Client NIC Teaming RSS NIC 10GbE/IB NIC 10GbE/IB NIC 10GbE NIC 10GbE NIC 10GbE NIC 1GbE NIC 1GbE Switch 10GbE/IB Switch 10GbE/IB Switch 10GbE Switch 10GbE Switch 1GbE Switch 1GbE Switch 10GbE SMB Server SMB Server SMB Server SMB Server NIC 10GbE/IB NIC 10GbE NIC 10GbE/IB NIC 10GbE NIC 10GbE NIC 1GbE NIC 1GbE RSS NIC Teaming Vertical lines are logical channels, not cables

  36. SMB Multichannel 1 session, without Multichannel • No failover • Can’t use full 10Gbps • Only one TCP/IP connection • Only one CPU core engaged CPU utilization per core SMB Client RSS NIC 10GbE Switch 10GbE SMB Server NIC 10GbE RSS Core 1 Core 2 Core 3 Core 4

  37. SMB Multichannel 1 session, with Multichannel • No failover • Full 10Gbps available • Multiple TCP/IP connections • Receive Side Scaling (RSS) helps distribute load across CPU cores SMB Client CPU utilization per core RSS NIC 10GbE Switch 10GbE SMB Server NIC 10GbE RSS Core 1 Core 2 Core 3 Core 4

  38. SMB Multichannel 1 session, without Multichannel • No automatic failover • Can’t use full bandwidth • Only one NIC engaged • Only one CPU core engaged SMB Client 1 SMB Client 2 RSS RSS NIC 10GbE NIC 10GbE NIC 10GbE NIC 10GbE Switch 10GbE Switch 10GbE Switch 10GbE Switch 10GbE SMB Server 1 SMB Server 2 NIC 10GbE NIC 10GbE NIC 10GbE NIC 10GbE RSS RSS

  39. SMB Multichannel 1 session, with Multichannel • Automatic NIC failover • Combined NIC bandwidth available • Multiple NICs engaged • Multiple CPU cores engaged SMB Client 1 SMB Client 2 RSS RSS NIC 10GbE NIC 10GbE NIC 10GbE NIC 10GbE Switch 10GbE Switch 10GbE Switch 10GbE Switch 10GbE SMB Server 1 SMB Server 2 NIC 10GbE NIC 10GbE NIC 10GbE NIC 10GbE RSS RSS

  40. SMB Multichannel Performance • Pre-RTM results using four 10GbE NICs simultaneously • Linear bandwidth scaling • 1 NIC – 1150 MB/sec • 2 NICs – 2330 MB/sec • 3 NICs – 3320 MB/sec • 4 NICs – 4300 MB/sec • Leverages NIC support for RSS (Receive Side Scaling) • Bandwidth for small IOs is bottlenecked on CPU

  41. RDMA in SMB 3.0 SMB over TCP and RDMA 4 • Application (Hyper-V, SQL Server) does not need to change. • SMB client makes the decision to use SMB Direct at run time • NDKPI provides a much thinner layer than TCP/IPNo longer flow anything via regular TCP/IP • Remote Direct Memory Access performed by the network interfaces. File Server Client Memory Memory Application 1 RDMA User Kernel Unchanged API SMB Client SMB Server 2 TCP/ IP TCP/ IP SMB Direct SMB Direct NDKPI NDKPI 3 RDMA NIC RDMA NIC NIC NIC Ethernet and/or InfiniBand

  42. SMB Direct and SMB Multichannel 1 session, without Multichannel • No automatic failover • Can’t use full bandwidth • Only one NIC engaged • RDMA capability not used SMB Client 1 SMB Client 2 R-NIC 10GbE R-NIC 10GbE R-NIC 54GbIB R-NIC 54GbIB Switch 10GbE Switch 10GbE Switch 54GbIB Switch 54GbIB SMB Server 1 SMB Server 2 R-NIC 10GbE R-NIC 10GbE R-NIC 54GbIB R-NIC 54GbIB

  43. SMB Direct and SMB Multichannel 1 session, with Multichannel • Automatic NIC failover • Combined NIC bandwidth available • Multiple NICs engaged • Multiple RDMA connections SMB Client 1 SMB Client 2 R-NIC 10GbE R-NIC 10GbE R-NIC 54GbIB R-NIC 54GbIB Switch 10GbE Switch 10GbE Switch 54GbIB Switch 54GbIB SMB Server 1 SMB Server 2 R-NIC 10GbE R-NIC 10GbE R-NIC 54GbIB R-NIC 54GbIB

  44. “DIY” Shared Storage New paradigm for SQL Server storage design • Direct Attached Storage (DAS) • Now with flexibility • Converting DAS to shared storage • Fast RAID controllers will be shared storage • NAND Flash PCIe cards (ex. Fuson-io) will be shared storage

  45. New Paradigm designs SQL Server SQL Server SQL Server File Server Fusion IO Fusion IO Fusion IO PCIe Flash Disks

  46. New Paradigm designs SQL Server SQL Server SQL Server File Server File Server NAND Flash Shared Storage Traditional SAN Shared Storage

  47. New Paradigm designs

  48. Demo Storage Spaces

  49. SQL Server storage challenges • Capacity • Fast • Shared • Reliable

  50. SQL Server virtualization challenges • Servers with lots of I/O • Servers using all RAM and CPU resources • Servers using more than 4 cores • Servers using large amounts of RAM

More Related