1 / 44

Storport Drivers from the Ground Up

Storport Drivers from the Ground Up. Steve Hagan Windows Driver Project Lead LSI Corporation Storage Components Group v-sthaga@microsoft.com. Agenda. Purpose of this Session Objectives of this Session Overall Storport Architecture Storport Architectural Improvements

vinaya
Download Presentation

Storport Drivers from the Ground Up

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Storport Drivers from the Ground Up • Steve Hagan • Windows Driver Project Lead • LSI Corporation • Storage Components Group • v-sthaga@microsoft.com

  2. Agenda • Purpose of this Session • Objectives of this Session • Overall Storport Architecture • Storport Architectural Improvements • Storport Specific APIs • Starting and Completing I/Os • Adapter and Device Queue Management • Error Handling and Recovery • LSI_U3 Sample Driver and Demo • Call to Action • Resources

  3. Purpose of this Session • ScsiPort is DEAD!! • In maintenance mode only, being phased out • Will be gone completely (sooner rather than later) • No ScsiPort in-box drivers in Windows 7 • ScsiPort supported only for OS upgrades, third-party drivers • Storport is the SCSI command set port driver moving forward • Plan to migrate existing or create new storage drivers using Storport • LSI developed a sample Storport driver for the Windows 7 WDK • Ported from Server 2008 in-box ScsiPort driver • This is NOT a detailed “How to write a Storport driver” session

  4. Objectives of this Session • Describe overall Storport driver architecture • Highlight areas of Storport drivers that provide • Improved storage / system performance • Improved error handling / recovery • Describe Storport Specific API’s • Demonstrate porting techniques • ScsiPort to Storport (LSI_U3 Sample Driver) • Demonstrate driver building / installation / operation

  5. Storport Architectural Improvements

  6. Overall Storport Architecture PnP/Power I/O WMI Errors Initialize Miniport Hardware STORPORT

  7. Storport Callback Routines Initialization HwFindAdapter DriverEntry HwInitialize I/O, Errors and WMI - Main I/O Path HwBuildIo HwStartIo HwInterrupt PnP/Power and Errors HwTimer HwAdapterControl HwResetBus

  8. Storport Architectural Improvements - Performance • Full-Duplex Operation – Main I/O path improvement • Separate StartIo and ISR threads • Start and complete I/Os at the same time (on a single adapter) • BuildIo routine (optional, but highly recommended) • Runs before StartIo for each SRB with NO locks • Allows most processing to start an I/O to be done in parallel • LSI experience: 75% to 90% of StartIo code can move to BuildIo • Attend “Storport Smorgasbord” session for more details CPU1 ISR #3 BuildIo I/O #3 ISR #1 StartIo I/O #3 CPU2 BuildIo I/O #4 ISR #2 BuildIo I/O #2 StartIo I/O #2

  9. Storport Architectural Improvements - Performance • Access to Storport Scatter/Gather (SG) List – Main I/O path improvement • More efficient building of miniport SG list • Eliminates the multiple calls to ScsiPortGetPhysicalAddress • ScsiPort: One call for each contiguous area in the user data buffer • Only one call to StorPortGetScatterGatherList • Walk through array of SG elements (physical address, length pairs) • MSI / MSI-X (Message Signaled Interrupt) Support • Provides capability for multi-level interrupts (specialized ISRs) • Much more efficient for PCI-Express devices • Most PCIe chipsets will share emulated line based interrupts

  10. Storport Architectural Improvements - Performance • I/O Queue Management • Storport will continue to issue I/Os in its queue until told to stop • ScsiPort: Must call Next[Lu]Request after each I/O • Better Performance —Eliminates an extra port driver call per I/O • Adapter level queue management • StorPortPause/Resume —StorPortBusy/Ready • Used for error handling, adapter recovery, adapter resources exhausted • Device level queue management • StorPortPause/ResumeDevice—StorPortDeviceBusy/Ready • Used for device errors, device firmware update, device queue full • StorPortSetDeviceQueueDepth • Sets maximum number of I/Os issued to a Logical Unit

  11. Device Queues Device Queues IO Queue Management ScsiPort Storport Adapter Queue 254 Max Requests Per LUN 254 Max Requests Per Adapter Storport I/O I/O I/O I/O I/O Resume or Timeout Pause I/O MiniPort / Device

  12. Storport Architectural Improvements – Error Handling • Thread Synchronization • Full duplex mode requires mechanisms to synchronize when needed • Error handling, adapter restart, adapter firmware update • Simple synchronization – StorPortSynchronizeAccess • Call from StartIo thread, synchronizes with ISR thread • Advanced synchronization • StorPortAcquireSpinLock, StorPortReleaseSpinLock • Can acquire and release DPC, StartIo, and ISR locks • Locks are hierarchical and Storport acquires certain locks before calling miniport callback routines

  13. Storport Architectural Improvements – Error Handling • Hierarchical Resets • ScsiPort only used bus reset to recover timed out I/O • Very disruptive to all devices on parallel SCSI bus • Unnecessary resets to devices on serial buses • Storport supports LUN, target, and bus resets • Starts with LUN reset. If unsuccessful, escalates to next level. • Least disruption if LUN reset succeeds • Multiple LUN / target resets can be processed concurrently Bus (Physical or Logical) LUN Target

  14. Storport Architectural Improvements – Error Handling • Auto Sense Support • Storport always provides SRBs with auto sense supported • Driver still needs to support disabling of auto sense (crash dump) • All serial storage protocols (packetized) do auto sense • Miniport caching of sense data required if auto sense not supported • IOCTLs – SRB_FUNCTION_IO_CONTROL • Storport supports multiple concurrent IOCTL SRBs • IOCTL SRB not tied to a LUN; miniport handles timeout of IOCTL • IOCTL SRB is isolated from I/O error handling and recovery • Greater flexibility in how miniport provides IOCTL support

  15. Storport-Specific APIs

  16. BuildIo Routine • Storport-specific API – Optional, but highly recommended • Called before StartIo for each SRB issued • Same callback interface as StartIo, but NO locks are held • All processors could be running BuildIo concurrently • Perform as much StartIo processing as possible, without modifying any shared memory • SrbExtension used for isolated per I/O memory buffer • I/O details, HW interface command structure, scatter/gather list, etc. • StartIo does any required modification of shared memory • LSI experience: 75% - 90% of StartIo work can be moved to BuildIo • Increased parallelism of I/O start processing

  17. StorPortGetScatterGatherList • Storport-specific API —Returns structure for entire scatter/gather list for Srb->DataBuffer PSTOR_SCATTER_GATHER_LIST StorPortGetScatterGatherList ( IN PVOID HwDeviceExtension, IN PSCSI_REQUEST_BLOCK Srb ); • Returned structure struct STOR_SCATTER_GATHER_LIST { ULONG NumberOfElements; ULONG _PTR Reserved; STOR_SCATTER_GATHER_ELEMENT List[ ]; }; • Individual scatter/gather element struct STOR_SCATTER_GATHER_ELEMENT { STOR_PHYSICAL_ADDRESS PhysicalAddress; ULONG Length; ULONG_PTR Reserved; };

  18. StorPortGetScatterGatherList ScsiPort: StorPort: do { // get physical address & length of next element PhysAddress = ScsiPortGetPhysicalAddress( DeviceExtension, Srb, VirtualBufferPointer, &ElementLength); // limit element length if necessary if ( ElementLength > RemainingDataCount) { ElementLength = RemainingDataCount; } // reduce the remaining data count RemainingDataCount -= ElementLength; // save length and lower 32-bits of address *iovPtr++ = scriptCmd | ElementLength; *iovPtr++ = PhysAddress.LowPart; // if using 64-bit addresses, save high 32-bits if (do64bit) *iovPtr++ = PhysAddress.HighPart; // bump number of elements numElements++; // bump virtual address to get next element (PUCHAR) VirtualBufferPointer += ElementLength; } while ( RemainingDataCount != 0); // get pointer to StorPort scatter/gather list pSpSGStruct = StorPortGetScatterGatherList( DeviceExtension, Srb); numElements = pSpSGStruct->NumberOfElements; pSpSGL = pSpSGStruct->List; // build the SG move instructions for ( loop = 0; loop < numElements; loop++ ) { // save length and lower 32-bits of address *iovPtr++ = scriptCmd | pSpSGL->Length; *iovPtr++ = pSpSGL->PhysicalAddress.LowPart; // if using 64-bit addresses, save high 32-bits if (do64bit) *iovPtr++ = pSpSGL->PhysicalAddress.HighPart; // bump SGL pointer pSpSGL++; }

  19. BuildIo Routine – Per I/O Structures SCSI Request Block SrbExtension (Driver Defined) typedefstruct _SCSI_REQUEST_BLOCK { USHORT Length UCHAR Function; UCHAR SrbStatus; UCHAR ScsiStatus; UCHAR PathId; UCHAR TargetId; UCHAR Lun; UCHAR QueueTag; UCHAR QueueAction; UCHAR CdbLength; UCHAR SenseInfoBufferLength; ULONG SrbFlags; ULONG DataTransferLength; ULONG TimeOutValue; PVOID DataBuffer; PVOID SenseInfoBuffer; struct _SCSI_REQUEST_BLOCK *NextSrb; PVOID OriginalRequest; PVOID SrbExtension; union { ULONG InternalStatus; ULONG QueueSortKey; ULONG LinkTimeoutValue; }; UCHAR Cdb[16]; } SCSI_REQUEST_BLOCK, *PSCSI_REQUEST_BLOCK; Per I/O Context I/O Command Block (H/W I/F) Scatter/Gather List Length defined in DriverEntry, but can be modified in FindAdapter to support desired maximum I/O size

  20. StartIo Routine • Same callback interface as ScsiPort • Perform I/O start tasks that must be serialized • Shared memory modification, H/W accesses, etc. • Runs asynchronously with ISR thread • Good multithreaded programming practices must be followed • StartIo and ISR thread memory modifications must be isolated • Threads can be synchronized if necessary (Synchronization APIs) • Best performance is achieved if no synchronization is required in the main I/O path

  21. Adapter Queue Management • Used to stop issuing new I/Os to entire adapter • StorPortPause/Resume used for error handling, adapter recovery BOOLEAN StorPortPause ( IN PVOID HwDeviceExtension, IN ULONG TimeOut ); BOOLEAN StorPortResume ( IN PVOID HwDeviceExtension ); • StorPortPause will automatically resume after TimeOut seconds • Calls are cumulative (multiple pauses need multiple resumes) • StorPortBusy/Ready used to stop IOs when resources are exhaused BOOLEAN StorPortBusy ( IN PVOID HwDeviceExtension, IN ULONG RequestsToComplete ); BOOLEAN StorPortReady ( IN PVOID HwDeviceExtension ); • I/Os automatically resume after RequestsToComplete I/Os complete • There is some latency associated with these routines

  22. Device Queue Management • Used to stop issuing new I/Os to a specific LUN • StorPortPause/ResumeDevice pause and resume I/O stream to LUN BOOLEAN StorPortPauseDevice ( IN PVOID HwDeviceExtension, IN UCHAR PathId, IN UCHAR TargetId, IN UCHAR Lun, IN ULONG TimeOut ); BOOLEAN StorPortResumeDevice ( IN PVOID HwDeviceExtension, IN UCHAR PathId, IN UCHAR TargetId, IN UCHAR Lun ); • StorPortDeviceBusy/Ready stop and restart I/Os to a LUN until some number of I/Os complete BOOLEAN StorPortDeviceBusy ( IN PVOID HwDeviceExtension, IN UCHAR PathId, IN UCHAR TargetId, IN UCHAR Lun, IN ULONG RequestsToComplete ); BOOLEAN StorPortDeviceReady ( IN PVOID HwDeviceExtension IN UCHAR PathId, IN UCHAR TargetId, IN UCHAR Lun );

  23. StorPortSetDeviceQueueDepth • Sets maximum number of concurrent I/Os Storport issues to a LUN • This is a LUN queue depth, not Target ID • Use to increase LUN queue depth from Storport default of 20 BOOLEAN StorPortSetDeviceQueueDepth ( IN PVOID HwDeviceExtension, IN UCHAR PathId, IN UCHAR TargetId, IN UCHAR Lun, IN ULONG Depth ); • Can be called after miniport determines LUN exists(receipt of valid Inquiry data) • LSI_U3 sample driver sets each LUN queue depth to 31

  24. MSI/MSI-X Initialization • Supported by Storport in Vista SP1, Server 2008 and later versions • Masking of MSI interrupts is an optional feature. MSI-X masking is a mandatory feature (via the MSI-X Vector Table) • Requires additions in: • FindAdapter • HwInitialize // set interrupt mode and interrupt routine for MSI ConfigInfo->InterruptSynchronizationMode = InterruptSynchronizeAll; ConfigInfo->HwMSInterruptRoutine = HwScsiMSIIsr; if ( StorPortGetMSIInfo( DeviceExtension, 0, &DeviceExtension->MsiInfo) == STOR_STATUS_SUCCESS { // MSI is supported - set MSI_Enabled flag DeviceExtension->MSI_Enabled = TRUE; }

  25. MSI/MSI-X Completing IOs • MSI/MSI-X interrupts behave differently than H/W line interrupts • PCI H/W line-based interrupts are level-sensitive • As long as an interrupt is pending the interrupt is signaled • Miniport can leave ISR and it will be called again if interrupt is pending • MSI/MSI-X interrupts are edge-sensitive • For some hardware, once the ISR is called due to an MSI/MSI-X interrupt, all pending interrupts must be processed (clearing the interrupt register) before another MSI/MSI-X interrupt will be issued by the H/W

  26. ISR - Completing IOs • Line interrupt ISR uses samecallback interface as ScsiPort • MSI/MSI-X interrupt ISR provides the MSI message ID BOOLEAN HwMSInterruptRoutine ( IN PVOID HwDeviceExtension, IN ULONG MessageID ); • MessageID is an index into the adapter MSI/MSI-X vector table • Not needed if synchronization mode is InterruptSynchronizeAll • Used to get additional info—StorPortGetMSIInfo • Used to acquire MSI spin locks (if synchronized per message) • Synchronization mode of InterruptSynchronizePerMessage • Multiple ISR routines can be active, each holding a Message ID spinlock • Use StorPortAcquire[Release]MSISpinLock to synchronize ISRs

  27. Synchronization - StorPortSynchronizeAccess • Synchronizes current StartIo thread with the ISR thread • SynchronizedAccessRoutine– routine to be called after ISR lock acquired • Context – pointer to a variable or structure to pass to routine • SynchronizedAccessRoutinecan return a BOOLEAN which is returned by StorPortSynchronizedAccess • Cannot be called from ISR thread (deadlock) • Schedule a timer routine and synchronize from that routine • Timer routines run holding the StartIo lock BOOLEAN StorPortSynchronizeAccess ( IN PVOID HwDeviceExtension, IN PSTOR_SYNCHRONIZED_ACCESS SynchronizedAccessRoutine, IN PVOID Context );

  28. Synchronization - StorPortSynchronizeAccess ISR Lock acquired before each ISR call CPU1 ISR ISR ISR ISR CPU2 StartIo SyncExec Storport acquires StartIo Lock before calling StartIo StartIo calls Synchronize Access, waits for ISR Lock SyncExec now has both StartIo and ISR Locks SyncExec returns, releasing the ISR Lock Release StartIo Lock

  29. Synchronization – Acquire / Release Routines • Individual routines available to acquire and release locks VOID StorPortAcquireSpinlock ( IN PVOID HwDeviceExtension, IN STOR_SPINLOCK SpinLock, IN PVOID LockContext, IN PSTOR_LOCK_HANDLE LockHandle ); VOID StorPortReleaseSpinlock ( IN PVOID HwDeviceExtension, IN PSTOR_LOCK_HANDLE LockHandle ); • Locks are hierarchical—must be acquired and released in order • DPC  StartIo  ISR • Storport acquires locks before calling certain miniport callbacks • StartIo, Timer, ResetBus routines – StartIo lock • Initialize, ISR, AdapterControl (StopAdapter) – ISR lock • All other miniport callbacks – No locks

  30. Synchronization – MSI Message Locks • Routines available to acquire and release locks for individual MSI message IDs ULONG StorPortAcquireMSISpinlock ( IN PVOID HwDeviceExtension, IN ULONG MessageId, IN PULONG OldIrql ); VOID StorPortReleaseMSISpinlock ( IN PVOID HwDeviceExtension, IN ULONG MessageId, IN ULONG OldIrql ); • Used if synchronization mode is InterruptSynchronizePerMessage • Miniport must acquire lock on each entry into ISR, release on exit • MessageId – Acquire the lock on this MSI Message ID • OldIrql – IRQL level to return to when this lock is released

  31. Hierarchical Resets • Storport has hierarchical resets: If one reset fails, escalate to the next • LUN  Target  Bus • Bus resets are issued via the HwResetBus callback • LUN and target resets are issued via SRBs • SRB_FUNCTION_RESET_LOGICAL_UNIT SRB_FUNCTION_RESET_DEVICE • PathId, TargetId, LunPathId, TargetID • LUN and target resets are timed by Storport – 30 second timeout • Storport can issue multiple LUN and target resets concurrently • A timed-out LUN reset will be followed by a target reset that is issued using the same SRB • A bus reset can be issued while a LUN or Target reset is active

  32. WMI Support • WMI provides management data to applications and services • Data blocks read / written • Methods executed • Asynchronous event notifications • Better defined and structured than using IOCTLs • Windows-defined header files for fiberchannel (FC) and serial attached SCSI (SAS) host bus adapter (HBA) API • hbaapi.h, hbapiwmi.h – see \WinDDK\<build>\inc\ddk • Additional WMI functionality with custom MOF and header files • Storport miniports must support WMI API • Initialize WmiLibContext structure with GUIDs and callbacks supported • Process SRB_FUNCTION_WMI via ScsiPortWmiDispatchFunction • Callback for requested data block / method is called • More info at http://msdn.microsoft.com/en-us/library/ms803223.aspx

  33. Event Tracing for Windows - ETW • ETW much more robust than DebugPrint statements • 2-dimensional filtering with user-defined categories and levels • Trace info captured in a file via WMI, no debugger needed • Very low overhead, can be included in retail (production) drivers • Trace info tokenized—readable strings not included in output file • Post capture, trace file can be processed or viewed on any system with header files created during driver build • All trace entries are timestamped to the millisecond • Miniport API very easy to implement • Fill in STORAGE_TRACE_INIT_INFOstructure, call WPP_INIT_TRACING macro during DriverEntry • When you want to trace an event, action, or status: • DoStorageTraceEtw( Level, Category, “<printf type string>”, <data values for string>); • Tracing cleanup routine automatically called at driver shutdown

  34. Event Tracing for Windows - Output

  35. LSI_U3 Sample Driver

  36. LSI_U3 Sample Driver Project • Port of LSI production SYM_U3 ScsiPort based driver to StorPort • Supports LSI Ultra160 Parallel SCSI adapters • Older technology SCSI controller hardware (minimal intelligence) • Very simple custom 8-bit RISC processor, 8K internal RAM (circa 1999) • H/W and scripts limited to use 256 queue tag values per adapter • H/W and scripts designed to ScsiPort capabilities • Adjustments in miniport match StorPort advanced functionality to limited H/W capabilities • Porting required about 3 weeks (develop/test) • Will be the Storport miniport sample in the next version of the WDK

  37. LSI_U3 Porting Activities • Summary of ScsiPort to StorPort porting activities • Remove all NextRequest and NextLuRequest calls • Rename all ScsiPortxxx calls to StorPortxxx • Add BuildIo routine (moved about 75% of code from StartIo) • Convert driver to full duplex operation • Add queue management routines (error handling, adapter restart) • Add LUN and Target reset support • Add code to assign queue tags internally (H/W limitation) • Add synchronization routines for bus reset and full adapter restart • Add support for SRB_FUNCTION_POWER (adapter shutdown via SRB) • Add StorPortSetDeviceQueueDepth call—LUN queue depth of 31

  38. LSI_U3 Code Review

  39. LSI_U3 Demo

  40. Call to Action • Accept the fact that ScsiPort is DEAD!! • Don’t continue to live in a state of denial! • Get familiar with Storport with the resources on the next slide • Try porting an existing ScsiPort driver if you have one • Use the new WDK when available • Try out the LSI_U3 sample driver (LSI U160 adapters are available) • Develop high-performance Storport storage drivers

  41. Additional Resources • White Papers • Designing RAID Adapters to Work with Windows http://www.microsoft.com/whdc/device/storage/RAID_design.mspx • WDK Documentation • Storport Driver Support Routines http://msdn.microsoft.com/en-us/library/ms807277.aspx • Storport Driver Miniport Routineshttp://msdn.microsoft.com/en-us/library/ms807164.aspx • Handling WMI SRBs in Storage Miniport Drivershttp://msdn.microsoft.com/en-us/library/ms803223.aspx

  42. Related Sessions

  43. Questions?

More Related