Optimizing applications for remote file access over wan
Download
1 / 34

Optimizing Applications For Remote File Access Over WAN - PowerPoint PPT Presentation


  • 167 Views
  • Uploaded on

ES23. Optimizing Applications For Remote File Access Over WAN.  Mathew George Sr. Software Engineer Microsoft Corporation. Agenda. Introduction and Motivation Understanding the problem Improvements to the platform Application guidelines Optimizing “throughput oriented” applications

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Optimizing Applications For Remote File Access Over WAN' - reid


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Optimizing applications for remote file access over wan l.jpg

ES23

Optimizing Applications For Remote File Access Over WAN

 Mathew George

Sr. Software Engineer

Microsoft Corporation


Agenda l.jpg
Agenda

  • Introduction and Motivation

  • Understanding the problem

  • Improvements to the platform

  • Application guidelines

    • Optimizing “throughput oriented” applications

    • Optimizing “interactive” applications

    • General considerations

  • Summary


Introduction why care about file access over a wan l.jpg
IntroductionWhy care about file access over a WAN?

  • Storage consolidation is moving storage away from the application

    • Branch office servers being consolidated to the data center

    • Data center to data center movement of data for disaster recovery and content distribution

    • Cloud storage

  • Trends driving storage consolidation

    • WAN bandwidth is increasing and cost/bandwidth is decreasing

    • Cost reductions, better resource utilization, centralized management, and better uptime


Introduction what makes an application wan unfriendly l.jpg
IntroductionWhat makes an application WAN unfriendly?

  • WAN bandwidth is still very valuable – it costs money

  • Chatty applications can lead to long end user wait times when running over a WAN

  • Bad programming model can cause app hangs when running over a WAN

    • Humans typically want a response time of less than 5 sec

    • A hung application is as bad as a crashed application


Introduction why should i change my application l.jpg
IntroductionWhy should I change my application?

  • Apps often assume files are local and hence fast data and metadata access

    • This assumption is invalid over a WAN

  • Improvements made to the platform (Windows Vista and Windows 7) could expose new app bottlenecks

  • Changes to APIs require apps to make use of them

  • What kind of gains can I expect?

    • 2-10x for throughput oriented applications

    • 30% traffic reduction and response time improvement for interactive applications


Understanding the problem understanding the network parameters l.jpg
Understanding The ProblemUnderstanding the network parameters

  • Bandwidth and Round trip Latency (RTT)

  • Bandwidth delay product (BDP)

    • Example: BDP of 100 ms, 100 Mbps link is 1.25 MB

  • Network utilization

    • Max theoretical utilization is the amount of outstanding data divided by the BDP

    • Example: App posting 64KB data gets ~ 5% utilization

Connectivity from branch offices to nearest data center, Fall 2007

Continental Bandwidth (Mb/s)

Continental Latencies (ms)


Understanding the problem the smb file i o stack l.jpg
Understanding The ProblemThe SMB file I/O stack

  • SMB (CIFS) and SMB2 – our core file sharing protocols

    • SMB/CIFS had limitations running on high BDP networks

    • The SMB2 protocol introduced with Windows Vista is optimized for high BDP networks

  • Clients have the ability to cache file data and metadata

    • Cache manager and the SMB redirector manages the caching

  • The Win32 API provides the application interface

    • Synchronous versus overlapped I/O

    • Cached (Buffered) versus non-cached

    • Handle versus path based APIs

    • I/O cancellation APIs

Application

Network Stack

Win32/NT File I/O APIs

SMB Server

Client Cache

Server Cache

SMB Redirector

Filesystem

Network Stack

Disk

SMB Client

File Server

Network


Understanding the problem copying a large file l.jpg
Understanding The ProblemCopying a large file

  • Copy a 20 MB file from local disk to a remote server over a 1Gbps, 100 ms link (Windows Server 2003)

  • Observations

    • Operation takes 38 sec

    • Link BDP = 1 Gbps * 100 ms = 100 Mb ( ~ 12 MB )

    • Throughput is 20 MB/38 sec = 4.2 Mbps

    • Network utilization is 4.2 Mbps/ 1 Gbps = 0.42%

  • Analysis

    • The CopyFile API posts a single 64K buffer at a time

    • Max theoretical utilization is 64KB/12MB = 0.53%

    • Observed utilization is lesser because of other overheads


Understanding the problem opening a word document l.jpg
Understanding The ProblemOpening a Word document

  • Open a 300KB Office 2003 document across a simulated 100 Mbps, 100 ms link

  • Observations

    • About 23 seconds to open the file

    • Approximately 1200 SMB frames seen on the wire including traffic in both directions

    • Same file opened and closed repeatedly

    • Data read multiple times from the server

    • Significant metadata traffic

  • Analysis

    • Bulk of the data transfer is caused due to SMB losing the ability to cache data due to multiple opens

    • Windows Explorer and Office 2003 interfering with each other by doing I/O on the same file


Enabling high throughput applications platform improvements l.jpg
Enabling High Throughput ApplicationsPlatform Improvements

  • SMB2 Redirector and Server

    • Protocol supports larger buffer sizes for data and metadata operations

    • Dynamic scaling of the number of outstanding operations based on network BDP

    • Support for deeper I/O pipelines

    • Automatic pipelining of large I/O requests

  • Network (TCP/IP) stack

    • Larger TCP window sizes

    • High BDP optimizations

    • Windows Vista and later OS releases

  • Cache manager

    • Larger I/O sizes

Application

Network Stack

Win32/NT File I/O APIs

SMB Server

Client Cache

Server Cache

SMB Redirector

Filesystem

Network Stack

Disk

SMB Client

File Server

Network


Enabling high throughput applications platform improvements11 l.jpg
Enabling High Throughput ApplicationsPlatform Improvements

  • Significant optimizations to the CopyFile API.

    • Windows Server 2008 and later OS releases have these optimizations

    • Uses 1 MB I/O requests (as opposed to 64 KB)

    • Issues up to 8 outstanding I/O operations (as opposed to 1 at a time)

    • All inbox file copy tools - copy, xcopy, robocopy, and Windows Explorer see gains

Robocopy throughput comparison between Windows Server 2003 and Windows Server 2008 transferring a 4.5 GB file over a 1 Gbps WAN link.

Pull = Copy from server to local disk

Push = Copy from local disk to server


Enabling high throughput applications application guidelines l.jpg
Enabling High Throughput Applications Application Guidelines

  • Use overlapped I/O instead of synchronous

    • Use the FILE_FLAG_OVERLAPPED option to CreateFile

    • ReadFile, WriteFile and DeviceIoControl APIs

      • Wait for completion (GetOverlappedResult)

      • Completion callback (“Ex” versions of the API)

      • Use completion ports for even better throughput

    • Issue sufficient I/O to fill the network BDP

      • Limit I/Os based on end to end response time and resources

    • Works best when buffering is turned off

    • Helps SMB1, SMB2 as well as local I/O

Idle

Utilized

Non-pipelined

Pipelined

The effect of pipelining on network utilization


Enabling high throughput applications application guidelines13 l.jpg
Enabling High Throughput ApplicationsApplication Guidelines

  • Large I/O sizes allow the OS to process the request more efficiently

    • Fewer passes through the I/O stack

    • OS can segment the request into optimal sized chunks and pipeline each individual chunk

      • Due to limitations in the SMB1 stack, use a 60K chunk when reading data and a 64K chunk while writing

      • With SMB2 (or for local files), I/O sizes of around 1 MB

        works well

      • Very large I/O sizes (> 16 MB) can result in memory fragmentation and resource shortages

  • Use the CopyFile API for large data transfers

    • When dealing with lots of small files, use multiple threads to issue parallel CopyFile calls

    • For Windows 7, we are adding a multithreading option to the robocopy tool


Enabling high throughput applications application guidelines14 l.jpg
Enabling High Throughput ApplicationsApplication guidelines

  • Take advantage of the cache manager by doing buffered I/O

    • Useful in scenarios where the app cannot do asynchronous or large I/Os

    • Can hide the delays caused by slow disks

    • Provide hints when opening the file

      • FILE_FLAG_RANDOM_ACCESS

      • FILE_FLAG_SEQUENTIAL_SCAN

  • Minimize extending writes

    • Set the file size first before writing data

  • Be cautious

    • Opening files with FILE_FLAG_WRITE_THROUGH option.

    • Making frequent calls to FlushFileBuffers


Developing responsive applications understanding the windows i o model l.jpg
Developing Responsive Applications Understanding the Windows I/O model

  • Handle based I/O

    • A handle is obtained by opening a file (via the CreateFile API)

    • All I/O operations are performed on the handle (Example: ReadFile, WriteFile, LockFile, GetFileInformationByHandle, ReadDirectoryChanges)

    • The handle is closed after use

  • Path based APIs

    • A sequence of 2 or more handle based primitives

      GetFileAttributes Open + QueryAttributes + Close

    • Similarly, SetFileAttributes, DeleteFile, MoveFileinvolve multiple I/O operations


Developing responsive applications understanding caching in the smb context l.jpg
Developing Responsive Applications Understanding caching in the SMB context

  • Data caching

    • Keeping a copy of previously read data

    • Holding onto data written by the application and “lazily” flushing the data to the server

    • Win32 file I/O is buffered (cached) by default, except if opened with FILE_FLAG_NO_BUFFERING or FILE_FLAG_WRITE_THROUGH options.

  • Metadata caching

    • File attributes, directory listings can be cached

  • Handle caching

    • SMB client holds handle open after application has closed the file.


Developing responsive applications understanding caching in the smb context17 l.jpg
Developing Responsive Applications Understanding caching in the SMB context

  • Maintaining cache coherency

    • Multiple clients accessing the same data

    • SMB uses “opportunistic locks” (Oplocks)

    • Completely hidden from the application

  • Oplocks tell the client what it can cache

    • Granted by the server when a file is opened

    • BATCH oplockallows the SMB client to cache reads, writes and the handle (exclusive)

      • SMB client can cache data even after the app closes the file

    • Level II oplockallows the client to cache reads (shared)

      • SMB client cannot cache data after the app closes the file

  • Oplocks can be revoked by the server

    • Client loses the ability to cache


Developing responsive applications data caching lost by opening multiple handles l.jpg
Developing Responsive ApplicationsData caching lost by opening multiple handles

Client

Server

CreateFile( GENERIC_READ | GENERIC_WRITE )

Granted batch oplock

Create completes

ReadFile

ReadFile completes.

Data is cached.

CloseHandle

Close is not sent out on wire.

CreateFile( GENERIC_READ | GENERIC_WRITE )

Cached handle is re-used.

WriteFile

Data is written to cache.

CreateFile( GENERIC_READ )

Break Oplock

Cache is destroyed.

Flush cached data

Create completes

WriteFile

No more caching !


Developing responsive applications smb2 leasing in windows 7 l.jpg
Developing Responsive ApplicationsSMB2 leasing in Windows 7

  • Enhancement to the SMB2 protocol in Windows 7 to support better caching semantics

    • Better support existing applications

    • Layered applications are hard to change

    • Mitigate cross application interference

  • Allows full caching when multiple handles are opened by the same “client”

  • A new lease level which allows multiple clients to cache reads as well as handles

    • Multiple clients can hold on to cached data after app closes handle


Developing responsive applications smb2 leasing in action l.jpg
Developing Responsive ApplicationsSMB2 leasing in action

Client

Server

CreateFile( GENERIC_READ | GENERIC_WRITE )

Create completes

Granted lease

ReadFile

ReadFile completes.

Data is cached.

CloseHandle

Close is not sent out on wire.

CreateFile( GENERIC_READ | GENERIC_WRITE )

Cached handle is re-used.

WriteFile

Data is written to cache.

CreateFile( GENERIC_READ )

Create completes

WriteFile

Data is written to cache.


Developing responsive applications more windows 7 caching enhancements l.jpg
Developing Responsive ApplicationsMore Windows 7 caching enhancements

  • Transparent cache

    • A secondary on-disk cache to augment the client’s in-memory cache

    • Uses the Windows offline files infrastructure.

    • Selectively enabled based on network latency and throughput.

  • BranchCache

    • A peer cache which works in conjunction with the “offline files” cache.

    • Uses hashes generated by the server to fetch data from peers.


Developing responsive applications windows 7 branchcache in action l.jpg
Developing Responsive ApplicationsWindows 7 BranchCache in action

Windows 7

Server

First access to a file on the server pulls down the file over the slow WAN link

(WAN access)

High latency

Low-bandwidth

WAN link

Windows 7 Clients

Subsequent access from the same client is satisfied from the transparent cache (local machine access)

Second access to the same file from another user in the branch is satisfied from the peer (local subnet access)

Client 1

Client 2


Developing responsive applications application guidelines for effective caching l.jpg
Developing Responsive ApplicationsApplication guidelines for effective caching

  • Avoid multiple open handles to the same file at the same time

    • Use the handle based APIs if possible

    • With SMB2 leasing, opening multiple handles on Windows 7 is no longer a problem

  • Make use of SMB “handle collapsing”.

    • Identical handles to the same file can be “collapsed” (same access mode, share mode and create options)

    • Particularly useful for SMB1 because a collapsed open implies that oplocks are not broken.


Developing responsive applications application guidelines for effective caching24 l.jpg
Developing Responsive ApplicationsApplication guidelines for effective caching

  • Provide hints to the cache manager and the SMB redirector when opening the file

    • FILE_FLAG_SEQUENTIAL_SCAN tells the cache manager to cache data just ahead of where the application is reading

    • Incorrect hints can result in poor caching behavior.

  • Other caveats

    • Write-only opens are not cached

    • Byte range locks cause loss of all caching if there are multiple handles open


Developing responsive applications platform support for metadata caching l.jpg
Developing Responsive ApplicationsPlatform Support for Metadata Caching

  • Metadata queries have significant cost

    • Each query may take up to 3 round trips

    • Around 40% of SMB roundtrips are for file metadata

  • Windows SMB clients can cache file metadata

    • Metadata caching is best effort and there are very limited consistency guarantees

    • Metadata caches expire after a fixed time

    • SMB1 client caches only file attributes, timestamps, and file sizes by default

    • SMB2 client caches directory enumeration in addition


Developing responsive applications application guidelines for metadata access l.jpg
Developing Responsive ApplicationsApplication guidelines for metadata access

  • Maximize use of the metadata cache

    • GetFileAttributes, GetFileSize, GetFileTime are cached

    • Directory enumeration via FindFirstFile/FindNextFileare cached for SMB2 only

    • Only the FileBasicInfo, FileStandardInfo and FileNameInfoclasses supported by the GetFileInformationByHandleEx API are cached

  • Avoid repeated queries for non-cached metadata by caching at the application level

  • Use the GetFileInformationByHandle(Ex) API

  • Use large buffers for variable length queries

    • Security descriptors, stream enumeration

    • Use the GetFileInformationByHandleExAPI to enumerate directories (SMB2 on Windows 7 only!)


Developing responsive applications general considerations l.jpg
Developing Responsive ApplicationsGeneral considerations

  • Support I/O cancellation

    • Starting with Windows Vista, creates can be cancelled via the CancelSynchronousIo API

      • CreateFile calls can sometimes incur connection establishment and authentication delays

      • Majority of app hangs involve a code path trying to open the file

    • Use overlapped I/O whenever possible

      • Does not block

      • Can be selectively cancelled via CancelIoEx API

  • Don’t do blocking network I/O on your main application thread

  • Don’t pipeline too much data


Summary l.jpg
Summary

  • For throughput oriented applications

    • Fill the network BDP using asynchronous I/O and large I/O chunks.

    • Use the CopyFile API when applicable.

    • Use multithreading when operating on large number of small files.

  • For interactive apps

    • Use the handle based APIs.

    • Help the system cache data effectively by watching your open patterns.

    • Watch out for metadata queries.

    • Support cancellation


Performance monitoring tools l.jpg
Performance Monitoring Tools

  • Process monitor

    • Effectively track I/O issued by the application.

    • Monitor file, registry, thread and process activity.

    • Available at http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx

  • Netmon 3

    • A network sniffer to monitor traffic

    • Parsers are available for both SMB and SMB2 protocols.

    • Available at http://blogs.technet.com/netmon/


Conclusion l.jpg
Conclusion

  • Be aware that your application will be used over a slow network even though you didn’t design for it

  • We are constantly improving the platform

  • The guidelines presented here are applicable to other WAN scenarios also

  • Use the APIs provided by the system to your advantage

  • Understanding how the system works can help us write well behaved apps


Evals recordings l.jpg
Evals & Recordings

Please fill out your evaluation for this session at:

This session will be available as a recording at:

www.microsoftpdc.com


Please use the microphones provided l.jpg

Q&A

Please use the microphones provided


Slide33 l.jpg

© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.