Writing rock solid reliable applications for windows vista and the clr
This presentation is the property of its rightful owner.
Sponsored Links
1 / 34

Writing Rock-Solid Reliable Applications For Windows Vista And The CLR PowerPoint PPT Presentation


  • 131 Views
  • Uploaded on
  • Presentation posted in: General

Writing Rock-Solid Reliable Applications For Windows Vista And The CLR. Björn Levidow, Group Program Manager Brian Grunkemeyer, Software Design Engineer FUN308 Microsoft Corporation [email protected] [email protected] What You Will See. Customer-Focused Reliability Attributes

Download Presentation

Writing Rock-Solid Reliable Applications For Windows Vista And The CLR

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Writing rock solid reliable applications for windows vista and the clr

Writing Rock-Solid Reliable Applications For Windows Vista And The CLR

Björn Levidow, Group Program Manager

Brian Grunkemeyer, Software Design Engineer

FUN308

Microsoft Corporation

[email protected] [email protected]


What you will see

What You Will See

  • Customer-Focused Reliability Attributes

  • Windows Vista and CLR reliability goals

  • Windows Vista and CLR reliability features

    • Detailed resiliency discussion

    • Features and Tools

  • Summary

  • Call to Action

The Microsoft Platform affords developing reliable applications, both native and managed


Customer focused reliability attributes

Customer-Focused Reliability Attributes

Attribute

Definition

Examples

The system continues to provide service in the face of internal or external disruptions

Resilient

crashes, hangs …

After disruption the system is easily restored to a previously known state with no data loss

data corruption

Recoverable

Provides timely and expected service whenever needed

Controlled

degraded response

Required changes and upgrades do not impact the service

update disruptions

Undisruptable

At release the system contains a minimum number of bugs, requiring a limited number of predictable patches/fixes

ProductionReady

patch size, frequency

It works as advertised, what worked before works now

compatibility failures

Predictable


Addressing customer focused reliability attributes

Requires Application Design Consideration

  • Process/App Domain Recycling

  • SafeHandle

Resilient

crashes, hangs …

  • Transactional file system/Registry

  • Common log file system

data corruption

Recoverable

  • Resource Exhaustion Diagnostics

  • I/O cancellation

Controlled

degraded response

  • Restart Manager

update disruptions

Undisruptable

  • /Analyze, Safe C++ libraries, FxCop

  • App Verifier, Managed Debugging Assistant

ProductionReady

patch size, frequency

OS or CLR features to plug into your app

compatibility failures

Predictable

Good versioning and installation practices

Addressing Customer-Focused Reliability Attributes


Windows vista reliability objectives

Windows Vista Reliability Objectives

  • No loss of work, time, data or control

  • No Hangs, No Crashes, No Reboots

  • Reducing user disruptions and increasing availability

  • How we raised the bar on Windows Vista reliability

    • New processes to minimize bugs and design issues

    • Enhanced feedback using Windows Error Reporting for identifying product problems during development

    • New reliability features


Clr reliability objectives

CLR Reliability Objectives

  • Write resilient applications

  • Improve application availability

  • Reduce user disruptions and increasing availability

    • Resiliency against failures, crashes and hangs

    • Availability is great today. Let’s make it even better

  • How we raised the bar on CLR reliability

    • Tested product with fault injection

    • New reliability features

    • Hardened managed libraries


How much reliability do i need different bars for different environments

How Much Reliability Do I Need?Different bars for different environments

  • Reliability of most software meets customer needs

    • A few bad apples spoil the overall experience

  • Reliability needs differ based on your application

    • Console applications and simple apps like calc.exe

    • Sophisticated application (Word, Photoshop)

    • Library code

    • Highly available server code

  • Library code’s reliability bar is dictated by the applications that use the library

    • Car


Writing reliable code reliability has a cost

Writing Reliable CodeReliability Has A Cost

  • Writing reliable unmanaged code takes work

    • Requires discipline to handle out of memory problems

    • Failures in multi-threaded apps are hard to handle

    • Requires extensive testing (fault injection, stress runs)

  • Writing reliable managed code takes work

    • Under the covers, the CLR manages your code

    • Eliminates entire classes of bugs, like dangling pointers, memory leaks, most buffer overruns, etc.

    • However, CLR-induced failure points aren’t obvious

    • Asynchronous exceptions: OutOfMemoryException and ThreadAbortException


Customer focused reliability attributes1

Customer-Focused Reliability Attributes

Attribute

Definition

Examples

The system continues to provide service in the face of internal or external disruptions

Resilient

crashes, hangs …

Recoverable

Controlled

Undisruptable

ProductionReady

Predictable


How do we get resiliency resiliency approaches

How Do We Get Resiliency?Resiliency Approaches

  • Isolated extensibility models

    • Keep extensions in their own process space

    • Enables recycling

  • Process Recycling

    • Operating System resources are guaranteed to be freed

    • Relatively cheap and relatively easy

    • Requires a stateless, almost transactional model


Process recycling hosted programming model example

Process RecyclingHosted programming model example

  • ASP.NET hosts applications

  • Uses process recycling for resiliency

  • Worker processes may encounter a resource leak or deadlock, and the host will kill them

    • Bugs could be anywhere in the process

  • Server is resilient to these failures

    • Session state must live in a database or out-of-proc

      • In-process session state is lost. Controllable via web.config

    • Cheap and good enough for a web server


Appdomain recycling another hosted programming model

SQL Server Process

AppDomain 3

Default AppDomain

AppDomain 2

AppDomain RecyclingAnother hosted programming model

  • Application Domains are a unit of isolation

    • Static variables are per-appdomain

    • Avoid* mutating any cross-AD or cross-process state

  • SQL unloads and recycles AppDomains

    • Mitigates state corruption

    • Higher availability

    • SQL is transacted => no database corruption

    • Operating System (OS) resources must be freed, but the OS is AD-ignorant

      Appdomain unloading must be clean!


Problems for hosted code how does a host hurt your reliability

Problems For Hosted CodeHow does a host hurt your reliability?

  • Hosted libraries make tradeoffs to guarantee availability

  • Thread aborts between two machine instructions

  • OutOfMemoryExceptions more common when hosted

  • Typical cleanup techniques aren’t guaranteed!

    • Finalizers and finally’s may be aborted

  • Hosted managed libraries should be hardened

    • Prevent leaking resources in aggressive hosts

    • Using hardened code is very forgiving

call native int CreateFile(…)

stloc.2

IntPtr handle = CreateFile(…);


Safehandle reliably releasing a handle

SafeHandleReliably releasing a handle

  • A reliable, convenient wrapper for OS handles

  • CLR guarantees your release code will run

    • Critical finalization

  • Benefits

    • Avoids races with your own finalizer

    • Reduced object graph promotion during GC

    • Type-safe manipulation of handles

  • Small perf costs

    • Another 20 bytes on x86, 32 bytes on 64 bit

    • Ref count when a thread is actively using a SafeHandle


Safehandle demo

SafeHandle Demo

Brian Grunkemeyer

Software Development Engineer

Common Language Runtime


Constrained execution regions limited guaranteed execution

Constrained Execution RegionsLimited guaranteed execution

  • For building hosts and changing cross-AD state

  • Hoist CLR-induced failures and delay thread aborts

  • Constraints on your code

    • Only call methods with reliability contracts

    • No allocations, virtual calls, acquiring locks, etc.

  • Perf and complexity cost

RuntimeHelpers.PrepareConstrainedRegions();

try {

// Arbitrary code: may fail

}

finally {

// Constrained code: No virtual calls or allocs

}


When to use safehandle and cer s

When To Use SafeHandle And CER’s

  • Use SafeHandles when

    • Libraries hosted in environments using appdomain recycling

    • Anyone using P/Invoke to acquire OS resources

  • Use CER’s when

    • Hosted code that manipulates cross-appdomain or cross-machine state

      • Still need to design for a power failure

    • Corner cases that SafeHandle doesn’t support

      • Marshaling out handles stored in a struct


Customer focused reliability attributes2

Customer-Focused Reliability Attributes

Attribute

Definition

Examples

Resilient

After disruption the system is easily restored to a previously known state with no data loss

data corruption

Recoverable

Controlled

Undisruptable

ProductionReady

Predictable


Writing recoverable applications

Writing Recoverable Applications

  • Writing bug free apps is Nirvana, but…

    • Nobody’s perfect 

    • Not all software controls nuclear power plants

    • Even if you get there, external factors affect you

      • Software installs, resource exhaustion, power failures

      • User uses your app in an unexpected way

  • So, writing recoverable apps is necessary

    • Expect the unexpected!

      • Apps should be journaled and designed to recover

    • Use transactions and journaling to persist data

    • Save data and state most important to your applications

  • Word is a good example

    • Saves user docs ever 3 minutes to minimize loss

    • Document recovery as well


Transactions and journaling tools to help build recoverable apps

Transactions And JournalingTools to help build recoverable apps

  • Win32

    • File and Registry Transactions (TxF)

    • Common Log File System (CLFS)

  • Managed

    • System.Transactions

SetCurrentTransaction(HANDLE hTransaction)

using (TransactionScope scope = new TransactionScope(

TransactionScopeOption.Required,

         new TransactionOptions(),EnterpriseServicesInteropOption.Full))

{

   if (!EnterTransactionScope()) throw new TransactionException(“Bad");

   // Write to one or many files, etc.

   if (!ExitTransactionScope()) throw new TransactionException(“Bad");

   scope.Complete();

}


Customer focused reliability attributes3

Customer-Focused Reliability Attributes

Attribute

Definition

Examples

Resilient

Recoverable

Provides timely and expected service whenever needed

Controlled

degraded response

Undisruptable

ProductionReady

Predictable


Resource exhaustion diagnosis

Resource Exhaustion Diagnosis

  • Give users control of their system by allowing them to take action before a low resource condition impacts them

    • Automatic detection and diagnosis of near-exhaustion of commit limit and memory leaks on client SKUs

    • Provide options for manual and automatic resolution to avoid exhaustion

  • Impact on Windows Vista applications

    • If GUI app uses lots of VM, will show up on list of applications to be closed by user

    • If service or CMD app, will be shut down by Windows when exhaustion has been hit

  • What you need to do

    • Be mindful of memory utilization: e.g. trim working set when unused


I o cancellation support

I/O Cancellation Support

  • Apps shouldn’t hang

    • Apps should provide a cancel button

    • Ever see Outlook hang while downloading mail?

  • New Win32 Cancellation APIs for Windows Vista

    • Cancel specific async I/O requests for file handle

    • Cancel synchronous requests from another thread

  • No managed support until “Orcas”

    • Look for the CancellationRegion class

  • Caveats

    • Operation is only marked for cancellation

    • Some “meta APIs” aren’t cancelable: (e.g. CopyFile. Use CopyFileEx)

    • Slightly tricky to use

CancelIoEx(HANDLE hFile, LPOVERLAPPED lpOverlap)

CancelSynchronousIO(HANDLE hThread)


Customer focused reliability attributes4

Customer-Focused Reliability Attributes

Attribute

Definition

Examples

Resilient

Recoverable

Controlled

Required changes and upgrades do not impact the service

update disruptions

Undisruptable

ProductionReady

Predictable


Minimize reboots when installing software

Minimize Reboots When Installing Software

  • Use the Restart Manager APIs

    • Shuts down only required apps and services

    • Automatically detect and shutdown services in shared processes with a file in use

    • Prevents the need for a machine restart after apps or services have been shutdown

      • Groups application, service and machine restarts

    • Design app “freeze-dry” functionality to return user to the state they were in before the restart

      • Use P/Invoke for managed applications

RegisterApplicationRestart( GetCommandLine(), 0 ); // Native

Users experience minimum disruptionfor application and patch installs for your application


Customer focused reliability attributes5

Customer-Focused Reliability Attributes

Attribute

Definition

Examples

Resilient

Recoverable

Controlled

Undisruptable

At release the system contains a minimum number of bugs, requiring a limited number of predictable patches/fixes

ProductionReady

patch size, frequency

Predictable


Windows error reporting during development

Windows Error Reporting During Development

  • Errors are reported to Microsoft in real-time by customer choice (crashes, hangs)

  • Automatic analysis and signature matching to known issues

  • Problems available to registered developers through the Developer Portal

  • Known fixes provided to customers in real-time

  • API’s for failing quickly and reporting an error

    • Or, simply let an exception go unhandled, in both managed and native

Environment.FailFast(String reason); // Managed “panic button”


Reliability best practices

Reliability Best Practices

  • If crash occurs, report the issue via Windows Error Reporting

    • Don’t use the IsBadWritePtr family of APIs

    • Turns debuggable crash into silent process exit

    • Replace the API with a simple `if (p == NULL)` check

  • Write multi-threaded code correctly

    • Use synchronization primitives for stopping and pausing threads

    • Don’t call TerminateThread

    • Avoid calling Thread.Abort

    • Don’t call Thread.Suspend


Recommended tools for making code production ready

Recommended Tools For Making Code Production Ready

  • Unmanaged

    • Safe C++ Libraries (CRT, MFC, ATL)

    • C++ Compiler static analysis (/analyze)

    • C++ Compiler’s buffer overrun cookie (/GS)

    • Application Verifier

  • Managed

    • FxCop

    • Managed Debugging Assistants


Summary

Summary

The Microsoft Platform affords developing reliable applications, both native and managed

  • What is Reliability?

    • Customer taxonomy

  • Windows Vista and CLR reliability goals

  • Windows Vista and CLR reliability features

    • Detailed resiliency discussion

    • Features and Tools


Call to action

Call To Action

  • Design for resiliency as discussed

    • Use SafeHandle to free OS handles

  • Use Windows Vista’s transactions for recoverability

  • Use Windows Vista’s new Restart Manager API’s to minimize disruptions

  • Support cancellation to give users control

  • Use all the tools at your disposal to make your code production ready

    • E.g. FxCop, /Analyze, Windows Error Reporting


More information managed resiliency features

More InformationManaged Resiliency Features

  • At PDC

    • Add-Ins and Versioning - FUN 309: “Designing managed addins for reliability, security, and versioning” w/ Jim Miller

    • Versioning – FUN 314: “Architecting your apps for the future”

  • After PDC

    • High-level overview: http://msdn.microsoft.com/msdnmag/issues/05/10/Reliability/

    • SafeHandle: http://blogs.msdn.com/bclteam/archive/2005/03/16/396900.aspx

    • Constrained Execution Regions: http://blogs.msdn.com/bclteam/archive/2005/06/14/429181.aspx

    • Chris Brumme’s Hosting & Reliability blog posts: http://blogs.msdn.com/cbrumme/archive/2004/02/21/77595.aspx

    • http://blogs.msdn.com/cbrumme/archive/2003/06/23/51482.aspx


More information windows vista reliability features

More InformationWindows Vista reliability features

  • At PDC

    • Journaling – FUN034: Improving reliability with the new System.Transactions classes, file system, and registry transactions

    • Restart Manager and Versioning – FUN222: Windows Vista and "Longhorn" Server: What's New in Windows Installer (MSI) and ClickOnce

    • Feedback – FUN313: Windows Vista: Improving Quality through Windows Feedback Data

    • I/O cancellation – FUN302: Programming with Concurrency (Part 1): Concepts, Patterns, and Best Practices

  • After PDC

    • http://msdn.microsoft.com/windowsvista/reliability/

    • http://www.microsoft.com/technet/windowsvista/webcasts.mspx

    • Resource Exhaustion: http://www.microsoft.com/technet/windowsvista/evaluate/admin/mntreli.mspx

    • I/O Cancellation

    • http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/cancelsynchronousio_func.asp


Writing rock solid reliable applications for windows vista and the clr

© 2005 Microsoft Corporation. All rights reserved.

This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.


  • Login