Writing Highly Available .Net Framework Applications Future of CLR in .Net 2.0 Writing Highly Available .Net Framework Applications Sriram Ramamurthy
Introduction Customize CLR for Application Scenarios High Degree of Availability Process must live for a very long time Provide features for host – handle exceptional conditions Application Domains: Isolation & Unloading Host can remove code from erroneous process & continue execution
Advantages Host runtime code – Reliable Handle Resource Exhaustion & Exceptional Conditions How to handle add-ins that might not be written properly?
Goals Unload Application Domain without leaking any resources Customize the handling of various exceptional conditions E.g. – System.OutOfMemoryException Customizing Escalation Policy
Application Domain Isolation & Process Lifetimes Process should not crash under exceptional conditions Why build such a complex infrastructure? Why not simply write managed code to handle all exceptions properly? Writing reliable managed code handling all exceptions is impractical
Application Domain Isolation & Process Lifetimes CLR Model of executing managed code May throw exception in any line of code Unexpected Memory and runtime operations Memory Allocation MSIL – to be JITed Boxing of Value Types E.g. HashTable.Add(“Entry1”, 5); Using PInvoke – InPtr semHandle = CreateSemaphore(…);
Application Domain Isolation & Process Lifetimes .Net Framework assemblies & eXtensible applications – if practical What about add-ins ? CLR 1.0, 1.1 – no guarantee for high availability No such need due to lack of CLR hosts Microsoft ASP.Net – Process Recycling Model
ASP. Net & IIS Multiple processes – load balancing incoming requests High Demand – more processes created Low Demand – processes idle or killed High Scalability achieved – Process recycling model Web applications – request or connection is stateless
ASP. Net & IIS Process hangs or fails Process – kill safely without affection application state User – Try Again Later, error message Refresh browser and resend request to different process
CLR Design Decisions Works well – Web Servers Does not work well – Database Servers High per Process state – starting a new process becomes expensive .Net 1.0, 1.1 – CLR Host (ASP . Net) .Net 2.0 – CLR Host (SQL Server 2005) Shall support long lived processes
Failure Escalation .Net 1.0, 1.1 – certain unhandled exceptions will be swallowed Does not terminate process Silent Failures & Process Corruption .Net 2.0 – all unhandled exceptions will bubble up affecting entire process Make failures more apparent & easier to debug
Escalation Policy - Failures Failure to allocate resource: Memory or resources managed by OS Failure to allocate resource in critical region of code: Block of code shared b/w multiple threads Code relies on state from another thread cannot be cleaned up by terminating running thread – integrity not guaranteed E.g. SQL Server: Abort thread - if failure to allocate resource Unload Application Domain – if thread is in critical region
Escalation Policy - Failures How does CLR know – if code is in critical region ? CLR detects code executed – waits on a synchronization primitive (mutex, event, semaphore or locks) Resource failure occurs in a region depending on sync primitive – code is in critical region
CLR Catch CLR ability to detect code waiting on sync primitive – limited System.Threading – mutex & events CLR tracks locks created in managed world Add-ins – given full trust in CAS & use PInvoke to create sync primitives by calling Win32 API’s Unknown to CLR – outside realm of managed code Won’t be reported as critical region code if failure occurs
Escalation Policy - Failures Fatal Runtime Error: Internal error – cannot continue to execute managed code Exit process or disable CLR Orphaned Lock: Sync primitive is created but never freed E.g. – Mutex or Monitor created on a thread that is aborted before lock is freed Lock is Orphaned and can never be freed Result in Resource Exhaustion
Escalation Policy - Actions Throw an exception: Default action – resource failures E.g. – StackOverflowException, OutOfMemoryException Gracefully Abort Thread: Throws ThreadAbortException on terminating thread CLR gives add-in chance to free resources by running code in finally blocks
Escalation Policy - Actions Rudely Abort Thread: No guarantee about cleaning up add-in code Use to remove threads that do not gracefully abort Gracefully Unload Application Domain: Gracefully abort all threads Free CLR data structures associated with domain Finalizer is run for all objects in domain
Escalation Policy - Actions Rudely Unload Application Domains: Rude abort of all threads CLR data structures are freed No guarantee of Finalizers to run Gracefully exit Process: Gracefully unload application domains Rudely exit Process: Rudely unload application domains TerminateProcess – Win32 API Disable the CLR: Prevent execution of managed code Process is still alive – continue other work E.g. – SQL Server Process
Escalation Policy - Operations Specify Timeouts for operations Indicate actions that should occur Diagram – Escalation Policy of SQL Server 2005 Host
Critical Finalization, Safe Handles & Constrained Execution Region Ensure application domains unload without leaking resources Guarantee native handles held will be closed properly Framework classes – wrappers around native handles E.g. System.IO, System.Net Dispose Pattern & Object Finalizers – no guarantee that they run
Critical Finalization, Safe Handles & Constrained Execution Region Critical Finalizer: CLR will always run Guaranteed to complete System.Runtime.ConstrainedExecution.Cri—ticalFinalizerObject Safe Handle: Wrapper around native handle BCL rewritten in .Net 2.0 using Safe Handles System.Runtime.InteropServices.SafeHandle
Critical Finalization, Safe Handles & Constrained Execution Region CER: How is it that it always run and always complete? Block of code in which exceptions are never thrown due to lack of resources CLR Steps: Prepare CER Restrict Operations inside CER
Guidelines for Writing Highly Available Managed Code Use Safe Handles to Encapsulate Native Handles: Use classes in System.Runtime.InteropServices Write a custom class Create a class derived from System.Runtime.InteropServices Provide a constructor that enables callers to associate native handle Implement ReleaseHandle method Implement IsInvalid Property
Safe Handles Derive from CriticalFinalizerObject Classes derived from SafeHandle require permission to call unmanaged code Constructor has ownsHandle parameter Annotate with SuppressUnmanagedCodeSecurityAttribute
Guidelines for Writing Highly Available Managed Code Use only Synchronization Primitives provided by . Net Code is shared or in Critical Region – sync primitives System.Threading – Monitor, Mutex, ReaderWriterLock Custom primitives – CLR cannot detect shared state, Escalation Policy cannot be used
Guidelines for Writing Highly Available Managed Code Ensure calls to Unmanaged Code return to CLR: Thread can enter a state that prevents CLR to abort it. Use PInvoke – call unmanaged API and waits infinitely on sync primitive or blocks CLR has no control of unmanaged code Provide timeout values Regain control and ask CLR to abort thread
Guidelines for Writing Highly Available Managed Code Annotate Your Libraries with Host Protection Attribute: Host Protection to prevent API’s that violate programming models Prevent add-ins from using any API that allows it to share state across threads Reduce resource failures and application domain unloads Use custom attribute HostProtectionAttribute