diff options
Diffstat (limited to 'docs/main/ExceptionHandling.html')
-rw-r--r-- | docs/main/ExceptionHandling.html | 606 |
1 files changed, 606 insertions, 0 deletions
diff --git a/docs/main/ExceptionHandling.html b/docs/main/ExceptionHandling.html new file mode 100644 index 0000000..9c7c615 --- /dev/null +++ b/docs/main/ExceptionHandling.html @@ -0,0 +1,606 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" + "http://www.w3.org/TR/html4/strict.dtd"> +<html> +<head> + <title>Exception Handling in LLVM</title> + <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> + <meta name="description" + content="Exception Handling in LLVM."> + <link rel="stylesheet" href="llvm.css" type="text/css"> +</head> + +<body> + +<div class="doc_title">Exception Handling in LLVM</div> + +<table class="layout" style="width:100%"> + <tr class="layout"> + <td class="left"> +<ul> + <li><a href="#introduction">Introduction</a> + <ol> + <li><a href="#itanium">Itanium ABI Zero-cost Exception Handling</a></li> + <li><a href="#sjlj">Setjmp/Longjmp Exception Handling</a></li> + <li><a href="#overview">Overview</a></li> + </ol></li> + <li><a href="#codegen">LLVM Code Generation</a> + <ol> + <li><a href="#throw">Throw</a></li> + <li><a href="#try_catch">Try/Catch</a></li> + <li><a href="#cleanups">Cleanups</a></li> + <li><a href="#throw_filters">Throw Filters</a></li> + <li><a href="#restrictions">Restrictions</a></li> + </ol></li> + <li><a href="#format_common_intrinsics">Exception Handling Intrinsics</a> + <ol> + <li><a href="#llvm_eh_exception"><tt>llvm.eh.exception</tt></a></li> + <li><a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a></li> + <li><a href="#llvm_eh_typeid_for"><tt>llvm.eh.typeid.for</tt></a></li> + <li><a href="#llvm_eh_sjlj_setjmp"><tt>llvm.eh.sjlj.setjmp</tt></a></li> + <li><a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a></li> + <li><a href="#llvm_eh_sjlj_lsda"><tt>llvm.eh.sjlj.lsda</tt></a></li> + <li><a href="#llvm_eh_sjlj_callsite"><tt>llvm.eh.sjlj.callsite</tt></a></li> + </ol></li> + <li><a href="#asm">Asm Table Formats</a> + <ol> + <li><a href="#unwind_tables">Exception Handling Frame</a></li> + <li><a href="#exception_tables">Exception Tables</a></li> + </ol></li> + <li><a href="#todo">ToDo</a></li> +</ul> +</td> +</tr></table> + +<div class="doc_author"> + <p>Written by <a href="mailto:jlaskey@mac.com">Jim Laskey</a></p> +</div> + + +<!-- *********************************************************************** --> +<div class="doc_section"><a name="introduction">Introduction</a></div> +<!-- *********************************************************************** --> + +<div class="doc_text"> + +<p>This document is the central repository for all information pertaining to + exception handling in LLVM. It describes the format that LLVM exception + handling information takes, which is useful for those interested in creating + front-ends or dealing directly with the information. Further, this document + provides specific examples of what exception handling information is used for + in C/C++.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="itanium">Itanium ABI Zero-cost Exception Handling</a> +</div> + +<div class="doc_text"> + +<p>Exception handling for most programming languages is designed to recover from + conditions that rarely occur during general use of an application. To that + end, exception handling should not interfere with the main flow of an + application's algorithm by performing checkpointing tasks, such as saving the + current pc or register state.</p> + +<p>The Itanium ABI Exception Handling Specification defines a methodology for + providing outlying data in the form of exception tables without inlining + speculative exception handling code in the flow of an application's main + algorithm. Thus, the specification is said to add "zero-cost" to the normal + execution of an application.</p> + +<p>A more complete description of the Itanium ABI exception handling runtime + support of can be found at + <a href="http://www.codesourcery.com/cxx-abi/abi-eh.html">Itanium C++ ABI: + Exception Handling</a>. A description of the exception frame format can be + found at + <a href="http://refspecs.freestandards.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html">Exception + Frames</a>, with details of the DWARF 3 specification at + <a href="http://www.eagercon.com/dwarf/dwarf3std.htm">DWARF 3 Standard</a>. + A description for the C++ exception table formats can be found at + <a href="http://www.codesourcery.com/cxx-abi/exceptions.pdf">Exception Handling + Tables</a>.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="sjlj">Setjmp/Longjmp Exception Handling</a> +</div> + +<div class="doc_text"> + +<p>Setjmp/Longjmp (SJLJ) based exception handling uses LLVM intrinsics + <a href="#llvm_eh_sjlj_setjmp"><tt>llvm.eh.sjlj.setjmp</tt></a> and + <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a> to + handle control flow for exception handling.</p> + +<p>For each function which does exception processing, be it try/catch blocks + or cleanups, that function registers itself on a global frame list. When + exceptions are being unwound, the runtime uses this list to identify which + functions need processing.<p> + +<p>Landing pad selection is encoded in the call site entry of the function + context. The runtime returns to the function via + <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a>, where + a switch table transfers control to the appropriate landing pad based on + the index stored in the function context.</p> + +<p>In contrast to DWARF exception handling, which encodes exception regions + and frame information in out-of-line tables, SJLJ exception handling + builds and removes the unwind frame context at runtime. This results in + faster exception handling at the expense of slower execution when no + exceptions are thrown. As exceptions are, by their nature, intended for + uncommon code paths, DWARF exception handling is generally preferred to + SJLJ.</p> +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="overview">Overview</a> +</div> + +<div class="doc_text"> + +<p>When an exception is thrown in LLVM code, the runtime does its best to find a + handler suited to processing the circumstance.</p> + +<p>The runtime first attempts to find an <i>exception frame</i> corresponding to + the function where the exception was thrown. If the programming language + (e.g. C++) supports exception handling, the exception frame contains a + reference to an exception table describing how to process the exception. If + the language (e.g. C) does not support exception handling, or if the + exception needs to be forwarded to a prior activation, the exception frame + contains information about how to unwind the current activation and restore + the state of the prior activation. This process is repeated until the + exception is handled. If the exception is not handled and no activations + remain, then the application is terminated with an appropriate error + message.</p> + +<p>Because different programming languages have different behaviors when + handling exceptions, the exception handling ABI provides a mechanism for + supplying <i>personalities.</i> An exception handling personality is defined + by way of a <i>personality function</i> (e.g. <tt>__gxx_personality_v0</tt> + in C++), which receives the context of the exception, an <i>exception + structure</i> containing the exception object type and value, and a reference + to the exception table for the current function. The personality function + for the current compile unit is specified in a <i>common exception + frame</i>.</p> + +<p>The organization of an exception table is language dependent. For C++, an + exception table is organized as a series of code ranges defining what to do + if an exception occurs in that range. Typically, the information associated + with a range defines which types of exception objects (using C++ <i>type + info</i>) that are handled in that range, and an associated action that + should take place. Actions typically pass control to a <i>landing + pad</i>.</p> + +<p>A landing pad corresponds to the code found in the <i>catch</i> portion of + a <i>try</i>/<i>catch</i> sequence. When execution resumes at a landing + pad, it receives the exception structure and a selector corresponding to + the <i>type</i> of exception thrown. The selector is then used to determine + which <i>catch</i> should actually process the exception.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_section"> + <a name="codegen">LLVM Code Generation</a> +</div> + +<div class="doc_text"> + +<p>At the time of this writing, only C++ exception handling support is available + in LLVM. So the remainder of this document will be somewhat C++-centric.</p> + +<p>From the C++ developers perspective, exceptions are defined in terms of the + <tt>throw</tt> and <tt>try</tt>/<tt>catch</tt> statements. In this section + we will describe the implementation of LLVM exception handling in terms of + C++ examples.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="throw">Throw</a> +</div> + +<div class="doc_text"> + +<p>Languages that support exception handling typically provide a <tt>throw</tt> + operation to initiate the exception process. Internally, a throw operation + breaks down into two steps. First, a request is made to allocate exception + space for an exception structure. This structure needs to survive beyond the + current activation. This structure will contain the type and value of the + object being thrown. Second, a call is made to the runtime to raise the + exception, passing the exception structure as an argument.</p> + +<p>In C++, the allocation of the exception structure is done by + the <tt>__cxa_allocate_exception</tt> runtime function. The exception + raising is handled by <tt>__cxa_throw</tt>. The type of the exception is + represented using a C++ RTTI structure.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="try_catch">Try/Catch</a> +</div> + +<div class="doc_text"> + +<p>A call within the scope of a <i>try</i> statement can potentially raise an + exception. In those circumstances, the LLVM C++ front-end replaces the call + with an <tt>invoke</tt> instruction. Unlike a call, the <tt>invoke</tt> has + two potential continuation points: where to continue when the call succeeds + as per normal; and where to continue if the call raises an exception, either + by a throw or the unwinding of a throw.</p> + +<p>The term used to define a the place where an <tt>invoke</tt> continues after + an exception is called a <i>landing pad</i>. LLVM landing pads are + conceptually alternative function entry points where an exception structure + reference and a type info index are passed in as arguments. The landing pad + saves the exception structure reference and then proceeds to select the catch + block that corresponds to the type info of the exception object.</p> + +<p>Two LLVM intrinsic functions are used to convey information about the landing + pad to the back end.</p> + +<ol> + <li><a href="#llvm_eh_exception"><tt>llvm.eh.exception</tt></a> takes no + arguments and returns a pointer to the exception structure. This only + returns a sensible value if called after an <tt>invoke</tt> has branched + to a landing pad. Due to code generation limitations, it must currently + be called in the landing pad itself.</li> + + <li><a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a> takes a minimum + of three arguments. The first argument is the reference to the exception + structure. The second argument is a reference to the personality function + to be used for this <tt>try</tt>/<tt>catch</tt> sequence. Each of the + remaining arguments is either a reference to the type info for + a <tt>catch</tt> statement, a <a href="#throw_filters">filter</a> + expression, or the number zero (<tt>0</tt>) representing + a <a href="#cleanups">cleanup</a>. The exception is tested against the + arguments sequentially from first to last. The result of + the <a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a> is a + positive number if the exception matched a type info, a negative number if + it matched a filter, and zero if it matched a cleanup. If nothing is + matched, the behaviour of the program + is <a href="#restrictions">undefined</a>. This only returns a sensible + value if called after an <tt>invoke</tt> has branched to a landing pad. + Due to codegen limitations, it must currently be called in the landing pad + itself. If a type info matched, then the selector value is the index of + the type info in the exception table, which can be obtained using the + <a href="#llvm_eh_typeid_for"><tt>llvm.eh.typeid.for</tt></a> + intrinsic.</li> +</ol> + +<p>Once the landing pad has the type info selector, the code branches to the + code for the first catch. The catch then checks the value of the type info + selector against the index of type info for that catch. Since the type info + index is not known until all the type info have been gathered in the backend, + the catch code will call the + <a href="#llvm_eh_typeid_for"><tt>llvm.eh.typeid.for</tt></a> intrinsic + to determine the index for a given type info. If the catch fails to match + the selector then control is passed on to the next catch. Note: Since the + landing pad will not be used if there is no match in the list of type info on + the call to <a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a>, then + neither the last catch nor <i>catch all</i> need to perform the check + against the selector.</p> + +<p>Finally, the entry and exit of catch code is bracketed with calls + to <tt>__cxa_begin_catch</tt> and <tt>__cxa_end_catch</tt>.</p> + +<ul> + <li><tt>__cxa_begin_catch</tt> takes a exception structure reference as an + argument and returns the value of the exception object.</li> + + <li><tt>__cxa_end_catch</tt> takes no arguments. This function:<br><br> + <ol> + <li>Locates the most recently caught exception and decrements its handler + count,</li> + <li>Removes the exception from the "caught" stack if the handler count + goes to zero, and</li> + <li>Destroys the exception if the handler count goes to zero, and the + exception was not re-thrown by throw.</li> + </ol> + <p>Note: a rethrow from within the catch may replace this call with + a <tt>__cxa_rethrow</tt>.</p></li> +</ul> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="cleanups">Cleanups</a> +</div> + +<div class="doc_text"> + +<p>To handle destructors and cleanups in <tt>try</tt> code, control may not run + directly from a landing pad to the first catch. Control may actually flow + from the landing pad to clean up code and then to the first catch. Since the + required clean up for each <tt>invoke</tt> in a <tt>try</tt> may be different + (e.g. intervening constructor), there may be several landing pads for a given + try. If cleanups need to be run, an <tt>i32 0</tt> should be passed as the + last <a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a> argument. + However, when using DWARF exception handling with C++, a <tt>i8* null</tt> + <a href="#restrictions">must</a> be passed instead.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="throw_filters">Throw Filters</a> +</div> + +<div class="doc_text"> + +<p>C++ allows the specification of which exception types can be thrown from a + function. To represent this a top level landing pad may exist to filter out + invalid types. To express this in LLVM code the landing pad will + call <a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a>. The + arguments are a reference to the exception structure, a reference to the + personality function, the length of the filter expression (the number of type + infos plus one), followed by the type infos themselves. + <a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a> will return a + negative value if the exception does not match any of the type infos. If no + match is found then a call to <tt>__cxa_call_unexpected</tt> should be made, + otherwise <tt>_Unwind_Resume</tt>. Each of these functions requires a + reference to the exception structure. Note that the most general form of an + <a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a> call can contain + any number of type infos, filter expressions and cleanups (though having more + than one cleanup is pointless). The LLVM C++ front-end can generate such + <a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a> calls due to + inlining creating nested exception handling scopes.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="restrictions">Restrictions</a> +</div> + +<div class="doc_text"> + +<p>The semantics of the invoke instruction require that any exception that + unwinds through an invoke call should result in a branch to the invoke's + unwind label. However such a branch will only happen if the + <a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a> matches. Thus in + order to ensure correct operation, the front-end must only generate + <a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a> calls that are + guaranteed to always match whatever exception unwinds through the invoke. + For most languages it is enough to pass zero, indicating the presence of + a <a href="#cleanups">cleanup</a>, as the + last <a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a> argument. + However for C++ this is not sufficient, because the C++ personality function + will terminate the program if it detects that unwinding the exception only + results in matches with cleanups. For C++ a <tt>null i8*</tt> should be + passed as the last <a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a> + argument instead. This is interpreted as a catch-all by the C++ personality + function, and will always match.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_section"> + <a name="format_common_intrinsics">Exception Handling Intrinsics</a> +</div> + +<div class="doc_text"> + +<p>LLVM uses several intrinsic functions (name prefixed with "llvm.eh") to + provide exception handling information at various points in generated + code.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="llvm_eh_exception">llvm.eh.exception</a> +</div> + +<div class="doc_text"> + +<pre> + i8* %<a href="#llvm_eh_exception">llvm.eh.exception</a>( ) +</pre> + +<p>This intrinsic returns a pointer to the exception structure.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="llvm_eh_selector">llvm.eh.selector</a> +</div> + +<div class="doc_text"> + +<pre> + i32 %<a href="#llvm_eh_selector">llvm.eh.selector</a>(i8*, i8*, i8*, ...) +</pre> + +<p>This intrinsic is used to compare the exception with the given type infos, + filters and cleanups.</p> + +<p><a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a> takes a minimum of + three arguments. The first argument is the reference to the exception + structure. The second argument is a reference to the personality function to + be used for this try catch sequence. Each of the remaining arguments is + either a reference to the type info for a catch statement, + a <a href="#throw_filters">filter</a> expression, or the number zero + representing a <a href="#cleanups">cleanup</a>. The exception is tested + against the arguments sequentially from first to last. The result of + the <a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a> is a positive + number if the exception matched a type info, a negative number if it matched + a filter, and zero if it matched a cleanup. If nothing is matched, the + behaviour of the program is <a href="#restrictions">undefined</a>. If a type + info matched then the selector value is the index of the type info in the + exception table, which can be obtained using the + <a href="#llvm_eh_typeid_for"><tt>llvm.eh.typeid.for</tt></a> intrinsic.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="llvm_eh_typeid_for">llvm.eh.typeid.for</a> +</div> + +<div class="doc_text"> + +<pre> + i32 %<a href="#llvm_eh_typeid_for">llvm.eh.typeid.for</a>(i8*) +</pre> + +<p>This intrinsic returns the type info index in the exception table of the + current function. This value can be used to compare against the result + of <a href="#llvm_eh_selector"><tt>llvm.eh.selector</tt></a>. The single + argument is a reference to a type info.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="llvm_eh_sjlj_setjmp">llvm.eh.sjlj.setjmp</a> +</div> + +<div class="doc_text"> + +<pre> + i32 %<a href="#llvm_eh_sjlj_setjmp">llvm.eh.sjlj.setjmp</a>(i8*) +</pre> + +<p>The SJLJ exception handling uses this intrinsic to force register saving for + the current function and to store the address of the following instruction + for use as a destination address by <a href="#llvm_eh_sjlj_longjmp"> + <tt>llvm.eh.sjlj.longjmp</tt></a>. The buffer format and the overall + functioning of this intrinsic is compatible with the GCC + <tt>__builtin_setjmp</tt> implementation, allowing code built with the + two compilers to interoperate.</p> + +<p>The single parameter is a pointer to a five word buffer in which the calling + context is saved. The front end places the frame pointer in the first word, + and the target implementation of this intrinsic should place the destination + address for a + <a href="#llvm_eh_sjlj_longjmp"><tt>llvm.eh.sjlj.longjmp</tt></a> in the + second word. The following three words are available for use in a + target-specific manner.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="llvm_eh_sjlj_lsda">llvm.eh.sjlj.lsda</a> +</div> + +<div class="doc_text"> + +<pre> + i8* %<a href="#llvm_eh_sjlj_lsda">llvm.eh.sjlj.lsda</a>( ) +</pre> + +<p>Used for SJLJ based exception handling, the <a href="#llvm_eh_sjlj_lsda"> + <tt>llvm.eh.sjlj.lsda</tt></a> intrinsic returns the address of the Language + Specific Data Area (LSDA) for the current function. The SJLJ front-end code + stores this address in the exception handling function context for use by the + runtime.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsubsection"> + <a name="llvm_eh_sjlj_callsite">llvm.eh.sjlj.callsite</a> +</div> + +<div class="doc_text"> + +<pre> + void %<a href="#llvm_eh_sjlj_callsite">llvm.eh.sjlj.callsite</a>(i32) +</pre> + +<p>For SJLJ based exception handling, the <a href="#llvm_eh_sjlj_callsite"> + <tt>llvm.eh.sjlj.callsite</tt></a> intrinsic identifies the callsite value + associated with the following invoke instruction. This is used to ensure + that landing pad entries in the LSDA are generated in the matching order.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_section"> + <a name="asm">Asm Table Formats</a> +</div> + +<div class="doc_text"> + +<p>There are two tables that are used by the exception handling runtime to + determine which actions should take place when an exception is thrown.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="unwind_tables">Exception Handling Frame</a> +</div> + +<div class="doc_text"> + +<p>An exception handling frame <tt>eh_frame</tt> is very similar to the unwind + frame used by dwarf debug info. The frame contains all the information + necessary to tear down the current frame and restore the state of the prior + frame. There is an exception handling frame for each function in a compile + unit, plus a common exception handling frame that defines information common + to all functions in the unit.</p> + +<p>Todo - Table details here.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="exception_tables">Exception Tables</a> +</div> + +<div class="doc_text"> + +<p>An exception table contains information about what actions to take when an + exception is thrown in a particular part of a function's code. There is one + exception table per function except leaf routines and functions that have + only calls to non-throwing functions will not need an exception table.</p> + +<p>Todo - Table details here.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_section"> + <a name="todo">ToDo</a> +</div> + +<div class="doc_text"> + +<ol> + + <li>Testing/Testing/Testing.</li> + +</ol> + +</div> + +<!-- *********************************************************************** --> + +<hr> +<address> + <a href="http://jigsaw.w3.org/css-validator/check/referer"><img + src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> + <a href="http://validator.w3.org/check/referer"><img + src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> + + <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> + <a href="http://llvm.org">LLVM Compiler Infrastructure</a><br> + Last modified: $Date$ +</address> + +</body> +</html> |