diff options
Diffstat (limited to 'docs/Statepoints.rst')
-rw-r--r-- | docs/Statepoints.rst | 580 |
1 files changed, 580 insertions, 0 deletions
diff --git a/docs/Statepoints.rst b/docs/Statepoints.rst new file mode 100644 index 0000000..9741c93 --- /dev/null +++ b/docs/Statepoints.rst @@ -0,0 +1,580 @@ +===================================== +Garbage Collection Safepoints in LLVM +===================================== + +.. contents:: + :local: + :depth: 2 + +Status +======= + +This document describes a set of experimental extensions to LLVM. Use +with caution. Because the intrinsics have experimental status, +compatibility across LLVM releases is not guaranteed. + +LLVM currently supports an alternate mechanism for conservative +garbage collection support using the ``gcroot`` intrinsic. The mechanism +described here shares little in common with the alternate ``gcroot`` +implementation and it is hoped that this mechanism will eventually +replace the gc_root mechanism. + +Overview +======== + +To collect dead objects, garbage collectors must be able to identify +any references to objects contained within executing code, and, +depending on the collector, potentially update them. The collector +does not need this information at all points in code - that would make +the problem much harder - but only at well-defined points in the +execution known as 'safepoints' For most collectors, it is sufficient +to track at least one copy of each unique pointer value. However, for +a collector which wishes to relocate objects directly reachable from +running code, a higher standard is required. + +One additional challenge is that the compiler may compute intermediate +results ("derived pointers") which point outside of the allocation or +even into the middle of another allocation. The eventual use of this +intermediate value must yield an address within the bounds of the +allocation, but such "exterior derived pointers" may be visible to the +collector. Given this, a garbage collector can not safely rely on the +runtime value of an address to indicate the object it is associated +with. If the garbage collector wishes to move any object, the +compiler must provide a mapping, for each pointer, to an indication of +its allocation. + +To simplify the interaction between a collector and the compiled code, +most garbage collectors are organized in terms of three abstractions: +load barriers, store barriers, and safepoints. + +#. A load barrier is a bit of code executed immediately after the + machine load instruction, but before any use of the value loaded. + Depending on the collector, such a barrier may be needed for all + loads, merely loads of a particular type (in the original source + language), or none at all. + +#. Analogously, a store barrier is a code fragement that runs + immediately before the machine store instruction, but after the + computation of the value stored. The most common use of a store + barrier is to update a 'card table' in a generational garbage + collector. + +#. A safepoint is a location at which pointers visible to the compiled + code (i.e. currently in registers or on the stack) are allowed to + change. After the safepoint completes, the actual pointer value + may differ, but the 'object' (as seen by the source language) + pointed to will not. + + Note that the term 'safepoint' is somewhat overloaded. It refers to + both the location at which the machine state is parsable and the + coordination protocol involved in bring application threads to a + point at which the collector can safely use that information. The + term "statepoint" as used in this document refers exclusively to the + former. + +This document focuses on the last item - compiler support for +safepoints in generated code. We will assume that an outside +mechanism has decided where to place safepoints. From our +perspective, all safepoints will be function calls. To support +relocation of objects directly reachable from values in compiled code, +the collector must be able to: + +#. identify every copy of a pointer (including copies introduced by + the compiler itself) at the safepoint, +#. identify which object each pointer relates to, and +#. potentially update each of those copies. + +This document describes the mechanism by which an LLVM based compiler +can provide this information to a language runtime/collector, and +ensure that all pointers can be read and updated if desired. The +heart of the approach is to construct (or rewrite) the IR in a manner +where the possible updates performed by the garbage collector are +explicitly visible in the IR. Doing so requires that we: + +#. create a new SSA value for each potentially relocated pointer, and + ensure that no uses of the original (non relocated) value is + reachable after the safepoint, +#. specify the relocation in a way which is opaque to the compiler to + ensure that the optimizer can not introduce new uses of an + unrelocated value after a statepoint. This prevents the optimizer + from performing unsound optimizations. +#. recording a mapping of live pointers (and the allocation they're + associated with) for each statepoint. + +At the most abstract level, inserting a safepoint can be thought of as +replacing a call instruction with a call to a multiple return value +function which both calls the original target of the call, returns +it's result, and returns updated values for any live pointers to +garbage collected objects. + + Note that the task of identifying all live pointers to garbage + collected values, transforming the IR to expose a pointer giving the + base object for every such live pointer, and inserting all the + intrinsics correctly is explicitly out of scope for this document. + The recommended approach is to use the :ref:`utility passes + <statepoint-utilities>` described below. + +This abstract function call is concretely represented by a sequence of +intrinsic calls known collectively as a "statepoint relocation sequence". + +Let's consider a simple call in LLVM IR: + +.. code-block:: llvm + + define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) + gc "statepoint-example" { + call void ()* @foo() + ret i8 addrspace(1)* %obj + } + +Depending on our language we may need to allow a safepoint during the execution +of ``foo``. If so, we need to let the collector update local values in the +current frame. If we don't, we'll be accessing a potential invalid reference +once we eventually return from the call. + +In this example, we need to relocate the SSA value ``%obj``. Since we can't +actually change the value in the SSA value ``%obj``, we need to introduce a new +SSA value ``%obj.relocated`` which represents the potentially changed value of +``%obj`` after the safepoint and update any following uses appropriately. The +resulting relocation sequence is: + +.. code-block:: llvm + + define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) + gc "statepoint-example" { + %0 = call i32 (void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(void ()* @foo, i32 0, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj) + %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(i32 %0, i32 9, i32 9) + ret i8 addrspace(1)* %obj.relocated + } + +Ideally, this sequence would have been represented as a M argument, N +return value function (where M is the number of values being +relocated + the original call arguments and N is the original return +value + each relocated value), but LLVM does not easily support such a +representation. + +Instead, the statepoint intrinsic marks the actual site of the +safepoint or statepoint. The statepoint returns a token value (which +exists only at compile time). To get back the original return value +of the call, we use the ``gc.result`` intrinsic. To get the relocation +of each pointer in turn, we use the ``gc.relocate`` intrinsic with the +appropriate index. Note that both the ``gc.relocate`` and ``gc.result`` are +tied to the statepoint. The combination forms a "statepoint relocation +sequence" and represents the entitety of a parseable call or 'statepoint'. + +When lowered, this example would generate the following x86 assembly: + +.. code-block:: gas + + .globl test1 + .align 16, 0x90 + pushq %rax + callq foo + .Ltmp1: + movq (%rsp), %rax # This load is redundant (oops!) + popq %rdx + retq + +Each of the potentially relocated values has been spilled to the +stack, and a record of that location has been recorded to the +:ref:`Stack Map section <stackmap-section>`. If the garbage collector +needs to update any of these pointers during the call, it knows +exactly what to change. + +The relevant parts of the StackMap section for our example are: + +.. code-block:: gas + + # This describes the call site + # Stack Maps: callsite 2882400000 + .quad 2882400000 + .long .Ltmp1-test1 + .short 0 + # .. 8 entries skipped .. + # This entry describes the spill slot which is directly addressable + # off RSP with offset 0. Given the value was spilled with a pushq, + # that makes sense. + # Stack Maps: Loc 8: Direct RSP [encoding: .byte 2, .byte 8, .short 7, .int 0] + .byte 2 + .byte 8 + .short 7 + .long 0 + +This example was taken from the tests for the :ref:`RewriteStatepointsForGC` utility pass. As such, it's full StackMap can be easily examined with the following command. + +.. code-block:: bash + + opt -rewrite-statepoints-for-gc test/Transforms/RewriteStatepointsForGC/basics.ll -S | llc -debug-only=stackmaps + + + + + +Intrinsics +=========== + +'llvm.experimental.gc.statepoint' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i32 + @llvm.experimental.gc.statepoint(func_type <target>, + i64 <#call args>. i64 <unused>, + ... (call parameters), + i64 <# deopt args>, ... (deopt parameters), + ... (gc parameters)) + +Overview: +""""""""" + +The statepoint intrinsic represents a call which is parse-able by the +runtime. + +Operands: +""""""""" + +The 'target' operand is the function actually being called. The +target can be specified as either a symbolic LLVM function, or as an +arbitrary Value of appropriate function type. Note that the function +type must match the signature of the callee and the types of the 'call +parameters' arguments. + +The '#call args' operand is the number of arguments to the actual +call. It must exactly match the number of arguments passed in the +'call parameters' variable length section. + +The 'unused' operand is unused and likely to be removed. Please do +not use. + +The 'call parameters' arguments are simply the arguments which need to +be passed to the call target. They will be lowered according to the +specified calling convention and otherwise handled like a normal call +instruction. The number of arguments must exactly match what is +specified in '# call args'. The types must match the signature of +'target'. + +The 'deopt parameters' arguments contain an arbitrary list of Values +which is meaningful to the runtime. The runtime may read any of these +values, but is assumed not to modify them. If the garbage collector +might need to modify one of these values, it must also be listed in +the 'gc pointer' argument list. The '# deopt args' field indicates +how many operands are to be interpreted as 'deopt parameters'. + +The 'gc parameters' arguments contain every pointer to a garbage +collector object which potentially needs to be updated by the garbage +collector. Note that the argument list must explicitly contain a base +pointer for every derived pointer listed. The order of arguments is +unimportant. Unlike the other variable length parameter sets, this +list is not length prefixed. + +Semantics: +"""""""""" + +A statepoint is assumed to read and write all memory. As a result, +memory operations can not be reordered past a statepoint. It is +illegal to mark a statepoint as being either 'readonly' or 'readnone'. + +Note that legal IR can not perform any memory operation on a 'gc +pointer' argument of the statepoint in a location statically reachable +from the statepoint. Instead, the explicitly relocated value (from a +``gc.relocate``) must be used. + +'llvm.experimental.gc.result' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare type* + @llvm.experimental.gc.result(i32 %statepoint_token) + +Overview: +""""""""" + +``gc.result`` extracts the result of the original call instruction +which was replaced by the ``gc.statepoint``. The ``gc.result`` +intrinsic is actually a family of three intrinsics due to an +implementation limitation. Other than the type of the return value, +the semantics are the same. + +Operands: +""""""""" + +The first and only argument is the ``gc.statepoint`` which starts +the safepoint sequence of which this ``gc.result`` is a part. +Despite the typing of this as a generic i32, *only* the value defined +by a ``gc.statepoint`` is legal here. + +Semantics: +"""""""""" + +The ``gc.result`` represents the return value of the call target of +the ``statepoint``. The type of the ``gc.result`` must exactly match +the type of the target. If the call target returns void, there will +be no ``gc.result``. + +A ``gc.result`` is modeled as a 'readnone' pure function. It has no +side effects since it is just a projection of the return value of the +previous call represented by the ``gc.statepoint``. + +'llvm.experimental.gc.relocate' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare <pointer type> + @llvm.experimental.gc.relocate(i32 %statepoint_token, + i32 %base_offset, + i32 %pointer_offset) + +Overview: +""""""""" + +A ``gc.relocate`` returns the potentially relocated value of a pointer +at the safepoint. + +Operands: +""""""""" + +The first argument is the ``gc.statepoint`` which starts the +safepoint sequence of which this ``gc.relocation`` is a part. +Despite the typing of this as a generic i32, *only* the value defined +by a ``gc.statepoint`` is legal here. + +The second argument is an index into the statepoints list of arguments +which specifies the base pointer for the pointer being relocated. +This index must land within the 'gc parameter' section of the +statepoint's argument list. + +The third argument is an index into the statepoint's list of arguments +which specify the (potentially) derived pointer being relocated. It +is legal for this index to be the same as the second argument +if-and-only-if a base pointer is being relocated. This index must land +within the 'gc parameter' section of the statepoint's argument list. + +Semantics: +"""""""""" + +The return value of ``gc.relocate`` is the potentially relocated value +of the pointer specified by it's arguments. It is unspecified how the +value of the returned pointer relates to the argument to the +``gc.statepoint`` other than that a) it points to the same source +language object with the same offset, and b) the 'based-on' +relationship of the newly relocated pointers is a projection of the +unrelocated pointers. In particular, the integer value of the pointer +returned is unspecified. + +A ``gc.relocate`` is modeled as a ``readnone`` pure function. It has no +side effects since it is just a way to extract information about work +done during the actual call modeled by the ``gc.statepoint``. + +.. _statepoint-stackmap-format: + +Stack Map Format +================ + +Locations for each pointer value which may need read and/or updated by +the runtime or collector are provided via the :ref:`Stack Map format +<stackmap-format>` specified in the PatchPoint documentation. + +Each statepoint generates the following Locations: + +* Constant which describes number of following deopt *Locations* (not + operands) +* Variable number of Locations, one for each deopt parameter listed in + the IR statepoint (same number as described by previous Constant) +* Variable number of Locations pairs, one pair for each unique pointer + which needs relocated. The first Location in each pair describes + the base pointer for the object. The second is the derived pointer + actually being relocated. It is guaranteed that the base pointer + must also appear explicitly as a relocation pair if used after the + statepoint. There may be fewer pairs then gc parameters in the IR + statepoint. Each *unique* pair will occur at least once; duplicates + are possible. + +Note that the Locations used in each section may describe the same +physical location. e.g. A stack slot may appear as a deopt location, +a gc base pointer, and a gc derived pointer. + +The ID field of the 'StkMapRecord' for a statepoint is meaningless and +it's value is explicitly unspecified. + +The LiveOut section of the StkMapRecord will be empty for a statepoint +record. + +Safepoint Semantics & Verification +================================== + +The fundamental correctness property for the compiled code's +correctness w.r.t. the garbage collector is a dynamic one. It must be +the case that there is no dynamic trace such that a operation +involving a potentially relocated pointer is observably-after a +safepoint which could relocate it. 'observably-after' is this usage +means that an outside observer could observe this sequence of events +in a way which precludes the operation being performed before the +safepoint. + +To understand why this 'observable-after' property is required, +consider a null comparison performed on the original copy of a +relocated pointer. Assuming that control flow follows the safepoint, +there is no way to observe externally whether the null comparison is +performed before or after the safepoint. (Remember, the original +Value is unmodified by the safepoint.) The compiler is free to make +either scheduling choice. + +The actual correctness property implemented is slightly stronger than +this. We require that there be no *static path* on which a +potentially relocated pointer is 'observably-after' it may have been +relocated. This is slightly stronger than is strictly necessary (and +thus may disallow some otherwise valid programs), but greatly +simplifies reasoning about correctness of the compiled code. + +By construction, this property will be upheld by the optimizer if +correctly established in the source IR. This is a key invariant of +the design. + +The existing IR Verifier pass has been extended to check most of the +local restrictions on the intrinsics mentioned in their respective +documentation. The current implementation in LLVM does not check the +key relocation invariant, but this is ongoing work on developing such +a verifier. Please ask on llvmdev if you're interested in +experimenting with the current version. + +.. _statepoint-utilities: + +Utility Passes for Safepoint Insertion +====================================== + +.. _RewriteStatepointsForGC: + +RewriteStatepointsForGC +^^^^^^^^^^^^^^^^^^^^^^^^ + +The pass RewriteStatepointsForGC transforms a functions IR by replacing a +``gc.statepoint`` (with an optional ``gc.result``) with a full relocation +sequence, including all required ``gc.relocates``. To function, the pass +requires that the GC strategy specified for the function be able to reliably +distinguish between GC references and non-GC references in IR it is given. + +As an example, given this code: + +.. code-block:: llvm + + define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) + gc "statepoint-example" { + call i32 (void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(void ()* @foo, i32 0, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0) + ret i8 addrspace(1)* %obj + } + +The pass would produce this IR: + +.. code-block:: llvm + + define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) + gc "statepoint-example" { + %0 = call i32 (void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(void ()* @foo, i32 0, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj) + %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(i32 %0, i32 9, i32 9) + ret i8 addrspace(1)* %obj.relocated + } + +In the above examples, the addrspace(1) marker on the pointers is the mechanism +that the ``statepoint-example`` GC strategy uses to distinguish references from +non references. Address space 1 is not globally reserved for this purpose. + +This pass can be used an utility function by a language frontend that doesn't +want to manually reason about liveness, base pointers, or relocation when +constructing IR. As currently implemented, RewriteStatepointsForGC must be +run after SSA construction (i.e. mem2ref). + + +In practice, RewriteStatepointsForGC can be run much later in the pass +pipeline, after most optimization is already done. This helps to improve +the quality of the generated code when compiled with garbage collection support. +In the long run, this is the intended usage model. At this time, a few details +have yet to be worked out about the semantic model required to guarantee this +is always correct. As such, please use with caution and report bugs. + +.. _PlaceSafepoints: + +PlaceSafepoints +^^^^^^^^^^^^^^^^ + +The pass PlaceSafepoints transforms a function's IR by replacing any call or +invoke instructions with appropriate ``gc.statepoint`` and ``gc.result`` pairs, +and inserting safepoint polls sufficient to ensure running code checks for a +safepoint request on a timely manner. This pass is expected to be run before +RewriteStatepointsForGC and thus does not produce full relocation sequences. + +As an example, given input IR of the following: + +.. code-block:: llvm + + define void @test() gc "statepoint-example" { + call void @foo() + ret void + } + + declare void @do_safepoint() + define void @gc.safepoint_poll() { + call void @do_safepoint() + ret void + } + + +This pass would produce the following IR: + +.. code-block:: llvm + + define void @test() gc "statepoint-example" { + %safepoint_token = call i32 (void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(void ()* @do_safepoint, i32 0, i32 0, i32 0) + %safepoint_token1 = call i32 (void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(void ()* @foo, i32 0, i32 0, i32 0) + ret void + } + +In this case, we've added an (unconditional) entry safepoint poll and converted the call into a ``gc.statepoint``. Note that despite appearances, the entry poll is not necessarily redundant. We'd have to know that ``foo`` and ``test`` were not mutually recursive for the poll to be redundant. In practice, you'd probably want to your poll definition to contain a conditional branch of some form. + + +At the moment, PlaceSafepoints can insert safepoint polls at method entry and +loop backedges locations. Extending this to work with return polls would be +straight forward if desired. + +PlaceSafepoints includes a number of optimizations to avoid placing safepoint +polls at particular sites unless needed to ensure timely execution of a poll +under normal conditions. PlaceSafepoints does not attempt to ensure timely +execution of a poll under worst case conditions such as heavy system paging. + +The implementation of a safepoint poll action is specified by looking up a +function of the name ``gc.safepoint_poll`` in the containing Module. The body +of this function is inserted at each poll site desired. While calls or invokes +inside this method are transformed to a ``gc.statepoints``, recursive poll +insertion is not performed. + +If you are scheduling the RewriteStatepointsForGC pass late in the pass order, +you should probably schedule this pass immediately before it. The exception +would be if you need to preserve abstract frame information (e.g. for +deoptimization or introspection) at safepoints. In that case, ask on the +llvmdev mailing list for suggestions. + + +Bugs and Enhancements +===================== + +Currently known bugs and enhancements under consideration can be +tracked by performing a `bugzilla search +<http://llvm.org/bugs/buglist.cgi?cmdtype=runnamed&namedcmd=Statepoint%20Bugs&list_id=64342>`_ +for [Statepoint] in the summary field. When filing new bugs, please +use this tag so that interested parties see the newly filed bug. As +with most LLVM features, design discussions take place on `llvmdev +<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>`_, and patches +should be sent to `llvm-commits +<http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>`_ for review. + |