diff options
Diffstat (limited to 'docs')
-rw-r--r-- | docs/AliasAnalysis.html | 1067 | ||||
-rw-r--r-- | docs/AliasAnalysis.rst | 702 | ||||
-rw-r--r-- | docs/subsystems.rst | 7 |
3 files changed, 708 insertions, 1068 deletions
diff --git a/docs/AliasAnalysis.html b/docs/AliasAnalysis.html deleted file mode 100644 index 6388235..0000000 --- a/docs/AliasAnalysis.html +++ /dev/null @@ -1,1067 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>LLVM Alias Analysis Infrastructure</title> - <link rel="stylesheet" href="_static/llvm.css" type="text/css"> -</head> -<body> - -<h1> - LLVM Alias Analysis Infrastructure -</h1> - -<ol> - <li><a href="#introduction">Introduction</a></li> - - <li><a href="#overview"><tt>AliasAnalysis</tt> Class Overview</a> - <ul> - <li><a href="#pointers">Representation of Pointers</a></li> - <li><a href="#alias">The <tt>alias</tt> method</a></li> - <li><a href="#ModRefInfo">The <tt>getModRefInfo</tt> methods</a></li> - <li><a href="#OtherItfs">Other useful <tt>AliasAnalysis</tt> methods</a></li> - </ul> - </li> - - <li><a href="#writingnew">Writing a new <tt>AliasAnalysis</tt> Implementation</a> - <ul> - <li><a href="#passsubclasses">Different Pass styles</a></li> - <li><a href="#requiredcalls">Required initialization calls</a></li> - <li><a href="#interfaces">Interfaces which may be specified</a></li> - <li><a href="#chaining"><tt>AliasAnalysis</tt> chaining behavior</a></li> - <li><a href="#updating">Updating analysis results for transformations</a></li> - <li><a href="#implefficiency">Efficiency Issues</a></li> - <li><a href="#limitations">Limitations</a></li> - </ul> - </li> - - <li><a href="#using">Using alias analysis results</a> - <ul> - <li><a href="#memdep">Using the <tt>MemoryDependenceAnalysis</tt> Pass</a></li> - <li><a href="#ast">Using the <tt>AliasSetTracker</tt> class</a></li> - <li><a href="#direct">Using the <tt>AliasAnalysis</tt> interface directly</a></li> - </ul> - </li> - - <li><a href="#exist">Existing alias analysis implementations and clients</a> - <ul> - <li><a href="#impls">Available <tt>AliasAnalysis</tt> implementations</a></li> - <li><a href="#aliasanalysis-xforms">Alias analysis driven transformations</a></li> - <li><a href="#aliasanalysis-debug">Clients for debugging and evaluation of - implementations</a></li> - </ul> - </li> - <li><a href="#memdep">Memory Dependence Analysis</a></li> -</ol> - -<div class="doc_author"> - <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="introduction">Introduction</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Alias Analysis (aka Pointer Analysis) is a class of techniques which attempt -to determine whether or not two pointers ever can point to the same object in -memory. There are many different algorithms for alias analysis and many -different ways of classifying them: flow-sensitive vs flow-insensitive, -context-sensitive vs context-insensitive, field-sensitive vs field-insensitive, -unification-based vs subset-based, etc. Traditionally, alias analyses respond -to a query with a <a href="#MustMayNo">Must, May, or No</a> alias response, -indicating that two pointers always point to the same object, might point to the -same object, or are known to never point to the same object.</p> - -<p>The LLVM <a -href="http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html"><tt>AliasAnalysis</tt></a> -class is the primary interface used by clients and implementations of alias -analyses in the LLVM system. This class is the common interface between clients -of alias analysis information and the implementations providing it, and is -designed to support a wide range of implementations and clients (but currently -all clients are assumed to be flow-insensitive). In addition to simple alias -analysis information, this class exposes Mod/Ref information from those -implementations which can provide it, allowing for powerful analyses and -transformations to work well together.</p> - -<p>This document contains information necessary to successfully implement this -interface, use it, and to test both sides. It also explains some of the finer -points about what exactly results mean. If you feel that something is unclear -or should be added, please <a href="mailto:sabre@nondot.org">let me -know</a>.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="overview"><tt>AliasAnalysis</tt> Class Overview</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>The <a -href="http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html"><tt>AliasAnalysis</tt></a> -class defines the interface that the various alias analysis implementations -should support. This class exports two important enums: <tt>AliasResult</tt> -and <tt>ModRefResult</tt> which represent the result of an alias query or a -mod/ref query, respectively.</p> - -<p>The <tt>AliasAnalysis</tt> interface exposes information about memory, -represented in several different ways. In particular, memory objects are -represented as a starting address and size, and function calls are represented -as the actual <tt>call</tt> or <tt>invoke</tt> instructions that performs the -call. The <tt>AliasAnalysis</tt> interface also exposes some helper methods -which allow you to get mod/ref information for arbitrary instructions.</p> - -<p>All <tt>AliasAnalysis</tt> interfaces require that in queries involving -multiple values, values which are not -<a href="LangRef.html#constants">constants</a> are all defined within the -same function.</p> - -<!-- ======================================================================= --> -<h3> - <a name="pointers">Representation of Pointers</a> -</h3> - -<div> - -<p>Most importantly, the <tt>AliasAnalysis</tt> class provides several methods -which are used to query whether or not two memory objects alias, whether -function calls can modify or read a memory object, etc. For all of these -queries, memory objects are represented as a pair of their starting address (a -symbolic LLVM <tt>Value*</tt>) and a static size.</p> - -<p>Representing memory objects as a starting address and a size is critically -important for correct Alias Analyses. For example, consider this (silly, but -possible) C code:</p> - -<div class="doc_code"> -<pre> -int i; -char C[2]; -char A[10]; -/* ... */ -for (i = 0; i != 10; ++i) { - C[0] = A[i]; /* One byte store */ - C[1] = A[9-i]; /* One byte store */ -} -</pre> -</div> - -<p>In this case, the <tt>basicaa</tt> pass will disambiguate the stores to -<tt>C[0]</tt> and <tt>C[1]</tt> because they are accesses to two distinct -locations one byte apart, and the accesses are each one byte. In this case, the -LICM pass can use store motion to remove the stores from the loop. In -constrast, the following code:</p> - -<div class="doc_code"> -<pre> -int i; -char C[2]; -char A[10]; -/* ... */ -for (i = 0; i != 10; ++i) { - ((short*)C)[0] = A[i]; /* Two byte store! */ - C[1] = A[9-i]; /* One byte store */ -} -</pre> -</div> - -<p>In this case, the two stores to C do alias each other, because the access to -the <tt>&C[0]</tt> element is a two byte access. If size information wasn't -available in the query, even the first case would have to conservatively assume -that the accesses alias.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="alias">The <tt>alias</tt> method</a> -</h3> - -<div> -<p>The <tt>alias</tt> method is the primary interface used to determine whether -or not two memory objects alias each other. It takes two memory objects as -input and returns MustAlias, PartialAlias, MayAlias, or NoAlias as -appropriate.</p> - -<p>Like all <tt>AliasAnalysis</tt> interfaces, the <tt>alias</tt> method requires -that either the two pointer values be defined within the same function, or at -least one of the values is a <a href="LangRef.html#constants">constant</a>.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="MustMayNo">Must, May, and No Alias Responses</a> -</h4> - -<div> -<p>The NoAlias response may be used when there is never an immediate dependence -between any memory reference <i>based</i> on one pointer and any memory -reference <i>based</i> the other. The most obvious example is when the two -pointers point to non-overlapping memory ranges. Another is when the two -pointers are only ever used for reading memory. Another is when the memory is -freed and reallocated between accesses through one pointer and accesses through -the other -- in this case, there is a dependence, but it's mediated by the free -and reallocation.</p> - -<p>As an exception to this is with the -<a href="LangRef.html#noalias"><tt>noalias</tt></a> keyword; the "irrelevant" -dependencies are ignored.</p> - -<p>The MayAlias response is used whenever the two pointers might refer to the -same object.</p> - -<p>The PartialAlias response is used when the two memory objects are known -to be overlapping in some way, but do not start at the same address.</p> - -<p>The MustAlias response may only be returned if the two memory objects are -guaranteed to always start at exactly the same location. A MustAlias response -implies that the pointers compare equal.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="ModRefInfo">The <tt>getModRefInfo</tt> methods</a> -</h3> - -<div> - -<p>The <tt>getModRefInfo</tt> methods return information about whether the -execution of an instruction can read or modify a memory location. Mod/Ref -information is always conservative: if an instruction <b>might</b> read or write -a location, ModRef is returned.</p> - -<p>The <tt>AliasAnalysis</tt> class also provides a <tt>getModRefInfo</tt> -method for testing dependencies between function calls. This method takes two -call sites (CS1 & CS2), returns NoModRef if neither call writes to memory -read or written by the other, Ref if CS1 reads memory written by CS2, Mod if CS1 -writes to memory read or written by CS2, or ModRef if CS1 might read or write -memory written to by CS2. Note that this relation is not commutative.</p> - -</div> - - -<!-- ======================================================================= --> -<h3> - <a name="OtherItfs">Other useful <tt>AliasAnalysis</tt> methods</a> -</h3> - -<div> - -<p> -Several other tidbits of information are often collected by various alias -analysis implementations and can be put to good use by various clients. -</p> - -<!-- _______________________________________________________________________ --> -<h4> - The <tt>pointsToConstantMemory</tt> method -</h4> - -<div> - -<p>The <tt>pointsToConstantMemory</tt> method returns true if and only if the -analysis can prove that the pointer only points to unchanging memory locations -(functions, constant global variables, and the null pointer). This information -can be used to refine mod/ref information: it is impossible for an unchanging -memory location to be modified.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="simplemodref">The <tt>doesNotAccessMemory</tt> and - <tt>onlyReadsMemory</tt> methods</a> -</h4> - -<div> - -<p>These methods are used to provide very simple mod/ref information for -function calls. The <tt>doesNotAccessMemory</tt> method returns true for a -function if the analysis can prove that the function never reads or writes to -memory, or if the function only reads from constant memory. Functions with this -property are side-effect free and only depend on their input arguments, allowing -them to be eliminated if they form common subexpressions or be hoisted out of -loops. Many common functions behave this way (e.g., <tt>sin</tt> and -<tt>cos</tt>) but many others do not (e.g., <tt>acos</tt>, which modifies the -<tt>errno</tt> variable).</p> - -<p>The <tt>onlyReadsMemory</tt> method returns true for a function if analysis -can prove that (at most) the function only reads from non-volatile memory. -Functions with this property are side-effect free, only depending on their input -arguments and the state of memory when they are called. This property allows -calls to these functions to be eliminated and moved around, as long as there is -no store instruction that changes the contents of memory. Note that all -functions that satisfy the <tt>doesNotAccessMemory</tt> method also satisfies -<tt>onlyReadsMemory</tt>.</p> - -</div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="writingnew">Writing a new <tt>AliasAnalysis</tt> Implementation</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Writing a new alias analysis implementation for LLVM is quite -straight-forward. There are already several implementations that you can use -for examples, and the following information should help fill in any details. -For a examples, take a look at the <a href="#impls">various alias analysis -implementations</a> included with LLVM.</p> - -<!-- ======================================================================= --> -<h3> - <a name="passsubclasses">Different Pass styles</a> -</h3> - -<div> - -<p>The first step to determining what type of <a -href="WritingAnLLVMPass.html">LLVM pass</a> you need to use for your Alias -Analysis. As is the case with most other analyses and transformations, the -answer should be fairly obvious from what type of problem you are trying to -solve:</p> - -<ol> - <li>If you require interprocedural analysis, it should be a - <tt>Pass</tt>.</li> - <li>If you are a function-local analysis, subclass <tt>FunctionPass</tt>.</li> - <li>If you don't need to look at the program at all, subclass - <tt>ImmutablePass</tt>.</li> -</ol> - -<p>In addition to the pass that you subclass, you should also inherit from the -<tt>AliasAnalysis</tt> interface, of course, and use the -<tt>RegisterAnalysisGroup</tt> template to register as an implementation of -<tt>AliasAnalysis</tt>.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="requiredcalls">Required initialization calls</a> -</h3> - -<div> - -<p>Your subclass of <tt>AliasAnalysis</tt> is required to invoke two methods on -the <tt>AliasAnalysis</tt> base class: <tt>getAnalysisUsage</tt> and -<tt>InitializeAliasAnalysis</tt>. In particular, your implementation of -<tt>getAnalysisUsage</tt> should explicitly call into the -<tt>AliasAnalysis::getAnalysisUsage</tt> method in addition to doing any -declaring any pass dependencies your pass has. Thus you should have something -like this:</p> - -<div class="doc_code"> -<pre> -void getAnalysisUsage(AnalysisUsage &AU) const { - AliasAnalysis::getAnalysisUsage(AU); - <i>// declare your dependencies here.</i> -} -</pre> -</div> - -<p>Additionally, your must invoke the <tt>InitializeAliasAnalysis</tt> method -from your analysis run method (<tt>run</tt> for a <tt>Pass</tt>, -<tt>runOnFunction</tt> for a <tt>FunctionPass</tt>, or <tt>InitializePass</tt> -for an <tt>ImmutablePass</tt>). For example (as part of a <tt>Pass</tt>):</p> - -<div class="doc_code"> -<pre> -bool run(Module &M) { - InitializeAliasAnalysis(this); - <i>// Perform analysis here...</i> - return false; -} -</pre> -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="interfaces">Interfaces which may be specified</a> -</h3> - -<div> - -<p>All of the <a -href="/doxygen/classllvm_1_1AliasAnalysis.html"><tt>AliasAnalysis</tt></a> -virtual methods default to providing <a href="#chaining">chaining</a> to another -alias analysis implementation, which ends up returning conservatively correct -information (returning "May" Alias and "Mod/Ref" for alias and mod/ref queries -respectively). Depending on the capabilities of the analysis you are -implementing, you just override the interfaces you can improve.</p> - -</div> - - - -<!-- ======================================================================= --> -<h3> - <a name="chaining"><tt>AliasAnalysis</tt> chaining behavior</a> -</h3> - -<div> - -<p>With only one special exception (the <a href="#no-aa"><tt>no-aa</tt></a> -pass) every alias analysis pass chains to another alias analysis -implementation (for example, the user can specify "<tt>-basicaa -ds-aa --licm</tt>" to get the maximum benefit from both alias -analyses). The alias analysis class automatically takes care of most of this -for methods that you don't override. For methods that you do override, in code -paths that return a conservative MayAlias or Mod/Ref result, simply return -whatever the superclass computes. For example:</p> - -<div class="doc_code"> -<pre> -AliasAnalysis::AliasResult alias(const Value *V1, unsigned V1Size, - const Value *V2, unsigned V2Size) { - if (...) - return NoAlias; - ... - - <i>// Couldn't determine a must or no-alias result.</i> - return AliasAnalysis::alias(V1, V1Size, V2, V2Size); -} -</pre> -</div> - -<p>In addition to analysis queries, you must make sure to unconditionally pass -LLVM <a href="#updating">update notification</a> methods to the superclass as -well if you override them, which allows all alias analyses in a change to be -updated.</p> - -</div> - - -<!-- ======================================================================= --> -<h3> - <a name="updating">Updating analysis results for transformations</a> -</h3> - -<div> -<p> -Alias analysis information is initially computed for a static snapshot of the -program, but clients will use this information to make transformations to the -code. All but the most trivial forms of alias analysis will need to have their -analysis results updated to reflect the changes made by these transformations. -</p> - -<p> -The <tt>AliasAnalysis</tt> interface exposes four methods which are used to -communicate program changes from the clients to the analysis implementations. -Various alias analysis implementations should use these methods to ensure that -their internal data structures are kept up-to-date as the program changes (for -example, when an instruction is deleted), and clients of alias analysis must be -sure to call these interfaces appropriately. -</p> - -<!-- _______________________________________________________________________ --> -<h4>The <tt>deleteValue</tt> method</h4> - -<div> -The <tt>deleteValue</tt> method is called by transformations when they remove an -instruction or any other value from the program (including values that do not -use pointers). Typically alias analyses keep data structures that have entries -for each value in the program. When this method is called, they should remove -any entries for the specified value, if they exist. -</div> - -<!-- _______________________________________________________________________ --> -<h4>The <tt>copyValue</tt> method</h4> - -<div> -The <tt>copyValue</tt> method is used when a new value is introduced into the -program. There is no way to introduce a value into the program that did not -exist before (this doesn't make sense for a safe compiler transformation), so -this is the only way to introduce a new value. This method indicates that the -new value has exactly the same properties as the value being copied. -</div> - -<!-- _______________________________________________________________________ --> -<h4>The <tt>replaceWithNewValue</tt> method</h4> - -<div> -This method is a simple helper method that is provided to make clients easier to -use. It is implemented by copying the old analysis information to the new -value, then deleting the old value. This method cannot be overridden by alias -analysis implementations. -</div> - -<!-- _______________________________________________________________________ --> -<h4>The <tt>addEscapingUse</tt> method</h4> - -<div> -<p>The <tt>addEscapingUse</tt> method is used when the uses of a pointer -value have changed in ways that may invalidate precomputed analysis information. -Implementations may either use this callback to provide conservative responses -for points whose uses have change since analysis time, or may recompute some -or all of their internal state to continue providing accurate responses.</p> - -<p>In general, any new use of a pointer value is considered an escaping use, -and must be reported through this callback, <em>except</em> for the -uses below:</p> - -<ul> - <li>A <tt>bitcast</tt> or <tt>getelementptr</tt> of the pointer</li> - <li>A <tt>store</tt> through the pointer (but not a <tt>store</tt> - <em>of</em> the pointer)</li> - <li>A <tt>load</tt> through the pointer</li> -</ul> -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="implefficiency">Efficiency Issues</a> -</h3> - -<div> - -<p>From the LLVM perspective, the only thing you need to do to provide an -efficient alias analysis is to make sure that alias analysis <b>queries</b> are -serviced quickly. The actual calculation of the alias analysis results (the -"run" method) is only performed once, but many (perhaps duplicate) queries may -be performed. Because of this, try to move as much computation to the run -method as possible (within reason).</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="limitations">Limitations</a> -</h3> - -<div> - -<p>The AliasAnalysis infrastructure has several limitations which make -writing a new <tt>AliasAnalysis</tt> implementation difficult.</p> - -<p>There is no way to override the default alias analysis. It would -be very useful to be able to do something like "opt -my-aa -O2" and -have it use -my-aa for all passes which need AliasAnalysis, but there -is currently no support for that, short of changing the source code -and recompiling. Similarly, there is also no way of setting a chain -of analyses as the default.</p> - -<p>There is no way for transform passes to declare that they preserve -<tt>AliasAnalysis</tt> implementations. The <tt>AliasAnalysis</tt> -interface includes <tt>deleteValue</tt> and <tt>copyValue</tt> methods -which are intended to allow a pass to keep an AliasAnalysis consistent, -however there's no way for a pass to declare in its -<tt>getAnalysisUsage</tt> that it does so. Some passes attempt to use -<tt>AU.addPreserved<AliasAnalysis></tt>, however this doesn't -actually have any effect.</p> - -<p><tt>AliasAnalysisCounter</tt> (<tt>-count-aa</tt>) and <tt>AliasDebugger</tt> -(<tt>-debug-aa</tt>) are implemented as <tt>ModulePass</tt> classes, so if your -alias analysis uses <tt>FunctionPass</tt>, it won't be able to use -these utilities. If you try to use them, the pass manager will -silently route alias analysis queries directly to -<tt>BasicAliasAnalysis</tt> instead.</p> - -<p>Similarly, the <tt>opt -p</tt> option introduces <tt>ModulePass</tt> -passes between each pass, which prevents the use of <tt>FunctionPass</tt> -alias analysis passes.</p> - -<p>The <tt>AliasAnalysis</tt> API does have functions for notifying -implementations when values are deleted or copied, however these -aren't sufficient. There are many other ways that LLVM IR can be -modified which could be relevant to <tt>AliasAnalysis</tt> -implementations which can not be expressed.</p> - -<p>The <tt>AliasAnalysisDebugger</tt> utility seems to suggest that -<tt>AliasAnalysis</tt> implementations can expect that they will be -informed of any relevant <tt>Value</tt> before it appears in an -alias query. However, popular clients such as <tt>GVN</tt> don't -support this, and are known to trigger errors when run with the -<tt>AliasAnalysisDebugger</tt>.</p> - -<p>Due to several of the above limitations, the most obvious use for -the <tt>AliasAnalysisCounter</tt> utility, collecting stats on all -alias queries in a compilation, doesn't work, even if the -<tt>AliasAnalysis</tt> implementations don't use <tt>FunctionPass</tt>. -There's no way to set a default, much less a default sequence, -and there's no way to preserve it.</p> - -<p>The <tt>AliasSetTracker</tt> class (which is used by <tt>LICM</tt> -makes a non-deterministic number of alias queries. This can cause stats -collected by <tt>AliasAnalysisCounter</tt> to have fluctuations among -identical runs, for example. Another consequence is that debugging -techniques involving pausing execution after a predetermined number -of queries can be unreliable.</p> - -<p>Many alias queries can be reformulated in terms of other alias -queries. When multiple <tt>AliasAnalysis</tt> queries are chained together, -it would make sense to start those queries from the beginning of the chain, -with care taken to avoid infinite looping, however currently an -implementation which wants to do this can only start such queries -from itself.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="using">Using alias analysis results</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>There are several different ways to use alias analysis results. In order of -preference, these are...</p> - -<!-- ======================================================================= --> -<h3> - <a name="memdep">Using the <tt>MemoryDependenceAnalysis</tt> Pass</a> -</h3> - -<div> - -<p>The <tt>memdep</tt> pass uses alias analysis to provide high-level dependence -information about memory-using instructions. This will tell you which store -feeds into a load, for example. It uses caching and other techniques to be -efficient, and is used by Dead Store Elimination, GVN, and memcpy optimizations. -</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="ast">Using the <tt>AliasSetTracker</tt> class</a> -</h3> - -<div> - -<p>Many transformations need information about alias <b>sets</b> that are active -in some scope, rather than information about pairwise aliasing. The <tt><a -href="/doxygen/classllvm_1_1AliasSetTracker.html">AliasSetTracker</a></tt> class -is used to efficiently build these Alias Sets from the pairwise alias analysis -information provided by the <tt>AliasAnalysis</tt> interface.</p> - -<p>First you initialize the AliasSetTracker by using the "<tt>add</tt>" methods -to add information about various potentially aliasing instructions in the scope -you are interested in. Once all of the alias sets are completed, your pass -should simply iterate through the constructed alias sets, using the -<tt>AliasSetTracker</tt> <tt>begin()</tt>/<tt>end()</tt> methods.</p> - -<p>The <tt>AliasSet</tt>s formed by the <tt>AliasSetTracker</tt> are guaranteed -to be disjoint, calculate mod/ref information and volatility for the set, and -keep track of whether or not all of the pointers in the set are Must aliases. -The AliasSetTracker also makes sure that sets are properly folded due to call -instructions, and can provide a list of pointers in each set.</p> - -<p>As an example user of this, the <a href="/doxygen/structLICM.html">Loop -Invariant Code Motion</a> pass uses <tt>AliasSetTracker</tt>s to calculate alias -sets for each loop nest. If an <tt>AliasSet</tt> in a loop is not modified, -then all load instructions from that set may be hoisted out of the loop. If any -alias sets are stored to <b>and</b> are must alias sets, then the stores may be -sunk to outside of the loop, promoting the memory location to a register for the -duration of the loop nest. Both of these transformations only apply if the -pointer argument is loop-invariant.</p> - -<!-- _______________________________________________________________________ --> -<h4> - The AliasSetTracker implementation -</h4> - -<div> - -<p>The AliasSetTracker class is implemented to be as efficient as possible. It -uses the union-find algorithm to efficiently merge AliasSets when a pointer is -inserted into the AliasSetTracker that aliases multiple sets. The primary data -structure is a hash table mapping pointers to the AliasSet they are in.</p> - -<p>The AliasSetTracker class must maintain a list of all of the LLVM Value*'s -that are in each AliasSet. Since the hash table already has entries for each -LLVM Value* of interest, the AliasesSets thread the linked list through these -hash-table nodes to avoid having to allocate memory unnecessarily, and to make -merging alias sets extremely efficient (the linked list merge is constant time). -</p> - -<p>You shouldn't need to understand these details if you are just a client of -the AliasSetTracker, but if you look at the code, hopefully this brief -description will help make sense of why things are designed the way they -are.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="direct">Using the <tt>AliasAnalysis</tt> interface directly</a> -</h3> - -<div> - -<p>If neither of these utility class are what your pass needs, you should use -the interfaces exposed by the <tt>AliasAnalysis</tt> class directly. Try to use -the higher-level methods when possible (e.g., use mod/ref information instead of -the <a href="#alias"><tt>alias</tt></a> method directly if possible) to get the -best precision and efficiency.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="exist">Existing alias analysis implementations and clients</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>If you're going to be working with the LLVM alias analysis infrastructure, -you should know what clients and implementations of alias analysis are -available. In particular, if you are implementing an alias analysis, you should -be aware of the <a href="#aliasanalysis-debug">the clients</a> that are useful -for monitoring and evaluating different implementations.</p> - -<!-- ======================================================================= --> -<h3> - <a name="impls">Available <tt>AliasAnalysis</tt> implementations</a> -</h3> - -<div> - -<p>This section lists the various implementations of the <tt>AliasAnalysis</tt> -interface. With the exception of the <a href="#no-aa"><tt>-no-aa</tt></a> -implementation, all of these <a href="#chaining">chain</a> to other alias -analysis implementations.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="no-aa">The <tt>-no-aa</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-no-aa</tt> pass is just like what it sounds: an alias analysis that -never returns any useful information. This pass can be useful if you think that -alias analysis is doing something wrong and are trying to narrow down a -problem.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="basic-aa">The <tt>-basicaa</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-basicaa</tt> pass is an aggressive local analysis that "knows" -many important facts:</p> - -<ul> -<li>Distinct globals, stack allocations, and heap allocations can never - alias.</li> -<li>Globals, stack allocations, and heap allocations never alias the null - pointer.</li> -<li>Different fields of a structure do not alias.</li> -<li>Indexes into arrays with statically differing subscripts cannot alias.</li> -<li>Many common standard C library functions <a - href="#simplemodref">never access memory or only read memory</a>.</li> -<li>Pointers that obviously point to constant globals - "<tt>pointToConstantMemory</tt>".</li> -<li>Function calls can not modify or references stack allocations if they never - escape from the function that allocates them (a common case for automatic - arrays).</li> -</ul> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="globalsmodref">The <tt>-globalsmodref-aa</tt> pass</a> -</h4> - -<div> - -<p>This pass implements a simple context-sensitive mod/ref and alias analysis -for internal global variables that don't "have their address taken". If a -global does not have its address taken, the pass knows that no pointers alias -the global. This pass also keeps track of functions that it knows never access -memory or never read memory. This allows certain optimizations (e.g. GVN) to -eliminate call instructions entirely. -</p> - -<p>The real power of this pass is that it provides context-sensitive mod/ref -information for call instructions. This allows the optimizer to know that -calls to a function do not clobber or read the value of the global, allowing -loads and stores to be eliminated.</p> - -<p>Note that this pass is somewhat limited in its scope (only support -non-address taken globals), but is very quick analysis.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="steens-aa">The <tt>-steens-aa</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-steens-aa</tt> pass implements a variation on the well-known -"Steensgaard's algorithm" for interprocedural alias analysis. Steensgaard's -algorithm is a unification-based, flow-insensitive, context-insensitive, and -field-insensitive alias analysis that is also very scalable (effectively linear -time).</p> - -<p>The LLVM <tt>-steens-aa</tt> pass implements a "speculatively -field-<b>sensitive</b>" version of Steensgaard's algorithm using the Data -Structure Analysis framework. This gives it substantially more precision than -the standard algorithm while maintaining excellent analysis scalability.</p> - -<p>Note that <tt>-steens-aa</tt> is available in the optional "poolalloc" -module, it is not part of the LLVM core.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="ds-aa">The <tt>-ds-aa</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-ds-aa</tt> pass implements the full Data Structure Analysis -algorithm. Data Structure Analysis is a modular unification-based, -flow-insensitive, context-<b>sensitive</b>, and speculatively -field-<b>sensitive</b> alias analysis that is also quite scalable, usually at -O(n*log(n)).</p> - -<p>This algorithm is capable of responding to a full variety of alias analysis -queries, and can provide context-sensitive mod/ref information as well. The -only major facility not implemented so far is support for must-alias -information.</p> - -<p>Note that <tt>-ds-aa</tt> is available in the optional "poolalloc" -module, it is not part of the LLVM core.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="scev-aa">The <tt>-scev-aa</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-scev-aa</tt> pass implements AliasAnalysis queries by -translating them into ScalarEvolution queries. This gives it a -more complete understanding of <tt>getelementptr</tt> instructions -and loop induction variables than other alias analyses have.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="aliasanalysis-xforms">Alias analysis driven transformations</a> -</h3> - -<div> -LLVM includes several alias-analysis driven transformations which can be used -with any of the implementations above. - -<!-- _______________________________________________________________________ --> -<h4> - <a name="adce">The <tt>-adce</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-adce</tt> pass, which implements Aggressive Dead Code Elimination -uses the <tt>AliasAnalysis</tt> interface to delete calls to functions that do -not have side-effects and are not used.</p> - -</div> - - -<!-- _______________________________________________________________________ --> -<h4> - <a name="licm">The <tt>-licm</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-licm</tt> pass implements various Loop Invariant Code Motion related -transformations. It uses the <tt>AliasAnalysis</tt> interface for several -different transformations:</p> - -<ul> -<li>It uses mod/ref information to hoist or sink load instructions out of loops -if there are no instructions in the loop that modifies the memory loaded.</li> - -<li>It uses mod/ref information to hoist function calls out of loops that do not -write to memory and are loop-invariant.</li> - -<li>If uses alias information to promote memory objects that are loaded and -stored to in loops to live in a register instead. It can do this if there are -no may aliases to the loaded/stored memory location.</li> -</ul> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="argpromotion">The <tt>-argpromotion</tt> pass</a> -</h4> - -<div> -<p> -The <tt>-argpromotion</tt> pass promotes by-reference arguments to be passed in -by-value instead. In particular, if pointer arguments are only loaded from it -passes in the value loaded instead of the address to the function. This pass -uses alias information to make sure that the value loaded from the argument -pointer is not modified between the entry of the function and any load of the -pointer.</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="gvn">The <tt>-gvn</tt>, <tt>-memcpyopt</tt>, and <tt>-dse</tt> - passes</a> -</h4> - -<div> - -<p>These passes use AliasAnalysis information to reason about loads and stores. -</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="aliasanalysis-debug">Clients for debugging and evaluation of - implementations</a> -</h3> - -<div> - -<p>These passes are useful for evaluating the various alias analysis -implementations. You can use them with commands like '<tt>opt -ds-aa --aa-eval foo.bc -disable-output -stats</tt>'.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="print-alias-sets">The <tt>-print-alias-sets</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-print-alias-sets</tt> pass is exposed as part of the -<tt>opt</tt> tool to print out the Alias Sets formed by the <a -href="#ast"><tt>AliasSetTracker</tt></a> class. This is useful if you're using -the <tt>AliasSetTracker</tt> class. To use it, use something like:</p> - -<div class="doc_code"> -<pre> -% opt -ds-aa -print-alias-sets -disable-output -</pre> -</div> - -</div> - - -<!-- _______________________________________________________________________ --> -<h4> - <a name="count-aa">The <tt>-count-aa</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-count-aa</tt> pass is useful to see how many queries a particular -pass is making and what responses are returned by the alias analysis. As an -example,</p> - -<div class="doc_code"> -<pre> -% opt -basicaa -count-aa -ds-aa -count-aa -licm -</pre> -</div> - -<p>will print out how many queries (and what responses are returned) by the -<tt>-licm</tt> pass (of the <tt>-ds-aa</tt> pass) and how many queries are made -of the <tt>-basicaa</tt> pass by the <tt>-ds-aa</tt> pass. This can be useful -when debugging a transformation or an alias analysis implementation.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="aa-eval">The <tt>-aa-eval</tt> pass</a> -</h4> - -<div> - -<p>The <tt>-aa-eval</tt> pass simply iterates through all pairs of pointers in a -function and asks an alias analysis whether or not the pointers alias. This -gives an indication of the precision of the alias analysis. Statistics are -printed indicating the percent of no/may/must aliases found (a more precise -algorithm will have a lower number of may aliases).</p> - -</div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="memdep">Memory Dependence Analysis</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>If you're just looking to be a client of alias analysis information, consider -using the Memory Dependence Analysis interface instead. MemDep is a lazy, -caching layer on top of alias analysis that is able to answer the question of -what preceding memory operations a given instruction depends on, either at an -intra- or inter-block level. Because of its laziness and caching -policy, using MemDep can be a significant performance win over accessing alias -analysis directly.</p> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date$ -</address> - -</body> -</html> diff --git a/docs/AliasAnalysis.rst b/docs/AliasAnalysis.rst new file mode 100644 index 0000000..2d4f291 --- /dev/null +++ b/docs/AliasAnalysis.rst @@ -0,0 +1,702 @@ +.. _alias_analysis: + +================================== +LLVM Alias Analysis Infrastructure +================================== + +.. contents:: + :local: + +Introduction +============ + +Alias Analysis (aka Pointer Analysis) is a class of techniques which attempt to +determine whether or not two pointers ever can point to the same object in +memory. There are many different algorithms for alias analysis and many +different ways of classifying them: flow-sensitive vs. flow-insensitive, +context-sensitive vs. context-insensitive, field-sensitive +vs. field-insensitive, unification-based vs. subset-based, etc. Traditionally, +alias analyses respond to a query with a `Must, May, or No`_ alias response, +indicating that two pointers always point to the same object, might point to the +same object, or are known to never point to the same object. + +The LLVM `AliasAnalysis +<http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`__ class is the +primary interface used by clients and implementations of alias analyses in the +LLVM system. This class is the common interface between clients of alias +analysis information and the implementations providing it, and is designed to +support a wide range of implementations and clients (but currently all clients +are assumed to be flow-insensitive). In addition to simple alias analysis +information, this class exposes Mod/Ref information from those implementations +which can provide it, allowing for powerful analyses and transformations to work +well together. + +This document contains information necessary to successfully implement this +interface, use it, and to test both sides. It also explains some of the finer +points about what exactly results mean. If you feel that something is unclear +or should be added, please `let me know <mailto:sabre@nondot.org>`_. + +``AliasAnalysis`` Class Overview +================================ + +The `AliasAnalysis <http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`__ +class defines the interface that the various alias analysis implementations +should support. This class exports two important enums: ``AliasResult`` and +``ModRefResult`` which represent the result of an alias query or a mod/ref +query, respectively. + +The ``AliasAnalysis`` interface exposes information about memory, represented in +several different ways. In particular, memory objects are represented as a +starting address and size, and function calls are represented as the actual +``call`` or ``invoke`` instructions that performs the call. The +``AliasAnalysis`` interface also exposes some helper methods which allow you to +get mod/ref information for arbitrary instructions. + +All ``AliasAnalysis`` interfaces require that in queries involving multiple +values, values which are not `constants <LangRef.html#constants>`_ are all +defined within the same function. + +Representation of Pointers +-------------------------- + +Most importantly, the ``AliasAnalysis`` class provides several methods which are +used to query whether or not two memory objects alias, whether function calls +can modify or read a memory object, etc. For all of these queries, memory +objects are represented as a pair of their starting address (a symbolic LLVM +``Value*``) and a static size. + +Representing memory objects as a starting address and a size is critically +important for correct Alias Analyses. For example, consider this (silly, but +possible) C code: + +.. code-block:: c++ + + int i; + char C[2]; + char A[10]; + /* ... */ + for (i = 0; i != 10; ++i) { + C[0] = A[i]; /* One byte store */ + C[1] = A[9-i]; /* One byte store */ + } + +In this case, the ``basicaa`` pass will disambiguate the stores to ``C[0]`` and +``C[1]`` because they are accesses to two distinct locations one byte apart, and +the accesses are each one byte. In this case, the Loop Invariant Code Motion +(LICM) pass can use store motion to remove the stores from the loop. In +constrast, the following code: + +.. code-block:: c++ + + int i; + char C[2]; + char A[10]; + /* ... */ + for (i = 0; i != 10; ++i) { + ((short*)C)[0] = A[i]; /* Two byte store! */ + C[1] = A[9-i]; /* One byte store */ + } + +In this case, the two stores to C do alias each other, because the access to the +``&C[0]`` element is a two byte access. If size information wasn't available in +the query, even the first case would have to conservatively assume that the +accesses alias. + +.. _alias: + +The ``alias`` method +-------------------- + +The ``alias`` method is the primary interface used to determine whether or not +two memory objects alias each other. It takes two memory objects as input and +returns MustAlias, PartialAlias, MayAlias, or NoAlias as appropriate. + +Like all ``AliasAnalysis`` interfaces, the ``alias`` method requires that either +the two pointer values be defined within the same function, or at least one of +the values is a `constant <LangRef.html#constants>`_. + +.. _Must, May, or No: + +Must, May, and No Alias Responses +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``NoAlias`` response may be used when there is never an immediate dependence +between any memory reference *based* on one pointer and any memory reference +*based* the other. The most obvious example is when the two pointers point to +non-overlapping memory ranges. Another is when the two pointers are only ever +used for reading memory. Another is when the memory is freed and reallocated +between accesses through one pointer and accesses through the other --- in this +case, there is a dependence, but it's mediated by the free and reallocation. + +As an exception to this is with the `noalias <LangRef.html#noalias>`_ keyword; +the "irrelevant" dependencies are ignored. + +The ``MayAlias`` response is used whenever the two pointers might refer to the +same object. + +The ``PartialAlias`` response is used when the two memory objects are known to +be overlapping in some way, but do not start at the same address. + +The ``MustAlias`` response may only be returned if the two memory objects are +guaranteed to always start at exactly the same location. A ``MustAlias`` +response implies that the pointers compare equal. + +The ``getModRefInfo`` methods +----------------------------- + +The ``getModRefInfo`` methods return information about whether the execution of +an instruction can read or modify a memory location. Mod/Ref information is +always conservative: if an instruction **might** read or write a location, +``ModRef`` is returned. + +The ``AliasAnalysis`` class also provides a ``getModRefInfo`` method for testing +dependencies between function calls. This method takes two call sites (``CS1`` +& ``CS2``), returns ``NoModRef`` if neither call writes to memory read or +written by the other, ``Ref`` if ``CS1`` reads memory written by ``CS2``, +``Mod`` if ``CS1`` writes to memory read or written by ``CS2``, or ``ModRef`` if +``CS1`` might read or write memory written to by ``CS2``. Note that this +relation is not commutative. + +Other useful ``AliasAnalysis`` methods +-------------------------------------- + +Several other tidbits of information are often collected by various alias +analysis implementations and can be put to good use by various clients. + +The ``pointsToConstantMemory`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``pointsToConstantMemory`` method returns true if and only if the analysis +can prove that the pointer only points to unchanging memory locations +(functions, constant global variables, and the null pointer). This information +can be used to refine mod/ref information: it is impossible for an unchanging +memory location to be modified. + +.. _never access memory or only read memory: + +The ``doesNotAccessMemory`` and ``onlyReadsMemory`` methods +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +These methods are used to provide very simple mod/ref information for function +calls. The ``doesNotAccessMemory`` method returns true for a function if the +analysis can prove that the function never reads or writes to memory, or if the +function only reads from constant memory. Functions with this property are +side-effect free and only depend on their input arguments, allowing them to be +eliminated if they form common subexpressions or be hoisted out of loops. Many +common functions behave this way (e.g., ``sin`` and ``cos``) but many others do +not (e.g., ``acos``, which modifies the ``errno`` variable). + +The ``onlyReadsMemory`` method returns true for a function if analysis can prove +that (at most) the function only reads from non-volatile memory. Functions with +this property are side-effect free, only depending on their input arguments and +the state of memory when they are called. This property allows calls to these +functions to be eliminated and moved around, as long as there is no store +instruction that changes the contents of memory. Note that all functions that +satisfy the ``doesNotAccessMemory`` method also satisfies ``onlyReadsMemory``. + +Writing a new ``AliasAnalysis`` Implementation +============================================== + +Writing a new alias analysis implementation for LLVM is quite straight-forward. +There are already several implementations that you can use for examples, and the +following information should help fill in any details. For a examples, take a +look at the `various alias analysis implementations`_ included with LLVM. + +Different Pass styles +--------------------- + +The first step to determining what type of `LLVM pass <WritingAnLLVMPass.html>`_ +you need to use for your Alias Analysis. As is the case with most other +analyses and transformations, the answer should be fairly obvious from what type +of problem you are trying to solve: + +#. If you require interprocedural analysis, it should be a ``Pass``. +#. If you are a function-local analysis, subclass ``FunctionPass``. +#. If you don't need to look at the program at all, subclass ``ImmutablePass``. + +In addition to the pass that you subclass, you should also inherit from the +``AliasAnalysis`` interface, of course, and use the ``RegisterAnalysisGroup`` +template to register as an implementation of ``AliasAnalysis``. + +Required initialization calls +----------------------------- + +Your subclass of ``AliasAnalysis`` is required to invoke two methods on the +``AliasAnalysis`` base class: ``getAnalysisUsage`` and +``InitializeAliasAnalysis``. In particular, your implementation of +``getAnalysisUsage`` should explicitly call into the +``AliasAnalysis::getAnalysisUsage`` method in addition to doing any declaring +any pass dependencies your pass has. Thus you should have something like this: + +.. code-block:: c++ + + void getAnalysisUsage(AnalysisUsage &AU) const { + AliasAnalysis::getAnalysisUsage(AU); + // declare your dependencies here. + } + +Additionally, your must invoke the ``InitializeAliasAnalysis`` method from your +analysis run method (``run`` for a ``Pass``, ``runOnFunction`` for a +``FunctionPass``, or ``InitializePass`` for an ``ImmutablePass``). For example +(as part of a ``Pass``): + +.. code-block:: c++ + + bool run(Module &M) { + InitializeAliasAnalysis(this); + // Perform analysis here... + return false; + } + +Interfaces which may be specified +--------------------------------- + +All of the `AliasAnalysis +<http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`__ virtual methods +default to providing `chaining`_ to another alias analysis implementation, which +ends up returning conservatively correct information (returning "May" Alias and +"Mod/Ref" for alias and mod/ref queries respectively). Depending on the +capabilities of the analysis you are implementing, you just override the +interfaces you can improve. + +.. _chaining: +.. _chain: + +``AliasAnalysis`` chaining behavior +----------------------------------- + +With only one special exception (the `no-aa`_ pass) every alias analysis pass +chains to another alias analysis implementation (for example, the user can +specify "``-basicaa -ds-aa -licm``" to get the maximum benefit from both alias +analyses). The alias analysis class automatically takes care of most of this +for methods that you don't override. For methods that you do override, in code +paths that return a conservative MayAlias or Mod/Ref result, simply return +whatever the superclass computes. For example: + +.. code-block:: c++ + + AliasAnalysis::AliasResult alias(const Value *V1, unsigned V1Size, + const Value *V2, unsigned V2Size) { + if (...) + return NoAlias; + ... + + // Couldn't determine a must or no-alias result. + return AliasAnalysis::alias(V1, V1Size, V2, V2Size); + } + +In addition to analysis queries, you must make sure to unconditionally pass LLVM +`update notification`_ methods to the superclass as well if you override them, +which allows all alias analyses in a change to be updated. + +.. _update notification: + +Updating analysis results for transformations +--------------------------------------------- + +Alias analysis information is initially computed for a static snapshot of the +program, but clients will use this information to make transformations to the +code. All but the most trivial forms of alias analysis will need to have their +analysis results updated to reflect the changes made by these transformations. + +The ``AliasAnalysis`` interface exposes four methods which are used to +communicate program changes from the clients to the analysis implementations. +Various alias analysis implementations should use these methods to ensure that +their internal data structures are kept up-to-date as the program changes (for +example, when an instruction is deleted), and clients of alias analysis must be +sure to call these interfaces appropriately. + +The ``deleteValue`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``deleteValue`` method is called by transformations when they remove an +instruction or any other value from the program (including values that do not +use pointers). Typically alias analyses keep data structures that have entries +for each value in the program. When this method is called, they should remove +any entries for the specified value, if they exist. + +The ``copyValue`` method +^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``copyValue`` method is used when a new value is introduced into the +program. There is no way to introduce a value into the program that did not +exist before (this doesn't make sense for a safe compiler transformation), so +this is the only way to introduce a new value. This method indicates that the +new value has exactly the same properties as the value being copied. + +The ``replaceWithNewValue`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This method is a simple helper method that is provided to make clients easier to +use. It is implemented by copying the old analysis information to the new +value, then deleting the old value. This method cannot be overridden by alias +analysis implementations. + +The ``addEscapingUse`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``addEscapingUse`` method is used when the uses of a pointer value have +changed in ways that may invalidate precomputed analysis information. +Implementations may either use this callback to provide conservative responses +for points whose uses have change since analysis time, or may recompute some or +all of their internal state to continue providing accurate responses. + +In general, any new use of a pointer value is considered an escaping use, and +must be reported through this callback, *except* for the uses below: + +* A ``bitcast`` or ``getelementptr`` of the pointer +* A ``store`` through the pointer (but not a ``store`` *of* the pointer) +* A ``load`` through the pointer + +Efficiency Issues +----------------- + +From the LLVM perspective, the only thing you need to do to provide an efficient +alias analysis is to make sure that alias analysis **queries** are serviced +quickly. The actual calculation of the alias analysis results (the "run" +method) is only performed once, but many (perhaps duplicate) queries may be +performed. Because of this, try to move as much computation to the run method +as possible (within reason). + +Limitations +----------- + +The AliasAnalysis infrastructure has several limitations which make writing a +new ``AliasAnalysis`` implementation difficult. + +There is no way to override the default alias analysis. It would be very useful +to be able to do something like "``opt -my-aa -O2``" and have it use ``-my-aa`` +for all passes which need AliasAnalysis, but there is currently no support for +that, short of changing the source code and recompiling. Similarly, there is +also no way of setting a chain of analyses as the default. + +There is no way for transform passes to declare that they preserve +``AliasAnalysis`` implementations. The ``AliasAnalysis`` interface includes +``deleteValue`` and ``copyValue`` methods which are intended to allow a pass to +keep an AliasAnalysis consistent, however there's no way for a pass to declare +in its ``getAnalysisUsage`` that it does so. Some passes attempt to use +``AU.addPreserved<AliasAnalysis>``, however this doesn't actually have any +effect. + +``AliasAnalysisCounter`` (``-count-aa``) and ``AliasDebugger`` (``-debug-aa``) +are implemented as ``ModulePass`` classes, so if your alias analysis uses +``FunctionPass``, it won't be able to use these utilities. If you try to use +them, the pass manager will silently route alias analysis queries directly to +``BasicAliasAnalysis`` instead. + +Similarly, the ``opt -p`` option introduces ``ModulePass`` passes between each +pass, which prevents the use of ``FunctionPass`` alias analysis passes. + +The ``AliasAnalysis`` API does have functions for notifying implementations when +values are deleted or copied, however these aren't sufficient. There are many +other ways that LLVM IR can be modified which could be relevant to +``AliasAnalysis`` implementations which can not be expressed. + +The ``AliasAnalysisDebugger`` utility seems to suggest that ``AliasAnalysis`` +implementations can expect that they will be informed of any relevant ``Value`` +before it appears in an alias query. However, popular clients such as ``GVN`` +don't support this, and are known to trigger errors when run with the +``AliasAnalysisDebugger``. + +Due to several of the above limitations, the most obvious use for the +``AliasAnalysisCounter`` utility, collecting stats on all alias queries in a +compilation, doesn't work, even if the ``AliasAnalysis`` implementations don't +use ``FunctionPass``. There's no way to set a default, much less a default +sequence, and there's no way to preserve it. + +The ``AliasSetTracker`` class (which is used by ``LICM``) makes a +non-deterministic number of alias queries. This can cause stats collected by +``AliasAnalysisCounter`` to have fluctuations among identical runs, for +example. Another consequence is that debugging techniques involving pausing +execution after a predetermined number of queries can be unreliable. + +Many alias queries can be reformulated in terms of other alias queries. When +multiple ``AliasAnalysis`` queries are chained together, it would make sense to +start those queries from the beginning of the chain, with care taken to avoid +infinite looping, however currently an implementation which wants to do this can +only start such queries from itself. + +Using alias analysis results +============================ + +There are several different ways to use alias analysis results. In order of +preference, these are: + +Using the ``MemoryDependenceAnalysis`` Pass +------------------------------------------- + +The ``memdep`` pass uses alias analysis to provide high-level dependence +information about memory-using instructions. This will tell you which store +feeds into a load, for example. It uses caching and other techniques to be +efficient, and is used by Dead Store Elimination, GVN, and memcpy optimizations. + +.. _AliasSetTracker: + +Using the ``AliasSetTracker`` class +----------------------------------- + +Many transformations need information about alias **sets** that are active in +some scope, rather than information about pairwise aliasing. The +`AliasSetTracker <http://llvm.org/doxygen/classllvm_1_1AliasSetTracker.html>`__ +class is used to efficiently build these Alias Sets from the pairwise alias +analysis information provided by the ``AliasAnalysis`` interface. + +First you initialize the AliasSetTracker by using the "``add``" methods to add +information about various potentially aliasing instructions in the scope you are +interested in. Once all of the alias sets are completed, your pass should +simply iterate through the constructed alias sets, using the ``AliasSetTracker`` +``begin()``/``end()`` methods. + +The ``AliasSet``\s formed by the ``AliasSetTracker`` are guaranteed to be +disjoint, calculate mod/ref information and volatility for the set, and keep +track of whether or not all of the pointers in the set are Must aliases. The +AliasSetTracker also makes sure that sets are properly folded due to call +instructions, and can provide a list of pointers in each set. + +As an example user of this, the `Loop Invariant Code Motion +<doxygen/structLICM.html>`_ pass uses ``AliasSetTracker``\s to calculate alias +sets for each loop nest. If an ``AliasSet`` in a loop is not modified, then all +load instructions from that set may be hoisted out of the loop. If any alias +sets are stored to **and** are must alias sets, then the stores may be sunk +to outside of the loop, promoting the memory location to a register for the +duration of the loop nest. Both of these transformations only apply if the +pointer argument is loop-invariant. + +The AliasSetTracker implementation +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The AliasSetTracker class is implemented to be as efficient as possible. It +uses the union-find algorithm to efficiently merge AliasSets when a pointer is +inserted into the AliasSetTracker that aliases multiple sets. The primary data +structure is a hash table mapping pointers to the AliasSet they are in. + +The AliasSetTracker class must maintain a list of all of the LLVM ``Value*``\s +that are in each AliasSet. Since the hash table already has entries for each +LLVM ``Value*`` of interest, the AliasesSets thread the linked list through +these hash-table nodes to avoid having to allocate memory unnecessarily, and to +make merging alias sets extremely efficient (the linked list merge is constant +time). + +You shouldn't need to understand these details if you are just a client of the +AliasSetTracker, but if you look at the code, hopefully this brief description +will help make sense of why things are designed the way they are. + +Using the ``AliasAnalysis`` interface directly +---------------------------------------------- + +If neither of these utility class are what your pass needs, you should use the +interfaces exposed by the ``AliasAnalysis`` class directly. Try to use the +higher-level methods when possible (e.g., use mod/ref information instead of the +`alias`_ method directly if possible) to get the best precision and efficiency. + +Existing alias analysis implementations and clients +=================================================== + +If you're going to be working with the LLVM alias analysis infrastructure, you +should know what clients and implementations of alias analysis are available. +In particular, if you are implementing an alias analysis, you should be aware of +the `the clients`_ that are useful for monitoring and evaluating different +implementations. + +.. _various alias analysis implementations: + +Available ``AliasAnalysis`` implementations +------------------------------------------- + +This section lists the various implementations of the ``AliasAnalysis`` +interface. With the exception of the `-no-aa`_ implementation, all of these +`chain`_ to other alias analysis implementations. + +.. _no-aa: +.. _-no-aa: + +The ``-no-aa`` pass +^^^^^^^^^^^^^^^^^^^ + +The ``-no-aa`` pass is just like what it sounds: an alias analysis that never +returns any useful information. This pass can be useful if you think that alias +analysis is doing something wrong and are trying to narrow down a problem. + +The ``-basicaa`` pass +^^^^^^^^^^^^^^^^^^^^^ + +The ``-basicaa`` pass is an aggressive local analysis that *knows* many +important facts: + +* Distinct globals, stack allocations, and heap allocations can never alias. +* Globals, stack allocations, and heap allocations never alias the null pointer. +* Different fields of a structure do not alias. +* Indexes into arrays with statically differing subscripts cannot alias. +* Many common standard C library functions `never access memory or only read + memory`_. +* Pointers that obviously point to constant globals "``pointToConstantMemory``". +* Function calls can not modify or references stack allocations if they never + escape from the function that allocates them (a common case for automatic + arrays). + +The ``-globalsmodref-aa`` pass +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This pass implements a simple context-sensitive mod/ref and alias analysis for +internal global variables that don't "have their address taken". If a global +does not have its address taken, the pass knows that no pointers alias the +global. This pass also keeps track of functions that it knows never access +memory or never read memory. This allows certain optimizations (e.g. GVN) to +eliminate call instructions entirely. + +The real power of this pass is that it provides context-sensitive mod/ref +information for call instructions. This allows the optimizer to know that calls +to a function do not clobber or read the value of the global, allowing loads and +stores to be eliminated. + +.. note:: + + This pass is somewhat limited in its scope (only support non-address taken + globals), but is very quick analysis. + +The ``-steens-aa`` pass +^^^^^^^^^^^^^^^^^^^^^^^ + +The ``-steens-aa`` pass implements a variation on the well-known "Steensgaard's +algorithm" for interprocedural alias analysis. Steensgaard's algorithm is a +unification-based, flow-insensitive, context-insensitive, and field-insensitive +alias analysis that is also very scalable (effectively linear time). + +The LLVM ``-steens-aa`` pass implements a "speculatively field-**sensitive**" +version of Steensgaard's algorithm using the Data Structure Analysis framework. +This gives it substantially more precision than the standard algorithm while +maintaining excellent analysis scalability. + +.. note:: + + ``-steens-aa`` is available in the optional "poolalloc" module. It is not part + of the LLVM core. + +The ``-ds-aa`` pass +^^^^^^^^^^^^^^^^^^^ + +The ``-ds-aa`` pass implements the full Data Structure Analysis algorithm. Data +Structure Analysis is a modular unification-based, flow-insensitive, +context-**sensitive**, and speculatively field-**sensitive** alias +analysis that is also quite scalable, usually at ``O(n * log(n))``. + +This algorithm is capable of responding to a full variety of alias analysis +queries, and can provide context-sensitive mod/ref information as well. The +only major facility not implemented so far is support for must-alias +information. + +.. note:: + + ``-ds-aa`` is available in the optional "poolalloc" module. It is not part of + the LLVM core. + +The ``-scev-aa`` pass +^^^^^^^^^^^^^^^^^^^^^ + +The ``-scev-aa`` pass implements AliasAnalysis queries by translating them into +ScalarEvolution queries. This gives it a more complete understanding of +``getelementptr`` instructions and loop induction variables than other alias +analyses have. + +Alias analysis driven transformations +------------------------------------- + +LLVM includes several alias-analysis driven transformations which can be used +with any of the implementations above. + +The ``-adce`` pass +^^^^^^^^^^^^^^^^^^ + +The ``-adce`` pass, which implements Aggressive Dead Code Elimination uses the +``AliasAnalysis`` interface to delete calls to functions that do not have +side-effects and are not used. + +The ``-licm`` pass +^^^^^^^^^^^^^^^^^^ + +The ``-licm`` pass implements various Loop Invariant Code Motion related +transformations. It uses the ``AliasAnalysis`` interface for several different +transformations: + +* It uses mod/ref information to hoist or sink load instructions out of loops if + there are no instructions in the loop that modifies the memory loaded. + +* It uses mod/ref information to hoist function calls out of loops that do not + write to memory and are loop-invariant. + +* If uses alias information to promote memory objects that are loaded and stored + to in loops to live in a register instead. It can do this if there are no may + aliases to the loaded/stored memory location. + +The ``-argpromotion`` pass +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``-argpromotion`` pass promotes by-reference arguments to be passed in +by-value instead. In particular, if pointer arguments are only loaded from it +passes in the value loaded instead of the address to the function. This pass +uses alias information to make sure that the value loaded from the argument +pointer is not modified between the entry of the function and any load of the +pointer. + +The ``-gvn``, ``-memcpyopt``, and ``-dse`` passes +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +These passes use AliasAnalysis information to reason about loads and stores. + +.. _the clients: + +Clients for debugging and evaluation of implementations +------------------------------------------------------- + +These passes are useful for evaluating the various alias analysis +implementations. You can use them with commands like: + +.. code-block:: bash + + % opt -ds-aa -aa-eval foo.bc -disable-output -stats + +The ``-print-alias-sets`` pass +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``-print-alias-sets`` pass is exposed as part of the ``opt`` tool to print +out the Alias Sets formed by the `AliasSetTracker`_ class. This is useful if +you're using the ``AliasSetTracker`` class. To use it, use something like: + +.. code-block:: bash + + % opt -ds-aa -print-alias-sets -disable-output + +The ``-count-aa`` pass +^^^^^^^^^^^^^^^^^^^^^^ + +The ``-count-aa`` pass is useful to see how many queries a particular pass is +making and what responses are returned by the alias analysis. As an example: + +.. code-block:: bash + + % opt -basicaa -count-aa -ds-aa -count-aa -licm + +will print out how many queries (and what responses are returned) by the +``-licm`` pass (of the ``-ds-aa`` pass) and how many queries are made of the +``-basicaa`` pass by the ``-ds-aa`` pass. This can be useful when debugging a +transformation or an alias analysis implementation. + +The ``-aa-eval`` pass +^^^^^^^^^^^^^^^^^^^^^ + +The ``-aa-eval`` pass simply iterates through all pairs of pointers in a +function and asks an alias analysis whether or not the pointers alias. This +gives an indication of the precision of the alias analysis. Statistics are +printed indicating the percent of no/may/must aliases found (a more precise +algorithm will have a lower number of may aliases). + +Memory Dependence Analysis +========================== + +If you're just looking to be a client of alias analysis information, consider +using the Memory Dependence Analysis interface instead. MemDep is a lazy, +caching layer on top of alias analysis that is able to answer the question of +what preceding memory operations a given instruction depends on, either at an +intra- or inter-block level. Because of its laziness and caching policy, using +MemDep can be a significant performance win over accessing alias analysis +directly. diff --git a/docs/subsystems.rst b/docs/subsystems.rst index 3a0db78..0a96368 100644 --- a/docs/subsystems.rst +++ b/docs/subsystems.rst @@ -3,6 +3,11 @@ Subsystem Documentation ======================= +.. toctree:: + :hidden: + + AliasAnalysis + * `Writing an LLVM Pass <WritingAnLLVMPass.html>`_ Information on how to write LLVM transformations and analyses. @@ -22,7 +27,7 @@ Subsystem Documentation Describes the TableGen tool, which is used heavily by the LLVM code generator. - * `Alias Analysis in LLVM <AliasAnalysis.html>`_ + * :ref:`alias_analysis` Information on how to write a new alias analysis implementation or how to use existing analyses. |