diff options
author | Eli Friedman <eli.friedman@gmail.com> | 2011-08-12 21:50:54 +0000 |
---|---|---|
committer | Eli Friedman <eli.friedman@gmail.com> | 2011-08-12 21:50:54 +0000 |
commit | 91a44dd9ccd8ec3a10fa35315c381cffade91d5b (patch) | |
tree | 6b8ddd50a4a0df4f1caef1db7562559a76fabb01 /docs/Atomics.html | |
parent | 53cae1362dca8aa312c3e36c10b106ea7d349f93 (diff) | |
download | external_llvm-91a44dd9ccd8ec3a10fa35315c381cffade91d5b.zip external_llvm-91a44dd9ccd8ec3a10fa35315c381cffade91d5b.tar.gz external_llvm-91a44dd9ccd8ec3a10fa35315c381cffade91d5b.tar.bz2 |
Some reorganization of atomic docs. Added explicit section for NonAtomic. Added example for illegal non-atomic operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137520 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs/Atomics.html')
-rw-r--r-- | docs/Atomics.html | 143 |
1 files changed, 111 insertions, 32 deletions
diff --git a/docs/Atomics.html b/docs/Atomics.html index 967ebdd..357f431 100644 --- a/docs/Atomics.html +++ b/docs/Atomics.html @@ -14,8 +14,8 @@ <ol> <li><a href="#introduction">Introduction</a></li> - <li><a href="#loadstore">Load and store</a></li> - <li><a href="#otherinst">Other atomic instructions</a></li> + <li><a href="#outsideatomic">Optimization outside atomic</a></li> + <li><a href="#atomicinst">Atomic instructions</a></li> <li><a href="#ordering">Atomic orderings</a></li> <li><a href="#iropt">Atomics and IR optimization</a></li> <li><a href="#codegen">Atomics and Codegen</a></li> @@ -75,51 +75,84 @@ instructions has been clarified in the IR.</p> <!-- *********************************************************************** --> <h2> - <a name="loadstore">Load and store</a> + <a name="outsideatomic">Optimization outside atomic</a> </h2> <!-- *********************************************************************** --> <div> <p>The basic <code>'load'</code> and <code>'store'</code> allow a variety of - optimizations, but can have unintuitive results in a concurrent environment. - For a frontend writer, the rule is essentially that all memory accessed - with basic loads and stores by multiple threads should be protected by a - lock or other synchronization; otherwise, you are likely to run into - undefined behavior. (Do not use volatile as a substitute for atomics; it - might work on some platforms, but does not provide the necessary guarantees - in general.)</p> + optimizations, but can lead to undefined results in a concurrent environment; + see <a href="#o_nonatomic">NonAtomic</a>. This section specifically goes + into the one optimizer restriction which applies in concurrent environments, + which gets a bit more of an extended description because any optimization + dealing with stores needs to be aware of it.</p> <p>From the optimizer's point of view, the rule is that if there are not any instructions with atomic ordering involved, concurrency does not matter, with one exception: if a variable might be visible to another thread or signal handler, a store cannot be inserted along a path where it - might not execute otherwise. For example, suppose LICM wants to take all the - loads and stores in a loop to and from a particular address and promote them - to registers. LICM is not allowed to insert an unconditional store after - the loop with the computed value unless a store unconditionally executes - within the loop. Note that speculative loads are allowed; a load which + might not execute otherwise. Take the following example:</p> + +<pre> +/* C code, for readability; run through clang -O2 -S -emit-llvm to get + equivalent IR */ +int x; +void f(int* a) { + for (int i = 0; i < 100; i++) { + if (a[i]) + x += 1; + } +} +</pre> + +<p>The following is equivalent in non-concurrent situations:</p> + +<pre> +int x; +void f(int* a) { + int xtemp = x; + for (int i = 0; i < 100; i++) { + if (a[i]) + xtemp += 1; + } + x = xtemp; +} +</pre> + +<p>However, LLVM is not allowed to transform the former to the latter: it could + introduce undefined behavior if another thread can access x at the same time. + (This example is particularly of interest because before the concurrency model + was implemented, LLVM would perform this transformation.)</p> + +<p>Note that speculative loads are allowed; a load which is part of a race returns <code>undef</code>, but does not have undefined behavior.</p> -<p>For cases where simple loads and stores are not sufficient, LLVM provides - atomic loads and stores with varying levels of guarantees.</p> </div> <!-- *********************************************************************** --> <h2> - <a name="otherinst">Other atomic instructions</a> + <a name="atomicinst">Atomic instructions</a> </h2> <!-- *********************************************************************** --> <div> +<p>For cases where simple loads and stores are not sufficient, LLVM provides + various atomic instructions. The exact guarantees provided depend on the + ordering; see <a href="#ordering">Atomic orderings</a></p> + +<p><code>load atomic</code> and <code>store atomic</code> provide the same + basic functionality as non-atomic loads and stores, but provide additional + guarantees in situations where threads and signals are involved.</p> + <p><code>cmpxchg</code> and <code>atomicrmw</code> are essentially like an atomic load followed by an atomic store (where the store is conditional for - <code>cmpxchg</code>), but no other memory operation can happen between - the load and store. Note that our cmpxchg does not have quite as many - options for making cmpxchg weaker as the C++0x version.</p> + <code>cmpxchg</code>), but no other memory operation can happen on any thread + between the load and store. Note that LLVM's cmpxchg does not provide quite + as many options as the C++0x version.</p> <p>A <code>fence</code> provides Acquire and/or Release ordering which is not part of another operation; it is normally used along with Monotonic memory @@ -148,6 +181,54 @@ instructions has been clarified in the IR.</p> <!-- ======================================================================= --> <h3> + <a name="o_notatomic">NotAtomic</a> +</h3> + +<div> + +<p>NotAtomic is the obvious, a load or store which is not atomic. (This isn't + really a level of atomicity, but is listed here for comparison.) This is + essentially a regular load or store. If code accesses a memory location + from multiple threads at the same time, the resulting loads return + 'undef'.</p> + +<dl> + <dt>Relevant standard</dt> + <dd>This is intended to match shared variables in C/C++, and to be used + in any other context where memory access is necessary, and + a race is impossible. + <dt>Notes for frontends</dt> + <dd>The rule is essentially that all memory accessed with basic loads and + stores by multiple threads should be protected by a lock or other + synchronization; otherwise, you are likely to run into undefined + behavior. If your frontend is for a "safe" language like Java, + use Unordered to load and store any shared variable. Note that NotAtomic + volatile loads and stores are not properly atomic; do not try to use + them as a substitute. (Per the C/C++ standards, volatile does provide + some limited guarantees around asynchronous signals, but atomics are + generally a better solution.) + <dt>Notes for optimizers</dt> + <dd>Introducing loads to shared variables along a codepath where they would + not otherwise exist is allowed; introducing stores to shared variables + is not. See <a href="#outsideatomic">Optimization outside + atomic</a>.</dd> + <dt>Notes for code generation</dt> + <dd>The one interesting restriction here is that it is not allowed to write + to bytes outside of the bytes relevant to a store. This is mostly + relevant to unaligned stores: it is not allowed in general to convert + an unaligned store into two aligned stores of the same width as the + unaligned store. Backends are also expected to generate an i8 store + as an i8 store, and not an instruction which writes to surrounding + bytes. (If you are writing a backend for an architecture which cannot + satisfy these restrictions and cares about concurrency, please send an + email to llvmdev.)</dd> +</dl> + +</div> + + +<!-- ======================================================================= --> +<h3> <a name="o_unordered">Unordered</a> </h3> @@ -379,24 +460,22 @@ instructions has been clarified in the IR.</p> <ul> <li>isSimple(): A load or store which is not volatile or atomic. This is what, for example, memcpyopt would check for operations it might - transform. + transform.</li> <li>isUnordered(): A load or store which is not volatile and at most Unordered. This would be checked, for example, by LICM before hoisting - an operation. + an operation.</li> <li>mayReadFromMemory()/mayWriteToMemory(): Existing predicate, but note that they return true for any operation which is volatile or at least - Monotonic. + Monotonic.</li> <li>Alias analysis: Note that AA will return ModRef for anything Acquire or - Release, and for the address accessed by any Monotonic operation. + Release, and for the address accessed by any Monotonic operation.</li> </ul> -<p>There are essentially two components to supporting atomic operations. The - first is making sure to query isSimple() or isUnordered() instead - of isVolatile() before transforming an operation. The other piece is - making sure that a transform does not end up replacing, for example, an - Unordered operation with a non-atomic operation. Most of the other - necessary checks automatically fall out from existing predicates and - alias analysis queries.</p> +<p>To support optimizing around atomic operations, make sure you are using + the right predicates; everything should work if that is done. If your + pass should optimize some atomic operations (Unordered operations in + particular), make sure it doesn't replace an atomic load or store with + a non-atomic operation.</p> <p>Some examples of how optimizations interact with various kinds of atomic operations: |