aboutsummaryrefslogtreecommitdiffstats
path: root/docs/Vectorizers.rst
diff options
context:
space:
mode:
authorNadav Rotem <nrotem@apple.com>2013-01-03 01:47:02 +0000
committerNadav Rotem <nrotem@apple.com>2013-01-03 01:47:02 +0000
commitf574b88adbb81c7262e236f8fb5aa662eb544a27 (patch)
tree0d91762c59a47b9646f6eb84b8340901a0009b7d /docs/Vectorizers.rst
parentded28aca6195d2d8d3bcbb9cd6b1c2c34c0d9702 (diff)
downloadexternal_llvm-f574b88adbb81c7262e236f8fb5aa662eb544a27.zip
external_llvm-f574b88adbb81c7262e236f8fb5aa662eb544a27.tar.gz
external_llvm-f574b88adbb81c7262e236f8fb5aa662eb544a27.tar.bz2
LoopVectorizer: Document the unrolling feature.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171445 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs/Vectorizers.rst')
-rw-r--r--docs/Vectorizers.rst36
1 files changed, 34 insertions, 2 deletions
diff --git a/docs/Vectorizers.rst b/docs/Vectorizers.rst
index 3410f18..b4c5458 100644
--- a/docs/Vectorizers.rst
+++ b/docs/Vectorizers.rst
@@ -159,8 +159,8 @@ The Loop Vectorizer can vectorize loops that count backwards.
Scatter / Gather
^^^^^^^^^^^^^^^^
-The Loop Vectorizer can vectorize code that becomes scatter/gather
-memory accesses.
+The Loop Vectorizer can vectorize code that becomes a sequence of scalar instructions
+that scatter/gathers memory.
.. code-block:: c++
@@ -203,6 +203,38 @@ See the table below for a list of these functions.
| | | fmuladd |
+-----+-----+---------+
+
+Partial unrolling during vectorization
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Modern processors feature multiple execution units, and only programs that contain a
+high degree of parallelism can fully utilize the entire width of the machine.
+
+The Loop Vectorizer increases the instruction level parallelism (ILP) by
+performing partial-unrolling of loops.
+
+In the example below the entire array is accumulated into the variable 'sum'.
+This is inefficient because only a single 'adder' can be used by the processor.
+By unrolling the code the Loop Vectorizer allows two or more execution ports
+to be used.
+
+.. code-block:: c++
+
+ int foo(int *A, int *B, int n) {
+ unsigned sum = 0;
+ for (int i = 0; i < n; ++i)
+ sum += A[i];
+ return sum;
+ }
+
+At the moment the unrolling feature is not enabled by default and needs to be enabled
+in opt or clang using the following flag:
+
+.. code-block:: console
+
+ -force-vector-unroll=2
+
+
Performance
-----------