diff options
author | Stephen Hines <srhines@google.com> | 2014-04-23 16:57:46 -0700 |
---|---|---|
committer | Stephen Hines <srhines@google.com> | 2014-04-24 15:53:16 -0700 |
commit | 36b56886974eae4f9c5ebc96befd3e7bfe5de338 (patch) | |
tree | e6cfb69fbbd937f450eeb83bfb83b9da3b01275a /docs/NVPTXUsage.rst | |
parent | 69a8640022b04415ae9fac62f8ab090601d8f889 (diff) | |
download | external_llvm-36b56886974eae4f9c5ebc96befd3e7bfe5de338.zip external_llvm-36b56886974eae4f9c5ebc96befd3e7bfe5de338.tar.gz external_llvm-36b56886974eae4f9c5ebc96befd3e7bfe5de338.tar.bz2 |
Update to LLVM 3.5a.
Change-Id: Ifadecab779f128e62e430c2b4f6ddd84953ed617
Diffstat (limited to 'docs/NVPTXUsage.rst')
-rw-r--r-- | docs/NVPTXUsage.rst | 6 |
1 files changed, 3 insertions, 3 deletions
diff --git a/docs/NVPTXUsage.rst b/docs/NVPTXUsage.rst index a9065ce..e1c401d 100644 --- a/docs/NVPTXUsage.rst +++ b/docs/NVPTXUsage.rst @@ -273,7 +273,7 @@ there is a separate version for each compute architecture. For a list of all math functions implemented in libdevice, see `libdevice Users Guide <http://docs.nvidia.com/cuda/libdevice-users-guide/index.html>`_. -To accomodate various math-related compiler flags that can affect code +To accommodate various math-related compiler flags that can affect code generation of libdevice code, the library code depends on a special LLVM IR pass (``NVVMReflect``) to handle conditional compilation within LLVM IR. This pass looks for calls to the ``@__nvvm_reflect`` function and replaces them @@ -839,7 +839,7 @@ Libdevice provides an ``__nv_powf`` function that we will use. %valB = load float addrspace(1)* %ptrB, align 4 ; Compute C = pow(A, B) - %valC = call float @__nv_exp2f(float %valA, float %valB) + %valC = call float @__nv_powf(float %valA, float %valB) ; Store back to C store float %valC, float addrspace(1)* %ptrC, align 4 @@ -850,7 +850,7 @@ Libdevice provides an ``__nv_powf`` function that we will use. !nvvm.annotations = !{!0} !0 = metadata !{void (float addrspace(1)*, float addrspace(1)*, - float addrspace(1)*)* @kernel, metadata !"kernel", i32 1}% + float addrspace(1)*)* @kernel, metadata !"kernel", i32 1} To compile this kernel, we perform the following steps: |