aboutsummaryrefslogtreecommitdiffstats
path: root/test/CodeGen/NVPTX
Commit message (Collapse)AuthorAgeFilesLines
* [NVPTX] Use approximate FP ops when unsafe-fp-math is used, and appendJustin Holewinski2013-07-221-0/+43
| | | | | | .ftz to instructions if the nvptx-f32ftz attribute is set to "true" git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186820 91177308-0d34-0410-b5e6-96231b3b80d8
* Convert CodeGen/*/*.ll tests to use the new CHECK-LABEL for easier ↵Stephen Lin2013-07-135-5/+5
| | | | | | | | | | | debugging. No functionality change and all tests pass after conversion. This was done with the following sed invocation to catch label lines demarking function boundaries: sed -i '' "s/^;\( *\)\([A-Z0-9_]*\):\( *\)test\([A-Za-z0-9_-]*\):\( *\)$/;\1\2-LABEL:\3test\4:\5/g" test/CodeGen/*/*.ll which was written conservatively to avoid false positives rather than false negatives. I scanned through all the changes and everything looks correct. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186258 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Add support for module-scope inline asmJustin Holewinski2013-07-011-0/+10
| | | | | | | Since we were explicitly not calling AsmPrinter::doInitialization, any module-scope inline asm was not being printed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185336 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] 64-bit ADDC/ADDE are not legalJustin Holewinski2013-07-011-0/+19
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185333 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Fix vector loads from parameters that span multiple loads, and fix ↵Justin Holewinski2013-07-011-0/+13
| | | | | | some typos git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185332 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Handle signext/zeroext attributes properlyJustin Holewinski2013-07-011-0/+16
| | | | | | | | Fix a case where we were incorrectly sign-extending a value when we should have been zero-extending the value. Also change some SIGN_EXTEND to ANY_EXTEND because we really dont care and may have more opportunity to fold subexpressions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185331 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Add support for native SIGN_EXTEND_INREG where availableJustin Holewinski2013-07-011-0/+111
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185330 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Add isel patterns for [reg+offset] form of ldg/ldu.Justin Holewinski2013-07-011-0/+21
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185329 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Make sure we zero out high-order 24 bits for 8-bit load into 32-bit ↵Justin Holewinski2013-07-011-0/+14
| | | | | | value git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185328 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Add (1.0 / sqrt(x)) => rsqrt(x) generation when allowable by FP flagsJustin Holewinski2013-06-281-0/+13
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185178 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Calling conventions fixJustin Holewinski2013-06-285-40/+63
| | | | | | | | Fix ABI handling for function returning bool -- use st.param.b32 to return the value and use ld.param.b32 in caller to load the return value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185177 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Add support for cttz/ctlz/ctpopJustin Holewinski2013-06-283-0/+114
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185176 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Clean up comparison/select/convert patterns and factor out PTX ↵Justin Holewinski2013-06-281-4/+4
| | | | | | | | instructions from their patterns Test case is no breakage git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185175 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Remove i8 register class. PTX support for i8 (.b8, .u8, .s8) is ↵Justin Holewinski2013-06-286-44/+44
| | | | | | rather poor and we're better off just ignoring it and letting LLVM expand all i8 ops out to i16. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185174 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Add support for vectorized function return valuesJustin Holewinski2013-06-281-0/+10
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185173 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Clean up handling of formal arguments and enable generation of ↵Justin Holewinski2013-06-281-4/+2
| | | | | | vector parameter loads git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185172 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Add support for selecting CUDA vs OCL mode based on tripleJustin Holewinski2013-06-215-7/+11
| | | | | | IR for CUDA should use "nvptx[64]-nvidia-cuda", and IR for NV OpenCL should use "nvptx[64]-nvidia-nvcl" git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184579 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Remove old CONST_NOT_GEN address space that is not being used ↵Justin Holewinski2013-06-101-0/+10
| | | | | | anymore and causes constants to be emitted in the global address space git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183652 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Re-enable support for virtual registers in the final outputJustin Holewinski2013-05-312-35/+35
| | | | | | | | | | | | Now that 3.3 is branched, we are re-enabling virtual registers to help iron out bugs before the next release. Some of the post-RA passes do not play well with virtual registers, so we disable them for now. The needed functionality of the PrologEpilogInserter pass is copied to a new backend-specific NVPTXPrologEpilog pass. The test for this commit is not breaking the existing tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182998 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Fix case where a sext load of an i1 type may produce anJustin Holewinski2013-05-301-0/+14
| | | | | | ld.u1 instead of an ld.u8. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182924 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Add @llvm.nvvm.sqrt.f() intrinsicJustin Holewinski2013-05-211-0/+7
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182394 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Fix mis-use of CurrentFnSym in NVPTXAsmPrinter. This was causing a ↵Justin Holewinski2013-05-201-0/+37
| | | | | | symbol name error in the output PTX. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182298 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Add GenericToNVVM IR converter to better handle idiomatic LLVM IR inputsJustin Holewinski2013-05-201-0/+25
| | | | | | | | | | | | | | | This converter currently only handles global variables in address space 0. For these variables, they are promoted to address space 1 (global memory), and all uses are updated to point to the result of a cvta.global instruction on the new variable. The motivation for this is address space 0 global variables are illegal since we cannot declare variables in the generic address space. Instead, we place the variables in address space 1 and explicitly convert the pointer to address space 0. This is primarily intended to help new users who expect to be able to place global variables in the default address space. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182254 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Fix i1 kernel parameters and global variables. ABI rules say we ↵Justin Holewinski2013-05-202-0/+37
| | | | | | need to use .u8 for i1 parameters for kernels. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182253 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Remove support for SM < 2.0. This was never fully supported anyway.Justin Holewinski2013-03-3016-170/+1
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178417 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Add NVVMReflect pass to allow compile-time selection ofJustin Holewinski2013-03-301-0/+34
| | | | | | | | | | | | | | | | specific code paths. This allows us to write code like: if (__nvvm_reflect("FOO")) // Do something else // Do something else and compile into a library, then give "FOO" a value at kernel compile-time so the check becomes a no-op. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178416 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Fix handling of vector argumentsJustin Holewinski2013-03-241-0/+27
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177847 91177308-0d34-0410-b5e6-96231b3b80d8
* Propagate DAG node ordering during type legalization and instruction selectionJustin Holewinski2013-03-203-8/+71
| | | | | | | | A node's ordering is only propagated during legalization if (a) the new node does not have an ordering (is not a CSE'd node), or (b) the new node has an ordering that is higher than the node being legalized. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177465 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Disable vector registersJustin Holewinski2013-02-121-0/+66
| | | | | | | | | | | Vectors were being manually scalarized by the backend. Instead, let the target-independent code do all of the work. The manual scalarization was from a time before good target-independent support for scalarization in LLVM. However, this forces us to specially-handle vector loads and stores, which we can turn into PTX instructions that produce/consume multiple operands. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174968 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Remove NoCapture from address space conversion intrinsics. NoCapture ↵Justin Holewinski2013-02-111-0/+21
| | | | | | is not valid in this case, and was causing incorrect optimizations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174896 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Fix crash with unnamed struct argumentsJustin Holewinski2012-12-051-0/+5
| | | | | | Patch by Eric Holk git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169418 91177308-0d34-0410-b5e6-96231b3b80d8
* Teach the legalizer how to handle operands for VSELECT nodesJustin Holewinski2012-11-291-0/+16
| | | | | | | If we need to split the operand of a VSELECT, it must be the mask operand. We split the entire VSELECT operand with EXTRACT_SUBVECTOR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168883 91177308-0d34-0410-b5e6-96231b3b80d8
* Allow targets to prefer TypeSplitVector over TypePromoteInteger when ↵Justin Holewinski2012-11-291-0/+19
| | | | | | | | computing the legalization method for vectors For some targets, it is desirable to prefer scalarizing <N x i1> instead of promoting to a larger legal type, such as <N x i32>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168882 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Order global variables in def-use order before emiting them in the ↵Justin Holewinski2012-11-161-0/+20
| | | | | | final assembly git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168198 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Implement custom lowering of loads/stores for i1Justin Holewinski2012-11-141-0/+26
| | | | | | | | | Loads from i1 become loads from i8 followed by trunc Stores to i1 become zext to i8 followed by store to i8 Fixes PR13291 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167948 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Add more precise PTX/SM target attributesJustin Holewinski2012-11-1210-0/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each SM and PTX version is modeled as a subtarget feature/CPU. Additionally, PTX 3.1 is added as the default PTX version to be out-of-the-box compatible with CUDA 5.0. Available CPUs for this target: sm_10 - Select the sm_10 processor. sm_11 - Select the sm_11 processor. sm_12 - Select the sm_12 processor. sm_13 - Select the sm_13 processor. sm_20 - Select the sm_20 processor. sm_21 - Select the sm_21 processor. sm_30 - Select the sm_30 processor. sm_35 - Select the sm_35 processor. Available features for this target: ptx30 - Use PTX version 3.0. ptx31 - Use PTX version 3.1. sm_10 - Target SM 1.0. sm_11 - Target SM 1.1. sm_12 - Target SM 1.2. sm_13 - Target SM 1.3. sm_20 - Target SM 2.0. sm_21 - Target SM 2.1. sm_30 - Target SM 3.0. sm_35 - Target SM 3.5. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167699 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Use ABI alignment for parameters when alignment is not specified.Justin Holewinski2012-11-091-0/+25
| | | | | | Affects SM 2.0+. Fixes bug 13324. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167646 91177308-0d34-0410-b5e6-96231b3b80d8
* Add llvm.fabs intrinsic.Peter Collingbourne2012-05-281-0/+21
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157594 91177308-0d34-0410-b5e6-96231b3b80d8
* [NVPTX] Add a new test case for the newly-enabled call handlingJustin Holewinski2012-05-251-0/+26
| | | | | | NV_CONTRIB git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157485 91177308-0d34-0410-b5e6-96231b3b80d8
* This patch adds a new NVPTX back-end to LLVM which supports code generation ↵Justin Holewinski2012-05-0417-0/+1994
for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it. The new target machines are: nvptx (old ptx32) => 32-bit PTX nvptx64 (old ptx64) => 64-bit PTX The sources are based on the internal NVIDIA NVPTX back-end, and contain more functionality than the current PTX back-end currently provides. NV_CONTRIB git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156196 91177308-0d34-0410-b5e6-96231b3b80d8