aboutsummaryrefslogtreecommitdiffstats
path: root/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
* Update aosp/master LLVM with patches for fp16Pirama Arumuga Nainar2015-05-267-11/+533
| | | | | | | | | | Cherry-pick LLVM revisions r235191, r235215, r235220, r235341, r235363, r235530, r235609, r235610, r237004 r235191 has a required bug-fix and the rest are all related to fp16. Change-Id: I7fe8da5ffd8f2c06150885a54769abd18c3a04c6 (cherry picked from commit a18e6af1712fd41c4a705a19ad71f6e9ac7a4e68)
* Update aosp/master LLVM for rebase to r235153Pirama Arumuga Nainar2015-05-1817-315/+770
| | | | | Change-Id: I9bf53792f9fc30570e81a8d80d296c681d005ea7 (cherry picked from commit 0c7f116bb6950ef819323d855415b2f2b0aad987)
* Update aosp/master llvm for rebase to r233350Pirama Arumuga Nainar2015-04-0914-1165/+1040
| | | | Change-Id: I07d935f8793ee8ec6b7da003f6483046594bca49
* Update aosp/master LLVM for rebase to r230699.Stephen Hines2015-03-2322-861/+3190
| | | | Change-Id: I2b5be30509658cb8266be782de0ab24f9099f9b9
* Update aosp/master LLVM for rebase to r222494.Stephen Hines2014-12-0225-2598/+4109
| | | | Change-Id: Ic787f5e0124df789bd26f3f24680f45e678eef2d
* Legalizer: Use the scalar bit width when promoting bit counting instrs onBenjamin Kramer2014-09-121-5/+6
| | | | | | | | | vectors. e.g. when promoting ctlz from <2 x i32> to <2 x i64> we have to fixup the result by 32 bits, not 64. PR20917. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217671 91177308-0d34-0410-b5e6-96231b3b80d8
* Add support for scalarizing cttz_zero_undefPetar Jovanovic2014-08-261-0/+1
| | | | | | | | | Follow up to r214266. Add missing case in ScalarizeVectorResult() for cttz_zero_undef. Differential Revision: http://reviews.llvm.org/D4813 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215330 91177308-0d34-0410-b5e6-96231b3b80d8
* Add support for scalarizing ctlz_zero_undefPetar Jovanovic2014-08-261-0/+1
| | | | | | | | | Fix the missing case in ScalarizeVectorResult() that was exposed with libclcore.bc in Android. Differential Revision: http://reviews.llvm.org/D4645 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214266 91177308-0d34-0410-b5e6-96231b3b80d8
* Update LLVM for rebase to r212749.Stephen Hines2014-07-2119-587/+1616
| | | | | | | Includes a cherry-pick of: r212948 - fixes a small issue with atomic calls Change-Id: Ib97bd980b59f18142a69506400911a6009d9df18
* Update LLVM for 3.5 rebase (r209712).Stephen Hines2014-05-2926-1513/+1677
| | | | Change-Id: I149556c940fb7dc92d075273c87ff584f400941f
* Update to LLVM 3.5a.Stephen Hines2014-04-2427-899/+2046
| | | | Change-Id: Ifadecab779f128e62e430c2b4f6ddd84953ed617
* Merge remote-tracking branch 'upstream/release_34' into merge-20140211Stephen Hines2014-02-1120-938/+3178
|\ | | | | | | | | | | | | | | Conflicts: lib/Linker/LinkModules.cpp lib/Support/Unix/Signals.inc Change-Id: Ia54f291fa5dc828052d2412736e8495c1282aa64
| * Merging r197047:Bill Wendling2013-12-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r197047 | d0k | 2013-12-11 08:36:09 -0800 (Wed, 11 Dec 2013) | 3 lines SelectionDAG: Fix a typo. Found by "cppcheck". PR18208. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@197355 91177308-0d34-0410-b5e6-96231b3b80d8
| * Merging r196858:Bill Wendling2013-12-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r196858 | nadav | 2013-12-09 17:13:59 -0800 (Mon, 09 Dec 2013) | 1 line Fix PR18162 - Incorrect assertion assumed that the SDValue resno is zero. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@196886 91177308-0d34-0410-b5e6-96231b3b80d8
| * Revert r191049 and r191059. They were causing failures. See PR17975.Bill Wendling2013-12-051-49/+4
| | | | | | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@196521 91177308-0d34-0410-b5e6-96231b3b80d8
| * Merging r195670:Bill Wendling2013-11-251-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195670 | void | 2013-11-25 10:05:22 -0800 (Mon, 25 Nov 2013) | 5 lines Unrevert r195599 with testcase fix. I'm not sure how it was checking for the wrong values... PR18023. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195672 91177308-0d34-0410-b5e6-96231b3b80d8
| * Merging r195636:Bill Wendling2013-11-251-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195636 | aemerson | 2013-11-25 03:24:18 -0800 (Mon, 25 Nov 2013) | 2 lines Revert r195599 as it broke the builds. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195671 91177308-0d34-0410-b5e6-96231b3b80d8
| * Merging r195635:Daniel Sanders2013-11-252-15/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195635 | dsanders | 2013-11-25 11:14:43 +0000 (Mon, 25 Nov 2013) | 19 lines Fixed tryFoldToZero() for vector types that need expansion. Summary: Moved the requirement for SelectionDAG::getConstant() to return legally typed nodes slightly earlier. There were two optional DAGCombine passes that were missed out and were required to produce type-legal DAGs. Simplified a code-path in tryFoldToZero() to use SelectionDAG::getConstant(). This provides support for both promoted and expanded vector types whereas the previous code only supported promoted vector types. Fixes a "Type for zero vector elements is not legal" assertion detected by an llvm-stress generated test. Reviewers: resistor CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2251 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195651 91177308-0d34-0410-b5e6-96231b3b80d8
| * Merging r195491:Bill Wendling2013-11-251-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195491 | probinson | 2013-11-22 11:11:24 -0800 (Fri, 22 Nov 2013) | 11 lines Teach ISel not to optimize 'optnone' functions (revised). Improvements over r195317: - Set/restore EnableFastISel flag instead of just running FastISel within SelectAllBasicBlocks; the flag is checked in various places, and FastISel won't run properly if those places don't do the right thing. - Test looks for normal ISel versus FastISel behavior, and not something more subtle that doesn't work everywhere. Based on work by Andrea Di Biagio. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195604 91177308-0d34-0410-b5e6-96231b3b80d8
| * Merging r195599:Bill Wendling2013-11-251-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195599 | void | 2013-11-24 21:01:21 -0800 (Sun, 24 Nov 2013) | 4 lines Don't look past volatile loads. A volatile load should block us from trying to coalesce stores. PR18023 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195600 91177308-0d34-0410-b5e6-96231b3b80d8
| * Merging r195398:Bill Wendling2013-11-222-10/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195398 | tstellar | 2013-11-21 16:41:05 -0800 (Thu, 21 Nov 2013) | 7 lines SelectionDAG: Optimize expansion of vec_type = BITCAST scalar_type The legalizer can now do this type of expansion for more type combinations without loading and storing to and from the stack. NOTE: This is a candidate for the 3.4 branch. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195414 91177308-0d34-0410-b5e6-96231b3b80d8
| * Merging r195397:Bill Wendling2013-11-222-21/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195397 | tstellar | 2013-11-21 16:39:23 -0800 (Thu, 21 Nov 2013) | 11 lines Split SETCC if VSELECT requires splitting too. This patch is a rewrite of the original patch commited in r194542. Instead of relying on the type legalizer to do the splitting for us, we now peform the splitting ourselves in the DAG combiner. This is necessary for the case where the vector mask is a legal type after promotion and still wouldn't require splitting. Patch by: Juergen Ributzka NOTE: This is a candidate for the 3.4 branch. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195413 91177308-0d34-0410-b5e6-96231b3b80d8
| * Merging r195156:Bill Wendling2013-11-226-114/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195156 | ributzka | 2013-11-19 13:20:17 -0800 (Tue, 19 Nov 2013) | 3 lines [DAG] Refactor vector splitting code in SelectionDAG. No functional change intended. Reviewed by Tom ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195412 91177308-0d34-0410-b5e6-96231b3b80d8
| * Merging r195339:Bill Wendling2013-11-211-40/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195339 | chapuni | 2013-11-21 02:55:15 -0800 (Thu, 21 Nov 2013) | 5 lines Revert r195317 (and r195333), "Teach ISel not to optimize 'optnone' functions." It broke, at least, i686 target. It is reproducible with "llc -mtriple=i686-unknown". FYI, it didn't appear to add either "-O0" or "-fast-isel". ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195375 91177308-0d34-0410-b5e6-96231b3b80d8
| * Merging r195355:Daniel Sanders2013-11-211-14/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195355 | dsanders | 2013-11-21 13:24:49 +0000 (Thu, 21 Nov 2013) | 20 lines Add support for legalizing SETNE/SETEQ by inverting the condition code and the result of the comparison. Summary: LegalizeSetCCCondCode can now legalize SETEQ and SETNE by returning the inverse condition and requesting that the caller invert the result of the condition. The caller of LegalizeSetCCCondCode must handle the inverted CC, and they do so as follows: SETCC, BR_CC: Invert the result of the SETCC with SelectionDAG::getNOT() SELECT_CC: Swap the true/false operands. This is necessary for MSA which lacks an integer SETNE instruction. Reviewers: resistor CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2229 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195363 91177308-0d34-0410-b5e6-96231b3b80d8
| * Merging r195317:Bill Wendling2013-11-211-1/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195317 | probinson | 2013-11-20 22:33:32 -0800 (Wed, 20 Nov 2013) | 4 lines Teach ISel not to optimize 'optnone' functions. Based on work by Andrea Di Biagio. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195321 91177308-0d34-0410-b5e6-96231b3b80d8
| * Merging r195103:Bill Wendling2013-11-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195103 | atrick | 2013-11-18 21:05:43 -0800 (Mon, 18 Nov 2013) | 1 line Fix patchpoint comments. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195115 91177308-0d34-0410-b5e6-96231b3b80d8
| * DAGCombiner: Partially revert r192795, getNOT was fixed not to create ↵Benjamin Kramer2013-11-171-1/+1
| | | | | | | | | | | | illegal constants. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194959 91177308-0d34-0410-b5e6-96231b3b80d8
| * Use more getZExtOrTruncsMatt Arsenault2013-11-172-9/+2
| | | | | | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194945 91177308-0d34-0410-b5e6-96231b3b80d8
| * Use getZExtOrTrunc instead of repeating the same logic.Matt Arsenault2013-11-171-5/+1
| | | | | | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194944 91177308-0d34-0410-b5e6-96231b3b80d8
| * Use right address space pointer sizeMatt Arsenault2013-11-171-1/+2
| | | | | | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194940 91177308-0d34-0410-b5e6-96231b3b80d8
| * Fix assert on unaligned access to global with different address space size.Matt Arsenault2013-11-161-1/+1
| | | | | | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194934 91177308-0d34-0410-b5e6-96231b3b80d8
| * Fix codegen for null different sized pointer.Matt Arsenault2013-11-161-2/+4
| | | | | | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194932 91177308-0d34-0410-b5e6-96231b3b80d8
| * Avoid illegal integer promotion in fastiselBob Wilson2013-11-151-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stop folding constant adds into GEP when the type size doesn't match. Otherwise, the adds' operands are effectively being promoted, changing the conditions of an overflow. Results are different when: sext(a) + sext(b) != sext(a + b) Problem originally found on x86-64, but also fixed issues with ARM and PPC, which used similar code. <rdar://problem/15292280> Patch by Duncan Exon Smith! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194840 91177308-0d34-0410-b5e6-96231b3b80d8
| * Fix illegal DAG produced by SelectionDAG::getConstant() for v2i64 typeDaniel Sanders2013-11-152-1/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When getConstant() is called for an expanded vector type, it is split into multiple scalar constants which are then combined using appropriate build_vector and bitcast operations. In addition to the usual big/little endian differences, the case where the element-order of the vector does not have the same endianness as the elements themselves is also accounted for. For example, for v4i32 on big-endian MIPS, the byte-order of the vector is <3210,7654,BA98,FEDC>. For little-endian, it is <0123,4567,89AB,CDEF>. Handling this case turns out to be a nop since getConstant() returns a splatted vector (so reversing the element order doesn't change the value) This fixes a number of cases in MIPS MSA where calling getConstant() during operation legalization introduces illegal types (e.g. to legalize v2i64 UNDEF into a v2i64 BUILD_VECTOR of illegal i64 zeros). It should also handle bigger differences between illegal and legal types such as legalizing v2i64 into v8i16. lowerMSASplatImm() in the MIPS backend no longer needs to avoid calling getConstant() so this function has been updated in the same patch. For the sake of transparency, the steps I've taken since the review are: * Added 'virtual' to isVectorEltOrderLittleEndian() as requested. This revealed that the MIPS tests were falsely passing because a polymorphic function was not actually polymorphic in the reviewed patch. * Fixed the tests that were now failing. This involved deleting the code to handle the MIPS MSA element-order (which was previously doing an byte-order swap instead of an element-order swap). This left isVectorEltOrderLittleEndian() unused and it was deleted. * Fixed build failures caused by rebasing beyond r194467-r194472. These build failures involved the bset, bneg, and bclr instructions added in these commits using lowerMSASplatImm() in a way that was no longer valid after this patch. Some of these were fixed by calling SelectionDAG::getConstant() instead, others were fixed by a new function getBuildVectorSplat() that provided the removed functionality of lowerMSASplatImm() in a more sensible way. Reviewers: bkramer Reviewed By: bkramer CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1973 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194811 91177308-0d34-0410-b5e6-96231b3b80d8
| * Add target hook to prevent folding some bitcasted loads.Matt Arsenault2013-11-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This is to avoid this transformation in some cases: fold (conv (load x)) -> (load (conv*)x) On architectures that don't natively support some vector loads efficiently casting the load to a smaller vector of larger types and loading is more efficient. Patch by Micah Villmow. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194783 91177308-0d34-0410-b5e6-96231b3b80d8
| * Add addrspacecast instruction.Matt Arsenault2013-11-154-0/+51
| | | | | | | | | | | | Patch by Michele Scandale! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194760 91177308-0d34-0410-b5e6-96231b3b80d8
| * Minor extension to llvm.experimental.patchpoint: don't require a call.Andrew Trick2013-11-141-1/+1
| | | | | | | | | | | | | | | | If a null call target is provided, don't emit a dummy call. This allows the runtime to reserve as little nop space as it needs without the requirement of emitting a call. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194676 91177308-0d34-0410-b5e6-96231b3b80d8
| * SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.Juergen Ributzka2013-11-132-8/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch reapplies r193676 with an additional fix for the Hexagon backend. The SystemZ backend has already been fixed by r194148. The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194542 91177308-0d34-0410-b5e6-96231b3b80d8
| * Vector forms of SHL, SRA, and SRL can be constant folded using ↵Daniel Sanders2013-11-111-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | SimplifyVBinOp too Reviewers: dsanders Reviewed By: dsanders CC: llvm-commits, nadav Differential Revision: http://llvm-reviews.chandlerc.com/D1958 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194393 91177308-0d34-0410-b5e6-96231b3b80d8
| * [Stackmap] Materialize the jump address within the patchpoint noop slide.Juergen Ributzka2013-11-092-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the jump address materialization inside the noop slide. This enables patching of the materialization itself or its complete removal. This patch also adds the ability to define scratch registers that can be used safely by the code called from the patchpoint intrinsic. At least one scratch register is required, because that one is used for the materialization of the jump address. This patch depends on D2009. Differential Revision: http://llvm-reviews.chandlerc.com/D2074 Reviewed by Andy git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194306 91177308-0d34-0410-b5e6-96231b3b80d8
| * [Stackmap] Add AnyReg calling convention support for patchpoint intrinsic.Juergen Ributzka2013-11-083-35/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea of the AnyReg Calling Convention is to provide the call arguments in registers, but not to force them to be placed in a paticular order into a specified set of registers. Instead it is up tp the register allocator to assign any register as it sees fit. The same applies to the return value (if applicable). Differential Revision: http://llvm-reviews.chandlerc.com/D2009 Reviewed by Andy git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194293 91177308-0d34-0410-b5e6-96231b3b80d8
| * Slightly change the way stackmap and patchpoint intrinsics are lowered.Andrew Trick2013-11-051-9/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | MorphNodeTo is not safe to call during DAG building. It eagerly deletes dependent DAG nodes which invalidates the NodeMap. We could expose a safe interface for morphing nodes, but I don't think it's worth it. Just create a new MachineNode and replaceAllUsesWith. My understaning of the SD design has been that we want to support early target opcode selection. That isn't very well supported, but generally works. It seems reasonable to rely on this feature even if it isn't widely used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194102 91177308-0d34-0410-b5e6-96231b3b80d8
| * [Stackmap] Remove erroneous assert.Juergen Ributzka2013-11-011-3/+0
| | | | | | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193871 91177308-0d34-0410-b5e6-96231b3b80d8
| * Commenting out this assert because it is causing the build bots to fail. ↵Aaron Ballman2013-11-011-2/+2
| | | | | | | | | | | | This effectively reverts r193861, but needs to be fixed as part of r193769. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193862 91177308-0d34-0410-b5e6-96231b3b80d8
| * Fixing an order of evaluation error in an assert.Aaron Ballman2013-11-011-1/+1
| | | | | | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193861 91177308-0d34-0410-b5e6-96231b3b80d8
| * Add support for stack map generation in the X86 backend.Andrew Trick2013-10-311-0/+3
| | | | | | | | | | | | Originally implemented by Lang Hames. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193811 91177308-0d34-0410-b5e6-96231b3b80d8
| * Lower stackmap intrinsics directly to their target opcode in the DAG builder.Andrew Trick2013-10-313-11/+216
| | | | | | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193769 91177308-0d34-0410-b5e6-96231b3b80d8
| * whitespaceAndrew Trick2013-10-311-7/+7
| | | | | | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193765 91177308-0d34-0410-b5e6-96231b3b80d8
| * Legalize: Improve legalization of long vector extends.Jim Grosbach2013-10-312-3/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an extend more than doubles the size of the elements (e.g., a zext from v16i8 to v16i32), the normal legalization method of splitting the vectors will run into problems as by the time the destination vector is legal, the source vector is illegal. The end result is the operation often becoming scalarized, with the typical horrible performance. For example, on x86_64, the simple input of: define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind { %tmp = zext <16 x i8> %a to <16 x i32> store <16 x i32> %tmp, <16 x i32>*%p ret void } Generates: .section __TEXT,__text,regular,pure_instructions .section __TEXT,__const .align 5 LCPI0_0: .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .section __TEXT,__text,regular,pure_instructions .globl _bar .align 4, 0x90 _bar: vpunpckhbw %xmm0, %xmm0, %xmm1 vpunpckhwd %xmm0, %xmm1, %xmm2 vpmovzxwd %xmm1, %xmm1 vinsertf128 $1, %xmm2, %ymm1, %ymm1 vmovaps LCPI0_0(%rip), %ymm2 vandps %ymm2, %ymm1, %ymm1 vpmovzxbw %xmm0, %xmm3 vpunpckhwd %xmm0, %xmm3, %xmm3 vpmovzxbd %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vandps %ymm2, %ymm0, %ymm0 vmovaps %ymm0, (%rdi) vmovaps %ymm1, 32(%rdi) vzeroupper ret So instead we can check if there are legal types that enable us to split more cleverly when the input vector is already legal such that we don't turn it into an illegal type. If the extend is such that it's more than doubling the size of the input we check if - the number of vector elements is even, - the source type is legal, - the type of a split source is illegal, - the type of an extended (by doubling element size) source is legal, and - the type of that extended source when split is legal. If the conditions are met, instead of just splitting both the destination and the source types, we create an extend that only goes up one "step" (doubling the element width), and the continue legalizing the rest of the operation normally. The result is that this operates as a new, more effecient, termination condition for the loop of "split the operation until the destination type is legal." With this change, the above example now compiles to: _bar: vpxor %xmm1, %xmm1, %xmm1 vpunpcklbw %xmm1, %xmm0, %xmm2 vpunpckhwd %xmm1, %xmm2, %xmm3 vpunpcklwd %xmm1, %xmm2, %xmm2 vinsertf128 $1, %xmm3, %ymm2, %ymm2 vpunpckhbw %xmm1, %xmm0, %xmm0 vpunpckhwd %xmm1, %xmm0, %xmm3 vpunpcklwd %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vmovaps %ymm0, 32(%rdi) vmovaps %ymm2, (%rdi) vzeroupper ret This generalizes a custom lowering that was added a while back to the ARM backend. That lowering is no longer necessary, and is removed. The testcases for it, however, provide excellent ARM tests for this change and so remain. rdar://14735100 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193727 91177308-0d34-0410-b5e6-96231b3b80d8