| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Cherry-pick LLVM revisions r235191, r235215, r235220, r235341, r235363,
r235530, r235609, r235610, r237004
r235191 has a required bug-fix and the rest are all related to fp16.
Change-Id: I7fe8da5ffd8f2c06150885a54769abd18c3a04c6
(cherry picked from commit a18e6af1712fd41c4a705a19ad71f6e9ac7a4e68)
|
|
|
|
|
| |
Change-Id: I9bf53792f9fc30570e81a8d80d296c681d005ea7
(cherry picked from commit 0c7f116bb6950ef819323d855415b2f2b0aad987)
|
|
|
|
| |
Change-Id: I07d935f8793ee8ec6b7da003f6483046594bca49
|
|
|
|
| |
Change-Id: I2b5be30509658cb8266be782de0ab24f9099f9b9
|
|
|
|
| |
Change-Id: Ic787f5e0124df789bd26f3f24680f45e678eef2d
|
|
|
|
|
|
|
|
|
| |
vectors.
e.g. when promoting ctlz from <2 x i32> to <2 x i64> we have to fixup
the result by 32 bits, not 64. PR20917.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217671 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
| |
Follow up to r214266. Add missing case in ScalarizeVectorResult() for
cttz_zero_undef.
Differential Revision: http://reviews.llvm.org/D4813
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215330 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
| |
Fix the missing case in ScalarizeVectorResult() that was exposed with
libclcore.bc in Android.
Differential Revision: http://reviews.llvm.org/D4645
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214266 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
| |
Includes a cherry-pick of:
r212948 - fixes a small issue with atomic calls
Change-Id: Ib97bd980b59f18142a69506400911a6009d9df18
|
|
|
|
| |
Change-Id: I149556c940fb7dc92d075273c87ff584f400941f
|
|
|
|
| |
Change-Id: Ifadecab779f128e62e430c2b4f6ddd84953ed617
|
|\
| |
| |
| |
| |
| |
| |
| | |
Conflicts:
lib/Linker/LinkModules.cpp
lib/Support/Unix/Signals.inc
Change-Id: Ia54f291fa5dc828052d2412736e8495c1282aa64
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r197047 | d0k | 2013-12-11 08:36:09 -0800 (Wed, 11 Dec 2013) | 3 lines
SelectionDAG: Fix a typo.
Found by "cppcheck". PR18208.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@197355 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r196858 | nadav | 2013-12-09 17:13:59 -0800 (Mon, 09 Dec 2013) | 1 line
Fix PR18162 - Incorrect assertion assumed that the SDValue resno is zero.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@196886 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@196521 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195670 | void | 2013-11-25 10:05:22 -0800 (Mon, 25 Nov 2013) | 5 lines
Unrevert r195599 with testcase fix.
I'm not sure how it was checking for the wrong values...
PR18023.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195672 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195636 | aemerson | 2013-11-25 03:24:18 -0800 (Mon, 25 Nov 2013) | 2 lines
Revert r195599 as it broke the builds.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195671 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195635 | dsanders | 2013-11-25 11:14:43 +0000 (Mon, 25 Nov 2013) | 19 lines
Fixed tryFoldToZero() for vector types that need expansion.
Summary:
Moved the requirement for SelectionDAG::getConstant() to return legally
typed nodes slightly earlier. There were two optional DAGCombine passes
that were missed out and were required to produce type-legal DAGs.
Simplified a code-path in tryFoldToZero() to use SelectionDAG::getConstant().
This provides support for both promoted and expanded vector types whereas the
previous code only supported promoted vector types.
Fixes a "Type for zero vector elements is not legal" assertion detected by
an llvm-stress generated test.
Reviewers: resistor
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2251
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195651 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195491 | probinson | 2013-11-22 11:11:24 -0800 (Fri, 22 Nov 2013) | 11 lines
Teach ISel not to optimize 'optnone' functions (revised).
Improvements over r195317:
- Set/restore EnableFastISel flag instead of just running FastISel within
SelectAllBasicBlocks; the flag is checked in various places, and
FastISel won't run properly if those places don't do the right thing.
- Test looks for normal ISel versus FastISel behavior, and not
something more subtle that doesn't work everywhere.
Based on work by Andrea Di Biagio.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195604 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195599 | void | 2013-11-24 21:01:21 -0800 (Sun, 24 Nov 2013) | 4 lines
Don't look past volatile loads.
A volatile load should block us from trying to coalesce stores.
PR18023
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195600 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195398 | tstellar | 2013-11-21 16:41:05 -0800 (Thu, 21 Nov 2013) | 7 lines
SelectionDAG: Optimize expansion of vec_type = BITCAST scalar_type
The legalizer can now do this type of expansion for more
type combinations without loading and storing to and
from the stack.
NOTE: This is a candidate for the 3.4 branch.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195414 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195397 | tstellar | 2013-11-21 16:39:23 -0800 (Thu, 21 Nov 2013) | 11 lines
Split SETCC if VSELECT requires splitting too.
This patch is a rewrite of the original patch commited in r194542. Instead of
relying on the type legalizer to do the splitting for us, we now peform the
splitting ourselves in the DAG combiner. This is necessary for the case where
the vector mask is a legal type after promotion and still wouldn't require
splitting.
Patch by: Juergen Ributzka
NOTE: This is a candidate for the 3.4 branch.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195413 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195156 | ributzka | 2013-11-19 13:20:17 -0800 (Tue, 19 Nov 2013) | 3 lines
[DAG] Refactor vector splitting code in SelectionDAG. No functional change intended.
Reviewed by Tom
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195412 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195339 | chapuni | 2013-11-21 02:55:15 -0800 (Thu, 21 Nov 2013) | 5 lines
Revert r195317 (and r195333), "Teach ISel not to optimize 'optnone' functions."
It broke, at least, i686 target. It is reproducible with "llc -mtriple=i686-unknown".
FYI, it didn't appear to add either "-O0" or "-fast-isel".
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195375 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195355 | dsanders | 2013-11-21 13:24:49 +0000 (Thu, 21 Nov 2013) | 20 lines
Add support for legalizing SETNE/SETEQ by inverting the condition code and the result of the comparison.
Summary:
LegalizeSetCCCondCode can now legalize SETEQ and SETNE by returning the inverse
condition and requesting that the caller invert the result of the condition.
The caller of LegalizeSetCCCondCode must handle the inverted CC, and they do
so as follows:
SETCC, BR_CC:
Invert the result of the SETCC with SelectionDAG::getNOT()
SELECT_CC:
Swap the true/false operands.
This is necessary for MSA which lacks an integer SETNE instruction.
Reviewers: resistor
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2229
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195363 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195317 | probinson | 2013-11-20 22:33:32 -0800 (Wed, 20 Nov 2013) | 4 lines
Teach ISel not to optimize 'optnone' functions.
Based on work by Andrea Di Biagio.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195321 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195103 | atrick | 2013-11-18 21:05:43 -0800 (Mon, 18 Nov 2013) | 1 line
Fix patchpoint comments.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195115 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| | |
illegal constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194959 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194945 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194944 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194940 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194934 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194932 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Stop folding constant adds into GEP when the type size doesn't match.
Otherwise, the adds' operands are effectively being promoted, changing the
conditions of an overflow. Results are different when:
sext(a) + sext(b) != sext(a + b)
Problem originally found on x86-64, but also fixed issues with ARM and PPC,
which used similar code.
<rdar://problem/15292280>
Patch by Duncan Exon Smith!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194840 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
When getConstant() is called for an expanded vector type, it is split into
multiple scalar constants which are then combined using appropriate build_vector
and bitcast operations.
In addition to the usual big/little endian differences, the case where the
element-order of the vector does not have the same endianness as the elements
themselves is also accounted for. For example, for v4i32 on big-endian MIPS,
the byte-order of the vector is <3210,7654,BA98,FEDC>. For little-endian, it is
<0123,4567,89AB,CDEF>.
Handling this case turns out to be a nop since getConstant() returns a splatted
vector (so reversing the element order doesn't change the value)
This fixes a number of cases in MIPS MSA where calling getConstant() during
operation legalization introduces illegal types (e.g. to legalize v2i64 UNDEF
into a v2i64 BUILD_VECTOR of illegal i64 zeros). It should also handle bigger
differences between illegal and legal types such as legalizing v2i64 into v8i16.
lowerMSASplatImm() in the MIPS backend no longer needs to avoid calling
getConstant() so this function has been updated in the same patch.
For the sake of transparency, the steps I've taken since the review are:
* Added 'virtual' to isVectorEltOrderLittleEndian() as requested. This revealed
that the MIPS tests were falsely passing because a polymorphic function was
not actually polymorphic in the reviewed patch.
* Fixed the tests that were now failing. This involved deleting the code to
handle the MIPS MSA element-order (which was previously doing an byte-order
swap instead of an element-order swap). This left
isVectorEltOrderLittleEndian() unused and it was deleted.
* Fixed build failures caused by rebasing beyond r194467-r194472. These build
failures involved the bset, bneg, and bclr instructions added in these commits
using lowerMSASplatImm() in a way that was no longer valid after this patch.
Some of these were fixed by calling SelectionDAG::getConstant() instead,
others were fixed by a new function getBuildVectorSplat() that provided the
removed functionality of lowerMSASplatImm() in a more sensible way.
Reviewers: bkramer
Reviewed By: bkramer
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1973
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194811 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is to avoid this transformation in some cases:
fold (conv (load x)) -> (load (conv*)x)
On architectures that don't natively support some vector
loads efficiently casting the load to a smaller vector of
larger types and loading is more efficient.
Patch by Micah Villmow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194783 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| | |
Patch by Michele Scandale!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194760 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| | |
If a null call target is provided, don't emit a dummy call. This
allows the runtime to reserve as little nop space as it needs without
the requirement of emitting a call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194676 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch reapplies r193676 with an additional fix for the Hexagon backend. The
SystemZ backend has already been fixed by r194148.
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.
This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. Now the type
legalizer will split both VSELECT and SETCC.
This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.
Reviewed by Nadav
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194542 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
SimplifyVBinOp too
Reviewers: dsanders
Reviewed By: dsanders
CC: llvm-commits, nadav
Differential Revision: http://llvm-reviews.chandlerc.com/D1958
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194393 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch moves the jump address materialization inside the noop slide. This
enables patching of the materialization itself or its complete removal. This
patch also adds the ability to define scratch registers that can be used safely
by the code called from the patchpoint intrinsic. At least one scratch register
is required, because that one is used for the materialization of the jump
address. This patch depends on D2009.
Differential Revision: http://llvm-reviews.chandlerc.com/D2074
Reviewed by Andy
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194306 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The idea of the AnyReg Calling Convention is to provide the call arguments in
registers, but not to force them to be placed in a paticular order into a
specified set of registers. Instead it is up tp the register allocator to assign
any register as it sees fit. The same applies to the return value (if
applicable).
Differential Revision: http://llvm-reviews.chandlerc.com/D2009
Reviewed by Andy
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194293 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
MorphNodeTo is not safe to call during DAG building. It eagerly
deletes dependent DAG nodes which invalidates the NodeMap. We could
expose a safe interface for morphing nodes, but I don't think it's
worth it. Just create a new MachineNode and replaceAllUsesWith.
My understaning of the SD design has been that we want to support
early target opcode selection. That isn't very well supported, but
generally works. It seems reasonable to rely on this feature even if
it isn't widely used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194102 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193871 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| | |
This effectively reverts r193861, but needs to be fixed as part of r193769.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193862 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193861 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| | |
Originally implemented by Lang Hames.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193811 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193769 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193765 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When an extend more than doubles the size of the elements (e.g., a zext
from v16i8 to v16i32), the normal legalization method of splitting the
vectors will run into problems as by the time the destination vector is
legal, the source vector is illegal. The end result is the operation
often becoming scalarized, with the typical horrible performance. For
example, on x86_64, the simple input of:
define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind {
%tmp = zext <16 x i8> %a to <16 x i32>
store <16 x i32> %tmp, <16 x i32>*%p
ret void
}
Generates:
.section __TEXT,__text,regular,pure_instructions
.section __TEXT,__const
.align 5
LCPI0_0:
.long 255 ## 0xff
.long 255 ## 0xff
.long 255 ## 0xff
.long 255 ## 0xff
.long 255 ## 0xff
.long 255 ## 0xff
.long 255 ## 0xff
.long 255 ## 0xff
.section __TEXT,__text,regular,pure_instructions
.globl _bar
.align 4, 0x90
_bar:
vpunpckhbw %xmm0, %xmm0, %xmm1
vpunpckhwd %xmm0, %xmm1, %xmm2
vpmovzxwd %xmm1, %xmm1
vinsertf128 $1, %xmm2, %ymm1, %ymm1
vmovaps LCPI0_0(%rip), %ymm2
vandps %ymm2, %ymm1, %ymm1
vpmovzxbw %xmm0, %xmm3
vpunpckhwd %xmm0, %xmm3, %xmm3
vpmovzxbd %xmm0, %xmm0
vinsertf128 $1, %xmm3, %ymm0, %ymm0
vandps %ymm2, %ymm0, %ymm0
vmovaps %ymm0, (%rdi)
vmovaps %ymm1, 32(%rdi)
vzeroupper
ret
So instead we can check if there are legal types that enable us to split
more cleverly when the input vector is already legal such that we don't
turn it into an illegal type. If the extend is such that it's more than
doubling the size of the input we check if
- the number of vector elements is even,
- the source type is legal,
- the type of a split source is illegal,
- the type of an extended (by doubling element size) source is legal, and
- the type of that extended source when split is legal.
If the conditions are met, instead of just splitting both the
destination and the source types, we create an extend that only goes up
one "step" (doubling the element width), and the continue legalizing the
rest of the operation normally. The result is that this operates as a
new, more effecient, termination condition for the loop of "split the
operation until the destination type is legal."
With this change, the above example now compiles to:
_bar:
vpxor %xmm1, %xmm1, %xmm1
vpunpcklbw %xmm1, %xmm0, %xmm2
vpunpckhwd %xmm1, %xmm2, %xmm3
vpunpcklwd %xmm1, %xmm2, %xmm2
vinsertf128 $1, %xmm3, %ymm2, %ymm2
vpunpckhbw %xmm1, %xmm0, %xmm0
vpunpckhwd %xmm1, %xmm0, %xmm3
vpunpcklwd %xmm1, %xmm0, %xmm0
vinsertf128 $1, %xmm3, %ymm0, %ymm0
vmovaps %ymm0, 32(%rdi)
vmovaps %ymm2, (%rdi)
vzeroupper
ret
This generalizes a custom lowering that was added a while back to the
ARM backend. That lowering is no longer necessary, and is removed. The
testcases for it, however, provide excellent ARM tests for this change
and so remain.
rdar://14735100
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193727 91177308-0d34-0410-b5e6-96231b3b80d8
|