| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Change-Id: I2b5be30509658cb8266be782de0ab24f9099f9b9
|
|
|
|
| |
Change-Id: Ic787f5e0124df789bd26f3f24680f45e678eef2d
|
|
|
|
|
|
|
| |
Includes a cherry-pick of:
r212948 - fixes a small issue with atomic calls
Change-Id: Ib97bd980b59f18142a69506400911a6009d9df18
|
|
|
|
| |
Change-Id: I149556c940fb7dc92d075273c87ff584f400941f
|
|
|
|
| |
Change-Id: Ifadecab779f128e62e430c2b4f6ddd84953ed617
|
|\
| |
| |
| |
| |
| |
| |
| | |
Conflicts:
lib/Linker/LinkModules.cpp
lib/Support/Unix/Signals.inc
Change-Id: Ia54f291fa5dc828052d2412736e8495c1282aa64
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r197492 | dyatkovskiy | 2013-12-17 04:07:33 -0800 (Tue, 17 Dec 2013) | 26 lines
Fix for PR18045:
http://llvm.org/bugs/show_bug.cgi?id=18045
Short issue description:
For X86 machines with sse < sse4.1 we got failures for some
particular load/store vector sequences:
$ clang-trunk -m32 -O2 test-case.c
fatal error: error in backend: Cannot select: 0x4200920: v4i32,ch = load 0x41d6ab0, 0x4205850,
0x41dcb10<LD16[getelementptr inbounds ([4 x i32]* @e, i32 0, i32 0)](align=4)> [ORD=82]
[ID=58]
0x4205850: i32 = X86ISD::Wrapper 0x41d5490 [ORD=26] [ID=43]
0x41d5490: i32 = TargetGlobalAddress<[4 x i32]* @e> 0 [ORD=26] [ID=23]
0x41dcb10: i32 = undef [ID=2]
The reason is that EltsFromConsecutiveLoads could emit such load instruction
both before and after legalize stage. Though this instruction is not legal for
machines with SSSE3 and lower.
The fix: In EltsFromConsecutiveLoads, if we have passed legalize stage, we
check whether nodes it emits are legal.
P.S.: If you get failure in time from 12:00 and till 22:00 (UTC-8),
perhaps I'll slow with response, so you better reject this commit. Thanks!
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@197779 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r197228 | d0k | 2013-12-13 05:40:24 -0800 (Fri, 13 Dec 2013) | 8 lines
X86: When lowering shl_parts, don't emit shift amounts larger than the bit width.
While it's safe for the X86-specific shift nodes, dag combining will
kill generic nodes. Insert an AND to make it safe, isel will nuke it
as x86's shift instructions have an implicit AND.
Fixes PR16108, which contains a contraption to hit this case in between
constant folders.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@197321 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195779 | hliao | 2013-11-26 12:31:31 -0800 (Tue, 26 Nov 2013) | 7 lines
Fix PR18054
- Fix bug in (vsext (vzext x)) -> (vsext x) in SIGN_EXTEND_IN_REG
lowering where we need to check whether x is a vector type (in-reg
type) of i8, i16 or i32; otherwise, that optimization is not valid.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195821 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195476 | hliao | 2013-11-22 09:56:57 -0800 (Fri, 22 Nov 2013) | 6 lines
Fix PR18014
- When simplifying the mask generation for BLEND, check whether that mask is
also consumed by other non-BLEND insns. If true, skip that simplification.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195602 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195439 | kcc | 2013-11-22 02:30:39 -0800 (Fri, 22 Nov 2013) | 3 lines
Revert r195318 as it causes miscompilation (PR18029)
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195478 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
------------------------------------------------------------------------
r195318 | void | 2013-11-20 23:04:30 -0800 (Wed, 20 Nov 2013) | 29 lines
The basic problem is that some mainstream programs cannot deal with the way
clang optimizes tail calls, as in this example:
int foo(void);
int bar(void) {
return foo();
}
where the call is transformed to:
calll .L0$pb
.L0$pb:
popl %eax
.Ltmp0:
addl $_GLOBAL_OFFSET_TABLE_+(.Ltmp0-.L0$pb), %eax
movl foo@GOT(%eax), %eax
popl %ebp
jmpl *%eax # TAILCALL
However, the GOT references must all be resolved at dlopen() time, and so this
approach cannot be used with lazy dynamic linking (e.g. using RTLD_LAZY), which
usually populates the PLT with stubs that perform the actual resolving.
This patch changes X86TargetLowering::LowerCall() to skip tail call
optimization, if the called function is a global or external symbol.
Patch by Dimitry Andric!
PR15086
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195319 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194824 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| | |
Patch by Michele Scandale!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194760 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| | |
Added VMOSHDUP/VMOVSLDUP shuffle instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194691 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch reapplies r193676 with an additional fix for the Hexagon backend. The
SystemZ backend has already been fixed by r194148.
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.
This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. Now the type
legalizer will split both VSELECT and SETCC.
This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.
Reviewed by Nadav
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194542 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch moves the jump address materialization inside the noop slide. This
enables patching of the materialization itself or its complete removal. This
patch also adds the ability to define scratch registers that can be used safely
by the code called from the patchpoint intrinsic. At least one scratch register
is required, because that one is used for the materialization of the jump
address. This patch depends on D2009.
Differential Revision: http://llvm-reviews.chandlerc.com/D2074
Reviewed by Andy
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194306 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The idea of the AnyReg Calling Convention is to provide the call arguments in
registers, but not to force them to be placed in a paticular order into a
specified set of registers. Instead it is up tp the register allocator to assign
any register as it sees fit. The same applies to the return value (if
applicable).
Differential Revision: http://llvm-reviews.chandlerc.com/D2009
Reviewed by Andy
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194293 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| | |
those produced by clang for the inline asm bswap conversion.
Modified from a patch by Chris Smowton.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194016 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193747 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| | |
splitting too."
Now Hexagon and SystemZ are not happy with it :-(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193677 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.
This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. This mask has
usually the same size as the VSELECT return type (except for Intel KNL). Now the
type legalizer will split both VSELECT and SETCC.
This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.
Reviewed by Nadav
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193676 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| | |
Patch by Cameron McInally <cameron.mcinally@nyu.edu>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193497 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This optimization is not SSE specific so I am moving it to DAGco.
The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193393 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Calling _chkstk is required on ELF as well as COFF on Windows. Without
_chkstk, functions requiring large stack crash in initialization code.
Previous code tested for COFF format but not Mach-O and this patch modifies
the code to test for Windows OS (both Windows target and MingW target)
but not Mach-O object format: Looks like macho environment was used to
build some EFI code.
Credits to Andrew MacPherson.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193289 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Without _chkstk functions requiring large stack crash in initialization code. Previous code tested for COFF format but not Mach-O and this patch modifies the code to test for Windows."
This reverts commit r193263.
It is causing CodeGen/X86/mingw-alloca.ll to fail.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193275 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| | |
Also update the cost model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193270 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Without _chkstk functions requiring large stack crash in
initialization code. Previous code tested for COFF format but
not Mach-O and this patch modifies the code to test for Windows.
Credits to Andrew MacPherson.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193263 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On sandy bridge (PR17654) we now get
vpxor %xmm1, %xmm1, %xmm1
vpunpckhbw %xmm1, %xmm0, %xmm2
vpunpcklbw %xmm1, %xmm0, %xmm0
vinsertf128 $1, %xmm2, %ymm0, %ymm0
On haswell it's a simple
vpmovzxbw %xmm0, %ymm0
There is a maze of duplicated and dead transforms and patterns in this
area. Remove the dead custom lowering of zext v8i16 to v8i32, that's
already handled by LowerAVXExtend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193262 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| | |
Per Nadav's review comments for r192866.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193252 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
the instruction defenitions and ISEL reflect this.
Prior to this patch these instructions took an i32i8imm, and the high bits were
dropped during encoding. This led to incorrect behavior for shifts by
immediates higher than 255. This patch fixes that issue by detecting large
immediate shifts and returning constant zero (for logical shifts) or capping
the shift amount at an encodable value (for arithmetic shifts).
Fixes <rdar://problem/14968098>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193096 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193083 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Consider the following:
typedef unsigned short ushort4U __attribute__((ext_vector_type(4),
aligned(2)));
typedef unsigned short ushort4 __attribute__((ext_vector_type(4)));
typedef unsigned short ushort8 __attribute__((ext_vector_type(8)));
typedef int int4 __attribute__((ext_vector_type(4)));
int4 __bbase_cvt_int(ushort4 v) {
ushort8 a;
a.lo = v;
return _mm_cvtepu16_epi32(a);
}
This generates the, not unreasonable, IR:
define <4 x i32> @foo0(double %v.coerce) nounwind ssp {
%tmp = bitcast double %v.coerce to <4 x i16>
%tmp1 = shufflevector <4 x i16> %tmp, <4 x i16> undef, <8 x i32> <i32
%0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef>
%tmp2 = tail call <4 x i32> @llvm.x86.sse41.pmovzxwd(<8 x i16> %tmp1)
ret <4 x i32> %tmp2
}
The problem is when type legalization gets hold of the v4i16. It
legalizes that by spilling to the stack, then doing a zero-extending
load. Things go even more silly from there, ending up with something
like:
_foo0:
movsd %xmm0, -8(%rsp) <== Spill to the stack.
movq -8(%rsp), %xmm0 <== Reload it right back out.
pmovzxwd %xmm0, %xmm1 <== Here's what we actually asked for.
pblendw $1, %xmm1, %xmm0 <== We don't need this at all
pmovzxwd %xmm0, %xmm0 <== We already did this
ret
The v8i8 to v8i16 zext intrinsic gives even worse results, with two
table lookups via pshufb instructions(!!).
To avoid all that, we can move the bitcasting until after we've formed
the wider (legal) vector type. Then our normal codegen flows along
nicely and we get the expected:
_foo0:
pmovzxwd %xmm0, %xmm0
ret
rdar://15245794
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192866 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
- Type of index used in extract_vector_elt or insert_vector_elt supposes
to be TLI.getVectorIdxTy() which is pointer type on most targets. It'd
better to truncate (or zero-extend in case it's changed later) it to
mask element type to guarantee they are matching instead of asserting
that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192722 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
- Lower signed division by constant powers-of-2 to target-independent
DAG operators instead of target-dependent ones to support them better
on targets where vector types are legal but shift operators on that
types are illegal. E.g., on AVX, PSRAW is only available on <8 x i16>
though <16 x i16> is a legal type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192721 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192630 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The alignment of allocated space was wrong, see Bugzila 17345.
Done by Zvi Rackover <zvi.rackover@intel.com>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192573 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
it's i64.
Fixes PR17495, where an i24 triggered this code. It's intended to
optimize i64 loads on 32 bit x86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192123 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| | |
Fixed load folding in VPERM2I instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192063 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| | |
in case of BLEND and added VSHUFPS patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192055 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| | |
from Yunzhong Gao.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191871 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| | |
Patch by Alp Toker.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191757 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191610 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| | |
splitting too."
This reverts commit r191130.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191138 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| | |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191133 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In AVX 256bit vectors are valid vectors and therefore the Type Legalizer doesn't
split the VSELECT and SETCC nodes. AVX only supports MIN/MAX on 128bit vectors
and this fix enables vector splitting for this special case in the X86 DAG
Combiner.
This fix is related to PR16695, PR17002, and <rdar://problem/14594431>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191131 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.
This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask for the given target. This mask has usually
te same size as the VSELECT return type (except for Intel KNL). Now the type
legalizer will split both VSELECT and SETCC.
This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191130 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| | |
Added parsing of mask register and "zeroing" semantic, like {%k1} {z}.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190595 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If the DAG already has only legal types, then the second round of DAG combines
is skipped. In this case VSELECT+SETCC patterns that match a more efficient
instruction (e.g. min/max) are never recognized.
This fix allows VSELECT+SETCC combines if the types are already legal before DAG
type legalization.
Reviewer: Nadav
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190105 91177308-0d34-0410-b5e6-96231b3b80d8
|
| |
| |
| |
| |
| |
| | |
Fixes PR17028.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189742 91177308-0d34-0410-b5e6-96231b3b80d8
|