diff options
author | Juergen Ributzka <juergen@apple.com> | 2013-10-30 05:48:18 +0000 |
---|---|---|
committer | Juergen Ributzka <juergen@apple.com> | 2013-10-30 05:48:18 +0000 |
commit | 4eced19c505bb32dc210a18e87624f64d011894c (patch) | |
tree | 5da7c1301173763ed6b160ab2e50e90ec956a686 /test/CodeGen/X86/vec_split.ll | |
parent | 6a4860af7aedd1bec725d0e59f43f66335f9a5a5 (diff) | |
download | external_llvm-4eced19c505bb32dc210a18e87624f64d011894c.zip external_llvm-4eced19c505bb32dc210a18e87624f64d011894c.tar.gz external_llvm-4eced19c505bb32dc210a18e87624f64d011894c.tar.bz2 |
SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.
This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. This mask has
usually the same size as the VSELECT return type (except for Intel KNL). Now the
type legalizer will split both VSELECT and SETCC.
This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.
Reviewed by Nadav
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193676 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'test/CodeGen/X86/vec_split.ll')
-rw-r--r-- | test/CodeGen/X86/vec_split.ll | 42 |
1 files changed, 42 insertions, 0 deletions
diff --git a/test/CodeGen/X86/vec_split.ll b/test/CodeGen/X86/vec_split.ll new file mode 100644 index 0000000..f9e7c20 --- /dev/null +++ b/test/CodeGen/X86/vec_split.ll @@ -0,0 +1,42 @@ +; RUN: llc -march=x86-64 -mcpu=corei7 < %s | FileCheck %s -check-prefix=SSE4 +; RUN: llc -march=x86-64 -mcpu=corei7-avx < %s | FileCheck %s -check-prefix=AVX1 +; RUN: llc -march=x86-64 -mcpu=core-avx2 < %s | FileCheck %s -check-prefix=AVX2 + +define <16 x i16> @split16(<16 x i16> %a, <16 x i16> %b, <16 x i8> %__mask) { +; SSE4-LABEL: split16: +; SSE4: pminuw +; SSE4: pminuw +; SSE4: ret +; AVX1-LABEL: split16: +; AVX1: vpminuw +; AVX1: vpminuw +; AVX1: ret +; AVX2-LABEL: split16: +; AVX2: vpminuw +; AVX2: ret + %1 = icmp ult <16 x i16> %a, %b + %2 = select <16 x i1> %1, <16 x i16> %a, <16 x i16> %b + ret <16 x i16> %2 +} + +define <32 x i16> @split32(<32 x i16> %a, <32 x i16> %b, <32 x i8> %__mask) { +; SSE4-LABEL: split32: +; SSE4: pminuw +; SSE4: pminuw +; SSE4: pminuw +; SSE4: pminuw +; SSE4: ret +; AVX1-LABEL: split32: +; AVX1: vpminuw +; AVX1: vpminuw +; AVX1: vpminuw +; AVX1: vpminuw +; AVX1: ret +; AVX2-LABEL: split32: +; AVX2: vpminuw +; AVX2: vpminuw +; AVX2: ret + %1 = icmp ult <32 x i16> %a, %b + %2 = select <32 x i1> %1, <32 x i16> %a, <32 x i16> %b + ret <32 x i16> %2 +} |