diff options
author | Evan Cheng <evan.cheng@apple.com> | 2010-05-19 01:08:17 +0000 |
---|---|---|
committer | Evan Cheng <evan.cheng@apple.com> | 2010-05-19 01:08:17 +0000 |
commit | 0a942dbb1e0f303191639498c35e742309f08a64 (patch) | |
tree | b1eedb00f314dd8d692314e88f8b0cbbca1eb075 /test/CodeGen/ARM/vcgt.ll | |
parent | 7c2e03916c22d9ad1d8596ad00dee04a9f1454ed (diff) | |
download | external_llvm-0a942dbb1e0f303191639498c35e742309f08a64.zip external_llvm-0a942dbb1e0f303191639498c35e742309f08a64.tar.gz external_llvm-0a942dbb1e0f303191639498c35e742309f08a64.tar.bz2 |
Intrinsics which do a vector compare (results are all zero or all ones) are modeled as icmp / fcmp + sext. This is turned into a vsetcc by dag combine (yes, not a good long term solution). The targets can then isel the vsetcc to the appropriate instruction.
The trouble arises when the result of a vector cmp + sext is then and'ed with all ones. Instcombine will turn it into a vector cmp + zext, dag combiner will miss turning it into a vsetcc and hell breaks loose after that.
Teach dag combine to turn a vector cpm + zest into a vsetcc + and 1. This fixes rdar://7923010.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104094 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'test/CodeGen/ARM/vcgt.ll')
-rw-r--r-- | test/CodeGen/ARM/vcgt.ll | 13 |
1 files changed, 13 insertions, 0 deletions
diff --git a/test/CodeGen/ARM/vcgt.ll b/test/CodeGen/ARM/vcgt.ll index 6b11ba5..194093c 100644 --- a/test/CodeGen/ARM/vcgt.ll +++ b/test/CodeGen/ARM/vcgt.ll @@ -158,5 +158,18 @@ define <4 x i32> @vacgtQf32(<4 x float>* %A, <4 x float>* %B) nounwind { ret <4 x i32> %tmp3 } +; rdar://7923010 +define <4 x i32> @vcgt_zext(<4 x float>* %A, <4 x float>* %B) nounwind { +;CHECK: vcgt_zext: +;CHECK: vcgt.f32 q0 +;CHECK: vmov.i32 q1, #0x1 +;CHECK: vand q0, q0, q1 + %tmp1 = load <4 x float>* %A + %tmp2 = load <4 x float>* %B + %tmp3 = fcmp ogt <4 x float> %tmp1, %tmp2 + %tmp4 = zext <4 x i1> %tmp3 to <4 x i32> + ret <4 x i32> %tmp4 +} + declare <2 x i32> @llvm.arm.neon.vacgtd(<2 x float>, <2 x float>) nounwind readnone declare <4 x i32> @llvm.arm.neon.vacgtq(<4 x float>, <4 x float>) nounwind readnone |