Fix for wrong instcombine on vector insert/extract

When trying to collapse sequences of insertelement/extractelement instructions into single shuffle instructions, there is one specific case where the Instruction Combiner wrongly updates the resulting Mask of shuffle indexes. The problem is in function CollectShuffleElments. If we have a sequence of insert/extract element instructions like the one below: %tmp1 = extractelement <4 x float> %LHS, i32 0 %tmp2 = insertelement <4 x float> %RHS, float %tmp1, i32 1 %tmp3 = extractelement <4 x float> %RHS, i32 2 %tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 3 Where: . %RHS will have a mask of [4,5,6,7] . %LHS will have a mask of [0,1,2,3] The Mask of shuffle indexes is wrongly computed to [4,1,6,7] instead of [4,0,6,7]. When analyzing %tmp2 in order to compute the Mask for the resulting shuffle instruction, the algorithm forgets to update the mask index at position 1 with the index associated to the element extracted from %LHS by instruction %tmp1. Patch by Andrea DiBiagio! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179291 91177308-0d34-0410-b5e6-96231b3b80d8
author: Benjamin Kramer <benny.kra@googlemail.com> 2013-04-11 15:10:09 +0000
committer: Benjamin Kramer <benny.kra@googlemail.com> 2013-04-11 15:10:09 +0000
commit: c37cb66e6ee256bcb3ba138383e4cb9aab55ddb9 (patch)
tree: ca0f4e97518092abe898964097539a8ccc364b06 /test/Transforms/InstCombine
parent: 765afbc4ca8a936fe563d1970d6bde43bdfc2528 (diff)
download: external_llvm-c37cb66e6ee256bcb3ba138383e4cb9aab55ddb9.zip
external_llvm-c37cb66e6ee256bcb3ba138383e4cb9aab55ddb9.tar.gz
external_llvm-c37cb66e6ee256bcb3ba138383e4cb9aab55ddb9.tar.bz2
1 files changed, 27 insertions, 0 deletions
diff --git a/test/Transforms/InstCombine/vec_shuffle.ll b/test/Transforms/InstCombine/vec_shuffle.ll
index 14f5321..37d4d56 100644
--- a/test/Transforms/InstCombine/vec_shuffle.ll
+++ b/test/Transforms/InstCombine/vec_shuffle.ll
@@ -196,3 +196,30 @@ define <4 x i16> @test13e(<4 x i16> %lhs, <4 x i16> %rhs) {
            <4 x i16> %lhs, <4 x i16> %rhs
   ret <4 x i16> %A
 }
+
+; Check that sequences of insert/extract element are
+; collapsed into shuffle instruction with correct shuffle indexes.
+
+define <4 x float> @test14a(<4 x float> %LHS, <4 x float> %RHS) {
+; CHECK: @test14a
+; CHECK-NEXT: shufflevector <4 x float> %LHS, <4 x float> %RHS, <4 x i32> <i32 4, i32 0, i32 6, i32 6>
+; CHECK-NEXT: ret <4 x float> %tmp4
+        %tmp1 = extractelement <4 x float> %LHS, i32 0
+        %tmp2 = insertelement <4 x float> %RHS, float %tmp1, i32 1
+        %tmp3 = extractelement <4 x float> %RHS, i32 2
+        %tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 3
+        ret <4 x float> %tmp4
+}
+
+define <4 x float> @test14b(<4 x float> %LHS, <4 x float> %RHS) {
+; CHECK: @test14b
+; CHECK-NEXT: shufflevector <4 x float> %LHS, <4 x float> %RHS, <4 x i32> <i32 4, i32 3, i32 6, i32 6>
+; CHECK-NEXT: ret <4 x float> %tmp5
+        %tmp0 = extractelement <4 x float> %LHS, i32 3
+        %tmp1 = insertelement <4 x float> %RHS, float %tmp0, i32 0
+        %tmp2 = extractelement <4 x float> %tmp1, i32 0
+        %tmp3 = insertelement <4 x float> %RHS, float %tmp2, i32 1
+        %tmp4 = extractelement <4 x float> %RHS, i32 2
+        %tmp5 = insertelement <4 x float> %tmp3, float %tmp4, i32 3
+        ret <4 x float> %tmp5
+}
author	Benjamin Kramer <benny.kra@googlemail.com>	2013-04-11 15:10:09 +0000
committer	Benjamin Kramer <benny.kra@googlemail.com>	2013-04-11 15:10:09 +0000
commit	c37cb66e6ee256bcb3ba138383e4cb9aab55ddb9 (patch)
tree	ca0f4e97518092abe898964097539a8ccc364b06 /test/Transforms/InstCombine
parent	765afbc4ca8a936fe563d1970d6bde43bdfc2528 (diff)
download	external_llvm-c37cb66e6ee256bcb3ba138383e4cb9aab55ddb9.zip external_llvm-c37cb66e6ee256bcb3ba138383e4cb9aab55ddb9.tar.gz external_llvm-c37cb66e6ee256bcb3ba138383e4cb9aab55ddb9.tar.bz2