BBVectorize: Choose pair ordering to minimize shuffles

BBVectorize would, except for loads and stores, always fuse instructions so that the first instruction (in the current source order) would always represent the low part of the input vectors and the second instruction would always represent the high part. This lead to too many shuffles being produced because sometimes the opposite order produces fewer of them. With this change, BBVectorize tracks the kind of pair connections that form the DAG of candidate pairs, and uses that information to reorder the pairs to avoid excess shuffles. Using this information, a future commit will be able to add VTTI-based shuffle costs to the pair selection procedure. Importantly, the number of remaining shuffles can now be estimated during pair selection. There are some trivial instruction reorderings in the test cases, and one simple additional test where we certainly want to do a reordering to avoid an unnecessary shuffle. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167122 91177308-0d34-0410-b5e6-96231b3b80d8
author: Hal Finkel <hfinkel@anl.gov> 2012-10-31 15:17:07 +0000
committer: Hal Finkel <hfinkel@anl.gov> 2012-10-31 15:17:07 +0000
commit: 72465ea23d010507d3746adc126d719005981e05 (patch)
tree: d5e6b1ad3aad528df1c41d88c82db6a62ba61ca4 /test/Transforms/BBVectorize/X86
parent: ef026f1b5e4d52e11c67a1a5ad01eadffcfa4d8e (diff)
download: external_llvm-72465ea23d010507d3746adc126d719005981e05.zip
external_llvm-72465ea23d010507d3746adc126d719005981e05.tar.gz
external_llvm-72465ea23d010507d3746adc126d719005981e05.tar.bz2
2 files changed, 2 insertions, 2 deletions
diff --git a/test/Transforms/BBVectorize/X86/loop1.ll b/test/Transforms/BBVectorize/X86/loop1.ll
index 9d5d9fb..c1be622 100644
--- a/test/Transforms/BBVectorize/X86/loop1.ll
+++ b/test/Transforms/BBVectorize/X86/loop1.ll
@@ -42,8 +42,8 @@ for.body:                                         ; preds = %for.body, %entry
 ; CHECK: %mul = fmul double %0, %0
 ; CHECK: %mul3 = fmul double %0, %1
 ; CHECK: %add = fadd double %mul, %mul3
-; CHECK: %add4.v.i1.1 = insertelement <2 x double> undef, double %1, i32 0
 ; CHECK: %mul8 = fmul double %1, %1
+; CHECK: %add4.v.i1.1 = insertelement <2 x double> undef, double %1, i32 0
 ; CHECK: %add4.v.i1.2 = insertelement <2 x double> %add4.v.i1.1, double %0, i32 1
 ; CHECK: %add4 = fadd <2 x double> %add4.v.i1.2, %add4.v.i1.2
 ; CHECK: %add5.v.i1.1 = insertelement <2 x double> undef, double %0, i32 0
diff --git a/test/Transforms/BBVectorize/X86/simple.ll b/test/Transforms/BBVectorize/X86/simple.ll
index 6450f82..d11c9b9 100644
--- a/test/Transforms/BBVectorize/X86/simple.ll
+++ b/test/Transforms/BBVectorize/X86/simple.ll
@@ -5,8 +5,8 @@ target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f3
 define double @test1(double %A1, double %A2, double %B1, double %B2) {
 ; CHECK: @test1
 ; CHECK: %X1.v.i1.1 = insertelement <2 x double> undef, double %B1, i32 0
-; CHECK: %X1.v.i0.1 = insertelement <2 x double> undef, double %A1, i32 0
 ; CHECK: %X1.v.i1.2 = insertelement <2 x double> %X1.v.i1.1, double %B2, i32 1
+; CHECK: %X1.v.i0.1 = insertelement <2 x double> undef, double %A1, i32 0
 ; CHECK: %X1.v.i0.2 = insertelement <2 x double> %X1.v.i0.1, double %A2, i32 1
 	%X1 = fsub double %A1, %B1
 	%X2 = fsub double %A2, %B2
author	Hal Finkel <hfinkel@anl.gov>	2012-10-31 15:17:07 +0000
committer	Hal Finkel <hfinkel@anl.gov>	2012-10-31 15:17:07 +0000
commit	72465ea23d010507d3746adc126d719005981e05 (patch)
tree	d5e6b1ad3aad528df1c41d88c82db6a62ba61ca4 /test/Transforms/BBVectorize/X86
parent	ef026f1b5e4d52e11c67a1a5ad01eadffcfa4d8e (diff)
download	external_llvm-72465ea23d010507d3746adc126d719005981e05.zip external_llvm-72465ea23d010507d3746adc126d719005981e05.tar.gz external_llvm-72465ea23d010507d3746adc126d719005981e05.tar.bz2