diff options
author | Evan Cheng <evan.cheng@apple.com> | 2008-05-12 23:04:07 +0000 |
---|---|---|
committer | Evan Cheng <evan.cheng@apple.com> | 2008-05-12 23:04:07 +0000 |
commit | b62904638976dad2609610ffc593e2db617f5476 (patch) | |
tree | 02ded03a6c36779787a2a0d20a980084bcde378b /lib/Target/X86/README-SSE.txt | |
parent | 7ee0252a6b866b04cd6392f0a5d1505ee68a7ddd (diff) | |
download | external_llvm-b62904638976dad2609610ffc593e2db617f5476.zip external_llvm-b62904638976dad2609610ffc593e2db617f5476.tar.gz external_llvm-b62904638976dad2609610ffc593e2db617f5476.tar.bz2 |
Xform bitconvert(build_pair(load a, load b)) to a single load if the load locations are at the right offset from each other.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51008 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'lib/Target/X86/README-SSE.txt')
-rw-r--r-- | lib/Target/X86/README-SSE.txt | 54 |
1 files changed, 0 insertions, 54 deletions
diff --git a/lib/Target/X86/README-SSE.txt b/lib/Target/X86/README-SSE.txt index 1a5d904..34b949a 100644 --- a/lib/Target/X86/README-SSE.txt +++ b/lib/Target/X86/README-SSE.txt @@ -428,60 +428,6 @@ entry: //===---------------------------------------------------------------------===// -Consider (PR2108): - -#include <xmmintrin.h> -__m128i doload64(unsigned long long x) { return _mm_loadl_epi64(&x);} -__m128i doload64_2(unsigned long long *x) { return _mm_loadl_epi64(x);} - -These are very similar routines, but we generate significantly worse code for -the first one on x86-32: - -_doload64: - subl $12, %esp - movl 20(%esp), %eax - movl %eax, 4(%esp) - movl 16(%esp), %eax - movl %eax, (%esp) - movsd (%esp), %xmm0 - addl $12, %esp - ret -_doload64_2: - movl 4(%esp), %eax - movsd (%eax), %xmm0 - ret - -The problem is that the argument lowering logic splits the i64 argument into -2x i32 loads early, the f64 insert doesn't match. Here's a reduced testcase: - -define fastcc double @doload64(i64 %x) nounwind { -entry: - %tmp717 = bitcast i64 %x to double ; <double> [#uses=1] - ret double %tmp717 -} - -compiles to: - -_doload64: - subl $12, %esp - movl 20(%esp), %eax - movl %eax, 4(%esp) - movl 16(%esp), %eax - movl %eax, (%esp) - movsd (%esp), %xmm0 - addl $12, %esp - ret - -instead of movsd from the stack. This is actually not too bad to implement. The -best way to do this is to implement a dag combine that turns -bitconvert(build_pair(load a, load b)) into one load of the right type. The -only trick to this is writing the predicate that determines that a/b are at the -right offset from each other. For the enterprising hacker, InferAlignment is a -helpful place to start poking if interested. - - -//===---------------------------------------------------------------------===// - __m128d test1( __m128d A, __m128d B) { return _mm_shuffle_pd(A, B, 0x3); } |