aboutsummaryrefslogtreecommitdiffstats
path: root/lib/Target/X86/README-SSE.txt
Commit message (Collapse)AuthorAgeFilesLines
* X86: Turn fp selects into mask operations.Benjamin Kramer2013-08-041-31/+0
| | | | | | | | | | | | | | | | | | | | | | | double test(double a, double b, double c, double d) { return a<b ? c : d; } before: _test: ucomisd %xmm0, %xmm1 ja LBB0_2 movaps %xmm3, %xmm2 LBB0_2: movaps %xmm2, %xmm0 after: _test: cmpltsd %xmm1, %xmm0 andpd %xmm0, %xmm2 andnpd %xmm3, %xmm0 orpd %xmm2, %xmm0 Small speedup on Benchmarks/SmallPT git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187706 91177308-0d34-0410-b5e6-96231b3b80d8
* X86: Add a note.Benjamin Kramer2013-02-171-0/+9
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175408 91177308-0d34-0410-b5e6-96231b3b80d8
* some peepholes that should match horizontal add/sub operations.Chris Lattner2012-09-031-0/+12
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163103 91177308-0d34-0410-b5e6-96231b3b80d8
* Add a note for -ffast-math optimization of vector norm.Benjamin Kramer2012-03-191-0/+19
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153031 91177308-0d34-0410-b5e6-96231b3b80d8
* This is now implemented.Benjamin Kramer2011-12-091-13/+0
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146258 91177308-0d34-0410-b5e6-96231b3b80d8
* Add a natural stack alignment field to TargetData, and prevent InstCombine fromLang Hames2011-10-101-1/+1
| | | | | | | | | | | | | | | | promoting allocas to preferred alignments that exceed the natural alignment. This avoids some potentially expensive dynamic stack realignments. The natural stack alignment is set in target data strings via the "S<size>" option. Size is in bits and must be a multiple of 8. The natural stack alignment defaults to "unspecified" (represented by a zero value), and the "unspecified" value does not prevent any alignment promotions. Target maintainers that care about avoiding promotions should explicitly add the "S<size>" option to their target data strings. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141599 91177308-0d34-0410-b5e6-96231b3b80d8
* Add a note about SSE4.1 roundss/roundsd.Benjamin Kramer2011-02-121-0/+11
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125438 91177308-0d34-0410-b5e6-96231b3b80d8
* update this.Chris Lattner2010-09-051-10/+29
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113116 91177308-0d34-0410-b5e6-96231b3b80d8
* we should pattern match the SSE complex arithmetic ops.Chris Lattner2010-08-251-0/+26
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112109 91177308-0d34-0410-b5e6-96231b3b80d8
* random improvement for variable shift codegen.Chris Lattner2010-08-231-2/+14
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111813 91177308-0d34-0410-b5e6-96231b3b80d8
* Remove obsolete README_SSE note.Jakob Stoklund Olesen2010-07-111-10/+0
| | | | | | | | | | | | | | | We are generating movaps for all XMM register copies, including scalar floating point values. This is known to be at least as good as movss and movsd for all known architectures up to and including Nehalem because it avoids a partial register stall. The SSEDomainFix pass will switch movaps to movdqa when appropriate (i.e., when operands come from the integer unit). We don't now that switching movaps to movapd has any benefit. The same applies to andps -> pand. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108096 91177308-0d34-0410-b5e6-96231b3b80d8
* some notes about suboptimal insertps'sChris Lattner2010-07-051-0/+31
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107613 91177308-0d34-0410-b5e6-96231b3b80d8
* Remove some already-fixed README entries.Eli Friedman2010-06-031-86/+1
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105377 91177308-0d34-0410-b5e6-96231b3b80d8
* Remove README entry which no longer compiles to something sane.Eli Friedman2010-06-031-56/+0
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105376 91177308-0d34-0410-b5e6-96231b3b80d8
* Floating-point add, sub, and mul are now spelled fadd, fsub, and fmul,Dan Gohman2010-03-021-4/+4
| | | | | | | respectively. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97531 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix "the the" and similar typos.Dan Gohman2010-02-101-1/+1
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95781 91177308-0d34-0410-b5e6-96231b3b80d8
* add a note from PR6194Chris Lattner2010-02-091-0/+33
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95649 91177308-0d34-0410-b5e6-96231b3b80d8
* move the PR6214 microoptzn to this file.Chris Lattner2010-02-041-0/+18
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95299 91177308-0d34-0410-b5e6-96231b3b80d8
* this is an SSE-specific issue.Chris Lattner2010-01-131-0/+20
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93373 91177308-0d34-0410-b5e6-96231b3b80d8
* Bill implemented this.Chris Lattner2009-02-041-36/+0
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63752 91177308-0d34-0410-b5e6-96231b3b80d8
* add a note, this is why we're faster at SciMark-MonteCarlo withChris Lattner2009-02-041-0/+40
| | | | | | | SSE disabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63751 91177308-0d34-0410-b5e6-96231b3b80d8
* The memory alignment requirement on some of the mov{h|l}p{d|s} patterns are ↵Evan Cheng2009-01-281-0/+5
| | | | | | 16-byte. That is overly strict. These instructions read / write f64 memory locations without alignment requirement. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63195 91177308-0d34-0410-b5e6-96231b3b80d8
* add a noteChris Lattner2008-09-201-1/+32
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@56391 91177308-0d34-0410-b5e6-96231b3b80d8
* add a noteChris Lattner2008-08-191-0/+37
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@54964 91177308-0d34-0410-b5e6-96231b3b80d8
* - Fix a x86 vector isel bug: illegal transformation of a vector_shuffle into aEvan Cheng2008-06-251-0/+31
| | | | | | | | | shift. - Add a readme entry for a missing vector_shuffle optimization that results in awful codegen. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52740 91177308-0d34-0410-b5e6-96231b3b80d8
* This is done.Evan Cheng2008-05-241-46/+0
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51526 91177308-0d34-0410-b5e6-96231b3b80d8
* Use movlps / movhps to modify low / high half of 16-byet memory location.Evan Cheng2008-05-231-40/+0
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51501 91177308-0d34-0410-b5e6-96231b3b80d8
* Elaborate on the entry on integer vector multiplication by constants.Dan Gohman2008-05-231-1/+6
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51491 91177308-0d34-0410-b5e6-96231b3b80d8
* New entry.Evan Cheng2008-05-231-0/+44
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51487 91177308-0d34-0410-b5e6-96231b3b80d8
* we compile multiply-by-constant into horrible code. Doesn't sse4 have someChris Lattner2008-05-231-0/+38
| | | | | | | instruction for doing this? git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51473 91177308-0d34-0410-b5e6-96231b3b80d8
* add a noteChris Lattner2008-05-131-0/+18
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51062 91177308-0d34-0410-b5e6-96231b3b80d8
* add a noteChris Lattner2008-05-131-0/+24
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51060 91177308-0d34-0410-b5e6-96231b3b80d8
* Instead of a vector load, shuffle and then extract an element. Load the ↵Evan Cheng2008-05-131-29/+0
| | | | | | | | | | | | element from address with an offset. pshufd $1, (%rdi), %xmm0 movd %xmm0, %eax => movl 4(%rdi), %eax git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51026 91177308-0d34-0410-b5e6-96231b3b80d8
* On x86, it's safe to treat i32 load anyext as a normal i32 load. Ditto for ↵Evan Cheng2008-05-131-25/+0
| | | | | | i8 anyext load to i16. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51019 91177308-0d34-0410-b5e6-96231b3b80d8
* Xform bitconvert(build_pair(load a, load b)) to a single load if the load ↵Evan Cheng2008-05-121-54/+0
| | | | | | locations are at the right offset from each other. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51008 91177308-0d34-0410-b5e6-96231b3b80d8
* Add noteAnton Korobeynikov2008-05-111-0/+38
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50959 91177308-0d34-0410-b5e6-96231b3b80d8
* add a note, this is actually not too bad to implement.Chris Lattner2008-04-101-1/+7
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49466 91177308-0d34-0410-b5e6-96231b3b80d8
* move the x86-32 part of PR2108 here.Chris Lattner2008-04-101-0/+48
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49465 91177308-0d34-0410-b5e6-96231b3b80d8
* Finish implementing a readme entry: when inserting an i64 variableChris Lattner2008-03-091-38/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | into a vector of zeros or undef, and when the top part is obviously zero, we can just use movd + shuffle. This allows us to compile vec_set-B.ll into: _test3: movl $1234567, %eax andl 4(%esp), %eax movd %eax, %xmm0 ret instead of: _test3: subl $28, %esp movl $1234567, %eax andl 32(%esp), %eax movl %eax, (%esp) movl $0, 4(%esp) movq (%esp), %xmm0 addl $28, %esp ret git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48090 91177308-0d34-0410-b5e6-96231b3b80d8
* add a noteChris Lattner2008-03-091-0/+37
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48064 91177308-0d34-0410-b5e6-96231b3b80d8
* Implement a readme entry, compilingChris Lattner2008-03-091-20/+0
| | | | | | | | | | | | | | | | #include <xmmintrin.h> __m128i doload64(short x) {return _mm_set_epi16(0,0,0,0,0,0,0,1);} into: movl $1, %eax movd %eax, %xmm0 ret instead of a constant pool load. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48063 91177308-0d34-0410-b5e6-96231b3b80d8
* This one looks easy, add a note.Chris Lattner2008-03-081-1/+2
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48055 91177308-0d34-0410-b5e6-96231b3b80d8
* move these to the appropriate fileChris Lattner2008-03-081-8/+57
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48054 91177308-0d34-0410-b5e6-96231b3b80d8
* evan implemented this.Chris Lattner2008-03-051-26/+0
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47948 91177308-0d34-0410-b5e6-96231b3b80d8
* add a noteChris Lattner2008-03-051-0/+30
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47939 91177308-0d34-0410-b5e6-96231b3b80d8
* Evan implemented these.Chris Lattner2008-03-021-26/+0
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47828 91177308-0d34-0410-b5e6-96231b3b80d8
* upgrade some entries, remove stuff that is done.Chris Lattner2008-02-141-34/+16
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47109 91177308-0d34-0410-b5e6-96231b3b80d8
* readme updatesNate Begeman2008-02-131-0/+11
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47051 91177308-0d34-0410-b5e6-96231b3b80d8
* Enable SSE4 codegen and pattern matching.Nate Begeman2008-02-111-0/+20
| | | | | | | Add some notes to the README. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46949 91177308-0d34-0410-b5e6-96231b3b80d8
* add a noteChris Lattner2008-01-271-0/+39
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46413 91177308-0d34-0410-b5e6-96231b3b80d8