external_llvm.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	Recommit the fix for rdar://9289512 with a couple tweaks to	Chris Lattner	2011-04-22	2	-0/+57
\| \| \| \| \| \| \| \| \| \| \| \| \|	fix bugs exposed by the gcc dejagnu testsuite: 1. The load may actually be used by a dead instruction, which would cause an assert. 2. The load may not be used by the current chain of instructions, and we could move it past a side-effecting instruction. Change how we process uses to define the problem away. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130018 91177308-0d34-0410-b5e6-96231b3b80d8
*	Disassembly of A8.6.59 LDR (literal) Encoding T1 (16-bit thumb instruction) ↵	Johnny Chen	2011-04-22	2	-3/+6
\| \| \| \| \| \| \| \| \| \| \|	should print out ldr, not ldr.n. rdar://problem/9267772 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130008 91177308-0d34-0410-b5e6-96231b3b80d8
*	DAGCombine: fold "(zext x) == C" into "x == (trunc C)" if the trunc is lossless.	Benjamin Kramer	2011-04-22	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \|	On x86 this allows to fold a load into the cmp, greatly reducing register pressure. movzbl (%rdi), %eax cmpl $47, %eax -> cmpb $47, (%rdi) This shaves 8k off gcc.o on i386. I'll leave applying the patch in README.txt to Chris :) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130005 91177308-0d34-0410-b5e6-96231b3b80d8
*	X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & ↵	Benjamin Kramer	2011-04-22	1	-0/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(C2 >> C1)) & C1. (Part of PR5039) This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is uint64_t foo(uint64_t x) { return (x&1) << 42; } which used to compile into bloated code: shlq $42, %rdi ## encoding: [0x48,0xc1,0xe7,0x2a] movabsq $4398046511104, %rax ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00] andq %rdi, %rax ## encoding: [0x48,0x21,0xf8] ret ## encoding: [0xc3] with this patch we can fold the immediate into the and: andq $1, %rdi ## encoding: [0x48,0x83,0xe7,0x01] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] shlq $42, %rax ## encoding: [0x48,0xc1,0xe0,0x2a] ret ## encoding: [0xc3] It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing that without making this code even more complicated. See the TODOs in the code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129990 91177308-0d34-0410-b5e6-96231b3b80d8
*	In Thumb2 mode, lower frame indix references to:	Evan Cheng	2011-04-22	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \|	add <rd>, sp, #<imm8> ldr <rd>, [sp, #<imm8>] When the offset from sp is multiple of 4 and in range of 0-1020. This saves code size by utilizing 16-bit instructions. rdar://9321541 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129971 91177308-0d34-0410-b5e6-96231b3b80d8
*	Fix DWARF description of Q registers.	Devang Patel	2011-04-21	1	-0/+94
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129952 91177308-0d34-0410-b5e6-96231b3b80d8
*	Fix DWARF description of S registers.	Devang Patel	2011-04-21	1	-0/+116
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129947 91177308-0d34-0410-b5e6-96231b3b80d8
*	Test case for r129922	Devang Patel	2011-04-21	1	-0/+105
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129934 91177308-0d34-0410-b5e6-96231b3b80d8
*	Fix relative relocations. This is sufficient for running the rust testsuite with	Rafael Espindola	2011-04-21	1	-3/+21
\| \| \| \| \| \|	MC :-) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129923 91177308-0d34-0410-b5e6-96231b3b80d8
*	Revert r1296656, "Fix rdar://9289512 - not folding load into compare at -O0...",	Daniel Dunbar	2011-04-21	1	-22/+0
\| \| \| \| \| \|	which broke a couple GCC test suite tests at -O0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129914 91177308-0d34-0410-b5e6-96231b3b80d8
*	ptx: fix parameter ordering	Che-Liang Chiou	2011-04-21	1	-4/+4
\| \| \| \| \| \| \| \| \| \|	This patch depends on the prior fix r129908 that changes to use std::find, rather than std::binary_search, on unordered array. Patch by Dan Bailey git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129909 91177308-0d34-0410-b5e6-96231b3b80d8
*	Remove -use-divmod-libcall. Let targets opt in when they are available.	Evan Cheng	2011-04-20	1	-1/+1
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129884 91177308-0d34-0410-b5e6-96231b3b80d8
*	Fix another case of <rdar://problem/9184212> that only occurs with code	Cameron Zwarich	2011-04-20	1	-0/+15
\| \| \| \| \| \| \|	generated by llvm-gcc, since llvm-gcc uses 2 i64s for passing a 4 x float vector on ARM rather than an i64 array like Clang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129878 91177308-0d34-0410-b5e6-96231b3b80d8
*	Un-XFAIL this test for ARM. <rdar://problem/7662569>	Stuart Hastings	2011-04-20	1	-1/+0
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129875 91177308-0d34-0410-b5e6-96231b3b80d8
*	PTX: Add intrinsics to list of built-in intrinsics, which allows them to be	Justin Holewinski	2011-04-20	19	-24/+24
\| \| \| \| \| \| \| \| \| \|	used by Clang. To help Clang integration, the PTX target has been split into two targets: ptx32 and ptx64, depending on the desired pointer size. - Add GCCBuiltin class to all intrinsics - Split PTX target into ptx32 and ptx64 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129851 91177308-0d34-0410-b5e6-96231b3b80d8
*	Behave like gnu as when a relocation crosses sections.	Rafael Espindola	2011-04-20	1	-0/+28
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129850 91177308-0d34-0410-b5e6-96231b3b80d8
*	Rewrite the expander for umulo/smulo to remember to sign extend the input	Eric Christopher	2011-04-20	1	-0/+27
\| \| \| \| \| \| \| \| \| \|	manually and pass all (now) 4 arguments to the mul libcall. Add a new ExpandLibCall for just this (copied gratuitously from type legalization). Fixes rdar://9292577 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129842 91177308-0d34-0410-b5e6-96231b3b80d8
*	llc: Eliminate a use of getDarwinMajorNumber().	Daniel Dunbar	2011-04-19	2	-3/+3
\| \| \| \| \| \| \| \| \|	- As before, there is a minor semantic change here (evidenced by the test change) for Darwin triples that have no version component. I debated changing the default behavior of isOSVersionLT, but decided it made more sense for triples to be explicit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129805 91177308-0d34-0410-b5e6-96231b3b80d8
*	CodeGen: Eliminate a use of getDarwinMajorNumber().	Daniel Dunbar	2011-04-19	1	-1/+1
\| \| \| \| \| \| \| \| \|	- There is a minor semantic change here (evidenced by the test change) for Darwin triples that have no version component. I debated changing the default behavior of isOSVersionLT, but decided it made more sense for triples to be explicit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129802 91177308-0d34-0410-b5e6-96231b3b80d8
*	This patch combines several changes from Evan Cheng for rdar://8659675.	Bob Wilson	2011-04-19	1	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Enable these fp vmlx codegen changes for Cortex-A9. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129775 91177308-0d34-0410-b5e6-96231b3b80d8
*	Add -mcpu=cortex-a9-mp. It's cortex-a9 with MP extension. rdar://8648637.	Bob Wilson	2011-04-19	1	-8/+13
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129774 91177308-0d34-0410-b5e6-96231b3b80d8
*	Avoid some 's' 16-bit instruction which partially update CPSR	Bob Wilson	2011-04-19	1	-0/+16
\| \| \| \| \| \| \|	(and add false dependency) when it isn't dependent on last CPSR defining instruction. rdar://8928208 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129773 91177308-0d34-0410-b5e6-96231b3b80d8
*	Avoid write-after-write issue hazards for Cortex-A9.	Bob Wilson	2011-04-19	5	-8/+8
\| \| \| \| \| \| \| \| \| \| \|	Add a avoidWriteAfterWrite() target hook to identify register classes that suffer from write-after-write hazards. For those register classes, try to avoid writing the same register in two consecutive instructions. This is currently disabled by default. We should not spill to avoid hazards! The command line flag -avoid-waw-hazard can be used to enable waw avoidance. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129772 91177308-0d34-0410-b5e6-96231b3b80d8
*	Add support for FastISel'ing varargs calls.	Eli Friedman	2011-04-19	1	-0/+19
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129765 91177308-0d34-0410-b5e6-96231b3b80d8
*	Tighten test case a bit.	Jakob Stoklund Olesen	2011-04-19	1	-1/+2
\| \| \| \| \| \| \|	Ideally, we would match an S-register to its containing D-register, but that requires arithmetic (divide by 2). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129756 91177308-0d34-0410-b5e6-96231b3b80d8
*	Implement support for x86 fastisel of small fixed-sized memcpys, which are ↵	Chris Lattner	2011-04-19	1	-0/+11
\| \| \| \| \| \| \| \| \| \|	generated en-mass for C++ PODs. On my c++ test file, this cuts the fast isel rejects by 10x and shrinks the generated .s file by 5% git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129755 91177308-0d34-0410-b5e6-96231b3b80d8
*	Implement support for fast isel of calls of i1 arguments, even though they ↵	Chris Lattner	2011-04-19	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	are illegal, when they are a truncate from something else. This eliminates fully half of all the fastisel rejections on a test c++ file I'm working with, which should make a substantial improvement for -O0 compile of c++ code. This fixed rdar://9297003 - fast isel bails out on all functions taking bools git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129752 91177308-0d34-0410-b5e6-96231b3b80d8
*	Handle i1/i8/i16 constant integer arguments to calls by prepromoting them.	Chris Lattner	2011-04-19	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before we would bail out on i1 arguments all together, now we just bail on non-constant ones. Also, we used to emit extraneous code. e.g. test12 was: movb $0, %al movzbl %al, %edi callq _test12 and test13 was: movb $0, %al xorl %edi, %edi movb %al, 7(%rsp) callq _test13f Now we get: movl $0, %edi callq _test12 and: movl $0, %edi callq _test13f git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129751 91177308-0d34-0410-b5e6-96231b3b80d8
*	be layout aware, to produce:	Chris Lattner	2011-04-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	testb $1, %al je LBB0_2 ## BB#1: ## %if.then movb $0, %al instead of: testb $1, %al jne LBB0_1 jmp LBB0_2 LBB0_1: ## %if.then movb $0, %al how 'bout that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129749 91177308-0d34-0410-b5e6-96231b3b80d8
*	fix rdar://9297006 - fast isel bails out on trunc to i1 -> bools cry,	Chris Lattner	2011-04-19	1	-0/+17
\| \| \| \| \| \| \|	a common cause of fast isel rejects on c++ code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129748 91177308-0d34-0410-b5e6-96231b3b80d8
*	Make tests register allocation independent again.	Jakob Stoklund Olesen	2011-04-19	4	-14/+10
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129739 91177308-0d34-0410-b5e6-96231b3b80d8
*	Do not lose mem_operands while lowering VLD / VST intrinsics.	Evan Cheng	2011-04-19	2	-5/+7
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129738 91177308-0d34-0410-b5e6-96231b3b80d8
*	Remove test to check line numbers. There are other numerous tests in our ↵	Devang Patel	2011-04-18	1	-27/+0
\| \| \| \| \| \|	test harness to check line number information. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129725 91177308-0d34-0410-b5e6-96231b3b80d8
*	Fix a bug where we were counting the alias sets as completely used	Eric Christopher	2011-04-18	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \|	registers for fast allocation a different way. This has us updating used registers only when we're using that exact register. Fixes rdar://9207598 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129711 91177308-0d34-0410-b5e6-96231b3b80d8
*	while we're at it, handle 'sdiv exact' of a power of 2 also,	Chris Lattner	2011-04-18	1	-0/+8
\| \| \| \| \| \| \|	this fixes a few rejects on c++ iterator loops. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129694 91177308-0d34-0410-b5e6-96231b3b80d8
*	fix rdar://9297011 - udiv by power of two causing fast-isel rejects	Chris Lattner	2011-04-18	1	-1/+9
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129693 91177308-0d34-0410-b5e6-96231b3b80d8
*	Implement major new fastisel functionality: the matcher can now handle ↵	Chris Lattner	2011-04-18	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	immediates with value constraints on them (when defined as ImmLeaf's). This is particularly important for X86-64, where almost all reg/imm instructions take a i64immSExt32 immediate operand, which has a value constraint. Before this patch we ended up iseling the examples into such amazing code as: movabsq $7, %rax imulq %rax, %rdi movq %rdi, %rax ret now we produce: imulq $7, %rdi, %rax ret This dramatically shrinks the generated code at -O0 on x86-64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129691 91177308-0d34-0410-b5e6-96231b3b80d8
*	relax this test to just check that the lock prefix is encoded properly,	Chris Lattner	2011-04-18	1	-2/+1
\| \| \| \| \| \| \|	and to not rely on the register allocator's arbitrary operand choices. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129690 91177308-0d34-0410-b5e6-96231b3b80d8
*	1. merge fast-isel-shift-imm.ll into fast-isel-x86-64.ll	Chris Lattner	2011-04-17	3	-12/+36
\| \| \| \| \| \| \| \| \| \| \| \|	2. implement rdar://9289501 - fast isel should fold trivial multiplies to shifts 3. teach tblgen to handle shift immediates that are different sizes than the shifted operands, eliminating some code from the X86 fast isel backend. 4. Have FastISel::SelectBinaryOp use (the poorly named) FastEmit_ri_ function instead of FastEmit_ri to simplify code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129666 91177308-0d34-0410-b5e6-96231b3b80d8
*	fix an x86 fast isel issue where we'd completely give up on folding an address	Chris Lattner	2011-04-17	1	-4/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	when we have a global variable base an an index. Instead, just give up on folding the global variable. Before we'd geenrate: _test: ## @test ## BB#0: movq _rtx_length@GOTPCREL(%rip), %rax leaq (%rax), %rax addq %rdi, %rax movzbl (%rax), %eax ret now we generate: _test: ## @test ## BB#0: movq _rtx_length@GOTPCREL(%rip), %rax movzbl (%rax,%rdi), %eax ret The difference is even more significant when there is a scale involved. This fixes rdar://9289558 - total fail with addr mode formation at -O0/x86-64 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129664 91177308-0d34-0410-b5e6-96231b3b80d8
*	fix an oversight which caused us to compile the testcase (and other	Chris Lattner	2011-04-17	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	less trivial things) into a dummy lea. Before we generated: _test: ## @test movq _G@GOTPCREL(%rip), %rax leaq (%rax), %rax ret now we produce: _test: ## @test movq _G@GOTPCREL(%rip), %rax ret This is part of rdar://9289558 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129662 91177308-0d34-0410-b5e6-96231b3b80d8
*	Fix rdar://9289512 - not folding load into compare at -O0	Chris Lattner	2011-04-17	1	-1/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The basic issue here is that bottom-up isel is matching the branch and compare, and was failing to fold the load into the branch/compare combo. Fixing this (by allowing folding into any instruction of a sequence that is selected) allows us to produce things like: cmpb $0, 52(%rax) je LBB4_2 instead of: movb 52(%rax), %cl cmpb $0, %cl je LBB4_2 This makes the generated -O0 code run a bit faster, but also speeds up compile time by putting less pressure on the register allocator and generating less code. This was one of the biggest classes of missing load folding. Implementing this shrinks 176.gcc's c-decl.s (as a random example) by about 4% in (verbose-asm) line count. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129656 91177308-0d34-0410-b5e6-96231b3b80d8
*	Remove working entry from README.	Eli Friedman	2011-04-17	1	-1/+1
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129654 91177308-0d34-0410-b5e6-96231b3b80d8
*	fix rdar://9289583 - fast isel should handle non-canonical commutative binops	Chris Lattner	2011-04-17	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \|	allowing us to fold the immediate into the 'and' in this case: int test1(int i) { return 8&i; } git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129653 91177308-0d34-0410-b5e6-96231b3b80d8
*	PR9055: extend the fix to PR4050 (r70179) to apply to zext and anyext.	Eli Friedman	2011-04-16	1	-0/+23
\| \| \| \| \| \| \| \| \|	Returning a new node makes the code try to replace the old node, which in the included testcase is killed by CSE. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129650 91177308-0d34-0410-b5e6-96231b3b80d8
*	Add test cases for Jay's r129641 and fix a 32-bit-centric testcase in a file ↵	Frits van Bommel	2011-04-16	1	-5/+81
\| \| \| \| \| \|	with a 64-bit datalayout. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129643 91177308-0d34-0410-b5e6-96231b3b80d8
*	Fix divmod libcall lowering. Convert to {S\|U}DIVREM first and then expand ↵	Evan Cheng	2011-04-16	1	-0/+31
\| \| \| \| \| \|	the node to a libcall. rdar://9280991 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129633 91177308-0d34-0410-b5e6-96231b3b80d8
*	Thumb2 BFC was insufficiently encoded.	Johnny Chen	2011-04-15	1	-0/+3
\| \| \| \| \| \| \|	rdar://problem/9292717 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129619 91177308-0d34-0410-b5e6-96231b3b80d8
*	A8.6.315 VLD3 (single 3-element structure to all lanes)	Johnny Chen	2011-04-15	1	-0/+11
\| \| \| \| \| \| \| \| \|	The a bit must be encoded as 0. rdar://problem/9292625 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129618 91177308-0d34-0410-b5e6-96231b3b80d8
*	Re-enable test o32_cc_vararg.ll.	Akira Hatanaka	2011-04-15	1	-3/+0
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129616 91177308-0d34-0410-b5e6-96231b3b80d8