aboutsummaryrefslogtreecommitdiffstats
path: root/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* Clean up some of this code a tiny bit, no functionality change.Nick Lewycky2013-07-181-13/+8
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186622 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert "Remove DIBuilder cache of variable TheCU and change the few"Eric Christopher2013-07-181-3/+3
| | | | | | This reverts commit r186599 as I didn't want to commit this yet. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186601 91177308-0d34-0410-b5e6-96231b3b80d8
* Remove DIBuilder cache of variable TheCU and change the fewEric Christopher2013-07-181-3/+3
| | | | | | | uses that wanted it. Also change the interface for createCompileUnit to compensate. Fix comments that refer to TheCU as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186599 91177308-0d34-0410-b5e6-96231b3b80d8
* Handle constants without going through SCEV.Nadav Rotem2013-07-181-0/+6
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186593 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Speedup isConsecutive by manually checking GEPs with multiple ↵Nadav Rotem2013-07-181-4/+12
| | | | | | | | | | indices. This brings the compile time of the SLP-Vectorizer to about 2.5% of OPT for my testcase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186592 91177308-0d34-0410-b5e6-96231b3b80d8
* Reapply r186316 with a fix for one bug where the code could walk off theChandler Carruth2013-07-181-1255/+976
| | | | | | | | | | | | end of a vector. This was found with ASan. I've had one other report of a crasher, but thus far been unable to reproduce the crash. It may well be fixed with this version, and if not I'd like to get more information from the build bots about what is happening. See r186316 for the full commit log for the new implementation of the SROA algorithm. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186565 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Speedup isConsecutive (that checks if two addresses are ↵Nadav Rotem2013-07-181-12/+31
| | | | | | consecutive in memory) by checking for additional patterns that don't need to go through SCEV. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186563 91177308-0d34-0410-b5e6-96231b3b80d8
* Add comparison operators for DIDescriptors to fix c++98 falloutEric Christopher2013-07-171-1/+1
| | | | | | | | of operator bool change. Also convert a variable in DebugIR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186544 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix a comment.Nadav Rotem2013-07-171-1/+1
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186541 91177308-0d34-0410-b5e6-96231b3b80d8
* Restore r181216, which was partially reverted in r182499.Stephen Lin2013-07-172-43/+29
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186533 91177308-0d34-0410-b5e6-96231b3b80d8
* Add a micro optimization to catch cases where the PtrA equals PtrB.Nadav Rotem2013-07-171-1/+1
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186531 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix comparisons of alloca alignment in inliner mergingHal Finkel2013-07-171-3/+12
| | | | | | | | Duncan pointed out a mistake in my fix in r186425 when only one of the allocas being compared had the target-default alignment. This is essentially his suggested solution. Thanks! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186510 91177308-0d34-0410-b5e6-96231b3b80d8
* Mark a method 'const' and another 'static'.Craig Topper2013-07-171-2/+2
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186485 91177308-0d34-0410-b5e6-96231b3b80d8
* Make a few more static string pointers constant.Craig Topper2013-07-171-8/+8
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186484 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Accelerate the isConsecutive check by replacing the ↵Nadav Rotem2013-07-171-10/+5
| | | | | | subtraction of the two values with a simple SCEV expression that adds the offset to one of the pointers that we compare. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186479 91177308-0d34-0410-b5e6-96231b3b80d8
* flip the scev minus direction to simplify the code.Nadav Rotem2013-07-161-3/+3
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186466 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Improve the compile time of isConsecutive by adding a simple ↵Nadav Rotem2013-07-161-0/+18
| | | | | | | | | | constant-gep check before using SCEV. This check does not always work because not all of the GEPs use a constant offset, but it happens often enough to reduce the number of times we use SCEV. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186465 91177308-0d34-0410-b5e6-96231b3b80d8
* Add a wrapper for open.Rafael Espindola2013-07-161-1/+1
| | | | | | | This centralizes the handling of O_BINARY and opens the way for hiding more differences (like how open behaves with directories). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186447 91177308-0d34-0410-b5e6-96231b3b80d8
* Make SpecialCaseList match full strings, as documented, using anchors.Peter Collingbourne2013-07-161-1/+1
| | | | | | Differential Revision: http://llvm-reviews.chandlerc.com/D1149 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186431 91177308-0d34-0410-b5e6-96231b3b80d8
* When the inliner merges allocas, it must keep the larger alignmentHal Finkel2013-07-161-2/+16
| | | | | | | | | | | | For safety, the inliner cannot decrease the allignment on an alloca when merging it with another. I've included two variants of the test case for this: one with DataLayout available, and one without. When DataLayout is not available, if only one of the allocas uses the default alignment (getAlignment() == 0), then they cannot be safely merged. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186425 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Reduce the compile time of the consecutive store lookup.Nadav Rotem2013-07-161-5/+13
| | | | | | | | Process groups of stores in chunks of 16. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186420 91177308-0d34-0410-b5e6-96231b3b80d8
* Add 'const' qualifiers to static const char* variables.Craig Topper2013-07-162-20/+21
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186371 91177308-0d34-0410-b5e6-96231b3b80d8
* PR16628: Fix a bug in the code that merges compares.Nadav Rotem2013-07-151-1/+3
| | | | | | | | Compares return i1 but they compare different types. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186359 91177308-0d34-0410-b5e6-96231b3b80d8
* Remove trailing whitespaceStephen Lin2013-07-151-36/+36
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186333 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert r186316 while I track down an ASan failure and an assert fromChandler Carruth2013-07-151-972/+1255
| | | | | | | | | | | a bot. This reverts the commit which introduced a new implementation of the fancy SROA pass designed to reduce its overhead. I'll skip the huge commit log here, refer to r186316 if you're looking for how this all works and why it works that way. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186332 91177308-0d34-0410-b5e6-96231b3b80d8
* Reimplement SROA yet again. Same fundamental principle, but a totallyChandler Carruth2013-07-151-1255/+972
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | different core implementation strategy. Previously, SROA would build a relatively elaborate partitioning of an alloca, associate uses with each partition, and then rewrite the uses of each partition in an attempt to break apart the alloca into chunks that could be promoted. This was very wasteful in terms of memory and compile time because regardless of how complex the alloca or how much we're able to do in breaking it up, all of the datastructure work to analyze the partitioning was done up front. The new implementation attempts to form partitions of the alloca lazily and on the fly, rewriting the uses that make up that partition as it goes. This has a few significant effects: 1) Much simpler data structures are used throughout. 2) No more double walk of the recursive use graph of the alloca, only walk it once. 3) No more complex algorithms for associating a particular use with a particular partition. 4) PHI and Select speculation is simplified and happens lazily. 5) More precise information is available about a specific use of the alloca, removing the need for some side datastructures. Ultimately, I think this is a much better implementation. It removes about 300 lines of code, but arguably removes more like 500 considering that some code grew in the process of being factored apart and cleaned up for this all to work. I've re-used as much of the old implementation as possible, which includes the lion's share of code in the form of the rewriting logic. The interesting new logic centers around how the uses of a partition are sorted, and split into actual partitions. Each instruction using a pointer derived from the alloca gets a 'Partition' entry. This name is totally wrong, but I'll do a rename in a follow-up commit as there is already enough churn here. The entry describes the offset range accessed and the nature of the access. Once we have all of these entries we sort them in a very specific way: increasing order of begin offset, followed by whether they are splittable uses (memcpy, etc), followed by the end offset or whatever. Sorting by splittability is important as it simplifies the collection of uses into a partition. Once we have these uses sorted, we walk from the beginning to the end building up a range of uses that form a partition of the alloca. Overlapping unsplittable uses are merged into a single partition while splittable uses are broken apart and carried from one partition to the next. A partition is also introduced to bridge splittable uses between the unsplittable regions when necessary. I've looked at the performance PRs fairly closely. PR15471 no longer will even load (the module is invalid). Not sure what is up there. PR15412 improves by between 5% and 10%, however it is nearly impossible to know what is holding it up as SROA (the entire pass) takes less time than reading the IR for that test case. The analysis takes the same time as running mem2reg on the final allocas. I suspect (without much evidence) that the new implementation will scale much better however, and it is just the small nature of the test cases that makes the changes small and noisy. Either way, it is still simpler and cleaner I think. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186316 91177308-0d34-0410-b5e6-96231b3b80d8
* Add 'const' qualifier to some arrays.Craig Topper2013-07-151-1/+1
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186312 91177308-0d34-0410-b5e6-96231b3b80d8
* Use llvm::array_lengthof to replace sizeof(array)/sizeof(array[0]).Craig Topper2013-07-151-1/+2
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186301 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: change the order in which we search for vectorization ↵Nadav Rotem2013-07-141-4/+4
| | | | | | candidates. Do stores first and PHIs second. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186277 91177308-0d34-0410-b5e6-96231b3b80d8
* Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector ↵Craig Topper2013-07-149-52/+57
| | | | | | size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186274 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorizer: Disallow reductions whose header phi is used outside the loopArnold Schwaighofer2013-07-131-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an outside loop user of the reduction value uses the header phi node we cannot just reduce the vectorized phi value in the vector code epilog because we would loose VF-1 reductions. lp: p = phi (0, lv) lv = lv + 1 ... brcond , lp, outside outside: usr = add 0, p (Say the loop iterates two times, the value of p coming out of the loop is one). We cannot just transform this to: vlp: p = phi (<0,0>, lv) lv = lv + <1,1> .. brcond , lp, outside outside: p_reduced = p[0] + [1]; usr = add 0, p_reduced (Because the original loop iterated two times the vectorized loop would iterate one time, but p_reduced ends up being zero instead of one). We would have to execute VF-1 iterations in the scalar remainder loop in such cases. For now, just disable vectorization. PR16522 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186256 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorize fix: LoopInfo must be valid when invoking utils like SCEVExpander.Andrew Trick2013-07-131-18/+18
| | | | | | | | | | | In general, one should always complete CFG modifications first, update CFG-based analyses, like Dominatores and LoopInfo, then generate instruction sequences. LoopVectorizer was creating a new loop, calling SCEVExpander to generate checks, then updating LoopInfo. I just changed the order. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186241 91177308-0d34-0410-b5e6-96231b3b80d8
* Add a microoptimization for urem.Nick Lewycky2013-07-131-0/+7
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186235 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix a crash in EvaluateInDifferentElementOrder where it would generate anJoey Gouly2013-07-121-1/+3
| | | | | | | | | undef vector of the wrong type. LGTM'd by Nick Lewycky on IRC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186224 91177308-0d34-0410-b5e6-96231b3b80d8
* LFTR improvement to avoid truncation.Andrew Trick2013-07-121-6/+32
| | | | | | This is a reimplemntation of the patch originally in r186107. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186215 91177308-0d34-0410-b5e6-96231b3b80d8
* Cleanup LFTR logic.Andrew Trick2013-07-121-28/+9
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186214 91177308-0d34-0410-b5e6-96231b3b80d8
* Cleanup: rename a variable to make the logic easier to follow.Andrew Trick2013-07-121-7/+7
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186213 91177308-0d34-0410-b5e6-96231b3b80d8
* TargetTransformInfo: address calculation parameter for gather/scatherArnold Schwaighofer2013-07-121-1/+56
| | | | | | | | | | | Address calculation for gather/scather in vectorized code can incur a significant cost making vectorization unbeneficial. Add infrastructure to add cost. Tests and cost model for targets will be in follow-up commits. radar://14351991 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186187 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert "indvars: Improve LFTR by eliminating truncation when comparingChandler Carruth2013-07-121-23/+4
| | | | | | | | | | | | | | | | | | | against a constant." This reverts commit r186107. It didn't handle wrapping arithmetic in the loop correctly and thus caused the following C program to count from 0 to UINT64_MAX instead of from 0 to 255 as intended: #include <stdio.h> int main() { unsigned char first = 0, last = 255; do { printf("%d\n", first); } while (first++ != last); } Full test case and instructions to reproduce with just the -indvars pass sent to the original review thread rather than to r186107's commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186152 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Sink and enable CSE for ExtractElements.Nadav Rotem2013-07-121-11/+25
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186145 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorize: Replace the code that checks for vectorization candidates in ↵Nadav Rotem2013-07-121-25/+22
| | | | | | | | | | successor blocks with code that scans PHINodes. Before we could vectorize PHINodes scanning successors was a good way of finding candidates. Now we can vectorize the phinodes which is simpler. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186139 91177308-0d34-0410-b5e6-96231b3b80d8
* Remove an argument that we dont use anymore.Nadav Rotem2013-07-111-15/+12
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186116 91177308-0d34-0410-b5e6-96231b3b80d8
* indvars: Improve LFTR by eliminating truncation when comparing against a ↵Andrew Trick2013-07-111-4/+23
| | | | | | | | | | | | | | | | | constant. Patch by Michele Scandale! Adds a special handling of the case where, during the loop exit condition rewriting, the exit value is a constant of bitwidth lower than the type of the induction variable: instead of introducing a trunc operation in order to match correctly the operand types, it allows to convert the constant value to an equivalent constant, depending on the initial value of the induction variable and the trip count, in order have an equivalent comparison between the induction variable and the new constant. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186107 91177308-0d34-0410-b5e6-96231b3b80d8
* Don't use a potentially expensive shift if all we want is one set bit.Benjamin Kramer2013-07-112-2/+2
| | | | | | No functionality change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186095 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorize: Vectorize all accesses in address space zero with unit strideArnold Schwaighofer2013-07-111-8/+16
| | | | | | | | | | | We can vectorize them because in the case where we wrap in the address space the unvectorized code would have had to access a pointer value of zero which is undefined behavior in address space zero according to the LLVM IR semantics. (Thank you Duncan, for pointing this out to me). Fixes PR16592. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186088 91177308-0d34-0410-b5e6-96231b3b80d8
* TryToSimplifyUncondBranchFromEmptyBlock was checking that any commonDuncan Sands2013-07-111-23/+147
| | | | | | | | | | | predecessors of the two blocks it is attempting to merge supply the same incoming values to any phi in the successor block. This change allows merging in the case where there is one or more incoming values that are undef. The undef values are rewritten to match the non-undef value that flows from the other edge. Patch by Mark Lacey. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186069 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix a warning.Nadav Rotem2013-07-111-2/+1
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186064 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: refactor the code that places extracts. Place the code that ↵Nadav Rotem2013-07-111-41/+131
| | | | | | decides where to put extracts in the build-tree phase. This allows us to take the cost of the extracts into account. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186058 91177308-0d34-0410-b5e6-96231b3b80d8
* Teach TailRecursionElimination to handle certain cases of nocapture escaping ↵Michael Gottesman2013-07-111-64/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | allocas. Without the changes introduced into this patch, if TRE saw any allocas at all, TRE would not perform TRE *or* mark callsites with the tail marker. Because TRE runs after mem2reg, this inadequacy is not a death sentence. But given a callsite A without escaping alloca argument, A may not be able to have the tail marker placed on it due to a separate callsite B having a write-back parameter passed in via an argument with the nocapture attribute. Assume that B is the only other callsite besides A and B only has nocapture escaping alloca arguments (*NOTE* B may have other arguments that are not passed allocas). In this case not marking A with the tail marker is unnecessarily conservative since: 1. By assumption A has no escaping alloca arguments itself so it can not access the caller's stack via its arguments. 2. Since all of B's escaping alloca arguments are passed as parameters with the nocapture attribute, we know that B does not stash said escaping allocas in a manner that outlives B itself and thus could be accessed indirectly by A. With the changes introduced by this patch: 1. If we see any escaping allocas passed as a capturing argument, we do nothing and bail early. 2. If we do not see any escaping allocas passed as captured arguments but we do see escaping allocas passed as nocapture arguments: i. We do not perform TRE to avoid PR962 since the code generator produces significantly worse code for the dynamic allocas that would be created by the TRE algorithm. ii. If we do not return twice, mark call sites without escaping allocas with the tail marker. *NOTE* This excludes functions with escaping nocapture allocas. 3. If we do not see any escaping allocas at all (whether captured or not): i. If we do not have usage of setjmp, mark all callsites with the tail marker. ii. If there are no dynamic/variable sized allocas in the function, attempt to perform TRE on all callsites in the function. Based off of a patch by Nick Lewycky. rdar://14324281. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186057 91177308-0d34-0410-b5e6-96231b3b80d8
* [objc-arc] Changed 'mode: c++' => 'C++' at Nick Lewycky's suggestion. Also ↵Michael Gottesman2013-07-107-7/+7
| | | | | | removed unnecessary mode: c++ lines from .cpp files. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186026 91177308-0d34-0410-b5e6-96231b3b80d8