| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Thanks to Chandler for pointing out the problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158928 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
and were crashing this code on 64 bits machines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158923 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
|
| |
- provide more extensive set of functions to detect library allocation functions (e.g., malloc, calloc, strdup, etc)
- provide an API to compute the size and offset of an object pointed by
Move a few clients (GVN, AA, instcombine, ...) to the new API.
This implementation is a lot more aggressive than each of the custom implementations being replaced.
Patch reviewed by Nick Lewycky and Chandler Carruth, thanks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158919 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-stable-loops enables a new algorithm for generating the Loop
forest. It differs from the original algorithm in a few respects:
- Not determined by use-list order.
- Initially guarantees RPO order of block and subloops.
- Linear in the number of CFG edges.
- Nonrecursive.
I didn't want to change the LoopInfo API yet, so the block lists are
still inclusive. This seems strange to me, and it means that building
LoopInfo is not strictly linear, but it may not be a problem in
practice. At least the block lists start out in RPO order now. In the
future we may add an attribute or wrapper analysis that allows other
passes to assume RPO order.
The primary motivation of this work was not to optimize LoopInfo, but
to allow reproducing performance issues by decomposing the compilation
stages. I'm often unable to do this with the current LoopInfo, because
the loop tree order determines Loop pass order. Serializing the IR
tends to invert the order, which reverses the optimization order. This
makes it nearly impossible to debug interdependent loop optimizations
such as LSR.
I also believe this will provide more stable performance results across time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158790 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
| |
The implementation only needs inclusion from LoopInfo.cpp and
MachineLoopInfo.cpp. Clients of the interface should only include the
interface. This makes the interface readable and speeds up rebuilds
after modifying the implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158787 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
LLVM is now -Wunused-private-field clean except for
- lib/MC/MCDisassembler/Disassembler.h. Not sure why it keeps all those unaccessible fields.
- gtest.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158096 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157885 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
Part of rdar://11570854
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157786 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157704 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This also required making recursive simplifications until
nothing changes or a hard limit (currently 3) is hit.
With the simplification in place indvars can canonicalize
loops of the form
for (unsigned i = 0; i < a-b; ++i)
into
for (unsigned i = 0; i != a-b; ++i)
which used to fail because SCEV created a weird umax expr
for the backedge taken count.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157701 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
| |
If integer overflow causes one of the terms to reach zero, that can
force the entire expression to zero.
Fixes PR12929: cast<Ty>() argument of incompatible type
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157673 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
No functionality.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157672 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157377 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
Part of rdar://11496790
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157303 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
Fixes PR12898: SCEVExpander crash.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157263 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
to generate out of the front end.
rdar://11479676
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157094 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
| |
getUDivExpr attempts to simplify by checking for overflow.
isLoopEntryGuardedByCond then evaluates the loop predicate which
may lead to the same getUDivExpr causing endless recursion.
Fixes PR12868: clang 3.2 segmentation fault.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157092 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
same switch instruction by doing union of ranges (which may still be conservative, but it's more aggressive than before)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157071 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157033 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157024 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157011 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
getZeroExtendExpr()
this gives a speedup of > 80 in a debug build in the test case of PR12825 (php_sha512_crypt_r)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156849 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
getEffectiveSCEVType() twice
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156823 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
| |
so that it can be reused in MemCpyOptimizer. This analysis is needed to remove
an unnecessary memcpy when returning a struct into a local variable.
rdar://11341081
PR12686
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156776 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156687 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156589 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156558 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
| |
of recursion, to avoid excessive stack usage on deep expressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156554 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
add a new Region::block_iterator which actually iterates over the basic
blocks of the region.
The old iterator, now call 'block_node_iterator' iterates over
RegionNodes which contain a single basic block. This works well with the
GraphTraits-based iterator design, however most users actually want an
iterator over the BasicBlocks inside these RegionNodes. Now the
'block_iterator' is a wrapper which exposes exactly this interface.
Internally it uses the block_node_iterator to walk all nodes which are
single basic blocks, but transparently unwraps the basic block to make
user code simpler.
While this patch is a bit of a wash, most of the updates are to internal
users, not external users of the RegionInfo. I have an accompanying
patch to Polly that is a strict simplification of every user of this
interface, and I'm working on a pass that also wants the same simplified
interface.
This patch alone should have no functional impact.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156202 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
minor behavior changes with this, but nothing I have seen evidence of in
the wild or expect to be meaningful. The real goal is unifying our logic
and simplifying the interfaces. A summary of the changes follows:
- Make 'callIsSmall' actually accept a callsite so it can handle
intrinsics, and simplify callers appropriately.
- Nuke a completely bogus declaration of 'callIsSmall' that was still
lurking in InlineCost.h... No idea how this got missed.
- Teach the 'isInstructionFree' about the various more intelligent
'free' heuristics that got added to the inline cost analysis during
review and testing. This mostly surrounds int->ptr and ptr->int casts.
- Switch most of the interesting parts of the inline cost analysis that
were essentially computing 'is this instruction free?' to use the code
metrics routine instead. This way we won't keep duplicating logic.
All of this is motivated by the desire to allow other passes to compute
a roughly equivalent 'cost' metric for a particular basic block as the
inline cost analysis. Sadly, re-using the same analysis for both is
really messy because only the actual inline cost analysis is ever going
to go to the contortions required for simplification, SROA analysis,
etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156140 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
being done for malloc)
fix a few typos found by Chad in my previous commit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156110 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156102 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
| |
known zero in the LHS. Fixes PR12541.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155818 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
| |
properly with how the code handles all-undef PHI nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155721 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
vectors"
It broke stage2 build. stage1/clang sometimes crashed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155699 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instead of getAggregateElement. This has the advantage of being
more consistent and allowing higher-level constant folding to
procede even if an inner extract element cannot be folded.
Make ConstantFoldInstruction call ConstantFoldConstantExpression
on the instruction's operands, making it more consistent with
ConstantFoldConstantExpression itself. This makes sure that
ConstantExprs get TargetData-aware folding before being handed
off as operands for further folding.
This causes more expressions to be folded, but due to a known
shortcoming in constant folding, this currently has the side effect
of stripping a few more nuw and inbounds flags in the non-targetdata
side of constant-fold-gep.ll. This is mostly harmless.
This fixes rdar://11324230.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155682 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
constants in C++11 mode. I have no idea why it required such particular
circumstances to get here, the code seems clearly to rely upon unchecked
assumptions.
Specifically, when we decide to form an index into a struct type, we may
have gone through (at least one) zero-length array indexing round, which
would have left the offset un-adjusted, and thus not necessarily valid
for use when indexing the struct type.
This is just an canonicalization step, so the correct thing is to refuse
to canonicalize nonsensical GEPs of this form. Implemented, and test
case added.
Fixes PR12642. Pair debugged and coded with Richard Smith. =] I credit
him with most of the debugging, and preventing me from writing the wrong
code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155466 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
| |
find forward declarations in the context that the actual definition
will occur.
rdar://11291658
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155380 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
has NUW but not NSW."
This isn't right either, reverting for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154910 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
| |
Yea, 'NumCallerCallersAnalyzed' isn't a great name, suggestions welcome.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154492 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
Take this opportunity to generalize the indirectbr bailout logic for
loop transformations. CFG transformations will never get indirectbr
right, and there's no point trying.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154386 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
speculate. Without this, loop rotate (among many other places) would
suddenly stop working in the presence of debug info. I found this
looking at loop rotate, and have augmented its tests with a reduction
out of a very hot loop in yacr2 where failing to do this rotation costs
sometimes more than 10% in runtime performance, perturbing numerous
downstream optimizations.
This should have no impact on performance without debug info, but the
change in performance when debug info is enabled can be extreme. As
a consequence (and this how I got to this yak) any profiling of
performance problems should be treated with deep suspicion -- they may
have been wildly innacurate of debug info was enabled for profiling. =/
Just a heads up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154263 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
but not NSW.
Found by inspection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154262 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
| |
parameter until we have a more sensible API for doing the same thing.
Reviewed by Chandler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154180 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154011 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
| |
brace) so that we get more accurate line number information about the
declaration of a given function and the line where the function
first starts.
Part of rdar://11026482
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153916 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
| |
This is the CodeGen equivalent of r153747. I tested that there is not noticeable
performance difference with any combination of -O0/-O2 /-g when compiling
gcc as a single compilation unit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153817 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
| |
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153815 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
| |
interfaces. These methods were used in the old inline cost system where
there was a persistent cache that had to be updated, invalidated, and
cleared. We're now doing more direct computations that don't require
this intricate dance. Even if we resume some level of caching, it would
almost certainly have a simpler and more narrow interface than this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153813 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
on a per-callsite walk of the called function's instructions, in
breadth-first order over the potentially reachable set of basic blocks.
This is a major shift in how inline cost analysis works to improve the
accuracy and rationality of inlining decisions. A brief outline of the
algorithm this moves to:
- Build a simplification mapping based on the callsite arguments to the
function arguments.
- Push the entry block onto a worklist of potentially-live basic blocks.
- Pop the first block off of the *front* of the worklist (for
breadth-first ordering) and walk its instructions using a custom
InstVisitor.
- For each instruction's operands, re-map them based on the
simplification mappings available for the given callsite.
- Compute any simplification possible of the instruction after
re-mapping, and store that back int othe simplification mapping.
- Compute any bonuses, costs, or other impacts of the instruction on the
cost metric.
- When the terminator is reached, replace any conditional value in the
terminator with any simplifications from the mapping we have, and add
any successors which are not proven to be dead from these
simplifications to the worklist.
- Pop the next block off of the front of the worklist, and repeat.
- As soon as the cost of inlining exceeds the threshold for the
callsite, stop analyzing the function in order to bound cost.
The primary goal of this algorithm is to perfectly handle dead code
paths. We do not want any code in trivially dead code paths to impact
inlining decisions. The previous metric was *extremely* flawed here, and
would always subtract the average cost of two successors of
a conditional branch when it was proven to become an unconditional
branch at the callsite. There was no handling of wildly different costs
between the two successors, which would cause inlining when the path
actually taken was too large, and no inlining when the path actually
taken was trivially simple. There was also no handling of the code
*path*, only the immediate successors. These problems vanish completely
now. See the added regression tests for the shiny new features -- we
skip recursive function calls, SROA-killing instructions, and high cost
complex CFG structures when dead at the callsite being analyzed.
Switching to this algorithm required refactoring the inline cost
interface to accept the actual threshold rather than simply returning
a single cost. The resulting interface is pretty bad, and I'm planning
to do lots of interface cleanup after this patch.
Several other refactorings fell out of this, but I've tried to minimize
them for this patch. =/ There is still more cleanup that can be done
here. Please point out anything that you see in review.
I've worked really hard to try to mirror at least the spirit of all of
the previous heuristics in the new model. It's not clear that they are
all correct any more, but I wanted to minimize the change in this single
patch, it's already a bit ridiculous. One heuristic that is *not* yet
mirrored is to allow inlining of functions with a dynamic alloca *if*
the caller has a dynamic alloca. I will add this back, but I think the
most reasonable way requires changes to the inliner itself rather than
just the cost metric, and so I've deferred this for a subsequent patch.
The test case is XFAIL-ed until then.
As mentioned in the review mail, this seems to make Clang run about 1%
to 2% faster in -O0, but makes its binary size grow by just under 4%.
I've looked into the 4% growth, and it can be fixed, but requires
changes to other parts of the inliner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153812 91177308-0d34-0410-b5e6-96231b3b80d8
|