diff options
author | Chris Lattner <sabre@nondot.org> | 2010-04-21 06:42:24 +0000 |
---|---|---|
committer | Chris Lattner <sabre@nondot.org> | 2010-04-21 06:42:24 +0000 |
commit | a54c1f70b8bffa78316d1447756d5ba400bda895 (patch) | |
tree | dc066f47fe0b0c013f3278d1b0d12f92e9c10a01 /docs | |
parent | 450a31edde46234dff2a681006878a853efc1027 (diff) | |
download | external_llvm-a54c1f70b8bffa78316d1447756d5ba400bda895.zip external_llvm-a54c1f70b8bffa78316d1447756d5ba400bda895.tar.gz external_llvm-a54c1f70b8bffa78316d1447756d5ba400bda895.tar.bz2 |
final hacking for tonight, still more to go.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101995 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
-rw-r--r-- | docs/ReleaseNotes.html | 116 |
1 files changed, 58 insertions, 58 deletions
diff --git a/docs/ReleaseNotes.html b/docs/ReleaseNotes.html index 9b65c6f..129a405 100644 --- a/docs/ReleaseNotes.html +++ b/docs/ReleaseNotes.html @@ -501,28 +501,48 @@ release includes a few major enhancements and additions to the optimizers:</p> <ul> -<li>...</li> -Inliner reuses arrays allocas when inlining multiple callers to reduce stack usage. -Optimal Edge Profiling? -Instcombine is now a library, has its own IRBuilder to simplify itself. -Better code size analysis in loop unswitch, inliner code split out to a new - CodeMetrics class for reuse. -Many changes to the pass ordering for improved optimization effectiveness. -BasicAA improved to be less dependent on "type safe" pointers, it can now look - through bitcasts more aggressively. -GVN PHI Translation improvements. blog post: http://blog.llvm.org/2009/12/advanced-topics-in-redundant-load.html -New SCEV AA pass: -scev-aa -Target data now has notion of 'native' integer data types which optimizations can use. -Opt now works conservatively if no target data is set (is this fully working?) -New Analysis/InstructionSimplify.h interface for simplifying instructions that don't exist. -Jump threading is now much more aggressive at simplifying correlated +<li>Inliner reuses arrays allocas when inlining multiple callers to reduce stack usage.</li> +<li>Instcombine is now a library, has its own IRBuilder to simplify itself.</li> +<li>Better code size analysis in loop unswitch, inliner code split out to a new + CodeMetrics class for reuse.</li> +<li>Many changes to the pass ordering for improved optimization + effectiveness.</li> +<li>BasicAA improved to be less dependent on "type safe" pointers, it can now look + through bitcasts more aggressively.</li> +<li>GVN PHI Translation improvements. blog post: http://blog.llvm.org/2009/12/advanced-topics-in-redundant-load.html</li> +<li>New SCEV AA pass: -scev-aa</li> +<li>Target data now has notion of 'native' integer data types which optimizations can use.</li> +<li>Opt now works conservatively if no target data is set (is this fully working?)</li> +<li>New Analysis/InstructionSimplify.h interface for simplifying instructions that don't exist.</li> +<li>Jump threading is now much more aggressive at simplifying correlated conditionals and threading blocks with otherwise complex logic. CondProp pass - removed (functionality merged into jump threading). -New SSAUpdater and MachineSSAUpdater classes for unstructured ssa updating, + removed (functionality merged into jump threading).</li> +<li>New SSAUpdater and MachineSSAUpdater classes for unstructured ssa updating, changed jump threading, GVN, etc to use it which simplified them and speed - them up. + them up.</li> +<li> +The Optimal Edge Profiling implementation in 2.6 was more a proof of +concept. The current implementation (the one that will go into 2.7) is +now stable and (as far as my tests go) bug free. + +The profiling with instrumentation via "opt" and analysis via the tool +"llvm-prof" should Work As Expected (TM). + +Two things are missing: + +*) Still missing is the modification of all -std-compile-opt passes to +update the profiling information according to the changes made to the +CFG, I'm planning to do this after my master thesis is finished. This +will enable all passes to use the ProfileInfo if available and base +decisions on that information. + +*) GCC has the options "-pg", "-fprofile-arcs" and "--coverage" that +insert profiling code and "-fprofile-use" to use them the next time +during compilation. I guess this options should also work properly in +llvm-gcc and clang?</li> + </ul> </div> @@ -568,25 +588,20 @@ it run faster:</p> <ul> <li>New instruction selector [blog post?].</li> - -Code generator MC'ized except for debug info and EH. - -New CodeGen Level CSE -Combiner-AA improvements, why not on by default? -Pre-regalloc tail duplication -New LSR with "full strength reduction" mode. Description? -Codegen level OptimizeExtsPass pass, takes advantage of x86 subregs. -Support for the GCC option -fno-schedule-insns -non-temporal load/store -MachineSSAUpdater.h -X86 and XCore supports returning arbitrary return values, returning too many values is - supported by returning through a hidden pointer. -verbose-asm now produces information about spill slots and loop nests -GHC Haskell ABI / calling conv support. -Many improvements to debug info - - -<li>...</li> +<li>New LSR with "full strength reduction" mode. Description?</li> +<li>Code generator MC'ized except for debug info and EH.</li> +<li>New CodeGen Level CSE</li> +<li>Combiner-AA improvements, why not on by default?</li> +<li>Pre-regalloc tail duplication</li> +<li>Codegen level OptimizeExtsPass pass, takes advantage of x86 subregs. </li> +<li>Support for the GCC option -fno-schedule-insns</li> +<li>Non-temporal load/store, only implemented on X86, see LangRef.html#i_load.</li> +<li>MachineSSAUpdater.h</li> +<li>X86 and XCore supports returning arbitrary return values, returning too many values is + supported by returning through a hidden pointer.</li> +<li>verbose-asm now produces information about spill slots and loop nests</li> +<li>GHC Haskell ABI / calling conv support.</li> +<li>Many improvements to debug info</li> </ul> </div> @@ -600,10 +615,13 @@ Many improvements to debug info </p> <ul> +<li>The X86 backend now optimizes tails calls much more aggressively for + functions that use the standard C calling convention.</li> +<li>The X86 backend now models scalar SSE registers as subregs of the SSE vector + registers, making the code generator more aggressive in cases where scalars + and vector types are mixed.</li> -<li>PostRA scheduler for X86?</li> -<li>x86 sibcall / tailcall optimization in CCC mode.</li> -<li>X86: XMM subreg modeling for extraction of the low element.</li> +<li>PostRA scheduler for X86? FIXME: is this on by default in 2.7?</li> </ul> @@ -642,21 +660,6 @@ href="http://blog.llvm.org/2010/04/arm-advanced-simd-neon-intrinsics-and.html"> <!--=========================================================================--> <div class="doc_subsection"> -<a name="OtherTarget">Other Target Specific Improvements</a> -</div> - -<div class="doc_text"> -<p>New features of other targets include: -</p> - -<ul> -<li>...</li> -</ul> - -</div> - -<!--=========================================================================--> -<div class="doc_subsection"> <a name="newapis">New Useful APIs</a> </div> @@ -917,9 +920,6 @@ compilation, and lacks support for debug information.</li> <div class="doc_text"> <ul> -<li>Support for the Advanced SIMD (Neon) instruction set is still incomplete -and not well tested. Some features may not work at all, and the code quality -may be poor in some cases.</li> <li>Thumb mode works only on ARMv6 or higher processors. On sub-ARMv6 processors, thumb programs can crash or produce wrong results (<a href="http://llvm.org/PR1388">PR1388</a>).</li> |