diff options
-rw-r--r-- | docs/Bugpoint.html | 316 | ||||
-rw-r--r-- | docs/Bugpoint.rst | 218 | ||||
-rw-r--r-- | docs/subsystems.rst | 3 |
3 files changed, 220 insertions, 317 deletions
diff --git a/docs/Bugpoint.html b/docs/Bugpoint.html deleted file mode 100644 index 31c35f0..0000000 --- a/docs/Bugpoint.html +++ /dev/null @@ -1,316 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>LLVM bugpoint tool: design and usage</title> - <link rel="stylesheet" href="_static/llvm.css" type="text/css"> -</head> - -<h1> - LLVM bugpoint tool: design and usage -</h1> - -<ul> - <li><a href="#desc">Description</a></li> - <li><a href="#design">Design Philosophy</a> - <ul> - <li><a href="#autoselect">Automatic Debugger Selection</a></li> - <li><a href="#crashdebug">Crash debugger</a></li> - <li><a href="#codegendebug">Code generator debugger</a></li> - <li><a href="#miscompilationdebug">Miscompilation debugger</a></li> - </ul></li> - <li><a href="#advice">Advice for using <tt>bugpoint</tt></a></li> - <li><a href="#notEnough">What to do when <tt>bugpoint</tt> isn't enough</a></li> -</ul> - -<div class="doc_author"> -<p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> -</div> - -<!-- *********************************************************************** --> -<h2> -<a name="desc">Description</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p><tt>bugpoint</tt> narrows down the source of problems in LLVM tools and -passes. It can be used to debug three types of failures: optimizer crashes, -miscompilations by optimizers, or bad native code generation (including problems -in the static and JIT compilers). It aims to reduce large test cases to small, -useful ones. For example, if <tt>opt</tt> crashes while optimizing a -file, it will identify the optimization (or combination of optimizations) that -causes the crash, and reduce the file down to a small example which triggers the -crash.</p> - -<p>For detailed case scenarios, such as debugging <tt>opt</tt>, or one of the -LLVM code generators, see <a href="HowToSubmitABug.html">How To Submit a Bug -Report document</a>.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> -<a name="design">Design Philosophy</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p><tt>bugpoint</tt> is designed to be a useful tool without requiring any -hooks into the LLVM infrastructure at all. It works with any and all LLVM -passes and code generators, and does not need to "know" how they work. Because -of this, it may appear to do stupid things or miss obvious -simplifications. <tt>bugpoint</tt> is also designed to trade off programmer -time for computer time in the compiler-debugging process; consequently, it may -take a long period of (unattended) time to reduce a test case, but we feel it -is still worth it. Note that <tt>bugpoint</tt> is generally very quick unless -debugging a miscompilation where each test of the program (which requires -executing it) takes a long time.</p> - -<!-- ======================================================================= --> -<h3> - <a name="autoselect">Automatic Debugger Selection</a> -</h3> - -<div> - -<p><tt>bugpoint</tt> reads each <tt>.bc</tt> or <tt>.ll</tt> file specified on -the command line and links them together into a single module, called the test -program. If any LLVM passes are specified on the command line, it runs these -passes on the test program. If any of the passes crash, or if they produce -malformed output (which causes the verifier to abort), <tt>bugpoint</tt> starts -the <a href="#crashdebug">crash debugger</a>.</p> - -<p>Otherwise, if the <tt>-output</tt> option was not specified, -<tt>bugpoint</tt> runs the test program with the "safe" backend (which is assumed to -generate good code) to generate a reference output. Once <tt>bugpoint</tt> has -a reference output for the test program, it tries executing it with the -selected code generator. If the selected code generator crashes, -<tt>bugpoint</tt> starts the <a href="#crashdebug">crash debugger</a> on the -code generator. Otherwise, if the resulting output differs from the reference -output, it assumes the difference resulted from a code generator failure, and -starts the <a href="#codegendebug">code generator debugger</a>.</p> - -<p>Finally, if the output of the selected code generator matches the reference -output, <tt>bugpoint</tt> runs the test program after all of the LLVM passes -have been applied to it. If its output differs from the reference output, it -assumes the difference resulted from a failure in one of the LLVM passes, and -enters the <a href="#miscompilationdebug">miscompilation debugger</a>. -Otherwise, there is no problem <tt>bugpoint</tt> can debug.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="crashdebug">Crash debugger</a> -</h3> - -<div> - -<p>If an optimizer or code generator crashes, <tt>bugpoint</tt> will try as hard -as it can to reduce the list of passes (for optimizer crashes) and the size of -the test program. First, <tt>bugpoint</tt> figures out which combination of -optimizer passes triggers the bug. This is useful when debugging a problem -exposed by <tt>opt</tt>, for example, because it runs over 38 passes.</p> - -<p>Next, <tt>bugpoint</tt> tries removing functions from the test program, to -reduce its size. Usually it is able to reduce a test program to a single -function, when debugging intraprocedural optimizations. Once the number of -functions has been reduced, it attempts to delete various edges in the control -flow graph, to reduce the size of the function as much as possible. Finally, -<tt>bugpoint</tt> deletes any individual LLVM instructions whose absence does -not eliminate the failure. At the end, <tt>bugpoint</tt> should tell you what -passes crash, give you a bitcode file, and give you instructions on how to -reproduce the failure with <tt>opt</tt> or <tt>llc</tt>.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="codegendebug">Code generator debugger</a> -</h3> - -<div> - -<p>The code generator debugger attempts to narrow down the amount of code that -is being miscompiled by the selected code generator. To do this, it takes the -test program and partitions it into two pieces: one piece which it compiles -with the "safe" backend (into a shared object), and one piece which it runs with -either the JIT or the static LLC compiler. It uses several techniques to -reduce the amount of code pushed through the LLVM code generator, to reduce the -potential scope of the problem. After it is finished, it emits two bitcode -files (called "test" [to be compiled with the code generator] and "safe" [to be -compiled with the "safe" backend], respectively), and instructions for reproducing -the problem. The code generator debugger assumes that the "safe" backend produces -good code.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="miscompilationdebug">Miscompilation debugger</a> -</h3> - -<div> - -<p>The miscompilation debugger works similarly to the code generator debugger. -It works by splitting the test program into two pieces, running the -optimizations specified on one piece, linking the two pieces back together, and -then executing the result. It attempts to narrow down the list of passes to -the one (or few) which are causing the miscompilation, then reduce the portion -of the test program which is being miscompiled. The miscompilation debugger -assumes that the selected code generator is working properly.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="advice">Advice for using bugpoint</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<tt>bugpoint</tt> can be a remarkably useful tool, but it sometimes works in -non-obvious ways. Here are some hints and tips:<p> - -<ol> -<li>In the code generator and miscompilation debuggers, <tt>bugpoint</tt> only - works with programs that have deterministic output. Thus, if the program - outputs <tt>argv[0]</tt>, the date, time, or any other "random" data, - <tt>bugpoint</tt> may misinterpret differences in these data, when output, - as the result of a miscompilation. Programs should be temporarily modified - to disable outputs that are likely to vary from run to run. - -<li>In the code generator and miscompilation debuggers, debugging will go - faster if you manually modify the program or its inputs to reduce the - runtime, but still exhibit the problem. - -<li><tt>bugpoint</tt> is extremely useful when working on a new optimization: - it helps track down regressions quickly. To avoid having to relink - <tt>bugpoint</tt> every time you change your optimization however, have - <tt>bugpoint</tt> dynamically load your optimization with the - <tt>-load</tt> option. - -<li><p><tt>bugpoint</tt> can generate a lot of output and run for a long period - of time. It is often useful to capture the output of the program to file. - For example, in the C shell, you can run:</p> - -<div class="doc_code"> -<p><tt>bugpoint ... |& tee bugpoint.log</tt></p> -</div> - - <p>to get a copy of <tt>bugpoint</tt>'s output in the file - <tt>bugpoint.log</tt>, as well as on your terminal.</p> - -<li><tt>bugpoint</tt> cannot debug problems with the LLVM linker. If - <tt>bugpoint</tt> crashes before you see its "All input ok" message, - you might try <tt>llvm-link -v</tt> on the same set of input files. If - that also crashes, you may be experiencing a linker bug. - -<li><tt>bugpoint</tt> is useful for proactively finding bugs in LLVM. - Invoking <tt>bugpoint</tt> with the <tt>-find-bugs</tt> option will cause - the list of specified optimizations to be randomized and applied to the - program. This process will repeat until a bug is found or the user - kills <tt>bugpoint</tt>. -</ol> - -</div> -<!-- *********************************************************************** --> -<h2> - <a name="notEnough">What to do when bugpoint isn't enough</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Sometimes, <tt>bugpoint</tt> is not enough. In particular, InstCombine and -TargetLowering both have visitor structured code with lots of potential -transformations. If the process of using bugpoint has left you with -still too much code to figure out and the problem seems -to be in instcombine, the following steps may help. These same techniques -are useful with TargetLowering as well.</p> - -<p>Turn on <tt>-debug-only=instcombine</tt> and see which transformations -within instcombine are firing by selecting out lines with "<tt>IC</tt>" -in them.</p> - -<p>At this point, you have a decision to make. Is the number -of transformations small enough to step through them using a debugger? -If so, then try that.</p> - -<p>If there are too many transformations, then a source modification -approach may be helpful. -In this approach, you can modify the source code of instcombine -to disable just those transformations that are being performed on your -test input and perform a binary search over the set of transformations. -One set of places to modify are the "<tt>visit*</tt>" methods of -<tt>InstCombiner</tt> (<I>e.g.</I> <tt>visitICmpInst</tt>) by adding a -"<tt>return false</tt>" as the first line of the method.</p> - -<p>If that still doesn't remove enough, then change the caller of -<tt>InstCombiner::DoOneIteration</tt>, <tt>InstCombiner::runOnFunction</tt> -to limit the number of iterations.</p> - -<p>You may also find it useful to use "<tt>-stats</tt>" now to see what parts -of instcombine are firing. This can guide where to put additional reporting -code.</p> - -<p>At this point, if the amount of transformations is still too large, then -inserting code to limit whether or not to execute the body of the code -in the visit function can be helpful. Add a static counter which is -incremented on every invocation of the function. Then add code which -simply returns false on desired ranges. For example:</p> - -<div class="doc_code"> -<p><tt>static int calledCount = 0;</tt></p> -<p><tt>calledCount++;</tt></p> -<p><tt>DEBUG(if (calledCount < 212) return false);</tt></p> -<p><tt>DEBUG(if (calledCount > 217) return false);</tt></p> -<p><tt>DEBUG(if (calledCount == 213) return false);</tt></p> -<p><tt>DEBUG(if (calledCount == 214) return false);</tt></p> -<p><tt>DEBUG(if (calledCount == 215) return false);</tt></p> -<p><tt>DEBUG(if (calledCount == 216) return false);</tt></p> -<p><tt>DEBUG(dbgs() << "visitXOR calledCount: " << calledCount - << "\n");</tt></p> -<p><tt>DEBUG(dbgs() << "I: "; I->dump());</tt></p> -</div> - -<p>could be added to <tt>visitXOR</tt> to limit <tt>visitXor</tt> to being -applied only to calls 212 and 217. This is from an actual test case and raises -an important point---a simple binary search may not be sufficient, as -transformations that interact may require isolating more than one call. -In TargetLowering, use <tt>return SDNode();</tt> instead of -<tt>return false;</tt>.</p> - -<p>Now that that the number of transformations is down to a manageable -number, try examining the output to see if you can figure out which -transformations are being done. If that can be figured out, then -do the usual debugging. If which code corresponds to the transformation -being performed isn't obvious, set a breakpoint after the call count -based disabling and step through the code. Alternatively, you can use -"printf" style debugging to report waypoints.</p> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date$ -</address> - -</body> -</html> diff --git a/docs/Bugpoint.rst b/docs/Bugpoint.rst new file mode 100644 index 0000000..9ccf0cc --- /dev/null +++ b/docs/Bugpoint.rst @@ -0,0 +1,218 @@ +.. _bugpoint: + +==================================== +LLVM bugpoint tool: design and usage +==================================== + +.. contents:: + :local: + +Description +=========== + +``bugpoint`` narrows down the source of problems in LLVM tools and passes. It +can be used to debug three types of failures: optimizer crashes, miscompilations +by optimizers, or bad native code generation (including problems in the static +and JIT compilers). It aims to reduce large test cases to small, useful ones. +For example, if ``opt`` crashes while optimizing a file, it will identify the +optimization (or combination of optimizations) that causes the crash, and reduce +the file down to a small example which triggers the crash. + +For detailed case scenarios, such as debugging ``opt``, or one of the LLVM code +generators, see `How To Submit a Bug Report document <HowToSubmitABug.html>`_. + +Design Philosophy +================= + +``bugpoint`` is designed to be a useful tool without requiring any hooks into +the LLVM infrastructure at all. It works with any and all LLVM passes and code +generators, and does not need to "know" how they work. Because of this, it may +appear to do stupid things or miss obvious simplifications. ``bugpoint`` is +also designed to trade off programmer time for computer time in the +compiler-debugging process; consequently, it may take a long period of +(unattended) time to reduce a test case, but we feel it is still worth it. Note +that ``bugpoint`` is generally very quick unless debugging a miscompilation +where each test of the program (which requires executing it) takes a long time. + +Automatic Debugger Selection +---------------------------- + +``bugpoint`` reads each ``.bc`` or ``.ll`` file specified on the command line +and links them together into a single module, called the test program. If any +LLVM passes are specified on the command line, it runs these passes on the test +program. If any of the passes crash, or if they produce malformed output (which +causes the verifier to abort), ``bugpoint`` starts the `crash debugger`_. + +Otherwise, if the ``-output`` option was not specified, ``bugpoint`` runs the +test program with the "safe" backend (which is assumed to generate good code) to +generate a reference output. Once ``bugpoint`` has a reference output for the +test program, it tries executing it with the selected code generator. If the +selected code generator crashes, ``bugpoint`` starts the `crash debugger`_ on +the code generator. Otherwise, if the resulting output differs from the +reference output, it assumes the difference resulted from a code generator +failure, and starts the `code generator debugger`_. + +Finally, if the output of the selected code generator matches the reference +output, ``bugpoint`` runs the test program after all of the LLVM passes have +been applied to it. If its output differs from the reference output, it assumes +the difference resulted from a failure in one of the LLVM passes, and enters the +`miscompilation debugger`_. Otherwise, there is no problem ``bugpoint`` can +debug. + +.. _crash debugger: + +Crash debugger +-------------- + +If an optimizer or code generator crashes, ``bugpoint`` will try as hard as it +can to reduce the list of passes (for optimizer crashes) and the size of the +test program. First, ``bugpoint`` figures out which combination of optimizer +passes triggers the bug. This is useful when debugging a problem exposed by +``opt``, for example, because it runs over 38 passes. + +Next, ``bugpoint`` tries removing functions from the test program, to reduce its +size. Usually it is able to reduce a test program to a single function, when +debugging intraprocedural optimizations. Once the number of functions has been +reduced, it attempts to delete various edges in the control flow graph, to +reduce the size of the function as much as possible. Finally, ``bugpoint`` +deletes any individual LLVM instructions whose absence does not eliminate the +failure. At the end, ``bugpoint`` should tell you what passes crash, give you a +bitcode file, and give you instructions on how to reproduce the failure with +``opt`` or ``llc``. + +.. _code generator debugger: + +Code generator debugger +----------------------- + +The code generator debugger attempts to narrow down the amount of code that is +being miscompiled by the selected code generator. To do this, it takes the test +program and partitions it into two pieces: one piece which it compiles with the +"safe" backend (into a shared object), and one piece which it runs with either +the JIT or the static LLC compiler. It uses several techniques to reduce the +amount of code pushed through the LLVM code generator, to reduce the potential +scope of the problem. After it is finished, it emits two bitcode files (called +"test" [to be compiled with the code generator] and "safe" [to be compiled with +the "safe" backend], respectively), and instructions for reproducing the +problem. The code generator debugger assumes that the "safe" backend produces +good code. + +.. _miscompilation debugger: + +Miscompilation debugger +----------------------- + +The miscompilation debugger works similarly to the code generator debugger. It +works by splitting the test program into two pieces, running the optimizations +specified on one piece, linking the two pieces back together, and then executing +the result. It attempts to narrow down the list of passes to the one (or few) +which are causing the miscompilation, then reduce the portion of the test +program which is being miscompiled. The miscompilation debugger assumes that +the selected code generator is working properly. + +Advice for using bugpoint +========================= + +``bugpoint`` can be a remarkably useful tool, but it sometimes works in +non-obvious ways. Here are some hints and tips: + +* In the code generator and miscompilation debuggers, ``bugpoint`` only works + with programs that have deterministic output. Thus, if the program outputs + ``argv[0]``, the date, time, or any other "random" data, ``bugpoint`` may + misinterpret differences in these data, when output, as the result of a + miscompilation. Programs should be temporarily modified to disable outputs + that are likely to vary from run to run. + +* In the code generator and miscompilation debuggers, debugging will go faster + if you manually modify the program or its inputs to reduce the runtime, but + still exhibit the problem. + +* ``bugpoint`` is extremely useful when working on a new optimization: it helps + track down regressions quickly. To avoid having to relink ``bugpoint`` every + time you change your optimization however, have ``bugpoint`` dynamically load + your optimization with the ``-load`` option. + +* ``bugpoint`` can generate a lot of output and run for a long period of time. + It is often useful to capture the output of the program to file. For example, + in the C shell, you can run: + + .. code-block:: bash + + bugpoint ... |& tee bugpoint.log + + to get a copy of ``bugpoint``'s output in the file ``bugpoint.log``, as well + as on your terminal. + +* ``bugpoint`` cannot debug problems with the LLVM linker. If ``bugpoint`` + crashes before you see its "All input ok" message, you might try ``llvm-link + -v`` on the same set of input files. If that also crashes, you may be + experiencing a linker bug. + +* ``bugpoint`` is useful for proactively finding bugs in LLVM. Invoking + ``bugpoint`` with the ``-find-bugs`` option will cause the list of specified + optimizations to be randomized and applied to the program. This process will + repeat until a bug is found or the user kills ``bugpoint``. + +What to do when bugpoint isn't enough +===================================== + +Sometimes, ``bugpoint`` is not enough. In particular, InstCombine and +TargetLowering both have visitor structured code with lots of potential +transformations. If the process of using bugpoint has left you with still too +much code to figure out and the problem seems to be in instcombine, the +following steps may help. These same techniques are useful with TargetLowering +as well. + +Turn on ``-debug-only=instcombine`` and see which transformations within +instcombine are firing by selecting out lines with "``IC``" in them. + +At this point, you have a decision to make. Is the number of transformations +small enough to step through them using a debugger? If so, then try that. + +If there are too many transformations, then a source modification approach may +be helpful. In this approach, you can modify the source code of instcombine to +disable just those transformations that are being performed on your test input +and perform a binary search over the set of transformations. One set of places +to modify are the "``visit*``" methods of ``InstCombiner`` (*e.g.* +``visitICmpInst``) by adding a "``return false``" as the first line of the +method. + +If that still doesn't remove enough, then change the caller of +``InstCombiner::DoOneIteration``, ``InstCombiner::runOnFunction`` to limit the +number of iterations. + +You may also find it useful to use "``-stats``" now to see what parts of +instcombine are firing. This can guide where to put additional reporting code. + +At this point, if the amount of transformations is still too large, then +inserting code to limit whether or not to execute the body of the code in the +visit function can be helpful. Add a static counter which is incremented on +every invocation of the function. Then add code which simply returns false on +desired ranges. For example: + +.. code-block:: c++ + + + static int calledCount = 0; + calledCount++; + DEBUG(if (calledCount < 212) return false); + DEBUG(if (calledCount > 217) return false); + DEBUG(if (calledCount == 213) return false); + DEBUG(if (calledCount == 214) return false); + DEBUG(if (calledCount == 215) return false); + DEBUG(if (calledCount == 216) return false); + DEBUG(dbgs() << "visitXOR calledCount: " << calledCount << "\n"); + DEBUG(dbgs() << "I: "; I->dump()); + +could be added to ``visitXOR`` to limit ``visitXor`` to being applied only to +calls 212 and 217. This is from an actual test case and raises an important +point---a simple binary search may not be sufficient, as transformations that +interact may require isolating more than one call. In TargetLowering, use +``return SDNode();`` instead of ``return false;``. + +Now that that the number of transformations is down to a manageable number, try +examining the output to see if you can figure out which transformations are +being done. If that can be figured out, then do the usual debugging. If which +code corresponds to the transformation being performed isn't obvious, set a +breakpoint after the call count based disabling and step through the code. +Alternatively, you can use "``printf``" style debugging to report waypoints. diff --git a/docs/subsystems.rst b/docs/subsystems.rst index e643e7d..28ad020 100644 --- a/docs/subsystems.rst +++ b/docs/subsystems.rst @@ -8,6 +8,7 @@ Subsystem Documentation AliasAnalysis BranchWeightMetadata + Bugpoint LinkTimeOptimization SegmentedStacks TableGenFundamentals @@ -51,7 +52,7 @@ Subsystem Documentation This document describes the design and implementation of exception handling in LLVM. -* `Bugpoint <Bugpoint.html>`_ +* :ref:`bugpoint` Automatic bug finder and test-case reducer description and usage information. |