diff options
author | Mikhail Glushenkov <foldr@codedgers.com> | 2008-11-25 21:38:12 +0000 |
---|---|---|
committer | Mikhail Glushenkov <foldr@codedgers.com> | 2008-11-25 21:38:12 +0000 |
commit | 113ec35f7f69bd66c0fbab7b42e2b9d59eddb946 (patch) | |
tree | 3b134d1c1e4f3a77e35efe15051cb40295f4e801 /tools/llvmc/doc/LLVMC-Reference.rst | |
parent | d91487785f641af7f5c6c32b04cb28cfe94518a9 (diff) | |
download | external_llvm-113ec35f7f69bd66c0fbab7b42e2b9d59eddb946.zip external_llvm-113ec35f7f69bd66c0fbab7b42e2b9d59eddb946.tar.gz external_llvm-113ec35f7f69bd66c0fbab7b42e2b9d59eddb946.tar.bz2 |
Since the old llvmc was removed, rename llvmc2 to llvmc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60048 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'tools/llvmc/doc/LLVMC-Reference.rst')
-rw-r--r-- | tools/llvmc/doc/LLVMC-Reference.rst | 517 |
1 files changed, 517 insertions, 0 deletions
diff --git a/tools/llvmc/doc/LLVMC-Reference.rst b/tools/llvmc/doc/LLVMC-Reference.rst new file mode 100644 index 0000000..77d9d2b --- /dev/null +++ b/tools/llvmc/doc/LLVMC-Reference.rst @@ -0,0 +1,517 @@ +=================================== +Customizing LLVMC: Reference Manual +=================================== +:Author: Mikhail Glushenkov <foldr@codedegers.com> + +LLVMC is a generic compiler driver, designed to be customizable and +extensible. It plays the same role for LLVM as the ``gcc`` program +does for GCC - LLVMC's job is essentially to transform a set of input +files into a set of targets depending on configuration rules and user +options. What makes LLVMC different is that these transformation rules +are completely customizable - in fact, LLVMC knows nothing about the +specifics of transformation (even the command-line options are mostly +not hard-coded) and regards the transformation structure as an +abstract graph. The structure of this graph is completely determined +by plugins, which can be either statically or dynamically linked. This +makes it possible to easily adapt LLVMC for other purposes - for +example, as a build tool for game resources. + +Because LLVMC employs TableGen [1]_ as its configuration language, you +need to be familiar with it to customize LLVMC. + + +.. contents:: + + +Compiling with LLVMC +==================== + +LLVMC tries hard to be as compatible with ``gcc`` as possible, +although there are some small differences. Most of the time, however, +you shouldn't be able to notice them:: + + $ # This works as expected: + $ llvmc -O3 -Wall hello.cpp + $ ./a.out + hello + +One nice feature of LLVMC is that one doesn't have to distinguish +between different compilers for different languages (think ``g++`` and +``gcc``) - the right toolchain is chosen automatically based on input +language names (which are, in turn, determined from file +extensions). If you want to force files ending with ".c" to compile as +C++, use the ``-x`` option, just like you would do it with ``gcc``:: + + $ # hello.c is really a C++ file + $ llvmc -x c++ hello.c + $ ./a.out + hello + +On the other hand, when using LLVMC as a linker to combine several C++ +object files you should provide the ``--linker`` option since it's +impossible for LLVMC to choose the right linker in that case:: + + $ llvmc -c hello.cpp + $ llvmc hello.o + [A lot of link-time errors skipped] + $ llvmc --linker=c++ hello.o + $ ./a.out + hello + + +Predefined options +================== + +LLVMC has some built-in options that can't be overridden in the +configuration files: + +* ``-o FILE`` - Output file name. + +* ``-x LANGUAGE`` - Specify the language of the following input files + until the next -x option. + +* ``-load PLUGIN_NAME`` - Load the specified plugin DLL. Example: + ``-load $LLVM_DIR/Release/lib/LLVMCSimple.so``. + +* ``-v`` - Enable verbose mode, i.e. print out all executed commands. + +* ``--view-graph`` - Show a graphical representation of the compilation + graph. Requires that you have ``dot`` and ``gv`` programs + installed. Hidden option, useful for debugging. + +* ``--write-graph`` - Write a ``compilation-graph.dot`` file in the + current directory with the compilation graph description in the + Graphviz format. Hidden option, useful for debugging. + +* ``--save-temps`` - Write temporary files to the current directory + and do not delete them on exit. Hidden option, useful for debugging. + +* ``--help``, ``--help-hidden``, ``--version`` - These options have + their standard meaning. + + +Compiling LLVMC plugins +======================= + +It's easiest to start working on your own LLVMC plugin by copying the +skeleton project which lives under ``$LLVMC_DIR/plugins/Simple``:: + + $ cd $LLVMC_DIR/plugins + $ cp -r Simple MyPlugin + $ cd MyPlugin + $ ls + Makefile PluginMain.cpp Simple.td + +As you can see, our basic plugin consists of only two files (not +counting the build script). ``Simple.td`` contains TableGen +description of the compilation graph; its format is documented in the +following sections. ``PluginMain.cpp`` is just a helper file used to +compile the auto-generated C++ code produced from TableGen source. It +can also contain hook definitions (see `below`__). + +__ hooks_ + +The first thing that you should do is to change the ``LLVMC_PLUGIN`` +variable in the ``Makefile`` to avoid conflicts (since this variable +is used to name the resulting library):: + + LLVMC_PLUGIN=MyPlugin + +It is also a good idea to rename ``Simple.td`` to something less +generic:: + + $ mv Simple.td MyPlugin.td + +Note that the plugin source directory must be placed under +``$LLVMC_DIR/plugins`` to make use of the existing build +infrastructure. To build a version of the LLVMC executable called +``mydriver`` with your plugin compiled in, use the following command:: + + $ cd $LLVMC_DIR + $ make BUILTIN_PLUGINS=MyPlugin DRIVER_NAME=mydriver + +To build your plugin as a dynamic library, just ``cd`` to its source +directory and run ``make``. The resulting file will be called +``LLVMC$(LLVMC_PLUGIN).$(DLL_EXTENSION)`` (in our case, +``LLVMCMyPlugin.so``). This library can be then loaded in with the +``-load`` option. Example:: + + $ cd $LLVMC_DIR/plugins/Simple + $ make + $ llvmc -load $LLVM_DIR/Release/lib/LLVMCSimple.so + +Sometimes, you will want a 'bare-bones' version of LLVMC that has no +built-in plugins. It can be compiled with the following command:: + + $ cd $LLVMC_DIR + $ make BUILTIN_PLUGINS="" + +How plugins are loaded +====================== + +It is possible for LLVMC plugins to depend on each other. For example, +one can create edges between nodes defined in some other plugin. To +make this work, however, that plugin should be loaded first. To +achieve this, the concept of plugin priority was introduced. By +default, every plugin has priority zero; to specify the priority +explicitly, put the following line in your ``.td`` file:: + + def Priority : PluginPriority<$PRIORITY_VALUE>; + # Where PRIORITY_VALUE is some integer > 0 + +Plugins are loaded in order of their (increasing) priority, starting +with 0. Therefore, the plugin with the highest priority value will be +loaded last. + + +Customizing LLVMC: the compilation graph +======================================== + +Each TableGen configuration file should include the common +definitions:: + + include "llvm/CompilerDriver/Common.td" + // And optionally: + // include "llvm/CompilerDriver/Tools.td" + // which contains some useful tool definitions. + +Internally, LLVMC stores information about possible source +transformations in form of a graph. Nodes in this graph represent +tools, and edges between two nodes represent a transformation path. A +special "root" node is used to mark entry points for the +transformations. LLVMC also assigns a weight to each edge (more on +this later) to choose between several alternative edges. + +The definition of the compilation graph (see file +``plugins/Base/Base.td`` for an example) is just a list of edges:: + + def CompilationGraph : CompilationGraph<[ + Edge<"root", "llvm_gcc_c">, + Edge<"root", "llvm_gcc_assembler">, + ... + + Edge<"llvm_gcc_c", "llc">, + Edge<"llvm_gcc_cpp", "llc">, + ... + + OptionalEdge<"llvm_gcc_c", "opt", (case (switch_on "opt"), + (inc_weight))>, + OptionalEdge<"llvm_gcc_cpp", "opt", (case (switch_on "opt"), + (inc_weight))>, + ... + + OptionalEdge<"llvm_gcc_assembler", "llvm_gcc_cpp_linker", + (case (input_languages_contain "c++"), (inc_weight), + (or (parameter_equals "linker", "g++"), + (parameter_equals "linker", "c++")), (inc_weight))>, + ... + + ]>; + +As you can see, the edges can be either default or optional, where +optional edges are differentiated by an additional ``case`` expression +used to calculate the weight of this edge. Notice also that we refer +to tools via their names (as strings). This makes it possible to add +edges to an existing compilation graph in plugins without having to +know about all tool definitions used in the graph. + +The default edges are assigned a weight of 1, and optional edges get a +weight of 0 + 2*N where N is the number of tests that evaluated to +true in the ``case`` expression. It is also possible to provide an +integer parameter to ``inc_weight`` and ``dec_weight`` - in this case, +the weight is increased (or decreased) by the provided value instead +of the default 2. + +When passing an input file through the graph, LLVMC picks the edge +with the maximum weight. To avoid ambiguity, there should be only one +default edge between two nodes (with the exception of the root node, +which gets a special treatment - there you are allowed to specify one +default edge *per language*). + +To get a visual representation of the compilation graph (useful for +debugging), run ``llvmc --view-graph``. You will need ``dot`` and +``gsview`` installed for this to work properly. + + +Writing a tool description +========================== + +As was said earlier, nodes in the compilation graph represent tools, +which are described separately. A tool definition looks like this +(taken from the ``include/llvm/CompilerDriver/Tools.td`` file):: + + def llvm_gcc_cpp : Tool<[ + (in_language "c++"), + (out_language "llvm-assembler"), + (output_suffix "bc"), + (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"), + (sink) + ]>; + +This defines a new tool called ``llvm_gcc_cpp``, which is an alias for +``llvm-g++``. As you can see, a tool definition is just a list of +properties; most of them should be self-explanatory. The ``sink`` +property means that this tool should be passed all command-line +options that lack explicit descriptions. + +The complete list of the currently implemented tool properties follows: + +* Possible tool properties: + + - ``in_language`` - input language name. Can be either a string or a + list, in case the tool supports multiple input languages. + + - ``out_language`` - output language name. + + - ``output_suffix`` - output file suffix. + + - ``cmd_line`` - the actual command used to run the tool. You can + use ``$INFILE`` and ``$OUTFILE`` variables, output redirection + with ``>``, hook invocations (``$CALL``), environment variables + (via ``$ENV``) and the ``case`` construct (more on this below). + + - ``join`` - this tool is a "join node" in the graph, i.e. it gets a + list of input files and joins them together. Used for linkers. + + - ``sink`` - all command-line options that are not handled by other + tools are passed to this tool. + +The next tool definition is slightly more complex:: + + def llvm_gcc_linker : Tool<[ + (in_language "object-code"), + (out_language "executable"), + (output_suffix "out"), + (cmd_line "llvm-gcc $INFILE -o $OUTFILE"), + (join), + (prefix_list_option "L", (forward), + (help "add a directory to link path")), + (prefix_list_option "l", (forward), + (help "search a library when linking")), + (prefix_list_option "Wl", (unpack_values), + (help "pass options to linker")) + ]>; + +This tool has a "join" property, which means that it behaves like a +linker. This tool also defines several command-line options: ``-l``, +``-L`` and ``-Wl`` which have their usual meaning. An option has two +attributes: a name and a (possibly empty) list of properties. All +currently implemented option types and properties are described below: + +* Possible option types: + + - ``switch_option`` - a simple boolean switch, for example ``-time``. + + - ``parameter_option`` - option that takes an argument, for example + ``-std=c99``; + + - ``parameter_list_option`` - same as the above, but more than one + occurence of the option is allowed. + + - ``prefix_option`` - same as the parameter_option, but the option name + and parameter value are not separated. + + - ``prefix_list_option`` - same as the above, but more than one + occurence of the option is allowed; example: ``-lm -lpthread``. + + - ``alias_option`` - a special option type for creating + aliases. Unlike other option types, aliases are not allowed to + have any properties besides the aliased option name. Usage + example: ``(alias_option "preprocess", "E")`` + + +* Possible option properties: + + - ``append_cmd`` - append a string to the tool invocation command. + + - ``forward`` - forward this option unchanged. + + - ``forward_as`` - Change the name of this option, but forward the + argument unchanged. Example: ``(forward_as "--disable-optimize")``. + + - ``output_suffix`` - modify the output suffix of this + tool. Example: ``(switch "E", (output_suffix "i")``. + + - ``stop_compilation`` - stop compilation after this phase. + + - ``unpack_values`` - used for for splitting and forwarding + comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is + converted to ``-foo=bar -baz`` and appended to the tool invocation + command. + + - ``help`` - help string associated with this option. Used for + ``--help`` output. + + - ``required`` - this option is obligatory. + + +Option list - specifying all options in a single place +====================================================== + +It can be handy to have all information about options gathered in a +single place to provide an overview. This can be achieved by using a +so-called ``OptionList``:: + + def Options : OptionList<[ + (switch_option "E", (help "Help string")), + (alias_option "quiet", "q") + ... + ]>; + +``OptionList`` is also a good place to specify option aliases. + +Tool-specific option properties like ``append_cmd`` have (obviously) +no meaning in the context of ``OptionList``, so the only properties +allowed there are ``help`` and ``required``. + +Option lists are used at file scope. See the file +``plugins/Clang/Clang.td`` for an example of ``OptionList`` usage. + +.. _hooks: + +Using hooks and environment variables in the ``cmd_line`` property +================================================================== + +Normally, LLVMC executes programs from the system ``PATH``. Sometimes, +this is not sufficient: for example, we may want to specify tool names +in the configuration file. This can be achieved via the mechanism of +hooks - to write your own hooks, just add their definitions to the +``PluginMain.cpp`` or drop a ``.cpp`` file into the +``$LLVMC_DIR/driver`` directory. Hooks should live in the ``hooks`` +namespace and have the signature ``std::string hooks::MyHookName +(void)``. They can be used from the ``cmd_line`` tool property:: + + (cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)") + +It is also possible to use environment variables in the same manner:: + + (cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)") + +To change the command line string based on user-provided options use +the ``case`` expression (documented below):: + + (cmd_line + (case + (switch_on "E"), + "llvm-g++ -E -x c $INFILE -o $OUTFILE", + (default), + "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm")) + +Conditional evaluation: the ``case`` expression +=============================================== + +The 'case' construct can be used to calculate weights of the optional +edges and to choose between several alternative command line strings +in the ``cmd_line`` tool property. It is designed after the +similarly-named construct in functional languages and takes the form +``(case (test_1), statement_1, (test_2), statement_2, ... (test_N), +statement_N)``. The statements are evaluated only if the corresponding +tests evaluate to true. + +Examples:: + + // Increases edge weight by 5 if "-A" is provided on the + // command-line, and by 5 more if "-B" is also provided. + (case + (switch_on "A"), (inc_weight 5), + (switch_on "B"), (inc_weight 5)) + + // Evaluates to "cmdline1" if option "-A" is provided on the + // command line, otherwise to "cmdline2" + (case + (switch_on "A"), "cmdline1", + (switch_on "B"), "cmdline2", + (default), "cmdline3") + +Note the slight difference in 'case' expression handling in contexts +of edge weights and command line specification - in the second example +the value of the ``"B"`` switch is never checked when switch ``"A"`` is +enabled, and the whole expression always evaluates to ``"cmdline1"`` in +that case. + +Case expressions can also be nested, i.e. the following is legal:: + + (case (switch_on "E"), (case (switch_on "o"), ..., (default), ...) + (default), ...) + +You should, however, try to avoid doing that because it hurts +readability. It is usually better to split tool descriptions and/or +use TableGen inheritance instead. + +* Possible tests are: + + - ``switch_on`` - Returns true if a given command-line switch is + provided by the user. Example: ``(switch_on "opt")``. Note that + you have to define all possible command-line options separately in + the tool descriptions. See the next section for the discussion of + different kinds of command-line options. + + - ``parameter_equals`` - Returns true if a command-line parameter equals + a given value. Example: ``(parameter_equals "W", "all")``. + + - ``element_in_list`` - Returns true if a command-line parameter list + includes a given value. Example: ``(parameter_in_list "l", "pthread")``. + + - ``input_languages_contain`` - Returns true if a given language + belongs to the current input language set. Example: + ``(input_languages_contain "c++")``. + + - ``in_language`` - Evaluates to true if the language of the input + file equals to the argument. At the moment works only with + ``cmd_line`` property on non-join nodes. Example: ``(in_language + "c++")``. + + - ``not_empty`` - Returns true if a given option (which should be + either a parameter or a parameter list) is set by the + user. Example: ``(not_empty "o")``. + + - ``default`` - Always evaluates to true. Should always be the last + test in the ``case`` expression. + + - ``and`` - A standard logical combinator that returns true iff all + of its arguments return true. Used like this: ``(and (test1), + (test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed, + but not encouraged. + + - ``or`` - Another logical combinator that returns true only if any + one of its arguments returns true. Example: ``(or (test1), + (test2), ... (testN))``. + + +Language map +============ + +One last thing that you will need to modify when adding support for a +new language to LLVMC is the language map, which defines mappings from +file extensions to language names. It is used to choose the proper +toolchain(s) for a given input file set. Language map definition looks +like this:: + + def LanguageMap : LanguageMap< + [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>, + LangToSuffixes<"c", ["c"]>, + ... + ]>; + +Debugging +========= + +When writing LLVMC plugins, it can be useful to get a visual view of +the resulting compilation graph. This can be achieved via the command +line option ``--view-graph``. This command assumes that Graphviz [2]_ and +Ghostview [3]_ are installed. There is also a ``--dump-graph`` option that +creates a Graphviz source file(``compilation-graph.dot``) in the +current directory. + + +References +========== + +.. [1] TableGen Fundamentals + http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html + +.. [2] Graphviz + http://www.graphviz.org/ + +.. [3] Ghostview + http://pages.cs.wisc.edu/~ghost/ |