diff options
Diffstat (limited to 'tools/llvmc2/doc/LLVMC-Reference.rst')
| -rw-r--r-- | tools/llvmc2/doc/LLVMC-Reference.rst | 258 |
1 files changed, 258 insertions, 0 deletions
diff --git a/tools/llvmc2/doc/LLVMC-Reference.rst b/tools/llvmc2/doc/LLVMC-Reference.rst new file mode 100644 index 0000000..b8a364f --- /dev/null +++ b/tools/llvmc2/doc/LLVMC-Reference.rst @@ -0,0 +1,258 @@ +Tutorial - Writing LLVMC Configuration files +============================================= + +LLVMC is a generic compiler driver, designed to be customizable and +extensible. It plays the same role for LLVM as the ``gcc`` program +does for GCC - LLVMC's job is essentially to transform a set of input +files into a set of targets depending on configuration rules and user +options. What makes LLVMC different is that these transformation rules +are completely customizable - in fact, LLVMC knows nothing about the +specifics of transformation (even the command-line options are mostly +not hard-coded) and regards the transformation structure as an +abstract graph. This makes it possible to adapt LLVMC for other +purposes - for example, as a build tool for game resources. This +tutorial describes the basic usage and configuration of LLVMC. + +Because LLVMC employs TableGen [1]_ as its configuration language, you +need to be familiar with it to customize LLVMC. + +Compiling with LLVMC +-------------------- + +In general, LLVMC tries to be command-line compatible with ``gcc`` as +much as possible, so most of the familiar options work:: + + $ llvmc2 -O3 -Wall hello.cpp + $ ./a.out + hello + +One nice feature of LLVMC is that you don't have to distinguish +between different compilers for different languages (think ``g++`` and +``gcc``) - the right toolchain is chosen automatically based on input +language names (which are, in turn, determined from file extension). If +you want to force files ending with ".c" compile as C++, use the +``-x`` option, just like you would do it with ``gcc``:: + + $ llvmc2 -x c hello.cpp + $ # hello.cpp is really a C file + $ ./a.out + hello + +On the other hand, when using LLVMC as a linker to combine several C++ +object files you should provide the ``--linker`` option since it's +impossible for LLVMC to choose the right linker in that case:: + + $ llvmc2 -c hello.cpp + $ llvmc2 hello.o + [A lot of link-time errors skipped] + $ llvmc2 --linker=c++ hello.o + $ ./a.out + hello + +For further help on command-line LLVMC usage, refer to the ``llvmc +--help`` output. + +Customizing LLVMC: the compilation graph +---------------------------------------- + +At the time of writing LLVMC does not support on-the-fly reloading of +configuration, so to customize LLVMC you'll have to edit and recompile +the source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The +relevant files are ``Common.td``, ``Tools.td`` and ``Example.td``. + +Internally, LLVMC stores information about possible transformations in +form of a graph. Nodes in this graph represent tools, and edges +between two nodes represent a transformation path. A special "root" +node represents entry points for the transformations. LLVMC also +assigns a weight to each edge (more on that below) to choose between +several alternative edges. + +The definition of the compilation graph (see file ``Example.td``) is +just a list of edges:: + + def CompilationGraph : CompilationGraph<[ + Edge<root, llvm_gcc_c>, + Edge<root, llvm_gcc_assembler>, + ... + + Edge<llvm_gcc_c, llc>, + Edge<llvm_gcc_cpp, llc>, + ... + + OptionalEdge<llvm_gcc_c, opt, [(switch_on "opt")]>, + OptionalEdge<llvm_gcc_cpp, opt, [(switch_on "opt")]>, + ... + + OptionalEdge<llvm_gcc_assembler, llvm_gcc_cpp_linker, + [(if_input_languages_contain "c++"), + (or (parameter_equals "linker", "g++"), + (parameter_equals "linker", "c++"))]>, + ... + + ]>; + +As you can see, the edges can be either default or optional, where +optional edges are differentiated by sporting a list of patterns (or +edge properties) which are used to calculate the edge's weight. The +default edges are assigned a weight of 1, and optional edges get a +weight of 0 + 2*N where N is the number of succesful edge property +matches. When passing an input file through the graph, LLVMC picks the +edge with the maximum weight. To avoid ambiguity, there should be only +one default edge between two nodes (with the exception of the root +node, which gets a special treatment - there you are allowed to +specify one default edge *per language*). + +* Possible edge properties are: + + - ``switch_on`` - Returns true if a given command-line option is + provided by the user. Example: ``(switch_on "opt")``. Note that + you have to define all possible command-line options separately in + the tool descriptions. See the next section for the discussion of + different kinds of command-line options. + + - ``parameter_equals`` - Returns true if a command-line parameter equals + a given value. Example: ``(parameter_equals "W", "all")``. + + - ``element_in_list`` - Returns true if a command-line parameter list + includes a given value. Example: ``(parameter_in_list "l", "pthread")``. + + - ``if_input_languages_contain`` - Returns true if a given input + language belongs to the current input language set. + + - ``and`` - Edge property combinator. Returns true if all of its + arguments return true. Used like this: ``(and (prop1), (prop2), + ... (propN))``. Nesting is allowed, but not encouraged. + + - ``or`` - Edge property combinator that returns true if any one of its + arguments returns true. Example: ``(or (prop1), (prop2), ... (propN))``. + + - ``weight`` - Makes it possible to explicitly specify the quantity + added to the edge weight if this edge property matches. Used like + this: ``(weight N, (prop))``. The inner property can include + ``and`` and ``or`` combinators. When N is equal to 2, equivalent + to ``(prop)``. + + Example: ``(weight 8, (and (switch_on "a"), (switch_on "b")))``. + + +To get a visual representation of the compilation graph (useful for +debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and +``gsview`` installed for this to work properly. + + +Writing a tool description +-------------------------- + +As was said earlier, nodes in the compilation graph represent tools. A +tool definition looks like this (taken from the ``Tools.td`` file):: + + def llvm_gcc_cpp : Tool<[ + (in_language "c++"), + (out_language "llvm-assembler"), + (output_suffix "bc"), + (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"), + (sink) + ]>; + +This defines a new tool called ``llvm_gcc_cpp``, which is an alias for +``llvm-g++``. As you can see, a tool definition is just a list of +properties; most of them should be self-evident. The ``sink`` property +means that this tool should be passed all command-line options that +aren't handled by the other tools. + +The complete list of the currently implemented tool properties follows: + +* Possible tool properties: + + - ``in_language`` - input language name. + + - ``out_language`` - output language name. + + - ``output_suffix`` - output file suffix. + + - ``cmd_line`` - the actual command used to run the tool. You can use + ``$INFILE`` and ``$OUTFILE`` variables, as well as output + redirection with ``>``. + + - ``join`` - this tool is a "join node" in the graph, i.e. it gets a + list of input files and joins them together. Used for linkers. + + - ``sink`` - all command-line options that are not handled by other + tools are passed to this tool. + +The next tool definition is slightly more complex:: + + def llvm_gcc_linker : Tool<[ + (in_language "object-code"), + (out_language "executable"), + (output_suffix "out"), + (cmd_line "llvm-gcc $INFILE -o $OUTFILE"), + (join), + (prefix_list_option "L", (forward), (help "add a directory to link path")), + (prefix_list_option "l", (forward), (help "search a library when linking")), + (prefix_list_option "Wl", (unpack_values), (help "pass options to linker")) + ]>; + +This tool has a "join" property, which means that it behaves like a +linker (because of that this tool should be the last in the +toolchain). This tool also defines several command-line options: ``-l``, +``-L`` and ``-Wl`` which have their usual meaning. An option has two +attributes: a name and a (possibly empty) list of properties. All +currently implemented option types and properties are described below: + +* Possible option types: + + - ``switch_option`` - a simple boolean switch, for example ``-time``. + + - ``parameter_option`` - option that takes an argument, for example + ``-std=c99``; + + - ``parameter_list_option`` - same as the above, but more than one + occurence of the option is allowed. + + - ``prefix_option`` - same as the parameter_option, but the option name + and parameter value are not separated. + + - ``prefix_list_option`` - same as the above, but more than one + occurence of the option is allowed; example: ``-lm -lpthread``. + + +* Possible option properties: + + - ``append_cmd`` - append a string to the tool invocation command. + + - ``forward`` - forward this option unchanged. + + - ``stop_compilation`` - stop compilation after this phase. + + - ``unpack_values`` - used for for splitting and forwarding + comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is + converted to ``-foo=bar -baz`` and appended to the tool invocation + command. + + - ``help`` - help string associated with this option. + + - ``required`` - this option is obligatory. + + +Language map +------------ + +One last thing that you need to modify when adding support for a new +language to LLVMC is the language map, which defines mappings from +file extensions to language names. It is used to choose the proper +toolchain based on the input. Language map definition is located in +the file ``Tools.td`` and looks like this:: + + def LanguageMap : LanguageMap< + [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>, + LangToSuffixes<"c", ["c"]>, + ... + ]>; + + +References +========== + +.. [1] TableGen Fundamentals + http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html |
