aboutsummaryrefslogtreecommitdiffstats
path: root/tools/llvmc2/doc/LLVMC-Reference.rst
diff options
context:
space:
mode:
Diffstat (limited to 'tools/llvmc2/doc/LLVMC-Reference.rst')
-rw-r--r--tools/llvmc2/doc/LLVMC-Reference.rst258
1 files changed, 258 insertions, 0 deletions
diff --git a/tools/llvmc2/doc/LLVMC-Reference.rst b/tools/llvmc2/doc/LLVMC-Reference.rst
new file mode 100644
index 0000000..b8a364f
--- /dev/null
+++ b/tools/llvmc2/doc/LLVMC-Reference.rst
@@ -0,0 +1,258 @@
+Tutorial - Writing LLVMC Configuration files
+=============================================
+
+LLVMC is a generic compiler driver, designed to be customizable and
+extensible. It plays the same role for LLVM as the ``gcc`` program
+does for GCC - LLVMC's job is essentially to transform a set of input
+files into a set of targets depending on configuration rules and user
+options. What makes LLVMC different is that these transformation rules
+are completely customizable - in fact, LLVMC knows nothing about the
+specifics of transformation (even the command-line options are mostly
+not hard-coded) and regards the transformation structure as an
+abstract graph. This makes it possible to adapt LLVMC for other
+purposes - for example, as a build tool for game resources. This
+tutorial describes the basic usage and configuration of LLVMC.
+
+Because LLVMC employs TableGen [1]_ as its configuration language, you
+need to be familiar with it to customize LLVMC.
+
+Compiling with LLVMC
+--------------------
+
+In general, LLVMC tries to be command-line compatible with ``gcc`` as
+much as possible, so most of the familiar options work::
+
+ $ llvmc2 -O3 -Wall hello.cpp
+ $ ./a.out
+ hello
+
+One nice feature of LLVMC is that you don't have to distinguish
+between different compilers for different languages (think ``g++`` and
+``gcc``) - the right toolchain is chosen automatically based on input
+language names (which are, in turn, determined from file extension). If
+you want to force files ending with ".c" compile as C++, use the
+``-x`` option, just like you would do it with ``gcc``::
+
+ $ llvmc2 -x c hello.cpp
+ $ # hello.cpp is really a C file
+ $ ./a.out
+ hello
+
+On the other hand, when using LLVMC as a linker to combine several C++
+object files you should provide the ``--linker`` option since it's
+impossible for LLVMC to choose the right linker in that case::
+
+ $ llvmc2 -c hello.cpp
+ $ llvmc2 hello.o
+ [A lot of link-time errors skipped]
+ $ llvmc2 --linker=c++ hello.o
+ $ ./a.out
+ hello
+
+For further help on command-line LLVMC usage, refer to the ``llvmc
+--help`` output.
+
+Customizing LLVMC: the compilation graph
+----------------------------------------
+
+At the time of writing LLVMC does not support on-the-fly reloading of
+configuration, so to customize LLVMC you'll have to edit and recompile
+the source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The
+relevant files are ``Common.td``, ``Tools.td`` and ``Example.td``.
+
+Internally, LLVMC stores information about possible transformations in
+form of a graph. Nodes in this graph represent tools, and edges
+between two nodes represent a transformation path. A special "root"
+node represents entry points for the transformations. LLVMC also
+assigns a weight to each edge (more on that below) to choose between
+several alternative edges.
+
+The definition of the compilation graph (see file ``Example.td``) is
+just a list of edges::
+
+ def CompilationGraph : CompilationGraph<[
+ Edge<root, llvm_gcc_c>,
+ Edge<root, llvm_gcc_assembler>,
+ ...
+
+ Edge<llvm_gcc_c, llc>,
+ Edge<llvm_gcc_cpp, llc>,
+ ...
+
+ OptionalEdge<llvm_gcc_c, opt, [(switch_on "opt")]>,
+ OptionalEdge<llvm_gcc_cpp, opt, [(switch_on "opt")]>,
+ ...
+
+ OptionalEdge<llvm_gcc_assembler, llvm_gcc_cpp_linker,
+ [(if_input_languages_contain "c++"),
+ (or (parameter_equals "linker", "g++"),
+ (parameter_equals "linker", "c++"))]>,
+ ...
+
+ ]>;
+
+As you can see, the edges can be either default or optional, where
+optional edges are differentiated by sporting a list of patterns (or
+edge properties) which are used to calculate the edge's weight. The
+default edges are assigned a weight of 1, and optional edges get a
+weight of 0 + 2*N where N is the number of succesful edge property
+matches. When passing an input file through the graph, LLVMC picks the
+edge with the maximum weight. To avoid ambiguity, there should be only
+one default edge between two nodes (with the exception of the root
+node, which gets a special treatment - there you are allowed to
+specify one default edge *per language*).
+
+* Possible edge properties are:
+
+ - ``switch_on`` - Returns true if a given command-line option is
+ provided by the user. Example: ``(switch_on "opt")``. Note that
+ you have to define all possible command-line options separately in
+ the tool descriptions. See the next section for the discussion of
+ different kinds of command-line options.
+
+ - ``parameter_equals`` - Returns true if a command-line parameter equals
+ a given value. Example: ``(parameter_equals "W", "all")``.
+
+ - ``element_in_list`` - Returns true if a command-line parameter list
+ includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
+
+ - ``if_input_languages_contain`` - Returns true if a given input
+ language belongs to the current input language set.
+
+ - ``and`` - Edge property combinator. Returns true if all of its
+ arguments return true. Used like this: ``(and (prop1), (prop2),
+ ... (propN))``. Nesting is allowed, but not encouraged.
+
+ - ``or`` - Edge property combinator that returns true if any one of its
+ arguments returns true. Example: ``(or (prop1), (prop2), ... (propN))``.
+
+ - ``weight`` - Makes it possible to explicitly specify the quantity
+ added to the edge weight if this edge property matches. Used like
+ this: ``(weight N, (prop))``. The inner property can include
+ ``and`` and ``or`` combinators. When N is equal to 2, equivalent
+ to ``(prop)``.
+
+ Example: ``(weight 8, (and (switch_on "a"), (switch_on "b")))``.
+
+
+To get a visual representation of the compilation graph (useful for
+debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
+``gsview`` installed for this to work properly.
+
+
+Writing a tool description
+--------------------------
+
+As was said earlier, nodes in the compilation graph represent tools. A
+tool definition looks like this (taken from the ``Tools.td`` file)::
+
+ def llvm_gcc_cpp : Tool<[
+ (in_language "c++"),
+ (out_language "llvm-assembler"),
+ (output_suffix "bc"),
+ (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
+ (sink)
+ ]>;
+
+This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
+``llvm-g++``. As you can see, a tool definition is just a list of
+properties; most of them should be self-evident. The ``sink`` property
+means that this tool should be passed all command-line options that
+aren't handled by the other tools.
+
+The complete list of the currently implemented tool properties follows:
+
+* Possible tool properties:
+
+ - ``in_language`` - input language name.
+
+ - ``out_language`` - output language name.
+
+ - ``output_suffix`` - output file suffix.
+
+ - ``cmd_line`` - the actual command used to run the tool. You can use
+ ``$INFILE`` and ``$OUTFILE`` variables, as well as output
+ redirection with ``>``.
+
+ - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
+ list of input files and joins them together. Used for linkers.
+
+ - ``sink`` - all command-line options that are not handled by other
+ tools are passed to this tool.
+
+The next tool definition is slightly more complex::
+
+ def llvm_gcc_linker : Tool<[
+ (in_language "object-code"),
+ (out_language "executable"),
+ (output_suffix "out"),
+ (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
+ (join),
+ (prefix_list_option "L", (forward), (help "add a directory to link path")),
+ (prefix_list_option "l", (forward), (help "search a library when linking")),
+ (prefix_list_option "Wl", (unpack_values), (help "pass options to linker"))
+ ]>;
+
+This tool has a "join" property, which means that it behaves like a
+linker (because of that this tool should be the last in the
+toolchain). This tool also defines several command-line options: ``-l``,
+``-L`` and ``-Wl`` which have their usual meaning. An option has two
+attributes: a name and a (possibly empty) list of properties. All
+currently implemented option types and properties are described below:
+
+* Possible option types:
+
+ - ``switch_option`` - a simple boolean switch, for example ``-time``.
+
+ - ``parameter_option`` - option that takes an argument, for example
+ ``-std=c99``;
+
+ - ``parameter_list_option`` - same as the above, but more than one
+ occurence of the option is allowed.
+
+ - ``prefix_option`` - same as the parameter_option, but the option name
+ and parameter value are not separated.
+
+ - ``prefix_list_option`` - same as the above, but more than one
+ occurence of the option is allowed; example: ``-lm -lpthread``.
+
+
+* Possible option properties:
+
+ - ``append_cmd`` - append a string to the tool invocation command.
+
+ - ``forward`` - forward this option unchanged.
+
+ - ``stop_compilation`` - stop compilation after this phase.
+
+ - ``unpack_values`` - used for for splitting and forwarding
+ comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
+ converted to ``-foo=bar -baz`` and appended to the tool invocation
+ command.
+
+ - ``help`` - help string associated with this option.
+
+ - ``required`` - this option is obligatory.
+
+
+Language map
+------------
+
+One last thing that you need to modify when adding support for a new
+language to LLVMC is the language map, which defines mappings from
+file extensions to language names. It is used to choose the proper
+toolchain based on the input. Language map definition is located in
+the file ``Tools.td`` and looks like this::
+
+ def LanguageMap : LanguageMap<
+ [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
+ LangToSuffixes<"c", ["c"]>,
+ ...
+ ]>;
+
+
+References
+==========
+
+.. [1] TableGen Fundamentals
+ http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html