From b54c99c26b9d2cf68e0a129cee2c7c1ce8ac9032 Mon Sep 17 00:00:00 2001 From: Chris Lattner Date: Fri, 6 Feb 2004 05:42:53 +0000 Subject: Add a new document describing TableGen git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11145 91177308-0d34-0410-b5e6-96231b3b80d8 --- docs/TableGenFundamentals.html | 562 +++++++++++++++++++++++++++++++++++++++++ docs/index.html | 7 + 2 files changed, 569 insertions(+) create mode 100644 docs/TableGenFundamentals.html diff --git a/docs/TableGenFundamentals.html b/docs/TableGenFundamentals.html new file mode 100644 index 0000000..f402361 --- /dev/null +++ b/docs/TableGenFundamentals.html @@ -0,0 +1,562 @@ + + + + TableGen Fundamentals + + + + +
TableGen Fundamentals
+ + + + +
Introduction
+ + +
+ +

TableGen's purpose is to help a human develop and maintain records of +domain-specific information. Because there may be a large number of these +records, it is specifically designed to allow writing flexible descriptions and +for common features of these records to be factored out. This reduces the +amount of duplication in the description, reduces the chance of error, and +makes it easier to structure domain specific information.

+ +

The core part of TableGen parses a file, instantiates +the declarations, and hands the result off to a domain-specific "TableGen backend" for processing. The current major user +of TableGen is the LLVM code generator. +

+ +
+ + +
+ Basic concepts +
+ +
+ +

+TableGen files consist of two key parts: 'classes' and 'definitions', both of +which are considered 'records'. +

+ +

+TableGen records have a unique name, a list of values, and a list of +superclasses. The list of values is main data that TableGen builds for each +record, it is this that holds the domain specific information for the +application. The interpretation of this data is left to a specific TableGen backend, but the structure and format rules are +taken care of and fixed by TableGen. +

+ +

+TableGen definitions are the concrete form of 'records'. These generally +do not have any undefined values, and are marked with the 'def' +keyword. +

+ +

+TableGen classes are abstract records that are used to build and describe +other records. These 'classes' allow the end-user to build abstractions for +either the domain they are targetting (such as "Register", "RegisterClass", and +"Instruction" in the LLVM code generator) or for the implementor to help factor +out common properties of records (such as "FPInst", which is used to represent +floating point instructions in the X86 backend). TableGen keeps track of all of +the classes that are used to build up a definition, so the backend can find all +definitions of a particular class, such as "Instruction". +

+ +
+ + +
+ An example record +
+ +
+ +

+With no other arguments, TableGen parses the specified file and prints out all +of the classes, then all of the definitions. This is a good way to see what the +various definitions expand to fully. Running this on the X86.td file +prints this (at the time of this writing): +

+ +

+

+...
+def ADDrr8 {    // Instruction X86Inst I2A8 Pattern
+  string Name = "add";
+  string Namespace = "X86";
+  list<Register> Uses = [];
+  list<Register> Defs = [];
+  bit isReturn = 0;
+  bit isBranch = 0;
+  bit isCall = 0;
+  bit isTwoAddress = 1;
+  bit isTerminator = 0;
+  dag Pattern = (set R8, (plus R8, R8));
+  bits<8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 0 };
+  Format Form = MRMDestReg;
+  bits<5> FormBits = { 0, 0, 0, 1, 1 };
+  ArgType Type = Arg8;
+  bits<3> TypeBits = { 0, 0, 1 };
+  bit hasOpSizePrefix = 0;
+  bit printImplicitUses = 0;
+  bits<4> Prefix = { 0, 0, 0, 0 };
+  FPFormat FPForm = ?;
+  bits<3> FPFormBits = { 0, 0, 0 };
+}
+...
+

+ +

+This definition corresponds to an 8-bit register-register add instruction in the +X86. The string after the 'def' string indicates the name of the +record ("ADDrr8" in this case), and the comment at the end of the line +indicates the superclasses of the definition. The body of the record contains +all of the data that TableGen assembled for the record, indicating that the +instruction is part of the "X86" namespace, should be printed as "add" +in the assembly file, it is a two-address instruction, has a particular +encoding, etc. The contents and semantics of the information in the record is +specific to the needs of the X86 backend, and is only shown as an example. +

+ +

+As you can see, a lot of information is needed for every instruction supported +by the code generator, and specifying it all manually would be unmaintainble, +prone to bugs, and tiring to do in the first place. Because we are using +TableGen, all of the information was derived from the following definition: +

+ +

+def ADDrr8   : I2A8<"add", 0x00, MRMDestReg>,
+               Pattern<(set R8, (plus R8, R8))>;
+

+ +

+This definition makes use of the custom I2A8 (two address instruction with 8-bit +operand) class, which is defined in the X86-specific TableGen file to factor out +the common features that instructions of its class share. A key feature of +TableGen is that it allows the end-user to define the abstractions they prefer +to use when describing their information. +

+ +
+ + +
+ Running TableGen +
+ +
+ +

+TableGen runs just like any other LLVM tool. The first (optional) argument +specifies the file to read. If a filename is not specified, tblgen +reads from standard input. +

+ +

+To be useful, one of the TableGen backends must be used. +These backends are selectable on the command line (type 'tblgen --help' +for a list). For example, to get a list of all of the definitions that subclass +a particular type (which can be useful for building up an enum list of these +records), use the --print-enums option: +

+ +

+$ tblgen X86.td -print-enums -class=Register
+AH, AL, AX, BH, BL, BP, BX, CH, CL, CX, DH, DI, DL, DX,
+EAX, EBP, EBX, ECX, EDI, EDX, ESI, ESP, FP0, FP1, FP2, FP3, FP4, FP5, FP6,
+SI, SP, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7, 
+
+$ tblgen X86.td -print-enums -class=Instruction 
+ADCrr32, ADDri16, ADDri16b, ADDri32, ADDri32b, ADDri8, ADDrr16, ADDrr32,
+ADDrr8, ADJCALLSTACKDOWN, ADJCALLSTACKUP, ANDri16, ANDri16b, ANDri32, ANDri32b,
+ANDri8, ANDrr16, ANDrr32, ANDrr8, BSWAPr32, CALLm32, CALLpcrel32, ...
+

+ +

+The default backend prints out all of the records, as described above. +

+ +

+If you plan to use TableGen for some purpose, you will most likely have to write a backend that extracts the information specific to +what you need and formats it in the appropriate way. +

+ +
+ + + +
TableGen syntax
+ + +
+ +

+TableGen doesn't care about the meaning of data (that is up to the backend to +define), but it does care about syntax, and it enforces a simple type system. +This section describes the syntax and the constructs allowed in a TableGen file. +

+ +
+ + +
+ TableGen primitives +
+ + +
+ TableGen comments +
+ +
+ +

TableGen supports BCPL style "//" comments, which run to the end of +the line, and it also supports nestable "/* */" comments.

+ +
+ + + +
+ The TableGen type system +
+ +
+

+TableGen files are strongly typed, in a simple (but complete) type-system. +These types are used to perform automatic conversions, check for errors, and to +help interface designers constrain the input that they allow. Every value definition is required to have an associated type. +

+ +

+TableGen supports a mixture of very low-level types (such as bit) and +very high-level types (such as dag). This flexibility is what allows +it to describe a wide range of information conveniently and compactly. The +TableGen types are: +

+ +

+

+

+ +

+To date, these types have been sufficient for describing things that TableGen +has been used for, but it is straight-forward to extend this list if needed. +

+ +
+ + +
+ TableGen values and expressions +
+ +
+

+TableGen allows for a pretty reasonable number of different expression forms +when building up values. These forms allow the TableGen file to be written in a +natural syntax and flavor for the application. The current expression forms +supported include: +

+ +

+ +

+Note that all of the values have rules specifying how they convert to to values +for different types. These rules allow you to assign a value like "7" to a +"bits<4>" value, for example. +

+ + + +
+ + + +
+ Classes and definitions +
+ +
+

+As mentioned in the intro, classes and definitions +(collectively known as 'records') in TableGen are the main high-level unit of +information that TableGen collects. Records are defined with a def or +class keyword, the record name, and an optional list of "template arguments". If the record has superclasses, +they are specified as a comma seperated list that starts with a colon character +(":"). If value definitions or let +expressions are needed for the class they are enclosed in curly braces +("{}"), otherwise the record ends with a semicolon. Here is a simple TableGen +file: +

+ +

+class C { bit V = 1; }
+def X : C;
+def Y : C {
+  string Greeting = "hello";
+}
+

+ +

+This example defines two definitions, X and Y, both of which +derive from the C class. Because of this, they both get the V +bit value. The Y definition also gets the Greeting member as well. +

+ +
+ + +
+ Value definitions +
+ +
+

+Value definitions define named entries in records. A value must be defined +before it can be referred to as the operand for another value definition, or +before the value is reset with a let expression. A +value is defined by specifying a TableGen type and a name. +If an initial value is available, it may be specified after the type with an +equal sign. Value definitions require terminating semicolons. +

+ + +
+ 'let' expressions +
+ +
+

+A record-level let expression is used to change the value of a value definition +in a record. This is primarily useful when a superclass defines a value that a +derived class or definitions wants to override. Let expressions consist of the +'let' keyword, followed by a value name, an equal sign ("="), and a new +value for example, a new class could be added to the example above, redefining +the V field for all of its subclasses:

+ +

+class D : C { let V = 0; }
+def Z : D;
+

+ +

+In this case, the Z definition will have a zero value for its "V" +value, despite the fact that it derives (indirectly) from the C class, +because the D class overrode its value. +

+ +
+ + +
+ Class template arguments +
+ +
+and default values... +
+ + + + +
+ File scope entities +
+ + +
+ File inclusion +
+ +
+

+TableGen supports the 'include' token, which textually substitutes the +specified file in place of the include directive. The filename should be +specified as a double quoted string immediately after the 'include' +keyword. Example: + +

+  include "foo.td"
+

+ +
+ + +
+ 'let' expressions +
+ +
+

+"let" expressions at file scope are similar to "let" +expressions within a record, except they can specify a value binding for +multiple records at a time, and may be useful in certain other cases. +File-scope let expressions are really just another way that TableGen allows the +end-user to factor out commonality from the records. +

+ +

+File-scope "let" expressions take a comma-seperated list of bindings to apply, +and one of more records to bind the values in. Here are some examples: +

+ +

+let isTerminator = 1, isReturn = 1 in
+  def RET : X86Inst<"ret", 0xC3, RawFrm, NoArg>;
+
+let isCall = 1 in
+  // All calls clobber the non-callee saved registers...
+  let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6] in {
+    def CALLpcrel32 : X86Inst<"call", 0xE8, RawFrm, NoArg>;
+    def CALLr32     : X86Inst<"call", 0xFF, MRMS2r, Arg32>;
+    def CALLm32     : X86Inst<"call", 0xFF, MRMS2m, Arg32>;
+  }
+

+ +

+File-scope "let" expressions are often useful when a couple of definitions need +to be added to several records, and the records do not otherwise need to be +opened, as in the case with the CALL* instructions above. +

+
+ + + +
TableGen backends
+ + +
+ +

+How they work, how to write one. This section should not contain details about +any particular backend, except maybe -print-enums as an example. This should +highlight the APIs in TableGen/Record.h. +

+ +
+ + + +
The LLVM code generator
+ + +
+ +

+This is just a temporary, convenient, place to put stuff about the code +generator before it gets its own document. This should describe all of the +tablegen backends used by the code generator and the classes/definitions they +expect. +

+ +
+ + + + +
+ + + + diff --git a/docs/index.html b/docs/index.html index b9770e9..114dd3a 100644 --- a/docs/index.html +++ b/docs/index.html @@ -183,6 +183,13 @@ LLVM Programming Documentation:

+ TableGen Fundamentals: +
+ llvm/docs/TableGenFundamentals.html +

+ + +

The Stacker Cronicles
The Stacker Cronicles -- cgit v1.1