From 7aa940de943c9adc128ba42cab6c5aa5cf99fa9c Mon Sep 17 00:00:00 2001 From: Reid Spencer Date: Tue, 25 May 2004 15:47:57 +0000 Subject: Added a bit on slot numbers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13756 91177308-0d34-0410-b5e6-96231b3b80d8 --- docs/BytecodeFormat.html | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) (limited to 'docs') diff --git a/docs/BytecodeFormat.html b/docs/BytecodeFormat.html index 81a7eb2..d77653b 100644 --- a/docs/BytecodeFormat.html +++ b/docs/BytecodeFormat.html @@ -19,6 +19,7 @@
  • Blocks
  • Lists
  • Fields
  • +
  • Slots
  • Encoding Rules
  • Alignment
  • @@ -120,6 +121,43 @@ sections that follow will provide the details on how these fields are written and how the bits are to be interpreted.

    +
    Slots
    +
    +

    The bytecode format uses the notion of a "slot" to reference Types and +Values. Since the bytecode file is a direct representation of LLVM's +intermediate representation, there is a need to represent pointers in the file. +Slots are used for this purpose. For example, if one has the following assembly: +

    +
    
    +  %MyType = type { int, sbyte };
    +  %MyVar = external global %MyType ;
    +
    +

    there are two definitions. The definition of %MyVar uses %MyType and %MyType +is used by %MyVar. In the C++ IR this linkage between %MyVar and %MyType is +made explicitly by the use of C++ pointers. In bytecode, however, there's no +ability to store memory addresses. Instead, we compute and write out slot +numbers for every type and Value written to the file.

    +

    A slot number is simply an unsigned 32-bit integer encoded in the variable +bit rate scheme (see encoding below). This ensures that +low slot numbers are encoded in one byte. Through various bits of magic LLVM +attempts to always keep the slot numbers low. The first attempt is to associate +slot numbers with their "type plane". That is, Values of the same type are +written to the bytecode file in a list (sequentially). Their order in that list +determines their slot number. This means that slot #1 doesn't mean anything +unless you also specify for which type you want slot #1. Types are handled +specially and are always written to the file first (in the Global Type Pool) and +in such a way that both forward and backward references of the types can be +resolved with a single pass through the type pool.

    +

    Slot numbers are also kept small by rearranging their order. Because of the +structure of LLVM, certain values are much more likely to be used frequently +in the body of a function. For this reason, a compaction table is provided in +the body of a function if its use would make the function body smaller. +Suppose you have a function body that uses just the types "int*" and "{double}" +but uses them thousands of time. Its worthwhile to ensure that the slot number +for these types are low so they can be encoded in a single byte (via vbr). +This is exactly what the compaction table does.

    +
    +
    Encoding Primitives

    Each field that can be put out is encoded into the file using a small set -- cgit v1.1