diff options
Diffstat (limited to 'docs/BytecodeFormat.html')
-rw-r--r-- | docs/BytecodeFormat.html | 228 |
1 files changed, 123 insertions, 105 deletions
diff --git a/docs/BytecodeFormat.html b/docs/BytecodeFormat.html index 67ed8ba..5c22530 100644 --- a/docs/BytecodeFormat.html +++ b/docs/BytecodeFormat.html @@ -39,8 +39,8 @@ <li><a href="#constantpool">Global Constant Pool</a></li> <li><a href="#functiondefs">Function Definition</a></li> <li><a href="#compactiontable">Compaction Table</a></li> - <li><a href="#instructionlist">Instruction List</a></li> - <li><a href="#opcodes">Instruction Opcodes</a></li> + <li><a href="#instructionlist">Instructions List</a></li> + <li><a href="#instructions">Instructions</a></li> <li><a href="#symtab">Symbol Table</a></li> </ol> </li> @@ -1363,8 +1363,125 @@ of formats. See <a href="#instruction">Instructions</a> for details.</td> </tbody> </table> </div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsection"><a name="instructions">Instructions</a></div> + +<div class="doc_text"> +<p>Instructions are written out one at a time as distinct units. Each +instruction +record contains at least an <a href="#opcodes">opcode</a> and a type field, +and may contain a list of operands (whose interpretation depends on the opcode). +Based on the number of operands, the +<a href="#instencode">instruction is encoded</a> in a +dense format that tries to encoded each instruction into 32-bits if +possible. </p> +</div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"><a name="opcodes">Instruction Opcodes</a></div> +<div class="doc_text"> + <p>Instructions encode an opcode that identifies the kind of instruction. + Opcodes are an enumerated integer value. The specific values used depend on + the version of LLVM you're using. The opcode values are defined in the + <a href="http://llvm.cs.uiuc.edu/cvsweb/cvsweb.cgi/llvm/include/llvm/Instruction.def"> + <tt>include/llvm/Instruction.def</tt></a> file. You should check there for the + most recent definitions. The table below provides the opcodes defined as of + the writing of this document. The table associates each opcode mnemonic with + its enumeration value and the bytecode and LLVM version numbers in which the + opcode was introduced.</p> + <table> + <tbody> + <tr> + <th>Opcode</th> + <th>Number</th> + <th>Bytecode Version</th> + <th>LLVM Version</th> + </tr> + <tr><td colspan="4"><b>Terminator Instructions</b></td></tr> + <tr><td>Ret</td><td>1</td><td>1</td><td>1.0</td></tr> + <tr><td>Br</td><td>2</td><td>1</td><td>1.0</td></tr> + <tr><td>Switch</td><td>3</td><td>1</td><td>1.0</td></tr> + <tr><td>Invoke</td><td>4</td><td>1</td><td>1.0</td></tr> + <tr><td>Unwind</td><td>5</td><td>1</td><td>1.0</td></tr> + <tr><td>Unreachable</td><td>6</td><td>1</td><td>1.4</td></tr> + <tr><td colspan="4"><b>Binary Operators</b></td></tr> + <tr><td>Add</td><td>7</td><td>1</td><td>1.0</td></tr> + <tr><td>Sub</td><td>8</td><td>1</td><td>1.0</td></tr> + <tr><td>Mul</td><td>9</td><td>1</td><td>1.0</td></tr> + <tr><td>Div</td><td>10</td><td>1</td><td>1.0</td></tr> + <tr><td>Rem</td><td>11</td><td>1</td><td>1.0</td></tr> + <tr><td colspan="4"><b>Logical Operators</b></td></tr> + <tr><td>And</td><td>12</td><td>1</td><td>1.0</td></tr> + <tr><td>Or</td><td>13</td><td>1</td><td>1.0</td></tr> + <tr><td>Xor</td><td>14</td><td>1</td><td>1.0</td></tr> + <tr><td colspan="4"><b>Binary Comparison Operators</b></td></tr> + <tr><td>SetEQ</td><td>15</td><td>1</td><td>1.0</td></tr> + <tr><td>SetNE</td><td>16</td><td>1</td><td>1.0</td></tr> + <tr><td>SetLE</td><td>17</td><td>1</td><td>1.0</td></tr> + <tr><td>SetGE</td><td>18</td><td>1</td><td>1.0</td></tr> + <tr><td>SetLT</td><td>19</td><td>1</td><td>1.0</td></tr> + <tr><td>SetGT</td><td>20</td><td>1</td><td>1.0</td></tr> + <tr><td colspan="4"><b>Memory Operators</b></td></tr> + <tr><td>Malloc</td><td>21</td><td>1</td><td>1.0</td></tr> + <tr><td>Free</td><td>22</td><td>1</td><td>1.0</td></tr> + <tr><td>Alloca</td><td>23</td><td>1</td><td>1.0</td></tr> + <tr><td>Load</td><td>24</td><td>1</td><td>1.0</td></tr> + <tr><td>Store</td><td>25</td><td>1</td><td>1.0</td></tr> + <tr><td>GetElementPtr</td><td>26</td><td>1</td><td>1.0</td></tr> + <tr><td colspan="4"><b>Other Operators</b></td></tr> + <tr><td>PHI</td><td>27</td><td>1</td><td>1.0</td></tr> + <tr><td>Cast</td><td>28</td><td>1</td><td>1.0</td></tr> + <tr><td>Call</td><td>29</td><td>1</td><td>1.0</td></tr> + <tr><td>Shl</td><td>30</td><td>1</td><td>1.0</td></tr> + <tr><td>Shr</td><td>31</td><td>1</td><td>1.0</td></tr> + <tr><td>VANext</td><td>32</td><td>1</td><td>1.0</td></tr> + <tr><td>VAArg</td><td>33</td><td>1</td><td>1.0</td></tr> + <tr><td>Select</td><td>34</td><td>2</td><td>1.2</td></tr> + <tr><td colspan="4"> + <b>Pseudo Instructions<a href="#pi_note">*</a></b> + </td></tr> + <tr><td>Invoke+CC </td><td>56</td><td>5</td><td>1.5</td></tr> + <tr><td>Invoke+FastCC</td><td>57</td><td>5</td><td>1.5</td></tr> + <tr><td>Call+CC</td><td>58</td><td>5</td><td>1.5</td></tr> + <tr><td>Call+FastCC+TailCall</td><td>59</td><td>5</td><td>1.5</td></tr> + <tr><td>Call+FastCC</td><td>60</td><td>5</td><td>1.5</td></tr> + <tr><td>Call+CCC+TailCall</td><td>61</td><td>5</td><td>1.5</td></tr> + <tr><td>Load+Volatile</td><td>62</td><td>3</td><td>1.3</td></tr> + <tr><td>Store+Volatile</td><td>63</td><td>3</td><td>1.3</td></tr> + </tbody> + </table> + +<p><b><a name="pi_note">* Note: </a></b> +These aren't really opcodes from an LLVM language perspective. They encode +information into other opcodes without reserving space for that information. +For example, opcode=63 is a Volatile Store. The opcode for this +instruction is 25 (Store) but we encode it as 63 to indicate that is a Volatile +Store. The same is done for the calling conventions and tail calls. +In each of these entries in range 56-63, the opcode is documented as the base +opcode (Invoke, Call, Store) plus some set of modifiers, as follows:</p> +<dl> + <dt>CC</dt> + <dd>This means an arbitrary calling convention is specified + in a VBR that follows the opcode. This is used when the instruction cannot + be encoded with one of the more compact forms. + </dd> + <dt>FastCC</dt> + <dd>This indicates that the Call or Invoke is using the FastCC calling + convention.</dd> + <dt>CCC</dt> + <dd>This indicates that the Call or Invoke is using the native "C" calling + convention.</dd> + <dt>TailCall</dt> + <dd>This indicates that the Call has the 'tail' modifier.</dd> +</dl> +</div> + + <!-- _______________________________________________________________________ --> -<div class="doc_subsubsection"><a name="instruction">Instructions</a></div> +<div class="doc_subsubsection"><a name="instencode">Instruction +Encoding</a></div> + <div class="doc_text"> <p>For brevity, instructions are written in one of four formats, depending on the number of operands to the instruction. Each @@ -1430,7 +1547,7 @@ single <a href="#uint32_vbr">uint32_vbr</a> as follows:</p> </tr> <tr> <td>2-7</td> - <td><a href="#opcode">opcode</a></td> + <td><a href="#instructions">opcode</a></td> <td class="td_left">Specifies the opcode of the instruction. Note that the maximum opcode value is 63.</td> </tr> @@ -1467,7 +1584,7 @@ single <a href="#uint32_vbr">uint32_vbr</a> as follows:</p> </tr> <tr> <td>2-7</td> - <td><a href="#opcodes">opcode</a></td> + <td><a href="#instructions">opcode</a></td> <td class="td_left">Specifies the opcode of the instruction. Note that the maximum opcode value is 63.</td> </tr> @@ -1509,7 +1626,7 @@ single <a href="#uint32_vbr">uint32_vbr</a> as follows:</p> </tr> <tr> <td>2-7</td> - <td><a href="#opcodes">opcode</a></td> + <td><a href="#instructions">opcode</a></td> <td class="td_left">Specifies the opcode of the instruction. Note that the maximum opcode value is 63.</td> </tr> @@ -1542,105 +1659,6 @@ single <a href="#uint32_vbr">uint32_vbr</a> as follows:</p> </div> <!-- _______________________________________________________________________ --> -<div class="doc_subsection"><a name="opcodes">Instruction Opcodes</a></div> -<div class="doc_text"> - <p>Instructions encode an opcode that identifies the kind of instruction. - Opcodes are an enumerated integer value. The specific values used depend on - the version of LLVM you're using. The opcode values are defined in the - <a href="http://llvm.cs.uiuc.edu/cvsweb/cvsweb.cgi/llvm/include/llvm/Instruction.def"> - <tt>include/llvm/Instruction.def</tt></a> file. You should check there for the - most recent definitions. The table below provides the opcodes defined as of - the writing of this document. The table associates each opcode mnemonic with - its enumeration value and the bytecode and LLVM version numbers in which the - opcode was introduced.</p> - <table> - <tbody> - <tr> - <th>Opcode</th> - <th>Number</th> - <th>Bytecode Version</th> - <th>LLVM Version</th> - </tr> - <tr><td colspan="4"><b>Terminator Instructions</b></td></tr> - <tr><td>Ret</td><td>1</td><td>1</td><td>1.0</td></tr> - <tr><td>Br</td><td>2</td><td>1</td><td>1.0</td></tr> - <tr><td>Switch</td><td>3</td><td>1</td><td>1.0</td></tr> - <tr><td>Invoke</td><td>4</td><td>1</td><td>1.0</td></tr> - <tr><td>Unwind</td><td>5</td><td>1</td><td>1.0</td></tr> - <tr><td>Unreachable</td><td>6</td><td>1</td><td>1.4</td></tr> - <tr><td colspan="4"><b>Binary Operators</b></td></tr> - <tr><td>Add</td><td>7</td><td>1</td><td>1.0</td></tr> - <tr><td>Sub</td><td>8</td><td>1</td><td>1.0</td></tr> - <tr><td>Mul</td><td>9</td><td>1</td><td>1.0</td></tr> - <tr><td>Div</td><td>10</td><td>1</td><td>1.0</td></tr> - <tr><td>Rem</td><td>11</td><td>1</td><td>1.0</td></tr> - <tr><td colspan="4"><b>Logical Operators</b></td></tr> - <tr><td>And</td><td>12</td><td>1</td><td>1.0</td></tr> - <tr><td>Or</td><td>13</td><td>1</td><td>1.0</td></tr> - <tr><td>Xor</td><td>14</td><td>1</td><td>1.0</td></tr> - <tr><td colspan="4"><b>Binary Comparison Operators</b></td></tr> - <tr><td>SetEQ</td><td>15</td><td>1</td><td>1.0</td></tr> - <tr><td>SetNE</td><td>16</td><td>1</td><td>1.0</td></tr> - <tr><td>SetLE</td><td>17</td><td>1</td><td>1.0</td></tr> - <tr><td>SetGE</td><td>18</td><td>1</td><td>1.0</td></tr> - <tr><td>SetLT</td><td>19</td><td>1</td><td>1.0</td></tr> - <tr><td>SetGT</td><td>20</td><td>1</td><td>1.0</td></tr> - <tr><td colspan="4"><b>Memory Operators</b></td></tr> - <tr><td>Malloc</td><td>21</td><td>1</td><td>1.0</td></tr> - <tr><td>Free</td><td>22</td><td>1</td><td>1.0</td></tr> - <tr><td>Alloca</td><td>23</td><td>1</td><td>1.0</td></tr> - <tr><td>Load</td><td>24</td><td>1</td><td>1.0</td></tr> - <tr><td>Store</td><td>25</td><td>1</td><td>1.0</td></tr> - <tr><td>GetElementPtr</td><td>26</td><td>1</td><td>1.0</td></tr> - <tr><td colspan="4"><b>Other Operators</b></td></tr> - <tr><td>PHI</td><td>27</td><td>1</td><td>1.0</td></tr> - <tr><td>Cast</td><td>28</td><td>1</td><td>1.0</td></tr> - <tr><td>Call</td><td>29</td><td>1</td><td>1.0</td></tr> - <tr><td>Shl</td><td>30</td><td>1</td><td>1.0</td></tr> - <tr><td>Shr</td><td>31</td><td>1</td><td>1.0</td></tr> - <tr><td>VANext</td><td>32</td><td>1</td><td>1.0</td></tr> - <tr><td>VAArg</td><td>33</td><td>1</td><td>1.0</td></tr> - <tr><td>Select</td><td>34</td><td>2</td><td>1.2</td></tr> - <tr><td colspan="4"> - <b>Pseudo Instructions<a href="#pi_note">*</a></b> - </td></tr> - <tr><td>Invoke+CC </td><td>56</td><td>5</td><td>1.5</td></tr> - <tr><td>Invoke+FastCC</td><td>57</td><td>5</td><td>1.5</td></tr> - <tr><td>Call+CC</td><td>58</td><td>5</td><td>1.5</td></tr> - <tr><td>Call+FastCC+TailCall</td><td>59</td><td>5</td><td>1.5</td></tr> - <tr><td>Call+FastCC</td><td>60</td><td>5</td><td>1.5</td></tr> - <tr><td>Call+CCC+TailCall</td><td>61</td><td>5</td><td>1.5</td></tr> - <tr><td>Load+Volatile</td><td>62</td><td>3</td><td>1.3</td></tr> - <tr><td>Store+Volatile</td><td>63</td><td>3</td><td>1.3</td></tr> - </tbody> - </table> -</div> - -<p><b><a name="pi_note">* Note: </a></b> -These aren't really opcodes from an LLVM language prespeective. They encode -information into other opcodes without reserving space for that information. -For example, opcode=63 is a Volatile Store. The opcode for this -instruction is 25 (Store) but we encode it as 63 to indicate that is a Volatile -Store. The same is done for the calling conventions and tail calls. -In each of these entries in range 56-63, the opcode is documented as the base -opcode (Invoke, Call, Store) plus some set of modifiers, as follows:</p> -<dl> - <dt>CC</dt> - <dd>This means an arbitrary calling convention is specified - in a VBR that follows the opcode. This is used when the instruction cannot - be encoded with one of the more compact forms. - </dd> - <dt>FastCC</dt> - <dd>This indicates that the Call or Invoke is using the FastCC calling - convention.</dd> - <dt>CCC</dt> - <dd>This indicates that the Call or Invoke is using the native "C" calling - convention.</dd> - <dt>TailCall</dt> - <dd>This indicates that the Call has the 'tail' modifier.</dd> -</dl> - -<!-- _______________________________________________________________________ --> <div class="doc_subsection"><a name="symtab">Symbol Table</a> </div> <div class="doc_text"> <p>A symbol table can be put out in conjunction with a module or a function. A |