aboutsummaryrefslogtreecommitdiffstats
path: root/docs/tutorial/JITTutorial2.html
blob: 17ff78c2f0e8692f8a08e7d78239bf03816357ec (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
                      "http://www.w3.org/TR/html4/strict.dtd">

<html>
<head>
  <title>LLVM Tutorial 2: A More Complicated Function</title>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  <meta name="author" content="Owen Anderson">
  <meta name="description" 
  content="LLVM Tutorial 2: A More Complicated Function.">
  <link rel="stylesheet" href="../llvm.css" type="text/css">
</head>

<body>

<div class="doc_title"> LLVM Tutorial 2: A More Complicated Function </div>

<div class="doc_author">
  <p>Written by <a href="mailto:owen@apple.com">Owen Anderson</a></p>
</div>

<!-- *********************************************************************** -->
<div class="doc_section"><a name="intro">A First Function</a></div>
<!-- *********************************************************************** -->

<div class="doc_text">

<p>Now that we understand the basics of creating functions in LLVM, let's move on to a more complicated example: something with control flow.  As an example, let's consider Euclid's Greatest Common Denominator (GCD) algorithm:</p>

<div class="doc_code">
<pre>
unsigned gcd(unsigned x, unsigned y) {
  if(x == y) {
    return x;
  } else if(x &lt; y) {
    return gcd(x, y - x);
  } else {
    return gcd(x - y, y);
  }
}
</pre>
</div>

<p>With this example, we'll learn how to create functions with multiple blocks and control flow, and how to make function calls within your LLVM code.  For starters, consider the diagram below.</p>

<div style="text-align: center;"><img src="JITTutorial2-1.png" alt="GCD CFG" width="60%"></div>

<p>This is a graphical representation of a program in LLVM IR.  It places each basic block on a node of a graph and uses directed edges to indicate flow control.  These blocks will be serialized when written to a text or bitcode file, but it is often useful conceptually to think of them as a graph.  Again, if you are unsure about the code in the diagram, you should skim through the <a href="../LangRef.html">LLVM Language Reference Manual</a> and convince yourself that it is, in fact, the GCD algorithm.</p>

<p>The first part of our code is practically the same as from the first tutorial.  The same basic setup is required: creating a module, verifying it, and running the <code>PrintModulePass</code> on it.  Even the first segment of  <code>makeLLVMModule()</code> looks essentially the same, except that <code>gcd</code> takes one fewer parameter than <code>mul_add</code>.</p>

<div class="doc_code">
<pre>
#include &lt;llvm/Module.h&gt;
#include &lt;llvm/Function.h&gt;
#include &lt;llvm/PassManager.h&gt;
#include &lt;llvm/Analysis/Verifier.h&gt;
#include &lt;llvm/Assembly/PrintModulePass.h&gt;
#include &lt;llvm/Support/IRBuilder.h&gt;

using namespace llvm;

Module* makeLLVMModule();

int main(int argc, char**argv) {
  Module* Mod = makeLLVMModule();
  
  verifyModule(*Mod, PrintMessageAction);
  
  PassManager PM;
  PM.add(new PrintModulePass(&amp;llvm::cout));
  PM.run(*Mod);

  delete Mod;  
  return 0;
}

Module* makeLLVMModule() {
  Module* mod = new Module(&quot;tut2&quot;);
  
  Constant* c = mod-&gt;getOrInsertFunction(&quot;gcd&quot;,
                                         IntegerType::get(32),
                                         IntegerType::get(32),
                                         IntegerType::get(32),
                                         NULL);
  Function* gcd = cast&lt;Function&gt;(c);
  
  Function::arg_iterator args = gcd-&gt;arg_begin();
  Value* x = args++;
  x-&gt;setName(&quot;x&quot;);
  Value* y = args++;
  y-&gt;setName(&quot;y&quot;);
</pre>
</div>

<p>Here, however, is where our code begins to diverge from the first tutorial.  Because <code>gcd</code> has control flow, it is composed of multiple blocks interconnected by branching (<code>br</code>) instructions.  For those familiar with assembly language, a block is similar to a labeled set of instructions.  For those not familiar with assembly language, a block is basically a set of instructions that can be branched to and is executed linearly until the block is terminated by one of a small number of control flow instructions, such as <code>br</code> or <code>ret</code>.</p>

<p>Blocks correspond to the nodes in the diagram we looked at in the beginning of this tutorial.  From the diagram, we can see that this function contains five blocks, so we'll go ahead and create them.  Note that we're making use of LLVM's automatic name uniquing in this code sample, since we're giving two blocks the same name.</p>

<div class="doc_code">
<pre>
  BasicBlock* entry = BasicBlock::Create(&quot;entry&quot;, gcd);
  BasicBlock* ret = BasicBlock::Create(&quot;return&quot;, gcd);
  BasicBlock* cond_false = BasicBlock::Create(&quot;cond_false&quot;, gcd);
  BasicBlock* cond_true = BasicBlock::Create(&quot;cond_true&quot;, gcd);
  BasicBlock* cond_false_2 = BasicBlock::Create(&quot;cond_false&quot;, gcd);
</pre>
</div>

<p>Now we're ready to begin generating code!  We'll start with the <code>entry</code> block.  This block corresponds to the top-level if-statement in the original C code, so we need to compare <code>x</code> and <code>y</code>.  To achieve this, we perform an explicit comparison using <code>ICmpEQ</code>.  <code>ICmpEQ</code> stands for an <em>integer comparison for equality</em> and returns a 1-bit integer result.  This 1-bit result is then used as the input to a conditional branch, with <code>ret</code> as the <code>true</code> and <code>cond_false</code> as the <code>false</code> case.</p>

<div class="doc_code">
<pre>
  IRBuilder builder(entry);
  Value* xEqualsY = builder.CreateICmpEQ(x, y, &quot;tmp&quot;);
  builder.CreateCondBr(xEqualsY, ret, cond_false);
</pre>
</div>

<p>Our next block, <code>ret</code>, is pretty simple: it just returns the value of <code>x</code>.  Recall that this block is only reached if <code>x == y</code>, so this is the correct behavior.  Notice that instead of creating a new <code>IRBuilder</code> for each block, we can use <code>SetInsertPoint</code> to retarget our existing one.  This saves on construction and memory allocation costs.</p>

<div class="doc_code">
<pre>
  builder.SetInsertPoint(ret);
  builder.CreateRet(x);
</pre>
</div>

<p><code>cond_false</code> is a more interesting block: we now know that <code>x != y</code>, so we must branch again to determine which of <code>x</code> and <code>y</code> is larger.  This is achieved using the <code>ICmpULT</code> instruction, which stands for <em>integer comparison for unsigned less-than</em>.  In LLVM, integer types do not carry sign; a 32-bit integer pseudo-register can interpreted as signed or unsigned without casting.  Whether a signed or unsigned interpretation is desired is specified in the instruction.  This is why several instructions in the LLVM IR, such as integer less-than, include a specifier for signed or unsigned.</p>

<p>Also note that we're again making use of LLVM's automatic name uniquing, this time at a register level.  We've deliberately chosen to name every instruction "tmp" to illustrate that LLVM will give them all unique names without getting confused.</p>

<div class="doc_code">
<pre>
  builder.SetInsertPoint(cond_false);
  Value* xLessThanY = builder.CreateICmpULT(x, y, &quot;tmp&quot;);
  builder.CreateCondBr(xLessThanY, cond_true, cond_false_2);
</pre>
</div>

<p>Our last two blocks are quite similar; they're both recursive calls to <code>gcd</code> with different parameters.  To create a call instruction, we have to create a <code>vector</code> (or any other container with <code>InputInterator</code>s) to hold the arguments.  We then pass in the beginning and ending iterators for this vector.</p>

<div class="doc_code">
<pre>
  builder.SetInsertPoint(cond_true);
  Value* yMinusX = builder.CreateSub(y, x, &quot;tmp&quot;);
  std::vector&lt;Value*&gt; args1;
  args1.push_back(x);
  args1.push_back(yMinusX);
  Value* recur_1 = builder.CreateCall(gcd, args1.begin(), args1.end(), &quot;tmp&quot;);
  builder.CreateRet(recur_1);
  
  builder.SetInsertPoint(cond_false_2);
  Value* xMinusY = builder.CreateSub(x, y, &quot;tmp&quot;);
  std::vector&lt;Value*&gt; args2;
  args2.push_back(xMinusY);
  args2.push_back(y);
  Value* recur_2 = builder.CreateCall(gcd, args2.begin(), args2.end(), &quot;tmp&quot;);
  builder.CreateRet(recur_2);
  
  return mod;
}
</pre>
</div>

<p>And that's it!  You can compile and execute your code in the same way as before, by doing:</p>

<div class="doc_code">
<pre>
# c++ -g tut2.cpp `llvm-config --cxxflags --ldflags --libs core` -o tut2
# ./tut2
</pre>
</div>

</div>

<!-- *********************************************************************** -->
<hr>
<address>
  <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
  src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
  <a href="http://validator.w3.org/check/referer"><img
  src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>

  <a href="mailto:owen@apple.com">Owen Anderson</a><br>
  <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
  Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
</address>

</body>
</html>