diff options
author | Chris Lattner <sabre@nondot.org> | 2008-02-10 19:11:04 +0000 |
---|---|---|
committer | Chris Lattner <sabre@nondot.org> | 2008-02-10 19:11:04 +0000 |
commit | 729eb14ae8b1a1002a212a99cdc411659670fbd4 (patch) | |
tree | 162dd4ce76756c74af1f9479a6628cec087774bd /docs | |
parent | 916c954bf20f48c3269542fc919ccb92a99496ee (diff) | |
download | external_llvm-729eb14ae8b1a1002a212a99cdc411659670fbd4.zip external_llvm-729eb14ae8b1a1002a212a99cdc411659670fbd4.tar.gz external_llvm-729eb14ae8b1a1002a212a99cdc411659670fbd4.tar.bz2 |
Various updates from Sam Bishop:
"I have been working my way through the JIT and Kaleidoscope tutorials in my
(minuscule) spare time. Thanks again for writing them! I have attached a
patch containing some minor changes, ranging from spelling and grammar fixes
to adding a "Next: <next tutorial section>" hyperlink to the bottom of each
page.
Every page has been given the "next link" treatment, but otherwise I'm only
half way through the Kaleidoscope tutorial. I will send a follow-on patch
if time permits."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46933 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
-rw-r--r-- | docs/tutorial/JITTutorial1.html | 31 | ||||
-rw-r--r-- | docs/tutorial/JITTutorial2.html | 12 | ||||
-rw-r--r-- | docs/tutorial/LangImpl1.html | 8 | ||||
-rw-r--r-- | docs/tutorial/LangImpl2.html | 65 | ||||
-rw-r--r-- | docs/tutorial/LangImpl3.html | 42 | ||||
-rw-r--r-- | docs/tutorial/LangImpl4.html | 1 | ||||
-rw-r--r-- | docs/tutorial/LangImpl5.html | 1 | ||||
-rw-r--r-- | docs/tutorial/LangImpl6.html | 1 | ||||
-rw-r--r-- | docs/tutorial/LangImpl7.html | 1 |
9 files changed, 87 insertions, 75 deletions
diff --git a/docs/tutorial/JITTutorial1.html b/docs/tutorial/JITTutorial1.html index fb906cd..ef026c0 100644 --- a/docs/tutorial/JITTutorial1.html +++ b/docs/tutorial/JITTutorial1.html @@ -25,7 +25,7 @@ <div class="doc_text"> -<p>For starters, lets consider a relatively straightforward function that takes three integer parameters and returns an arithmetic combination of them. This is nice and simple, especially since it involves no control flow:</p> +<p>For starters, let's consider a relatively straightforward function that takes three integer parameters and returns an arithmetic combination of them. This is nice and simple, especially since it involves no control flow:</p> <div class="doc_code"> <pre> @@ -86,7 +86,7 @@ int main(int argc, char**argv) { </pre> </div> -<p>The first segment is pretty simple: it creates an LLVM “module.” In LLVM, a module represents a single unit of code that is to be processed together. A module contains things like global variables and function declarations and implementations. Here, we’ve declared a <code>makeLLVMModule()</code> function to do the real work of creating the module. Don’t worry, we’ll be looking at that one next!</p> +<p>The first segment is pretty simple: it creates an LLVM “module.” In LLVM, a module represents a single unit of code that is to be processed together. A module contains things like global variables, function declarations, and implementations. Here we’ve declared a <code>makeLLVMModule()</code> function to do the real work of creating the module. Don’t worry, we’ll be looking at that one next!</p> <p>The second segment runs the LLVM module verifier on our newly created module. While this probably isn’t really necessary for a simple module like this one, it’s always a good idea, especially if you’re generating LLVM IR based on some input. The verifier will print an error message if your LLVM module is malformed in any way.</p> @@ -106,7 +106,7 @@ Module* makeLLVMModule() { <div class="doc_code"> <pre> - Constant* c = mod->getOrInsertFunction("mul_add", + Constant* c = mod->getOrInsertFunction("mul_add", /*ret type*/ IntegerType::get(32), /*args*/ IntegerType::get(32), IntegerType::get(32), @@ -114,31 +114,31 @@ Module* makeLLVMModule() { /*varargs terminated with null*/ NULL); Function* mul_add = cast<Function>(c); - mul_add->setCallingConv(CallingConv::C); + mul_add->setCallingConv(CallingConv::C); </pre> </div> -<p>We construct our <code>Function</code> by calling <code>getOrInsertFunction()</code> on our module, passing in the name, return type, and argument types of the function. In the case of our <code>mul_add</code> function, that means one 32-bit integer for the return value, and three 32-bit integers for the arguments.</p> +<p>We construct our <code>Function</code> by calling <code>getOrInsertFunction()</code> on our module, passing in the name, return type, and argument types of the function. In the case of our <code>mul_add</code> function, that means one 32-bit integer for the return value and three 32-bit integers for the arguments.</p> -<p>You'll notice that <code>getOrInsertFunction</code> doesn't actually return a <code>Function*</code>. This is because, if the function already existed, but with a different prototype, <code>getOrInsertFunction</code> will return a cast of the existing function to the desired prototype. Since we know that there's not already a <code>mul_add</code> function, we can safely just cast <code>c</code> to a <code>Function*</code>. +<p>You'll notice that <code>getOrInsertFunction()</code> doesn't actually return a <code>Function*</code>. This is because <code>getOrInsertFunction()</code> will return a cast of the existing function if the function already existed with a different prototype. Since we know that there's not already a <code>mul_add</code> function, we can safely just cast <code>c</code> to a <code>Function*</code>. <p>In addition, we set the calling convention for our new function to be the C calling convention. This isn’t strictly necessary, but it insures that our new function will interoperate properly with C code, which is a good thing.</p> <div class="doc_code"> <pre> - Function::arg_iterator args = mul_add->arg_begin(); + Function::arg_iterator args = mul_add->arg_begin(); Value* x = args++; - x->setName("x"); + x->setName("x"); Value* y = args++; - y->setName("y"); + y->setName("y"); Value* z = args++; - z->setName("z"); + z->setName("z"); </pre> </div> -<p>While we’re setting up our function, let’s also give names to the parameters. This also isn’t strictly necessary (LLVM will generate names for them if you don’t specify them), but it’ll make looking at our output somewhat more pleasant. To name the parameters, we iterator over the arguments of our function, and call <code>setName()</code> on them. We’ll also keep the pointer to <code>x</code>, <code>y</code>, and <code>z</code> around, since we’ll need them when we get around to creating instructions.</p> +<p>While we’re setting up our function, let’s also give names to the parameters. This also isn’t strictly necessary (LLVM will generate names for them if you don’t specify them), but it’ll make looking at our output somewhat more pleasant. To name the parameters, we iterate over the arguments of our function and call <code>setName()</code> on them. We’ll also keep the pointer to <code>x</code>, <code>y</code>, and <code>z</code> around, since we’ll need them when we get around to creating instructions.</p> -<p>Great! We have a function now. But what good is a function if it has no body? Before we start working on a body for our new function, we need to recall some details of the LLVM IR. The IR, being an abstract assembly language, represents control flow using jumps (we call them branches), both conditional and unconditional. The straight-line sequences of code between branches are called basic blocks, or just blocks. To create a body for our function, we fill it with blocks!</p> +<p>Great! We have a function now. But what good is a function if it has no body? Before we start working on a body for our new function, we need to recall some details of the LLVM IR. The IR, being an abstract assembly language, represents control flow using jumps (we call them branches), both conditional and unconditional. The straight-line sequences of code between branches are called basic blocks, or just blocks. To create a body for our function, we fill it with blocks:</p> <div class="doc_code"> <pre> @@ -165,17 +165,18 @@ Module* makeLLVMModule() { <p>The final step in creating our function is to create the instructions that make it up. Our <code>mul_add</code> function is composed of just three instructions: a multiply, an add, and a return. <code>LLVMBuilder</code> gives us a simple interface for constructing these instructions and appending them to the “entry” block. Each of the calls to <code>LLVMBuilder</code> returns a <code>Value*</code> that represents the value yielded by the instruction. You’ll also notice that, above, <code>x</code>, <code>y</code>, and <code>z</code> are also <code>Value*</code>’s, so it’s clear that instructions operate on <code>Value*</code>’s.</p> -<p>And that’s it! Now you can compile and run your code, and get a wonderful textual print out of the LLVM IR we saw at the beginning. To compile, use the following commandline as a guide:</p> +<p>And that’s it! Now you can compile and run your code, and get a wonderful textual print out of the LLVM IR we saw at the beginning. To compile, use the following command line as a guide:</p> <div class="doc_code"> <pre> -# c++ -g tut2.cpp `llvm-config --cppflags --ldflags --libs core` -o tut2 -# ./tut2 +# c++ -g tut1.cpp `llvm-config --cppflags --ldflags --libs core` -o tut1 +# ./tut1 </pre> </div> <p>The <code>llvm-config</code> utility is used to obtain the necessary GCC-compatible compiler flags for linking with LLVM. For this example, we only need the 'core' library. We'll use others once we start adding optimizers and the JIT engine.</p> +<a href="JITTutorial2.html">Next: A More Complicated Function</a> </div> <!-- *********************************************************************** --> diff --git a/docs/tutorial/JITTutorial2.html b/docs/tutorial/JITTutorial2.html index 261a794..ba72ea2 100644 --- a/docs/tutorial/JITTutorial2.html +++ b/docs/tutorial/JITTutorial2.html @@ -32,7 +32,7 @@ unsigned gcd(unsigned x, unsigned y) { if(x == y) { return x; - } else if(x < y) { + } else if(x < y) { return gcd(x, y - x); } else { return gcd(x - y, y); @@ -45,7 +45,7 @@ unsigned gcd(unsigned x, unsigned y) { <div style="text-align: center;"><img src="JITTutorial2-1.png" alt="GCD CFG" width="60%"></div> -<p>The above is a graphical representation of a program in LLVM IR. It places each basic block on a node of a graph, and uses directed edges to indicate flow control. These blocks will be serialized when written to a text or bitcode file, but it is often useful conceptually to think of them as a graph. Again, if you are unsure about the code in the diagram, you should skim through the <a href="../LangRef.html">LLVM Language Reference Manual</a> and convince yourself that it is, in fact, the GCD algorithm.</p> +<p>This is a graphical representation of a program in LLVM IR. It places each basic block on a node of a graph and uses directed edges to indicate flow control. These blocks will be serialized when written to a text or bitcode file, but it is often useful conceptually to think of them as a graph. Again, if you are unsure about the code in the diagram, you should skim through the <a href="../LangRef.html">LLVM Language Reference Manual</a> and convince yourself that it is, in fact, the GCD algorithm.</p> <p>The first part of our code is practically the same as from the first tutorial. The same basic setup is required: creating a module, verifying it, and running the <code>PrintModulePass</code> on it. Even the first segment of <code>makeLLVMModule()</code> looks essentially the same, except that <code>gcd</code> takes one fewer parameter than <code>mul_add</code>.</p> @@ -94,7 +94,7 @@ Module* makeLLVMModule() { <p>Here, however, is where our code begins to diverge from the first tutorial. Because <code>gcd</code> has control flow, it is composed of multiple blocks interconnected by branching (<code>br</code>) instructions. For those familiar with assembly language, a block is similar to a labeled set of instructions. For those not familiar with assembly language, a block is basically a set of instructions that can be branched to and is executed linearly until the block is terminated by one of a small number of control flow instructions, such as <code>br</code> or <code>ret</code>.</p> -<p>Blocks corresponds to the nodes in the diagram we looked at in the beginning of this tutorial. From the diagram, we can see that this function contains five blocks, so we'll go ahead and create them. Note that, in this code sample, we're making use of LLVM's automatic name uniquing, since we're giving two blocks the same name.</p> +<p>Blocks correspond to the nodes in the diagram we looked at in the beginning of this tutorial. From the diagram, we can see that this function contains five blocks, so we'll go ahead and create them. Note that we're making use of LLVM's automatic name uniquing in this code sample, since we're giving two blocks the same name.</p> <div class="doc_code"> <pre> @@ -106,7 +106,7 @@ Module* makeLLVMModule() { </pre> </div> -<p>Now, we're ready to begin generate code! We'll start with the <code>entry</code> block. This block corresponds to the top-level if-statement in the original C code, so we need to compare <code>x == y</code> To achieve this, we perform an explicity comparison using <code>ICmpEQ</code>. <code>ICmpEQ</code> stands for an <em>integer comparison for equality</em> and returns a 1-bit integer result. This 1-bit result is then used as the input to a conditional branch, with <code>ret</code> as the <code>true</code> and <code>cond_false</code> as the <code>false</code> case.</p> +<p>Now we're ready to begin generating code! We'll start with the <code>entry</code> block. This block corresponds to the top-level if-statement in the original C code, so we need to compare <code>x</code> and <code>y</code>. To achieve this, we perform an explicit comparison using <code>ICmpEQ</code>. <code>ICmpEQ</code> stands for an <em>integer comparison for equality</em> and returns a 1-bit integer result. This 1-bit result is then used as the input to a conditional branch, with <code>ret</code> as the <code>true</code> and <code>cond_false</code> as the <code>false</code> case.</p> <div class="doc_code"> <pre> @@ -116,7 +116,7 @@ Module* makeLLVMModule() { </pre> </div> -<p>Our next block, <code>ret</code>, is pretty simple: it just returns the value of <code>x</code>. Recall that this block is only reached if <code>x == y</code>, so this is the correct behavior. Notice that, instead of creating a new <code>LLVMBuilder</code> for each block, we can use <code>SetInsertPoint</code> to retarget our existing one. This saves on construction and memory allocation costs.</p> +<p>Our next block, <code>ret</code>, is pretty simple: it just returns the value of <code>x</code>. Recall that this block is only reached if <code>x == y</code>, so this is the correct behavior. Notice that instead of creating a new <code>LLVMBuilder</code> for each block, we can use <code>SetInsertPoint</code> to retarget our existing one. This saves on construction and memory allocation costs.</p> <div class="doc_code"> <pre> @@ -127,7 +127,7 @@ Module* makeLLVMModule() { <p><code>cond_false</code> is a more interesting block: we now know that <code>x != y</code>, so we must branch again to determine which of <code>x</code> and <code>y</code> is larger. This is achieved using the <code>ICmpULT</code> instruction, which stands for <em>integer comparison for unsigned less-than</em>. In LLVM, integer types do not carry sign; a 32-bit integer pseudo-register can interpreted as signed or unsigned without casting. Whether a signed or unsigned interpretation is desired is specified in the instruction. This is why several instructions in the LLVM IR, such as integer less-than, include a specifier for signed or unsigned.</p> -<p>Also, note that we're again making use of LLVM's automatic name uniquing, this time at a register level. We've deliberately chosen to name every instruction "tmp", to illustrate that LLVM will give them all unique names without getting confused.</p> +<p>Also note that we're again making use of LLVM's automatic name uniquing, this time at a register level. We've deliberately chosen to name every instruction "tmp" to illustrate that LLVM will give them all unique names without getting confused.</p> <div class="doc_code"> <pre> diff --git a/docs/tutorial/LangImpl1.html b/docs/tutorial/LangImpl1.html index 3cd26fa..487ddd9 100644 --- a/docs/tutorial/LangImpl1.html +++ b/docs/tutorial/LangImpl1.html @@ -54,9 +54,10 @@ teaching compiler techniques and LLVM specifically, <em>not</em> about teaching modern and sane software engineering principles. In practice, this means that we'll take a number of shortcuts to simplify the exposition. For example, the code leaks memory, uses global variables all over the place, doesn't use nice -design patterns like visitors, etc... but it is very simple. If you dig in and -use the code as a basis for future projects, fixing these deficiencies shouldn't -be hard.</p> +design patterns like <a +href="http://en.wikipedia.org/wiki/Visitor_pattern">visitors</a>, etc... but it +is very simple. If you dig in and use the code as a basis for future projects, +fixing these deficiencies shouldn't be hard.</p> <p>I've tried to put this tutorial together in a way that makes chapters easy to skip over if you are already familiar with or are uninterested in the various @@ -328,6 +329,7 @@ build an Abstract Syntax Tree</a>. When we have that, we'll include a driver so that you can use the lexer and parser together. </p> +<a href="LangImpl2.html">Next: Implementing a Parser and AST</a> </div> <!-- *********************************************************************** --> diff --git a/docs/tutorial/LangImpl2.html b/docs/tutorial/LangImpl2.html index dbd617c..b7f968b 100644 --- a/docs/tutorial/LangImpl2.html +++ b/docs/tutorial/LangImpl2.html @@ -98,7 +98,7 @@ know what the stored numeric value is.</p> <p>Right now we only create the AST, so there are no useful accessor methods on them. It would be very easy to add a virtual method to pretty print the code, for example. Here are the other expression AST node definitions that we'll use -in the basic form of the Kaleidoscope language. +in the basic form of the Kaleidoscope language: </p> <div class="doc_code"> @@ -130,7 +130,7 @@ public: </pre> </div> -<p>This is all (intentially) rather straight-forward: variables capture the +<p>This is all (intentionally) rather straight-forward: variables capture the variable name, binary operators capture their opcode (e.g. '+'), and calls capture a function name as well as a list of any argument expressions. One thing that is nice about our AST is that it captures the language features without @@ -201,7 +201,7 @@ calls like this:</p> <div class="doc_code"> <pre> /// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current -/// token the parser it looking at. getNextToken reads another token from the +/// token the parser is looking at. getNextToken reads another token from the /// lexer and updates CurTok with its results. static int CurTok; static int getNextToken() { @@ -263,11 +263,11 @@ static ExprAST *ParseNumberExpr() { <p>This routine is very simple: it expects to be called when the current token is a <tt>tok_number</tt> token. It takes the current number value, creates -a <tt>NumberExprAST</tt> node, advances the lexer to the next token and finally +a <tt>NumberExprAST</tt> node, advances the lexer to the next token, and finally returns.</p> <p>There are some interesting aspects to this. The most important one is that -this routine eats all of the tokens that correspond to the production, and +this routine eats all of the tokens that correspond to the production and returns the lexer buffer with the next token (which is not part of the grammar production) ready to go. This is a fairly standard way to go for recursive descent parsers. For a better example, the parenthesis operator is defined like @@ -293,7 +293,7 @@ static ExprAST *ParseParenExpr() { parser:</p> <p> -1) it shows how we use the Error routines. When called, this function expects +1) It shows how we use the Error routines. When called, this function expects that the current token is a '(' token, but after parsing the subexpression, it is possible that there is no ')' waiting. For example, if the user types in "(4 x" instead of "(4)", the parser should emit an error. Because errors can @@ -305,8 +305,8 @@ calling <tt>ParseExpression</tt> (we will soon see that <tt>ParseExpression</tt> <tt>ParseParenExpr</tt>). This is powerful because it allows us to handle recursive grammars, and keeps each production very simple. Note that parentheses do not cause construction of AST nodes themselves. While we could -do it this way, the most important role of parens are to guide the parser and -provide grouping. Once the parser constructs the AST, parens are not +do it this way, the most important role of parentheses are to guide the parser +and provide grouping. Once the parser constructs the AST, parentheses are not needed.</p> <p>The next simple production is for handling variable references and function @@ -350,21 +350,21 @@ static ExprAST *ParseIdentifierExpr() { </pre> </div> -<p>This routine follows the same style as the other routines (it expects to be +<p>This routine follows the same style as the other routines. (It expects to be called if the current token is a <tt>tok_identifier</tt> token). It also has recursion and error handling. One interesting aspect of this is that it uses <em>look-ahead</em> to determine if the current identifier is a stand alone variable reference or if it is a function call expression. It handles this by -checking to see if the token after the identifier is a '(' token, and constructs +checking to see if the token after the identifier is a '(' token, constructing either a <tt>VariableExprAST</tt> or <tt>CallExprAST</tt> node as appropriate. </p> -<p>Now that we have all of our simple expression parsing logic in place, we can -define a helper function to wrap it together into one entry-point. We call this +<p>Now that we have all of our simple expression-parsing logic in place, we can +define a helper function to wrap it together into one entry point. We call this class of expressions "primary" expressions, for reasons that will become more clear <a href="LangImpl6.html#unary">later in the tutorial</a>. In order to parse an arbitrary primary expression, we need to determine what sort of -specific expression it is:</p> +expression it is:</p> <div class="doc_code"> <pre> @@ -383,13 +383,13 @@ static ExprAST *ParsePrimary() { </pre> </div> -<p>Now that you see the definition of this function, it makes it more obvious -why we can assume the state of CurTok in the various functions. This uses -look-ahead to determine which sort of expression is being inspected, and parses -it with a function call.</p> +<p>Now that you see the definition of this function, it is more obvious why we +can assume the state of CurTok in the various functions. This uses look-ahead +to determine which sort of expression is being inspected, and then parses it +with a function call.</p> -<p>Now that basic expressions are handled, we need to handle binary expressions, -which are a bit more complex.</p> +<p>Now that basic expressions are handled, we need to handle binary expressions. +They are a bit more complex.</p> </div> @@ -447,12 +447,12 @@ int main() { or -1 if the token is not a binary operator. Having a map makes it easy to add new operators and makes it clear that the algorithm doesn't depend on the specific operators involved, but it would be easy enough to eliminate the map -and do the comparisons in the <tt>GetTokPrecedence</tt> function (or just use +and do the comparisons in the <tt>GetTokPrecedence</tt> function. (Or just use a fixed-size array).</p> <p>With the helper above defined, we can now start parsing binary expressions. The basic idea of operator precedence parsing is to break down an expression -with potentially ambiguous binary operators into pieces. Consider for example +with potentially ambiguous binary operators into pieces. Consider ,for example, the expression "a+b+(c+d)*e*f+g". Operator precedence parsing considers this as a stream of primary expressions separated by binary operators. As such, it will first parse the leading primary expression "a", then it will see the @@ -708,7 +708,7 @@ static FunctionAST *ParseTopLevelExpr() { </pre> </div> -<p>Now that we have all the pieces, lets build a little driver that will let us +<p>Now that we have all the pieces, let's build a little driver that will let us actually <em>execute</em> this code we've built!</p> </div> @@ -732,7 +732,7 @@ static void MainLoop() { fprintf(stderr, "ready> "); switch (CurTok) { case tok_eof: return; - case ';': getNextToken(); break; // ignore top level semicolons. + case ';': getNextToken(); break; // ignore top-level semicolons. case tok_def: HandleDefinition(); break; case tok_extern: HandleExtern(); break; default: HandleTopLevelExpression(); break; @@ -742,13 +742,13 @@ static void MainLoop() { </pre> </div> -<p>The most interesting part of this is that we ignore top-level semi colons. +<p>The most interesting part of this is that we ignore top-level semicolons. Why is this, you ask? The basic reason is that if you type "4 + 5" at the command line, the parser doesn't know whether that is the end of what you will type or not. For example, on the next line you could type "def foo..." in which case 4+5 is the end of a top-level expression. Alternatively you could type "* 6", which would continue the expression. Having top-level semicolons allows you to -type "4+5;" and the parser will know you are done.</p> +type "4+5;", and the parser will know you are done.</p> </div> @@ -760,8 +760,8 @@ type "4+5;" and the parser will know you are done.</p> <p>With just under 400 lines of commented code (240 lines of non-comment, non-blank code), we fully defined our minimal language, including a lexer, -parser and AST builder. With this done, the executable will validate -Kaleidoscope code and tell us if it is gramatically invalid. For +parser, and AST builder. With this done, the executable will validate +Kaleidoscope code and tell us if it is grammatically invalid. For example, here is a sample interaction:</p> <div class="doc_code"> @@ -798,8 +798,8 @@ Representation (IR) from the AST.</p> <p> Here is the complete code listing for this and the previous chapter. Note that it is fully self-contained: you don't need LLVM or any external -libraries at all for this (other than the C and C++ standard libraries of -course). To build this, just compile with:</p> +libraries at all for this. (Besides the C and C++ standard libraries, of +course.) To build this, just compile with:</p> <div class="doc_code"> <pre> @@ -955,7 +955,7 @@ public: //===----------------------------------------------------------------------===// /// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current -/// token the parser it looking at. getNextToken reads another token from the +/// token the parser is looking at. getNextToken reads another token from the /// lexer and updates CurTok with its results. static int CurTok; static int getNextToken() { @@ -1167,7 +1167,7 @@ static void HandleExtern() { } static void HandleTopLevelExpression() { - // Evaluate a top level expression into an anonymous function. + // Evaluate a top-level expression into an anonymous function. if (FunctionAST *F = ParseTopLevelExpr()) { fprintf(stderr, "Parsed a top-level expr\n"); } else { @@ -1182,7 +1182,7 @@ static void MainLoop() { fprintf(stderr, "ready> "); switch (CurTok) { case tok_eof: return; - case ';': getNextToken(); break; // ignore top level semicolons. + case ';': getNextToken(); break; // ignore top-level semicolons. case tok_def: HandleDefinition(); break; case tok_extern: HandleExtern(); break; default: HandleTopLevelExpression(); break; @@ -1211,6 +1211,7 @@ int main() { } </pre> </div> +<a href="LangImpl3.html">Next: Implementing Code Generation to LLVM IR</a> </div> <!-- *********************************************************************** --> diff --git a/docs/tutorial/LangImpl3.html b/docs/tutorial/LangImpl3.html index 768a783..60ad5f1 100644 --- a/docs/tutorial/LangImpl3.html +++ b/docs/tutorial/LangImpl3.html @@ -59,8 +59,8 @@ LLVM SVN to work. LLVM 2.1 and before will not work with it.</p> <div class="doc_text"> <p> -In order to generate LLVM IR, we want some simple setup to get started. First, -we define virtual codegen methods in each AST class:</p> +In order to generate LLVM IR, we want some simple setup to get started. First +we define virtual code generation (codegen) methods in each AST class:</p> <div class="doc_code"> <pre> @@ -95,9 +95,11 @@ href="http://en.wikipedia.org/wiki/Static_single_assignment_form">Static Single Assignment</a> - the concepts are really quite natural once you grok them.</p> <p>Note that instead of adding virtual methods to the ExprAST class hierarchy, -it could also make sense to use a visitor pattern or some other way to model -this. Again, this tutorial won't dwell on good software engineering practices: -for our purposes, adding a virtual method is simplest.</p> +it could also make sense to use a <a +href="http://en.wikipedia.org/wiki/Visitor_pattern">visitor pattern</a> or some +other way to model this. Again, this tutorial won't dwell on good software +engineering practices: for our purposes, adding a virtual method is +simplest.</p> <p>The second thing we want is an "Error" method like we used for the parser, which will @@ -121,16 +123,15 @@ uses to contain code.</p> <p>The <tt>Builder</tt> object is a helper object that makes it easy to generate LLVM instructions. Instances of the <a -href="http://llvm.org/doxygen/LLVMBuilder_8h-source.html"><tt>LLVMBuilder</tt> -class</a> keep track of the current place to -insert instructions and has methods to create new instructions.</p> +href="http://llvm.org/doxygen/LLVMBuilder_8h-source.html"><tt>LLVMBuilder</tt></a> +class keep track of the current place to insert instructions and has methods to +create new instructions.</p> <p>The <tt>NamedValues</tt> map keeps track of which values are defined in the -current scope and what their LLVM representation is (in other words, it is a -symbol table for the code). In this form of -Kaleidoscope, the only things that can be referenced are function parameters. -As such, function parameters will be in this map when generating code for their -function body.</p> +current scope and what their LLVM representation is. (In other words, it is a +symbol table for the code). In this form of Kaleidoscope, the only things that +can be referenced are function parameters. As such, function parameters will +be in this map when generating code for their function body.</p> <p> With these basics in place, we can start talking about how to generate code for @@ -148,7 +149,7 @@ has already been done, and we'll just use it to emit code. <div class="doc_text"> <p>Generating LLVM code for expression nodes is very straightforward: less -than 45 lines of commented code for all four of our expression nodes. First, +than 45 lines of commented code for all four of our expression nodes. First we'll do numeric literals:</p> <div class="doc_code"> @@ -218,11 +219,13 @@ code, we do a simple switch on the opcode to create the right LLVM instruction. LLVMBuilder knows where to insert the newly created instruction, all you have to do is specify what instruction to create (e.g. with <tt>CreateAdd</tt>), which operands to use (<tt>L</tt> and <tt>R</tt> here) and optionally provide a name -for the generated instruction. One nice thing about LLVM is that the name is -just a hint: if there are multiple additions in a single function, the first -will be named "addtmp" and the second will be "autorenamed" by adding a suffix, -giving it a name like "addtmp42". Local value names for instructions are purely -optional, but it makes it much easier to read the IR dumps.</p> +for the generated instruction.</p> + +<p>One nice thing about LLVM is that the name is just a hint. For instance, if +the code above emits multiple "addtmp" variables, LLVM will automatically +provide each one with an increasing, unique numeric suffix. Local value names +for instructions are purely optional, but it makes it much easier to read the +IR dumps.</p> <p><a href="../LangRef.html#instref">LLVM instructions</a> are constrained by strict rules: for example, the Left and Right operators of @@ -1228,6 +1231,7 @@ int main() { } </pre> </div> +<a href="LangImpl4.html">Next: Adding JIT and Optimizer Support</a> </div> <!-- *********************************************************************** --> diff --git a/docs/tutorial/LangImpl4.html b/docs/tutorial/LangImpl4.html index db76cb3..4b9b8c5 100644 --- a/docs/tutorial/LangImpl4.html +++ b/docs/tutorial/LangImpl4.html @@ -1119,6 +1119,7 @@ int main() { </pre> </div> +<a href="LangImpl5.html">Next: Extending the language: control flow</a> </div> <!-- *********************************************************************** --> diff --git a/docs/tutorial/LangImpl5.html b/docs/tutorial/LangImpl5.html index 44b0ccf..f40efb2 100644 --- a/docs/tutorial/LangImpl5.html +++ b/docs/tutorial/LangImpl5.html @@ -1745,6 +1745,7 @@ int main() { </pre> </div> +<a href="LangImpl6.html">Next: Extending the language: user-defined operators</a> </div> <!-- *********************************************************************** --> diff --git a/docs/tutorial/LangImpl6.html b/docs/tutorial/LangImpl6.html index 022d6fa..62ce43e 100644 --- a/docs/tutorial/LangImpl6.html +++ b/docs/tutorial/LangImpl6.html @@ -1784,6 +1784,7 @@ int main() { </pre> </div> +<a href="LangImpl7.html">Next: Extending the language: mutable variables / SSA construction</a> </div> <!-- *********************************************************************** --> diff --git a/docs/tutorial/LangImpl7.html b/docs/tutorial/LangImpl7.html index 3cd02a7..7138cbd 100644 --- a/docs/tutorial/LangImpl7.html +++ b/docs/tutorial/LangImpl7.html @@ -2140,6 +2140,7 @@ int main() { </pre> </div> +<a href="LangImpl8.html">Next: Conclusion and other useful LLVM tidbits</a> </div> <!-- *********************************************************************** --> |