This document contains the release notes for the LLVM compiler
infrastructure, release 1.2. Here we describe the status of LLVM, including any
known problems and bug fixes from the previous release. The most up-to-date
version of this document can be found on the LLVM 1.2 web site. If you are
not reading this on the LLVM web pages, you should probably go there because
this document may be updated after the release.
For more information about LLVM, including information about potentially more
current releases, please check out the main
web site. If you have questions or comments, the LLVM developer's mailing
list is a good place to send them.
Note that if you are reading this file from CVS, this document applies
to the next release, not the current one. To see the release notes for
the current or previous releases, see the releases page.
This is the third public release of the LLVM compiler infrastructure. This
release incorporates several new features (including
exception handling support for the native code generators, the start of a
source-level debugger, and profile guided optimizer components), many speedups and code quality
improvements, documentation improvements, and a small collection of important bug fixes. Overall, this is our highest quality release to
date, and we encourage you to upgrade if you are using LLVM 1.0 or 1.1.
At this time, LLVM is known to correctly compile and run all C & C++ SPEC
CPU2000 benchmarks, the Olden benchmarks, and the Ptrdist benchmarks. It has
also been used to compile many other programs. LLVM now also works with
a broad variety of C++ programs, though it has still received less testing than
the C front-end.
- A new LLVM source-level debugger has been started.
- LLVM 1.2 encodes bytecode files for large programs in 10-30% less space.
- LLVM can now feed profile information back into optimizers for Profile Guided Optimization, includes a simple basic block reordering pass, and supports edge profiling as well as function and block-level profiling.
- The LLVM JIT lazily initializes global variables, reducing startup time for programs with lots of globals (like C++ programs).
- The build and installation infrastructure in this release is dramatically
improved. There is now an autoconf/AutoRegen.sh script
that you can run to rebuild the configure script and its associated
files as well as beta support for "make install" and RPM package generation.
- The "tblgen" tool is now documented.
- The target-independent code generator got several improvements:
- It can now fold spill code into instructions (on targets that support it).
- A generic machine code spiller/rewriter was added. It provides an API for
global register allocators to eliminate virtual registers and add the
appropriate spill code.
- The represenation of machine code basic blocks is more efficient and has
an easier to use interface.
- LLVM now no longer depends on the boost library.
- The X86 backend now generates substantially better native code, and is faster.
- The C backend has been turned moved from the "llvm-dis" tool to the "llc"
tool. You can activate it with "llc -march=c foo.bc -o foo.c".
- LLVM includes a new interprocedural optimization that marks global variables
"constant" when they are provably never written to.
- LLVM now includes a new interprocedural optimization that converts small "by reference" arguments to "by value" arguments, which is often improve the performance of C++ programs substantially.
- Bugpoint can now do a better job reducing miscompilation problems by
reducing programs down to a particular loop nest, instead of just the function
being miscompiled.
- The GCSE and LICM passes can now operate on side-effect-free function calls, for example hoisting calls to "strlen" and folding "cos" common subexpressions.
- LLVM has early support for a new select instruction, though it is
currently only supported by the C backend.
In this release, the following missing features were implemented:
- Exception handling in the X86
& Sparc native code generators is now supported
- The C/C++ front-end now support the GCC __builtin_return_address and __builtin_frame_address extensions. These are also supported by the X86 backend and by the C backend.
- [X86] Missing cast from ULong -> Double, cast FP -> bool and support for -9223372036854775808
- The C/C++ front-end now supports
the "labels as values" GCC extension, often used to build "threaded interpreters".
- JIT should lazily initialize global variables
- [X86] X86 Backend never releases memory for machine code structures
- [vmcore] OpaqueType objects memory leak
- [llvmgcc] C front-end does not compile "extern inline" into linkonce
- Bytecode format inconsistent
- [loadvn/inline/scalarrepl] Slow optimizations with extremely large basic blocks
- [asmparser] Really slow parsing of types with complex upreferences
- [llvmgcc] C front-end does not emit 'zeroinitializer' when possible
- [llvmgcc] Structure copies result in a LOT of code
- LLVM is now much more memory efficient when handling large zero initialized arrays
- [llvmgcc] Local array initializers are expanded into large amounts of code
In this release, the following build problems were fixed:
- [build] Makefiles break if C frontend target string has unexpected value
- [build] hard-wired assumption that shared-library extension is ".so"
- make tools-only doesn't make lib/Support
- [loopsimplify] Many pointless phi nodes are created
- [x86] wierd stack/frame pointer manipulation
- The X86 backend now generate fchs to negate floating point numbers,
compiles memcpy() into the rep movs instruction, and makes much better
use of powerful addressing modes and instructions.
Bugs in the LLVM Core:
- [licm] LICM promotes volatile memory
locations to registers
- [licm] Memory read after free causes
infrequent crash
- [indvars] Induction variable
canonicalization always makes 32-bit indvars
- [constantmerge] Merging globals can
cause use of invalid pointers!
- [bcreader] Bytecode reader misreads 'long -9223372036854775808'!
- Tail duplication does not update SSA form correctly.
- VMCore mishandles double -0.0
- [X86] X86 backend code generates -0.0 as +0.0
- [loopsimplify] Loopsimplify incorrectly updates dominator information
- [pruneeh] -pruneeh pass removes invoke instructions it shouldn't
- [sparc] Boolean constants are emitted as true and false
- [interpreter] va_list values silently corrupted by function calls
- Tablegen aborts on errors
- [inliner] Error inlining intrinsic calls into invoke instructions
- Linking weak and strong global variables is dependent on link order
- Variables used to define non-printable FP constants are externally visible
- CBE gives linkonce functions wrong linkage semantics
- [JIT] Programs cannot resolve the fstat function
- [indvars] Induction variable analysis violates LLVM invariants
- [execution engines] Unhandled cast constant expression
Bugs in the C/C++ front-end:
- Need weak linkage on memory
management functions in libc runtime to allow them to be overriden
- [llvm-gcc] asserts when an extern inline function is redefined
- [llvmg++] Dynamically initialized constants cannot be marked 'constant'
- [llvmgcc] floating-point unary minus is incorrect for +0.0
- [llvm-gcc] miscompilation of 'X = Y = Z' with aggregate values
- [llvmgcc] Invalid code created for complex division operation
- [llvmgcc] Incorrect code generation for pointer subtraction
- [llvmg++] Crash assigning pointers-to-members with casted types
- [llvm-g++] Cleanups and gotos don't mix properly
- [llvmgcc] Crash on auto register variable with specific register specified
LLVM has been extensively tested on Intel and AMD machines running Red
Hat Linux and FreeBSD. It has also been tested on Sun UltraSPARC workstations running Solaris 8.
Additionally,
LLVM works on Mac OS X 10.3 and above, but only with the C backend or
interpreter (no native backend for the PowerPC is available yet).
The core LLVM infrastructure uses "autoconf" for portability, so hopefully we
work on more platforms than that. However, it is likely that we
missed something and that minor porting is required to get LLVM to work on
new platforms. We welcome portability patches and error messages.
This section contains all known problems with the LLVM system, listed by
component. As new problems are discovered, they will be added to these
sections. If you run into a problem, please check the LLVM bug database and submit a bug if
there isn't already one.
The following components of this LLVM release are either untested, known to be
broken or unreliable, or are in early development. These components should not
be relied on, and bugs should not be filed against them, but they may be useful
to some people. In particular, if you would like to work on one of these
components, please contact us on the llvmdev list.
- The following passes are incomplete or buggy: -pgmdep, -memdep,
-ipmodref, -sortstructs, -swapstructs, -cee
- The -pre pass is incomplete (there are cases it doesn't handle that
it should) and not thoroughly tested.
- The llvm-ar tool is incomplete and probably buggy.
- The llvm-db tool is in a very early stage of development.
- Inline assembly is not yet supported.
- "long double" is transformed by the front-end into "double". There is no
support for floating point data types of any size other than 32 and 64
bits.
- The following Unix system functionality has not been tested and may not
work:
- sigsetjmp, siglongjmp - These are not turned into the
appropriate invoke/unwind instructions. Note that
setjmp and longjmp are compiled correctly.
- getcontext, setcontext, makecontext
- These functions have not been tested.
- Although many GCC extensions are supported, some are not. In particular,
the following extensions are known to not be supported:
- Local Labels: Labels local to a block.
- Nested Functions: As in Algol and Pascal, lexical scoping of functions.
- Constructing Calls: Dispatching a call to another function.
- Extended Asm: Assembler instructions with C expressions as operands.
- Constraints: Constraints for asm operands.
- Asm Labels: Specifying the assembler name to use for a C symbol.
- Explicit Reg Vars: Defining variables residing in specified registers.
- Vector Extensions: Using vector instructions through built-in functions.
- Target Builtins: Built-in functions specific to particular targets.
- Thread-Local: Per-thread variables.
- Pragmas: Pragmas accepted by GCC.
The following GCC extensions are partially supported. An ignored
attribute means that the LLVM compiler ignores the presence of the attribute,
but the code should still work. An unsupported attribute is one which is
ignored by the LLVM compiler and will cause a different interpretation of
the program.
- Variable Length:
Arrays whose length is computed at run time.
Supported, but allocated stack space is not freed until the function returns (noted above).
- Function Attributes:
Declaring that functions have no side effects or that they can never
return.
Supported: format, format_arg, non_null,
constructor, destructor, unused,
deprecated, warn_unused_result, weak
Ignored: noreturn, noinline,
always_inline, pure, const, nothrow,
malloc, no_instrument_function, cdecl
Unsupported: used, section, alias,
visibility, regparm, stdcall,
fastcall, all other target specific attributes
- Variable Attributes:
Specifying attributes of variables.
Supported: cleanup, common, nocommon,
deprecated, transparent_union,
unused, weak
Unsupported: aligned, mode, packed,
section, shared, tls_model,
vector_size, dllimport,
dllexport, all target specific attributes.
- Type Attributes: Specifying attributes of types.
Supported: transparent_union, unused,
deprecated, may_alias
Unsupported: aligned, packed,
all target specific attributes.
- Other Builtins:
Other built-in functions.
We support all builtins which have a C language equivalent (e.g.,
__builtin_cos), __builtin_alloca,
__builtin_types_compatible_p, __builtin_choose_expr,
__builtin_constant_p, and __builtin_expect (ignored).
The following extensions are known to be supported:
- Labels as Values: Getting pointers to labels and computed gotos.
- Statement Exprs: Putting statements and declarations inside expressions.
- Typeof:
typeof
: referring to the type of an expression.
- Lvalues: Using
?:
, ",
" and casts in lvalues.
- Conditionals: Omitting the middle operand of a
?:
expression.
- Long Long: Double-word integers.
- Complex: Data types for complex numbers.
- Hex Floats:Hexadecimal floating-point constants.
- Zero Length: Zero-length arrays.
- Empty Structures: Structures with no members.
- Variadic Macros: Macros with a variable number of arguments.
- Escaped Newlines: Slightly looser rules for escaped newlines.
- Subscripting: Any array can be subscripted, even if not an lvalue.
- Pointer Arith: Arithmetic on
void
-pointers and function pointers.
- Initializers: Non-constant initializers.
- Compound Literals: Compound literals give structures, unions,
or arrays as values.
- Designated Inits: Labeling elements of initializers.
- Cast to Union: Casting to union type from any member of the union.
- Case Ranges: `case 1 ... 9' and such.
- Mixed Declarations: Mixing declarations and code.
- Function Prototypes: Prototype declarations and old-style definitions.
- C++ Comments: C++ comments are recognized.
- Dollar Signs: Dollar sign is allowed in identifiers.
- Character Escapes:
\e
stands for the character <ESC>.
- Alignment: Inquiring about the alignment of a type or variable.
- Inline: Defining inline functions (as fast as macros).
- Alternate Keywords:
__const__
, __asm__
, etc., for header files.
- Incomplete Enums:
enum foo;
, with details to follow.
- Function Names: Printable strings which are the name of the current function.
- Return Address: Getting the return or frame address of a function.
- Unnamed Fields: Unnamed struct/union fields within structs/unions.
- Attribute Syntax: Formal syntax for attributes.
If you run into GCC extensions which have not been included in any of these
lists, please let us know (also including whether or not they work).
For this release, the C++ front-end is considered to be fully functional but
has not been tested as thoroughly as the C front-end. It has been tested and
works for a number of non-trivial programs, but there may be lurking bugs.
Please report any bugs or problems.
A wide variety of additional information is available on the LLVM web page,
including mailing lists and publications describing algorithms and components
implemented in LLVM. The web page also contains versions of the API
documentation which is up-to-date with the CVS version of the source code. You
can access versions of these documents specific to this release by going into
the "llvm/doc/" directory in the LLVM tree.
If you have any questions or comments about LLVM, please feel free to contact
us via the mailing
lists.