From e264f62ca09a8f65c87a46d562a4d0f9ec5d457e Mon Sep 17 00:00:00 2001 From: Shih-wei Liao Date: Wed, 10 Feb 2010 11:10:31 -0800 Subject: Check in LLVM r95781. --- docs/ExtendedIntegerResults.txt | 133 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 133 insertions(+) create mode 100644 docs/ExtendedIntegerResults.txt (limited to 'docs/ExtendedIntegerResults.txt') diff --git a/docs/ExtendedIntegerResults.txt b/docs/ExtendedIntegerResults.txt new file mode 100644 index 0000000..44e9fbf --- /dev/null +++ b/docs/ExtendedIntegerResults.txt @@ -0,0 +1,133 @@ +//===----------------------------------------------------------------------===// +// Representing sign/zero extension of function results +//===----------------------------------------------------------------------===// + +Mar 25, 2009 - Initial Revision + +Most ABIs specify that functions which return small integers do so in a +specific integer GPR. This is an efficient way to go, but raises the question: +if the returned value is smaller than the register, what do the high bits hold? + +There are three (interesting) possible answers: undefined, zero extended, or +sign extended. The number of bits in question depends on the data-type that +the front-end is referencing (typically i1/i8/i16/i32). + +Knowing the answer to this is important for two reasons: 1) we want to be able +to implement the ABI correctly. If we need to sign extend the result according +to the ABI, we really really do need to do this to preserve correctness. 2) +this information is often useful for optimization purposes, and we want the +mid-level optimizers to be able to process this (e.g. eliminate redundant +extensions). + +For example, lets pretend that X86 requires the caller to properly extend the +result of a return (I'm not sure this is the case, but the argument doesn't +depend on this). Given this, we should compile this: + +int a(); +short b() { return a(); } + +into: + +_b: + subl $12, %esp + call L_a$stub + addl $12, %esp + cwtl + ret + +An optimization example is that we should be able to eliminate the explicit +sign extension in this example: + +short y(); +int z() { + return ((int)y() << 16) >> 16; +} + +_z: + subl $12, %esp + call _y + ;; movswl %ax, %eax -> not needed because eax is already sext'd + addl $12, %esp + ret + +//===----------------------------------------------------------------------===// +// What we have right now. +//===----------------------------------------------------------------------===// + +Currently, these sorts of things are modelled by compiling a function to return +the small type and a signext/zeroext marker is used. For example, we compile +Z into: + +define i32 @z() nounwind { +entry: + %0 = tail call signext i16 (...)* @y() nounwind + %1 = sext i16 %0 to i32 + ret i32 %1 +} + +and b into: + +define signext i16 @b() nounwind { +entry: + %0 = tail call i32 (...)* @a() nounwind ; [#uses=1] + %retval12 = trunc i32 %0 to i16 ; [#uses=1] + ret i16 %retval12 +} + +This has some problems: 1) the actual precise semantics are really poorly +defined (see PR3779). 2) some targets might want the caller to extend, some +might want the callee to extend 3) the mid-level optimizer doesn't know the +size of the GPR, so it doesn't know that %0 is sign extended up to 32-bits +here, and even if it did, it could not eliminate the sext. 4) the code +generator has historically assumed that the result is extended to i32, which is +a problem on PIC16 (and is also probably wrong on alpha and other 64-bit +targets). + +//===----------------------------------------------------------------------===// +// The proposal +//===----------------------------------------------------------------------===// + +I suggest that we have the front-end fully lower out the ABI issues here to +LLVM IR. This makes it 100% explicit what is going on and means that there is +no cause for confusion. For example, the cases above should compile into: + +define i32 @z() nounwind { +entry: + %0 = tail call i32 (...)* @y() nounwind + %1 = trunc i32 %0 to i16 + %2 = sext i16 %1 to i32 + ret i32 %2 +} +define i32 @b() nounwind { +entry: + %0 = tail call i32 (...)* @a() nounwind + %retval12 = trunc i32 %0 to i16 + %tmp = sext i16 %retval12 to i32 + ret i32 %tmp +} + +In this model, no functions will return an i1/i8/i16 (and on a x86-64 target +that extends results to i64, no i32). This solves the ambiguity issue, allows us +to fully describe all possible ABIs, and now allows the optimizers to reason +about and eliminate these extensions. + +The one thing that is missing is the ability for the front-end and optimizer to +specify/infer the guarantees provided by the ABI to allow other optimizations. +For example, in the y/z case, since y is known to return a sign extended value, +the trunc/sext in z should be eliminable. + +This can be done by introducing new sext/zext attributes which mean "I know +that the result of the function is sign extended at least N bits. Given this, +and given that it is stuck on the y function, the mid-level optimizer could +easily eliminate the extensions etc with existing functionality. + +The major disadvantage of doing this sort of thing is that it makes the ABI +lowering stuff even more explicit in the front-end, and that we would like to +eventually move to having the code generator do more of this work. However, +the sad truth of the matter is that this is a) unlikely to happen anytime in +the near future, and b) this is no worse than we have now with the existing +attributes. + +C compilers fundamentally have to reason about the target in many ways. +This is ugly and horrible, but a fact of life. + -- cgit v1.1