diff options
author | Dan Gohman <djg@cray.com> | 2007-07-18 16:29:46 +0000 |
---|---|---|
committer | Dan Gohman <djg@cray.com> | 2007-07-18 16:29:46 +0000 |
commit | f17a25c88b892d30c2b41ba7ecdfbdfb2b4be9cc (patch) | |
tree | ebb79ea1ee5e3bc1fdf38541a811a8b804f0679a /lib/Target/X86/README-FPStack.txt | |
download | external_llvm-f17a25c88b892d30c2b41ba7ecdfbdfb2b4be9cc.zip external_llvm-f17a25c88b892d30c2b41ba7ecdfbdfb2b4be9cc.tar.gz external_llvm-f17a25c88b892d30c2b41ba7ecdfbdfb2b4be9cc.tar.bz2 |
It's not necessary to do rounding for alloca operations when the requested
alignment is equal to the stack alignment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40004 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'lib/Target/X86/README-FPStack.txt')
-rw-r--r-- | lib/Target/X86/README-FPStack.txt | 99 |
1 files changed, 99 insertions, 0 deletions
diff --git a/lib/Target/X86/README-FPStack.txt b/lib/Target/X86/README-FPStack.txt new file mode 100644 index 0000000..d94fa02 --- /dev/null +++ b/lib/Target/X86/README-FPStack.txt @@ -0,0 +1,99 @@ +//===---------------------------------------------------------------------===// +// Random ideas for the X86 backend: FP stack related stuff +//===---------------------------------------------------------------------===// + +//===---------------------------------------------------------------------===// + +Some targets (e.g. athlons) prefer freep to fstp ST(0): +http://gcc.gnu.org/ml/gcc-patches/2004-04/msg00659.html + +//===---------------------------------------------------------------------===// + +On darwin/x86, we should codegen: + + ret double 0.000000e+00 + +as fld0/ret, not as: + + movl $0, 4(%esp) + movl $0, (%esp) + fldl (%esp) + ... + ret + +//===---------------------------------------------------------------------===// + +This should use fiadd on chips where it is profitable: +double foo(double P, int *I) { return P+*I; } + +We have fiadd patterns now but the followings have the same cost and +complexity. We need a way to specify the later is more profitable. + +def FpADD32m : FpI<(ops RFP:$dst, RFP:$src1, f32mem:$src2), OneArgFPRW, + [(set RFP:$dst, (fadd RFP:$src1, + (extloadf64f32 addr:$src2)))]>; + // ST(0) = ST(0) + [mem32] + +def FpIADD32m : FpI<(ops RFP:$dst, RFP:$src1, i32mem:$src2), OneArgFPRW, + [(set RFP:$dst, (fadd RFP:$src1, + (X86fild addr:$src2, i32)))]>; + // ST(0) = ST(0) + [mem32int] + +//===---------------------------------------------------------------------===// + +The FP stackifier needs to be global. Also, it should handle simple permutates +to reduce number of shuffle instructions, e.g. turning: + +fld P -> fld Q +fld Q fld P +fxch + +or: + +fxch -> fucomi +fucomi jl X +jg X + +Ideas: +http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02410.html + + +//===---------------------------------------------------------------------===// + +Add a target specific hook to DAG combiner to handle SINT_TO_FP and +FP_TO_SINT when the source operand is already in memory. + +//===---------------------------------------------------------------------===// + +Open code rint,floor,ceil,trunc: +http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02006.html +http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02011.html + +Opencode the sincos[f] libcall. + +//===---------------------------------------------------------------------===// + +None of the FPStack instructions are handled in +X86RegisterInfo::foldMemoryOperand, which prevents the spiller from +folding spill code into the instructions. + +//===---------------------------------------------------------------------===// + +Currently the x86 codegen isn't very good at mixing SSE and FPStack +code: + +unsigned int foo(double x) { return x; } + +foo: + subl $20, %esp + movsd 24(%esp), %xmm0 + movsd %xmm0, 8(%esp) + fldl 8(%esp) + fisttpll (%esp) + movl (%esp), %eax + addl $20, %esp + ret + +This will be solved when we go to a dynamic programming based isel. + +//===---------------------------------------------------------------------===// |