Fix for the following bug in AVX codegen for double-to-int conversions:

. "fptosi" and "fptoui" IR instructions are defined with round-to-zero rounding mode. . Currently for AVX mode for <4xdouble> and <8xdouble> the "VCVTPD2DQ.128" and "VCVTPD2DQ.256" instructions are selected (for .fp_to_sint. DAG node operation ) by AVX codegen. However they use round-to-nearest-even rounding mode. . Consequently, the conversion produces incorrect numbers. The fix is to replace selection of VCVTPD2DQ instructions with VCVTTPD2DQ instructions. The latter use truncate (i.e. round-to-zero) rounding mode. As .fp_to_sint. DAG node operation is used only for lowering of "fptosi" and "fptoui" IR instructions, the fix in X86InstrSSE.td definition file doesn.t have an impact on other LLVM flows. The patch includes changes in the .td file, LIT test for the changes and a fix in a legacy LIT test (which produced asm code conflicting with LLVN IR spec). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149056 91177308-0d34-0410-b5e6-96231b3b80d8
author: Victor Umansky <victor.umansky@intel.com> 2012-01-26 08:51:39 +0000
committer: Victor Umansky <victor.umansky@intel.com> 2012-01-26 08:51:39 +0000
commit: 668f7ac9e4642e1b4af4f5b047d569c68bc2c55f (patch)
tree: e2865732f4a8dc4baeb255150480f00513e51281 /test
parent: a3937416e43b485633035e8184755d9a94efb7b3 (diff)
download: external_llvm-668f7ac9e4642e1b4af4f5b047d569c68bc2c55f.zip
external_llvm-668f7ac9e4642e1b4af4f5b047d569c68bc2c55f.tar.gz
external_llvm-668f7ac9e4642e1b4af4f5b047d569c68bc2c55f.tar.bz2
2 files changed, 20 insertions, 1 deletions
diff --git a/test/CodeGen/X86/avx-cvt.ll b/test/CodeGen/X86/avx-cvt.ll
index 6c0bd58..d0a7fe0 100644
--- a/test/CodeGen/X86/avx-cvt.ll
+++ b/test/CodeGen/X86/avx-cvt.ll
@@ -18,7 +18,7 @@ define <4 x double> @sitofp01(<4 x i32> %a) {
   ret <4 x double> %b
 }
 
-; CHECK: vcvtpd2dqy %ymm
+; CHECK: vcvttpd2dqy %ymm
 define <4 x i32> @fptosi01(<4 x double> %a) {
   %b = fptosi <4 x double> %a to <4 x i32>
   ret <4 x i32> %b
diff --git a/test/CodeGen/X86/avx-fp2int.ll b/test/CodeGen/X86/avx-fp2int.ll
new file mode 100755
index 0000000..9e505bd
--- /dev/null
+++ b/test/CodeGen/X86/avx-fp2int.ll
@@ -0,0 +1,19 @@
+; RUN: llc < %s -mtriple=i386-apple-darwin10 -mcpu=corei7-avx -mattr=+avx | FileCheck %s
+
+;; Check that FP_TO_SINT and FP_TO_UINT generate convert with truncate
+
+; CHECK: test1:
+; CHECK: vcvttpd2dqy
+; CHECK: ret
+; CHECK: test2:
+; CHECK: vcvttpd2dqy
+; CHECK: ret
+
+define <4 x i8> @test1(<4 x double> %d) {
+  %c = fptoui <4 x double> %d to <4 x i8>
+  ret <4 x i8> %c
+}
+define <4 x i8> @test2(<4 x double> %d) {
+  %c = fptosi <4 x double> %d to <4 x i8>
+  ret <4 x i8> %c
+}
author	Victor Umansky <victor.umansky@intel.com>	2012-01-26 08:51:39 +0000
committer	Victor Umansky <victor.umansky@intel.com>	2012-01-26 08:51:39 +0000
commit	668f7ac9e4642e1b4af4f5b047d569c68bc2c55f (patch)
tree	e2865732f4a8dc4baeb255150480f00513e51281 /test
parent	a3937416e43b485633035e8184755d9a94efb7b3 (diff)
download	external_llvm-668f7ac9e4642e1b4af4f5b047d569c68bc2c55f.zip external_llvm-668f7ac9e4642e1b4af4f5b047d569c68bc2c55f.tar.gz external_llvm-668f7ac9e4642e1b4af4f5b047d569c68bc2c55f.tar.bz2