summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorSamuel Pitoiset <samuel.pitoiset@gmail.com>2016-09-14 23:02:38 +0200
committerSamuel Pitoiset <samuel.pitoiset@gmail.com>2016-09-29 21:20:50 +0200
commit3abe68b8282496688186157b51da5600ac540906 (patch)
treec2878ded35434e911322838d0340cf76cc4d6a7e
parent115c79be10bf3712a1e1bc25a563c90388c1bcaa (diff)
downloadexternal_mesa3d-3abe68b8282496688186157b51da5600ac540906.zip
external_mesa3d-3abe68b8282496688186157b51da5600ac540906.tar.gz
external_mesa3d-3abe68b8282496688186157b51da5600ac540906.tar.bz2
nv50/ir: teach insnCanLoad() about SHLADD
Commutativity is not allowed with SHLADD, but src2 can accept loads. To allow the load propagation pass to do its job, add a special case like for SUCLAMP because src1 is always an immediate. This IMAD to SHLADD optimization helps a bunch of shaders from Tomb Raider, Victor Vran, UE4 demos (+15% perf with Elemental) and Shadow Warrior. GF100/GK104: total instructions in shared programs :2838045 -> 2834712 (-0.12%) total gprs used in shared programs :396684 -> 396386 (-0.08%) total local used in shared programs :34416 -> 34416 (0.00%) local gpr inst bytes helped 0 326 1105 1105 hurt 0 55 3 3 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
-rw-r--r--src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp2
1 files changed, 2 insertions, 0 deletions
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
index 8606065..2d1f1b4 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
@@ -334,6 +334,8 @@ TargetNVC0::insnCanLoad(const Instruction *i, int s,
if (i->src(k).getFile() == FILE_IMMEDIATE) {
if (k == 2 && i->op == OP_SUCLAMP) // special case
continue;
+ if (k == 1 && i->op == OP_SHLADD) // special case
+ continue;
if (i->getSrc(k)->reg.data.u64 != 0)
return false;
} else