summaryrefslogtreecommitdiffstats
path: root/src/glsl/nir
diff options
context:
space:
mode:
authorKristian Høgsberg Kristensen <krh@bitplanet.net>2015-12-14 17:44:23 -0800
committerKristian Høgsberg Kristensen <krh@bitplanet.net>2015-12-29 10:39:25 -0800
commitf9283f2668bb1a64303d73b663464a8556fe3f8f (patch)
tree78aa7dc727e68567966a8e65b19916cc5af04bca /src/glsl/nir
parentcddfc2cefa93b884c40329dcb193fe4fb22143ab (diff)
downloadexternal_mesa3d-f9283f2668bb1a64303d73b663464a8556fe3f8f.zip
external_mesa3d-f9283f2668bb1a64303d73b663464a8556fe3f8f.tar.gz
external_mesa3d-f9283f2668bb1a64303d73b663464a8556fe3f8f.tar.bz2
nir: Teach nir_opt_algebraic about adding and subtracting the same thing
This optimizes a + b - b to just a. Modest shader-db results (BDW): total instructions in shared programs: 7842452 -> 7841862 (-0.01%) instructions in affected programs: 61938 -> 61348 (-0.95%) total loops in shared programs: 2131 -> 2131 (0.00%) helped: 263 HURT: 0 GAINED: 0 LOST: 0 but the optimization turns gl_VertexID - gl_BaseVertexARB into just a reference to SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, which the i965 hardware supports natively. That means we can avoid using the internal vertex buffer for gl_BaseVertexARB in this case. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
Diffstat (limited to 'src/glsl/nir')
-rw-r--r--src/glsl/nir/nir_opt_algebraic.py4
1 files changed, 4 insertions, 0 deletions
diff --git a/src/glsl/nir/nir_opt_algebraic.py b/src/glsl/nir/nir_opt_algebraic.py
index cb715c0..1fdad3d 100644
--- a/src/glsl/nir/nir_opt_algebraic.py
+++ b/src/glsl/nir/nir_opt_algebraic.py
@@ -62,6 +62,10 @@ optimizations = [
(('iadd', ('imul', a, b), ('imul', a, c)), ('imul', a, ('iadd', b, c))),
(('fadd', ('fneg', a), a), 0.0),
(('iadd', ('ineg', a), a), 0),
+ (('iadd', ('ineg', a), ('iadd', a, b)), b),
+ (('iadd', a, ('iadd', ('ineg', a), b)), b),
+ (('fadd', ('fneg', a), ('fadd', a, b)), b),
+ (('fadd', a, ('fadd', ('fneg', a), b)), b),
(('fmul', a, 0.0), 0.0),
(('imul', a, 0), 0),
(('umul_unorm_4x8', a, 0), 0),