diff options
author | Kenneth Graunke <kenneth@whitecape.org> | 2016-04-07 15:04:35 -0700 |
---|---|---|
committer | Kenneth Graunke <kenneth@whitecape.org> | 2016-04-11 18:44:17 -0700 |
commit | bfd17c76c1267756ea16051cbe174cb23ff49f44 (patch) | |
tree | b451c81beb61d850f0f726fc8437e81384cb39ee /src/mesa/drivers/dri/i965/brw_nir.c | |
parent | b0dffdc616801a1fd8534502e11ac840369041ab (diff) | |
download | external_mesa3d-bfd17c76c1267756ea16051cbe174cb23ff49f44.zip external_mesa3d-bfd17c76c1267756ea16051cbe174cb23ff49f44.tar.gz external_mesa3d-bfd17c76c1267756ea16051cbe174cb23ff49f44.tar.bz2 |
i965: Port INTEL_PRECISE_TRIG=1 to NIR.
This makes the extra multiply visible to NIR's algebraic optimizations
(for constant reassociation) as well as constant folding. This means
that when the result of sin/cos are multiplied by an constant, we can
eliminate the extra multiply altogether, reducing the cost of the
workaround.
It also means we only have to implement it one place, rather than in
both backends.
This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion,
which has a ton of sin() calls, but always multiplies them by an
immediate constant. The extra multiply gets folded away.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Diffstat (limited to 'src/mesa/drivers/dri/i965/brw_nir.c')
-rw-r--r-- | src/mesa/drivers/dri/i965/brw_nir.c | 3 |
1 files changed, 3 insertions, 0 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 1821c0d..932979a 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -447,6 +447,9 @@ brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir) if (nir->stage == MESA_SHADER_GEOMETRY) OPT(nir_lower_gs_intrinsics); + if (compiler->precise_trig) + OPT(brw_nir_apply_trig_workarounds); + static const nir_lower_tex_options tex_options = { .lower_txp = ~0, }; |