diff options
author | Francisco Jerez <currojerez@riseup.net> | 2015-09-03 18:15:19 +0300 |
---|---|---|
committer | Francisco Jerez <currojerez@riseup.net> | 2015-12-09 13:46:05 +0200 |
commit | 95ad0bd33ba491c2e34ce02f4ec2d75ffffe16ea (patch) | |
tree | 002ae2cd306ade0f85273cf9f71d9e75625ca46c /src/mesa/drivers/dri/i965/brw_compiler.h | |
parent | fa1300f75e80f32a0fd40ed53cfdeaecc4c2f012 (diff) | |
download | external_mesa3d-95ad0bd33ba491c2e34ce02f4ec2d75ffffe16ea.zip external_mesa3d-95ad0bd33ba491c2e34ce02f4ec2d75ffffe16ea.tar.gz external_mesa3d-95ad0bd33ba491c2e34ce02f4ec2d75ffffe16ea.tar.bz2 |
i965: Calculate appropriate L3 partition weights for the current pipeline state.
This calculates a rather conservative partitioning of the L3 cache
based on the shaders currently bound to the pipeline and whether they
use SLM, atomics, images or scratch space. The result is intended to
be fine-tuned later on based on other pipeline state.
Note that the L3 partitioning calculated for VLV in the non-SLM non-DC
case differs from the hardware defaults in that it doesn't include a
DC partition and has twice as much RO cache space -- This is an
intentional functional change that improves performance in several
bandwidth-bound benchmarks on VLV (5% significance): SynMark
OglTexFilterAniso by 14.18%, SynMark OglTexFilterTri by 7.15%, Unigine
Heaven by 4.91%, SynMark OglShMapPcf by 2.15%, GpuTest Fur by 1.83%,
SynMark OglDrvRes by 1.80%, SynMark OglVsTangent by 1.71%, and a few
other benchmarks from the Finnish system by less than 1%.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Diffstat (limited to 'src/mesa/drivers/dri/i965/brw_compiler.h')
-rw-r--r-- | src/mesa/drivers/dri/i965/brw_compiler.h | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h b/src/mesa/drivers/dri/i965/brw_compiler.h index b2251bf..218d9c7 100644 --- a/src/mesa/drivers/dri/i965/brw_compiler.h +++ b/src/mesa/drivers/dri/i965/brw_compiler.h @@ -300,6 +300,7 @@ struct brw_stage_prog_data { unsigned curb_read_length; unsigned total_scratch; + unsigned total_shared; /** * Register where the thread expects to find input data from the URB |