summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_eu.h
diff options
context:
space:
mode:
authorFrancisco Jerez <currojerez@riseup.net>2015-11-23 19:18:26 +0200
committerFrancisco Jerez <currojerez@riseup.net>2015-11-26 14:07:58 +0200
commit55ffa64daf765b1229364518106a4124bd84b9a7 (patch)
treebbc31d37dc2eebaf480f08a1b3ba55ba4162d9f7 /src/mesa/drivers/dri/i965/brw_eu.h
parentbc8182808aea111aea3cfcba4da3dd861689d890 (diff)
downloadexternal_mesa3d-55ffa64daf765b1229364518106a4124bd84b9a7.zip
external_mesa3d-55ffa64daf765b1229364518106a4124bd84b9a7.tar.gz
external_mesa3d-55ffa64daf765b1229364518106a4124bd84b9a7.tar.bz2
i965/gen9+: Switch thread scratch space to non-coherent stateless access.
The thread scratch space is thread-local so using the full IA-coherent stateless surface index (255 since Gen8) is unnecessary and potentially expensive. On Gen8 and early steppings of Gen9 this is not a functional change because the kernel already sets bit 4 of HDC_CHICKEN0 which overrides all HDC memory access to be non-coherent in order to workaround a hardware bug. This happens to fix a full system hang when running any spilling code on a pre-production SKL GT4e machine I have on my desk (forcing all HDC access to non-coherent from the kernel up to stepping F0 might be a good idea though regardless of this patch), and improves performance of the OglPSBump2 SynMark benchmark run with INTEL_DEBUG=spill_fs by 33% (11 runs, 5% significance) on a production SKL GT2 (on which HDC IA-coherency is apparently functional so it wouldn't make sense to disable globally). Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Diffstat (limited to 'src/mesa/drivers/dri/i965/brw_eu.h')
-rw-r--r--src/mesa/drivers/dri/i965/brw_eu.h2
1 files changed, 2 insertions, 0 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h
index ef94347..686b42b 100644
--- a/src/mesa/drivers/dri/i965/brw_eu.h
+++ b/src/mesa/drivers/dri/i965/brw_eu.h
@@ -315,6 +315,8 @@ void brw_oword_block_read(struct brw_codegen *p,
uint32_t offset,
uint32_t bind_table_index);
+unsigned brw_scratch_surface_idx(const struct brw_codegen *p);
+
void brw_oword_block_read_scratch(struct brw_codegen *p,
struct brw_reg dest,
struct brw_reg mrf,