diff options
author | Juan A. Suarez Romero <jasuarez@igalia.com> | 2016-03-02 13:21:02 +0100 |
---|---|---|
committer | Matt Turner <mattst88@gmail.com> | 2016-03-04 19:16:52 -0800 |
commit | 2f76a9924e7b0b33a508ee3651b0cb2ab536a7dc (patch) | |
tree | de4d250338ff40976ff66d54ea14ee4272a6fd92 /src/mesa/drivers/dri/i965/brw_vec4.h | |
parent | feb71117aebc0932a96b548b4c402b010a008b2d (diff) | |
download | external_mesa3d-2f76a9924e7b0b33a508ee3651b0cb2ab536a7dc.zip external_mesa3d-2f76a9924e7b0b33a508ee3651b0cb2ab536a7dc.tar.gz external_mesa3d-2f76a9924e7b0b33a508ee3651b0cb2ab536a7dc.tar.bz2 |
i965/vec4: add opportunistic behaviour to opt_vector_float()
opt_vector_float() transforms several scalar MOV operations to a single
vectorial MOV.
This is done when those MOV covers all the components of the destination
register. So something like:
mov vgrf3.0.xy:D, 0D
mov vgrf3.0.w:D, 1065353216D
mov vgrf3.0.z:D, 0D
is transformed in:
mov vgrf3.0:F, [0F, 0F, 0F, 1F]
But there are cases where not all the components are written. For
example, in:
mov vgrf2.0.x:D, 1073741824D
mov vgrf3.0.xy:D, 0D
mov vgrf3.0.w:D, 1065353216D
mov vgrf4.0.xy:D, 1065353216D
mov vgrf4.0.w:D, 0D
mov vgrf6.0:UD, u4.xyzw:UD
Nor vgrf3 nor vgrf4 .z components are written, so the optimization is
not applied.
But it could be applied anyway with the components covered, using a
writemask to select the ones written. So we could transform it in:
mov vgrf2.0.x:D, 1073741824D
mov vgrf3.0.xyw:F, [0F, 0F, 0F, 1F]
mov vgrf4.0.xyw:F, [1F, 1F, 0F, 0F]
mov vgrf6.0:UD, u4.xyzw:UD
This commit does precisely that: opportunistically apply
opt_vector_float() when possible.
total instructions in shared programs: 7124660 -> 7114784 (-0.14%)
instructions in affected programs: 443078 -> 433202 (-2.23%)
helped: 4998
HURT: 0
total cycles in shared programs: 64757760 -> 64728016 (-0.05%)
cycles in affected programs: 1401686 -> 1371942 (-2.12%)
helped: 3243
HURT: 38
v2: change vectorize_mov() signature (Matt).
v3: take in account predicates (Juan).
v4 [mattst88]: Update shader-db numbers. Fix some whitespace issues.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Diffstat (limited to 'src/mesa/drivers/dri/i965/brw_vec4.h')
-rw-r--r-- | src/mesa/drivers/dri/i965/brw_vec4.h | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 633f13c..91771b8 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -369,6 +369,10 @@ protected: virtual void gs_end_primitive(); private: + bool vectorize_mov(bblock_t *block, vec4_instruction *inst, + uint8_t imm[4], vec4_instruction *imm_inst[4], + int inst_count, unsigned writemask); + /** * If true, then register allocation should fail instead of spilling. */ |