| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous regs_written field can be recovered by rewriting each
rvalue reference of regs_written like 'x = i.regs_written' to 'x =
DIV_ROUND_UP(i.size_written, reg_unit)', and each lvalue reference
like 'i.regs_written = x' to 'i.size_written = x * reg_unit'.
For the same reason as in the previous patches, this doesn't attempt
to be particularly clever about simplifying the result in the interest
of keeping the rather lengthy patch as obvious as possible. I'll come
back later to clean up any ugliness introduced here.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generated by:
sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.h
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.cpp
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.h
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
|
|
| |
None of them are actually using it. It's a relic of an older compiler
interface that required a gl_program.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Cuts 10k of .text, of which only 776 bytes are the fs_reg constructor
implementations themselves.
text data bss dec hex filename
5204535 214112 27784 5446431 531b1f i965_dri.so before
5193977 214112 27784 5435873 52f1e1 i965_dri.so after
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tests
The complete way to do this would be parse INTEL_DEBUG and
print the output if DEBUG_VS (or a new one) is present
(see intel_debug.c).
But that seems like an overkill for the unit tests, that
after all, the most common use case is being run when
calling make check.
v2: use the same idea for the fs counterpart too, as suggested by
Matt Turner
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
| |
Rather than accepting a void pointer, only to down and up cast around
it, convert the function to take the base (struct gl_program) pointer.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
| |
They didn't do anything useful.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
|
|
|
| |
Unfortunately, we can't get rid of them entirely. The FS backend still
needs gl_program for handling TEXTURE_RECTANGLE. The GS vec4 backend still
needs gl_shader_program for handling transfom feedback. However, the VS
needs neither and we can substantially reduce the amount they are used.
One day we will be free from their tyranny.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
|
|
|
|
|
|
|
| |
As of this commit, nothing actually needs the brw_context.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
|
|
|
|
|
|
|
| |
v2: Use set_predicate/condmod. Use fs_builder::OPCODE instead of
::emit.
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 3687d75 changed the fs_visitor constructors, but it didn't update
all the users. As a result, 'make check' fails.
I added the explicit cast to the gl_program* parameter to make it more
clear which NULL was which.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@Whitecape.org>
|
|
|
|
|
|
|
| |
In future tests, we will start relying on devinfo and not just brw in the
compiler. Changing this now keeps these tests from failing in the future.
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89670
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: Vinson Lee <vlee@freedesktop.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Espically on platforms that do not natively generate 0u and ~0u for
Boolean results, we generate a lot of sequences where a CMP is
followed by an AND with 1. emit_bool_to_cond_code does this, for
example. On ILK, this results in a sequence like:
add(8) g3<1>F g8<8,8,1>F -g4<0,1,0>F
cmp.l.f0(8) g3<1>D g3<8,8,1>F 0F
and.nz.f0(8) null g3<8,8,1>D 1D
(+f0) iff(8) Jump: 6
The AND.nz is obviously redundant. By propagating the cmod, we can
instead generate
add.l.f0(8) null g8<8,8,1>F -g4<0,1,0>F
(+f0) iff(8) Jump: 6
Existing code already handles the propagation from the CMP to the ADD.
Shader-db results:
GM45 (0x2A42):
total instructions in shared programs: 3550829 -> 3550788 (-0.00%)
instructions in affected programs: 10028 -> 9987 (-0.41%)
helped: 24
Iron Lake (0x0046):
total instructions in shared programs: 4993146 -> 4993105 (-0.00%)
instructions in affected programs: 9675 -> 9634 (-0.42%)
helped: 24
Ivy Bridge (0x0166):
total instructions in shared programs: 6291870 -> 6291794 (-0.00%)
instructions in affected programs: 17914 -> 17838 (-0.42%)
helped: 48
Haswell (0x0426):
total instructions in shared programs: 5779256 -> 5779180 (-0.00%)
instructions in affected programs: 16694 -> 16618 (-0.46%)
helped: 48
Broadwell (0x162E):
total instructions in shared programs: 6823088 -> 6823014 (-0.00%)
instructions in affected programs: 15824 -> 15750 (-0.47%)
helped: 46
No chage on Sandy Bridge or on any platform when NIR is used.
v2: Add unit tests suggested by Matt. Remove spurious writes_flag()
check on scan_inst when scan_inst is known to be BRW_OPCODE_CMP (also
suggested by Matt).
v3: Fix some comments and remove some explicit int() casts in fs_reg
constructors in the unit tests. Both suggested by Matt.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
| |
Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89317
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For some reason, we occasionally write the flag register with a MOV.NZ
instruction:
add(8) g25<1>F -g6<0,1,0>F g15<8,8,1>F
cmp.l.f0(8) g26<1>D g25<8,8,1>F 0F
mov.nz.f0(8) null g26<8,8,1>D
A MOV.NZ instruction on the result of a CMP is like comparing for
equality with true in C. It's useless. Removing it allows us to
generate:
add.l.f0(8) null -g6<0,1,0>F g15<8,8,1>F
total instructions in shared programs: 5955701 -> 5951657 (-0.07%)
instructions in affected programs: 302910 -> 298866 (-1.34%)
GAINED: 1
LOST: 0
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows us to apply the optimization in cases where the CMP's
argument is negated, by flipping the conditional mod. For example, it
allows us to optimize this:
add(8) temp a b
cmp.l.f0(8) null -temp 0.0
into
add.g.f0(8) temp a b
total instructions in shared programs: 5958360 -> 5955701 (-0.04%)
instructions in affected programs: 466880 -> 464221 (-0.57%)
GAINED: 0
LOST: 1
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 5959463 -> 5958900 (-0.01%)
instructions in affected programs: 70031 -> 69468 (-0.80%)
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|