summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_fs.cpp
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Relax fs_builder channel group assertion when force_writemask_all ↵Francisco Jerez2015-07-011-3/+3
| | | | | | | | | | | | | | | | | | | is on. This assertion was meant to catch code inadvertently escaping the control flow jail determined by the group of channel enable signals selected by some caller, however it seems useful to be able to increase the default execution size as long as force_writemask_all is enabled, because force_writemask_all is an explicit indication that there is no longer a one-to-one correspondence between channels and SIMD components so the restriction doesn't apply. In addition reorder the calls to fs_builder::group and ::exec_all in a couple of places to make sure that we don't temporarily break this invariant in the future for instructions with exec_size higher than the dispatch width. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* i965/fs: Fix PIXEL_X/Y in regs_read()Jason Ekstrand2015-06-301-1/+1
| | | | PIXEL_X/Y takes a vec2 in the first argument
* i965/fs: Remove the width field from fs_regJason Ekstrand2015-06-301-53/+9
| | | | | | | | | | | | | As of now, the width field is no longer used for anything. The width field "seemed like a good idea at the time" but is actually entirely redundant with the instruction's execution size. Initially, it gave us the ability to easily set the instructions execution size based entirely on register widths. With the builder, we can easiliy set the sizes explicitly and the width field doesn't have as much purpose. At this point, it's just redundant information that can get out of sync so it really needs to go. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net>
* i965/fs: Use exec_size instead of dst.width for computing component sizeJason Ekstrand2015-06-301-3/+3
| | | | | | | | There are a variety of places where we use dst.width / 8 to compute the size of a single logical channel. Instead, we should be using exec_size. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net>
* i965/fs: Use the builder dispatch width instead of dst.width for pull constantsJason Ekstrand2015-06-301-4/+4
| | | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net>
* i965/fs: Remove exec_size guessing from fs_inst::init()Jason Ekstrand2015-06-301-22/+0
| | | | | | | | Now that all of the non-explicit constructors are gone, we don't need to guess anymore. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net>
* i965/fs: Use exec_size for determining regs read/written and partial writesJason Ekstrand2015-06-301-3/+3
| | | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net>
* i965/fs: Remove fs_inst constructors that don't take an explicit exec_sizeJason Ekstrand2015-06-301-28/+2
| | | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net>
* i965/fs: Make better use of the builder in shader_timeJason Ekstrand2015-06-301-6/+8
| | | | | | | | | Previously, we were just depending on register widths to ensure that various things were exec_size of 1 etc. Now, we do so explicitly using the builder. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net>
* i965/fs: Add a builder argument to offset()Jason Ekstrand2015-06-301-17/+25
| | | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net>
* i965/fs: Properly handle LOAD_PAYLOAD in fs_inst::regs_readJason Ekstrand2015-06-301-0/+5
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: Report the right value in fs_inst::regs_read() for PIXEL_X/YJason Ekstrand2015-06-301-2/+9
| | | | | | Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net>
* i965/fs: Fix fs_inst::regs_read() for uniform pull constant loadsJason Ekstrand2015-06-301-0/+6
| | | | | | | | | | | | Previously, fs_inst::regs_read() fell back to depending on the register width for the second source. This isn't really correct since it isn't a SIMD8 value at all, but a SIMD4x2 value. This commit changes it to explicitly be always one register. v2: Use mlen for determining the number of registers read Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Acked-by: Francisco Jerez <currojerez@riseup.net>
* i965/fs: Actually set/use the mlen for gen7 uniform pull constant loadsJason Ekstrand2015-06-301-7/+12
| | | | | | | | | Previously, we were allocating the payload with different sizes per gen and then figuring out the mlen in the generator based on gen. This meant, among other things, that the higher level passes knew nothing about it. Acked-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: Use a switch statement in fs_inst::regs_read()Jason Ekstrand2015-06-301-22/+23
| | | | | | | | This makes things a little simpler, more efficient, and quite a bit more readable. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965/vs: Move compute_clip_distance() out of emit_urb_writes().Kenneth Graunke2015-06-281-1/+3
| | | | | | | | | | | | | | | | | Legacy user clipping (using gl_Position or gl_ClipVertex) is handled by turning those into the modern gl_ClipDistance equivalents. This is unnecessary in Core Profile: if user clipping is enabled, but the shader doesn't write the corresponding gl_ClipDistance entry, results are undefined. Hence, it is also unnecessary for geometry shaders. This patch moves the call up to run_vs(). This is equivalent for VS, but removes the need to pass clip distances into emit_urb_writes(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Remove the brw_context from the visitorsJason Ekstrand2015-06-231-6/+6
| | | | | | | As of this commit, nothing actually needs the brw_context. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965/vs: Pass the current set of clip planes through run() and run_vs()Jason Ekstrand2015-06-231-2/+2
| | | | | | | | | Previously, these were pulled out of the GL context conditionally based on whether we were running ff/ARB or a GLSL program. Now, we just pass them in so that the visitor doesn't have to grab them itself. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965/fs: Add a do_rep_send flag to run_fsJason Ekstrand2015-06-231-4/+5
| | | | | | | Previously, we were pulling it from brw->do_rep_send Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Pull calls to get_shader_time_index out of the visitorJason Ekstrand2015-06-231-37/+18
| | | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Use a single index per shader for shader_time.Jason Ekstrand2015-06-231-18/+10
| | | | | | | | | | Previously, each shader took 3 shader time indices which were potentially at arbirary points in the shader time buffer. Now, each shader gets a single index which refers to 3 consecutive locations in the buffer. This simplifies some of the logic at the cost of having a magic 3 a few places. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965/fs: Plumb compiler debug logging through brw_compilerJason Ekstrand2015-06-231-4/+9
| | | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965/fs: Do the no16 perf logging directly in fs_visitor::no16()Jason Ekstrand2015-06-231-11/+2
| | | | | | | While we're at it, we'll drop the note about 10-20% performance loss. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965/fs: Make no16 non-variadicJason Ekstrand2015-06-231-10/+4
| | | | | | | We never used the fact that it was variadic anyway. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Remove the dependance on brw_context from the generatorsJason Ekstrand2015-06-231-1/+1
| | | | | Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Plumb compiler debug logging through a function pointer in brw_compilerJason Ekstrand2015-06-231-1/+2
| | | | | | | | v2 (Ken): Make shader_debug_log a printf-like function. v3 (Jason): Add a void * to pass the brw_context through Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Replace some instances of brw->gen with devinfo->genJason Ekstrand2015-06-231-2/+2
|
* i965/fs: Don't mess up stride for uniform integer multiplication.Matt Turner2015-06-231-4/+16
| | | | | | | | | If the stride is 0, the source is a uniform and we should not modify the stride. Cc: "10.6" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91047 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: Remove one more fixed brw_null_reg() from the visitor.Francisco Jerez2015-06-121-1/+1
| | | | | | | | | | Instead use fs_builder::null_reg_f() which has the correct register width. Avoids the assertion failure in fs_builder::emit() hit by the "ES3-CTS.shaders.loops.for_dynamic_iterations.unconditional_break_fragment" GLES3 conformance test introduced by 4af4cfba9ee1014baa4a777660fc9d53d57e4c82. Reported-and-reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* i965/fs: Remove dead IR construction code from the visitor.Francisco Jerez2015-06-091-284/+0
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Migrate translation of NIR ALU instructions to the IR builder.Francisco Jerez2015-06-091-2/+2
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Migrate FS framebuffer writes to the IR builder.Francisco Jerez2015-06-091-5/+4
| | | | | | | | | | | | | | The explicit call to fs_builder::group() in emit_single_fb_write() is required by the builder (otherwise the assertion in fs_builder::emit() would fail) because the subsequent LOAD_PAYLOAD and FB_WRITE instructions are in some cases emitted with a non-native execution width. The previous code would always use the channel enables for the first quarter, which is dubious but probably worked in practice because FB writes are never emitted inside non-uniform control flow and we don't pass the kill-pixel mask via predication in the cases where we have to fall-back to SIMD8 writes. Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Migrate FS discard handling to the IR builder.Francisco Jerez2015-06-091-3/+3
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Migrate FS gl_SamplePosition/ID computation code to the IR builder.Francisco Jerez2015-06-091-25/+24
| | | | | | v2: Use fs_builder::AND/SHR/MOV instead of ::emit. Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Migrate FS interpolation code to the IR builder.Francisco Jerez2015-06-091-14/+14
| | | | | | v2: Fix some preexisting trivial codestyle issues. Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Migrate shader time to the IR builder.Francisco Jerez2015-06-091-34/+20
| | | | | | v2: Change null register destination type to UD so it can be compacted. Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Migrate pull constant loads to the IR builder.Francisco Jerez2015-06-091-25/+14
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Migrate Gen4 send dependency workarounds to the IR builder.Francisco Jerez2015-06-091-16/+10
| | | | | | v2: Change brw_null_reg() to bld.null_reg_f(). Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Migrate lower_integer_multiplication to the IR builder.Francisco Jerez2015-06-091-13/+11
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Migrate lower_load_payload to the IR builder.Francisco Jerez2015-06-091-23/+11
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Migrate opt_sampler_eot to the IR builder.Francisco Jerez2015-06-091-2/+3
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Allocate a common IR builder object in fs_visitor.Francisco Jerez2015-06-091-0/+11
| | | | | | | v2: Call fs_builder::at_end() to point the builder at the end of the program explicitly. Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Print mlen in dump_instructions() output.Kenneth Graunke2015-06-041-0/+3
| | | | | Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Remove the ir_visitor codeJason Ekstrand2015-05-281-99/+0
| | | | | | | Now that everything is running through NIR, this is all dead. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Make NIR non-optional for scalar shadersJason Ekstrand2015-05-281-22/+3
| | | | | Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Rename backend_visitor to backend_shaderJason Ekstrand2015-05-281-2/+2
| | | | | | | | The backend_shader class really is a representation of a shader. The fact that it inherits from ir_visitor is somewhat immaterial. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: Fix lowering of integer multiplication with cmod.Matt Turner2015-05-281-0/+11
| | | | | | | | | | | | If the multiplication's result is unused, except by a conditional_mod, the destination will be null. Since the final instruction in the lowered sequence is a partial-write, we can't put the conditional mod on it and we have to store the full result to a register and do a MOV with a conditional mod. Cc: "10.6" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90580 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: Fix implied_mrf_writes for scratch writesJason Ekstrand2015-05-231-1/+1
| | | | | | | | We build the entire message in the generator so all the MRF writes are implied. Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: Implement integer multiply without mul/mach.Matt Turner2015-05-181-28/+66
| | | | | | | Ivybridge and Baytrail can't use mach with 2Q quarter control, so just do it without the accumulator. Stupid accumulator. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* i965/fs: Support integer multiplication in SIMD16 on Haswell.Matt Turner2015-05-181-5/+47
| | | | | | | Ivybridge (and presumably Baytrail) have a bug that prevents this from working. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>