summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_eu.c
Commit message (Collapse)AuthorAgeFilesLines
* i965: Add support for swizzling arbitrary immediates to (brw_)swizzle().Francisco Jerez2016-03-061-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Scalar immediates used to be handled correctly by swizzle() (as the identity) but since commit 58fa9d47b536403c4e3ca5d6a2495691338388fd it will corrupt the contents of the immediate. Vector immediates were never handled correctly, but we had ad-hoc code to swizzle VF immediates in the vec4 copy propagation pass. This takes care of swizzling V and UV in addition. v2: Don't implement swizzling of V/UV immediates (Matt). If you need to swizzle an integer vector immediate in the future apply the following diff to go back to v1: --- a/src/mesa/drivers/dri/i965/brw_eu.c +++ b/src/mesa/drivers/dri/i965/brw_eu.c @@ -119,11 +119,10 @@ brw_swap_cmod(uint32_t cmod) static unsigned imm_shift(enum brw_reg_type type, unsigned i) { - assert(type != BRW_REGISTER_TYPE_UV && type != BRW_REGISTER_TYPE_V && - "Not implemented."); - if (type == BRW_REGISTER_TYPE_VF) return 8 * (i & 3); + else if (type == BRW_REGISTER_TYPE_UV || type == BRW_REGISTER_TYPE_V) + return 4 * (i & 7); else return 0; } Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965: Add INTEL_DEBUG=hex to print the hex with the disassembly.Matt Turner2015-10-291-1/+1
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Rename brw_compile to brw_codegenJason Ekstrand2015-04-221-14/+14
| | | | | | | | | | | | This name better matches what it's actually used for. The patch was generated with the following command: for file in *; do sed -i -e s/brw_compile/brw_codegen/g $file done Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Remove the context field from brw_compilerJason Ekstrand2015-04-221-11/+7
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Make the disassembler take a device_info instead of a contextJason Ekstrand2015-04-221-4/+4
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Make instruction compaction take a device_info instead of a contextJason Ekstrand2015-04-221-1/+1
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Make the brw_inst helpers take a device_info instead of a contextJason Ekstrand2015-04-221-14/+14
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/eu: Add a devinfo parameter to brw_compileJason Ekstrand2015-04-221-0/+1
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Replace guess_execution_size with something simpler.Matt Turner2015-04-211-0/+7
| | | | | | | | | | | | | | | | | | | | | | | guess_execution_size() does two things: 1. Cope with small destination registers. 2. Cope with SIMD8 vs SIMD16 mode. This patch replaces the first with a simple if block in brw_set_dest: if the destination register width is less than 8, you probably want the execution size to match. (I didn't put this in the 3src block because it doesn't seem to matter.) Since only the FS compiler cares about SIMD16 mode, it's easy to just set the default execution size there. This pattern was already been proven in the Gen8+ generator, but we didn't port it back to the existing generator when we combined the two. This is based on a patch from Ken from about a year ago. I've rebased it and and fixed a few bugs. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* i965: Fix assertion in brw_reg_type_lettersBen Widawsky2015-03-021-1/+1
| | | | | | | | | | | | | | | | | While using various debugging features (optimization debug, instruction dumping, etc) this function is called in order to get a readable letter for the type of unit. On GEN8, two new units were added, the Qword and the Unsigned Qword (Q, and UQ respectively). The existing assertion tries to determine that the argument passed in is within the correct boundary, however, it was using UQ as the upper limit instead of Q. To my knowledge you can only hit this case with the branch I am currently working on, so it doesn't fix any known issues. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Introduce brw_negate_cmod().Kenneth Graunke2015-02-271-0/+22
| | | | | Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Return NONE from brw_swap_cmod on unknown input.Matt Turner2014-08-121-1/+1
| | | | | | | | Comparing ~0u with a packed enum (i.e., 1 byte) always evaluates to false. Shouldn't gcc warn about this? Reported-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* util: Move ralloc to a new src/util directory.Kenneth Graunke2014-08-041-1/+1
| | | | | | | | | | | | | | | | | | For a long time, we've wanted a place to put utility code which isn't directly tied to Mesa or Gallium internals. This patch creates a new src/util directory for exactly that purpose, and builds the contents as libmesautil.la. ralloc seemed like a good first candidate. These days, it's directly used by mesa/main, i965, i915, and r300g, so keeping it in src/glsl didn't make much sense. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> v2 (Jason Ekstrand): More realloc uses and some scons fixes Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* i965: Disable hex offset printing in disassembly.Kenneth Graunke2014-07-211-1/+2
| | | | | | | | | | | | | | | Printing the hex offsets makes it basically impossible to diff assembly: if you add even a single instruction, the entire shader shows up as a difference. So, every time I want to compare assembly, I have to strip this out. The hex offsets might be useful when debugging compaction, or when inspecting the program cache buffer. Since it's occasionally useful, but uncommon, this patch disables it by default, but makes it easy to re-enable it temporarily when the need arises. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Make a brw_conditional_mod enum.Matt Turner2014-07-051-1/+1
| | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965: Use unreachable() instead of unconditional assert().Matt Turner2014-07-011-3/+1
| | | | Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965: Replace struct brw_compact_instruction with brw_compact_inst.Matt Turner2014-06-261-1/+1
| | | | | Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Replace 'struct brw_instruction' with 'brw_inst'.Matt Turner2014-06-261-4/+4
| | | | | | | | Use this an an opportunity to clean up the formatting of some old code (brw_ADD, for instance). Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Pass brw rather than gen to brw_disassemble_inst().Matt Turner2014-06-261-1/+1
| | | | | | | We will need it in order to use the new brw_inst API. Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Convert brw_eu.[ch] to use the new brw_inst API.Kenneth Graunke2014-06-261-15/+17
| | | | | | | v2: Don't set flag_reg_nr prior to Gen7 (as it doesn't exist). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Use brw->gen in some generation checks.Matt Turner2014-06-111-2/+6
| | | | | | | Will simplify the automated conversion if we want to allow compiling the driver for a single generation. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* i965: Put '_default_' in the name of functions that set default state.Kenneth Graunke2014-06-021-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Eventually we're going to use functions to set bits on an instruction. Putting 'default' in the name of functions that alter default state will help distinguins them. This patch was generated entirely mechanically, by the following: for file in brw*.{cpp,c,h}; do sed -i \ -e 's/brw_set_mask_control/brw_set_default_mask_control/g' \ -e 's/brw_set_saturate/brw_set_default_saturate/g' \ -e 's/brw_set_access_mode/brw_set_default_access_mode/g' \ -e 's/brw_set_compression_control/brw_set_default_compression_control/g' \ -e 's/brw_set_predicate_control/brw_set_default_predicate_control/g' \ -e 's/brw_set_predicate_inverse/brw_set_default_predicate_inverse/g' \ -e 's/brw_set_flag_reg/brw_set_default_flag_reg/g' \ -e 's/brw_set_acc_write_control/brw_set_default_acc_write_control/g' \ $file; done No manual changes were done after running that command. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Delete brw_set_conditionalmod.Kenneth Graunke2014-06-021-5/+0
| | | | | | | | | This removes the ability to set the default conditional modifier on all future instructions. Nothing uses it, and it's not really a sensible thing to do anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/sf: Move brw_set_predicate_control_flag_value to brw_sf_emit.c.Kenneth Graunke2014-05-271-18/+0
| | | | | | | | Only the Gen4-5 SF program compiler actually uses this function; move it there. Soon the fields will be moved out of brw_compile. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/sf: Drop useless push/pop state from flag register mashing code.Kenneth Graunke2014-05-271-2/+0
| | | | | | | | There's no point in pushing and popping the default state; the code between the two stack operations doesn't alter anything. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/sf: Reset flag_value to 0xff before emitting SF subroutines.Kenneth Graunke2014-05-271-1/+0
| | | | | | | | | | | | | | | | | | | | | When compiling any of the SF program variants, flag_value starts off as 0xff and will be modified when generating code. brw_emit_anyprim_setup emits several subroutines, saving and restoring flag_value across each of them. Since it starts out as 0xff, this is equivalent to simply setting it to 0xff at the start of each subroutine. Resetting the value makes more logical sense; each subroutine doesn't know whether one of the others even executed, much less what it did to the flag register. This also lets us to drop the brw_set_predicate_control_flag_value call from brw_init_compile: predicate is already initialized to BRW_PREDICATE_NONE by the memset, and the value of flag_value is irrelevant (as it's only used by the SF compiler). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Rename brw/gen8_dump_compile to brw/gen8_disassemble.Kenneth Graunke2014-05-181-1/+2
| | | | | | | "Disassemble" is an accurate description of what this function does. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Rename brw_disasm/gen8_disassemble to brw/gen8_disassemble_inst.Kenneth Graunke2014-05-181-1/+1
| | | | | | | | We're going to use "disassemble" for the function that disassembles the whole program. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Pass brw_context and assembly separately to brw_dump_compile.Matt Turner2014-05-151-5/+3
| | | | | | | | brw_dump_compile will be called indirectly by code common used by generations before and after the gen8 instruction format change. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Pull brw_compact_instructions() out of brw_get_program().Matt Turner2014-05-151-2/+0
| | | | | Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/disasm: Disassemble the compaction control bit.Matt Turner2014-05-151-1/+2
| | | | | | | | | brw_disasm doesn't disassemble compacted instructions, so we uncompact before disassembling them which would unset the compaction control bit. Instead pass it as a separate argument. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Fix register types in dump_instructions().Kenneth Graunke2014-02-051-0/+29
| | | | | | | | | | | | | | This regressed when I converted BRW_REGISTER_TYPE_* to be an abstract type that doesn't match the hardware description. dump_instruction() was using reg_encoding[] from brw_disasm.c, which no longer matches (and was incorrect for Gen8+ anyway). This patch introduces a new function to convert the abstract enum values into the letter suffix we expect. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reported-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Replace 8-wide and 16-wide with SIMD8 and SIMD16.Eric Anholt2014-01-171-4/+4
| | | | | | | | Those are the terms used in the docs, and think "n-wide" was something I just happened to say. Note that shader-db needs updating for the INTEL_DEBUG=fs parsing. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* s/Tungsten Graphics/VMware/José Fonseca2014-01-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/alanh@tungstengraphics.com/alanh@vmware.com/ s/jens@tungstengraphics.com/jowen@vmware.com/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\?@tungstengraphics.com/jfonseca@vmware.com/g s/keithw\?@tungstengraphics.com/keithw@vmware.com/g s/michel@tungstengraphics.com/daenzer@vmware.com/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/zack@tungstengraphics.com/zackr@vmware.com/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <brianp@vmware.com>
* i965: dump the disassembly to the given fileTopi Pohjolainen2013-12-271-10/+10
| | | | | | | | instead of ignoring the argument and always dumping to standard output. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Don't use GL types in files shared with intel-gpu-tools.Kenneth Graunke2013-12-051-9/+9
| | | | | | | | | sed -i -e 's/GLuint/unsigned/g' -e 's/GLint/int/g' \ -e 's/GLfloat/float/g' -e 's/GLubyte/uint8_t/g' \ -e 's/GLshort/int16_t/g' \ brw_eu* brw_disasm.c brw_structs.h Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Drop trailing whitespace from files shared with intel-gpu-tools.Kenneth Graunke2013-12-051-8/+8
| | | | | | Performed via s/ *$//g. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Move intel_context::gen and gt fields to brw_context.Kenneth Graunke2013-07-091-3/+3
| | | | | | | | | | Most functions no longer use intel_context, so this patch additionally removes the local "intel" variables to avoid compiler warnings. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965: Pass brw_context to functions rather than intel_context.Kenneth Graunke2013-07-091-3/+2
| | | | | | | | | | | | | | This makes brw_context available in every function that used intel_context. This makes it possible to start migrating fields from intel_context to brw_context. Surprisingly, this actually removes some code, as functions that use OUT_BATCH don't need to declare "intel"; they just use "brw." Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965/fs: Add an instruction flag for choosing the flag subregister.Eric Anholt2012-12-111-0/+6
| | | | | | | | We're going to redo discard handling to track discards in the other flag subregister, saving instructions in the discard and allowing predicated jumps out to the end of the shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Let brw_flag_reg() choose the flag reg and subreg.Eric Anholt2012-12-111-1/+1
| | | | | | We're about to start using the f0.1 subregister. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Stop putting 8 NOPs after each prorgam.Eric Anholt2012-09-171-8/+0
| | | | | | | | | | | | | As far as I can see, the intention of the requirement that we do so is to prevent instruction prefetch from wandering out into either unmapped memory or memory with a different caching type, and hanging the chip. The kernel makes sure that the page after your BO has a valid page of the same caching type, which meets this requirement, so there's no need to waste space between our programs (and in instruction cache) on this. Saves another 9kb instructions in l4d2 shaders. Acked-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Add support for instruction compaction on Gen7.Kenneth Graunke2012-09-171-0/+2
| | | | | | | | | | Reduces l4d2 program size from 1195kb to 919kb. Improves performance by 0.22% +/- 0.11% (n=70). v2: Rebase on compaction v2, fix up flag reg handling (by anholt). v3: Fix uncompaction of the flag register number. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Add support for instruction compaction.Eric Anholt2012-09-171-8/+30
| | | | | | | | | | | | | | | This reduces program size by using some smaller encodings for common bit patterns in the Gen ISA, with the hope of making programs fit in the instruction cache better. v2: Use larger bitshifts for the uncompressed field setups, in line with the way it's described in the spec. Consistently name a brw_compile "p" like all other code. Add a couple more tests. Consistently call things "compacted" not "compressed" (which is a different feature). Drop the explicit check for not compacting SENDs, which is unjustified and already implied by our lack of support for immediate values. Reviewed-by: Paul Berry <stereotype441@gmail.com>
* i965: Move program dump to a helper function in brw_eu.c.Eric Anholt2012-09-171-1/+23
| | | | | | | | | It's going to get more complicated when we do instruction compaction. This also introduces putting the program offset in the output. v2: Use next_insn_offset in brw_get_program(), too. Reviewed-by: Paul Berry <stereotype441@gmail.com>
* i965: Clear brw_compile on setup.Eric Anholt2012-09-171-0/+2
| | | | | | | | I noticed in valgrind that p->single_program_flow was used while uninitialized. Everything else zeroed out brw_compile, but this is better API. Reviewed-by: Paul Berry <stereotype441@gmail.com>
* i965: Make brw_set_saturate() use stdbool.Eric Anholt2012-08-081-2/+2
| | | | | | There was a chance for brw_wm_emit.c to screw up and pass (1 << 4) instead of 1, which would get converted to 0 when stored. Instead, use stdbool which converts nonzero to true/1 like we want.
* i965: Fix brw_swap_cmod() for LE/GE comparisons.Kenneth Graunke2012-06-181-4/+4
| | | | | | | | | | | | | | | | | | | | | The idea here is to rewrite comparisons like 2 >= x with x <= 2; we want to simply exchange arguments, not negate the condition. If equality was part of the original comparison, it should remain part of the swapped version. This is the true cause of bug #50298. It didn't manifest itself on Sandybridge because we embed the conditional modifier in the IF instruction rather than emitting a CMP. All other platforms use CMP. It also didn't manifest itself on the master branch because commit be5f27a84d ("glsl: Refine the loop instruction counting.") papered over the problem. NOTE: This is a candidate for stable release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50298 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
* i965: Remove vestiges of function call support from the old VS backend.Kenneth Graunke2012-04-091-124/+0
| | | | | | | | This never worked. brwProgramStringNotify also explicitly rejects programs that use CAL and RET. So there's no need for this to exist. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
* i965: increase the brw eu instruction store size dynamicallyYuanhan Liu2011-12-261-0/+7
| | | | | | | | | | | | | | Here is the final patch to enable dynamic eu instruction store size: increase the brw eu instruction store size dynamically instead of just allocating it statically with a constant limit. This would fix something that 'GL_MAX_PROGRAM_INSTRUCTIONS_ARB was 16384 while the driver would limit it to 10000'. v2: comments from ken, do not hardcode the eu limit to (1024 * 1024) Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>