summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Replace fs_reg::reg_offset with fs_reg::offset expressed in bytes.Francisco Jerez2016-09-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fs_reg::offset field in byte units introduced in this patch is a more straightforward alternative to the current register offset representation split between fs_reg::reg_offset and ::subreg_offset. The split representation makes it too easy to forget about one of the offsets while dealing with the other, which has led to multiple back-end bugs in the past. To make the matter worse the unit reg_offset was expressed in was rather inconsistent, for uniforms it would be expressed in either 4B or 16B units depending on the back-end, and for most other things it would be expressed in 32B units. This encodes reg_offset as a new offset field expressed consistently in byte units. Each rvalue reference of reg_offset in existing code like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and each lvalue reference like 'r.reg_offset = x' is rewritten to 'r.offset = r.offset % reg_unit + x * reg_unit'. Because the change affects a lot of places and is rather non-trivial to verify due to the inconsistent value of reg_unit, I've tried to avoid making any additional changes other than applying the rewrite rule above in order to keep the patch as simple as possible, sometimes at the cost of introducing obvious stupidity (e.g. algebraic expressions that could be simplified given some knowledge of the context) -- I'll clean those up later on in a second pass. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965: Use brw_reg's nr field to store register number.Matt Turner2015-11-131-1/+1
| | | | | | | | | | | | In addition to combining another field, we get replace silliness like "reg.reg" with something that actually makes sense, "reg.nr"; and no one will ever wonder again why dst.reg isn't a dst_reg. Moving the now 16-bit nr field to a 16-bit boundary decreases code size by about 3k. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* util: Move Mesa's bitset.h to util/.Eric Anholt2015-02-201-1/+1
| | | | Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* i965: Factor out virtual GRF allocation to a separate object.Francisco Jerez2015-02-101-1/+1
| | | | | | | | | | | | | Right now virtual GRF book-keeping and allocation is performed in each visitor class separately (among other hundred different things), leading to duplicated logic in each visitor and preventing layering as it forces any code that manipulates i965 IR and needs to allocate virtual registers to depend on the specific visitor that happens to be used to translate from GLSL IR. v2: Use realloc()/free() to allocate VGRF book-keeping arrays (Connor). Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Use const fs_reg & rather than a copy or pointer.Matt Turner2014-12-011-3/+8
| | | | | | Also while we're touching var_from_reg, just make it an inline function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: Track liveness of the flag register.Matt Turner2014-12-011-0/+5
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Use local pointer to block_data in live intervals.Matt Turner2014-12-011-3/+3
| | | | | | | The next patch will be simplified because of this, and makes reading the code a lot easier. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/cfg: Make cfg_t usable from C.Matt Turner2014-07-051-1/+1
| | | | Acked-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965/fs: Pass cfg to calculate_live_intervals().Matt Turner2014-07-011-2/+2
| | | | | | | We've often created the CFG immediately before, so use it when available. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965: Mark fields in the live interval classes protected.Matt Turner2014-07-011-10/+12
| | | | | | | | cfg, for instance, is a pointer to a local variable in calculate_live_intervals, certainly not valid after that function has returned. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965/fs: Use per-channel interference for register_coalesce_2().Eric Anholt2013-10-101-0/+3
| | | | | | | | | | This will let us coalesce into texture-from-GRF arguments, which would otherwise be prevented due to the live interval for the whole vgrf extending across all the MOVs setting up the channels of the message v2 (Kenneth Graunke): Rebase for renames. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: Use the new per-channel live ranges for dead code elimination.Eric Anholt2013-10-101-0/+2
| | | | | | v2 (Kenneth Graunke): Rebase on s/live_variables/live_intervals/g. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: Track live variable ranges on a per-channel level.Eric Anholt2013-10-101-0/+9
| | | | | | | | | | | This is the information we'll actually use to replace the virtual_grf_start[]/end[] arrays. No change in shader-db. v2 (Kenneth Graunke): Rebase; minor comment updates. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: Factor def[]/use[] setup out to a separate function.Eric Anholt2013-10-101-0/+2
| | | | | | | | | These blocks are about to grow some more code, and the indentation was getting out of hand. v2 (Kenneth Graunke): Rebase, minor typo fixes and style changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: Do live variables dataflow analysis on a per-channel level.Eric Anholt2013-10-101-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This significantly improves our handling of VGRFs of size > 1. Previously, we only marked VGRFs as def'd if the whole register was written by a single instruction. Large VGRFs which were written piecemeal would not be considered def'd at all, even if they were ultimately completely written. Without being def'd, these were then marked "live in" to the basic block, often extending the range to preceding blocks and sometimes even the start of the program. The new per-component tracking gives more accurate live intervals, which makes register coalescing more effective. In the future, this should help with texturing from GRFs on Gen7+. A sampler message might be represented by a 2-register VGRF which holds the texture coordinates. If those are incoming varyings, they'll be produced by two PLN instructions, which are piecemeal writes. No reduction in shader-db instruction counts. However, code which prints the live interval ranges does show that some VGRFs now have smaller (and more correct) live intervals. v2: Rebase on current send-from-GRF code requiring adding extra use[]s. v3: Rebase on live intervals fix to include defs in the end of the interval. v4 (Kenneth Graunke): Rebase; split off a few preparatory patches; add lots of comments; minor style changes; rewrite commit message. v5 (Eric Anholt): whitespace nit. Written-by: Eric Anholt <eric@anholt.net> [v1-3] Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [v4] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> (v4)
* i965/fs: Rename num_vars to num_vgrfs in live interval analysis.Kenneth Graunke2013-10-101-1/+1
| | | | | | | | | num_vars was shorthand for the number of virtual GRFs. num_vgrfs is a bit clearer. Plus, the next patch will introduce "vars" which are distinct from vgrfs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
* i965: Switch fs_live_variables to the non-zeroing allocator.Francisco Jerez2013-10-011-1/+1
| | | | | | | | | | | | | | | | All member variables of fs_live_variables are already being initialized from its constructor, it's not necessary to use rzalloc to allocate its memory, and doing so makes it more likely that we will start relying on the allocator to zero out all memory if the class is ever extended with new member variables. That's bad because it ties objects to some specific allocation scheme, and gives unpredictable results when an object is created with a different allocator -- Stack allocation, array allocation, or aggregation inside a different object are some of the useful possibilities that come to my mind. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965, mesa: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS macros.Kenneth Graunke2013-09-211-9/+1
| | | | | | | | | | | | | | | | These classes declared a placement new operator, but didn't declare a delete operator. Switching to the macro gives them a delete operator, which probably is a good idea anyway. This also eliminates a lot of boilerplate. v2: Properly use RZALLOC in Mesa IR/TGSI translators. Caught by Eric and Chad. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965/fs: Improve live variables calculation performance.Eric Anholt2013-03-111-4/+6
| | | | | | | | | | | | We can execute way fewer instructions by doing our boolean manipulation on an "int" of bits at a time, while also reducing our working set size. Reduces compile time of L4D2's slowest shader from 4s to 1.1s (-72.4% +/- 0.2%, n=10) v2: Remove redundant masking (noted by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Rename fs_cfg types to not mention fs.Eric Anholt2012-10-171-2/+2
| | | | | | | | fs_bblock_link -> bblock_link fs_bblock -> bblock_t (to avoid conflicting with all the fs_bblock *bblock) fs_cfg -> cfg_t (to avoid conflicting with all the fs_cfg *cfg) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: Remove a dead member from live variables analysis.Eric Anholt2012-08-291-5/+0
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Add support for live variable analysis using dataflow analysis.Eric Anholt2012-04-191-0/+86