summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno/ir3/ir3.c
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/ir3: drop unused instr category argRob Clark2016-04-041-5/+3
| | | | | | No longer used, so drop the extra arg to ir3_instr_create() Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: remove ir3_instruction::categoryRob Clark2016-04-041-2/+1
| | | | Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: encode instruction category in opc_tRob Clark2016-04-041-0/+1
| | | | | | | | | | | | | Been on my TODO list for a while. If nothing else this will make gdb properly grok the opc_t enum. This first step preserves ir3_instruction::category (with an added assert that category matches what is encoded in opc_t). Next step is to drop the category field (and arg to ir3_instr_create()), but that is split into next commit for bisectability and so that we can run piglit in the intermediate state to flush out any problems. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: array offset can be negativeRob Clark2016-01-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | It at least happens with some piglit tests, like $piglit/bin/vp-address-01 VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..7] DCL ADDR[0] 0: ARL ADDR[0].x, IN[1].xxxx 1: MOV_SAT OUT[1], CONST[ADDR[0].x-1] 2: DP4 OUT[0].x, CONST[4], IN[0] 3: DP4 OUT[0].y, CONST[5], IN[0] 4: DP4 OUT[0].z, CONST[6], IN[0] 5: DP4 OUT[0].w, CONST[7], IN[0] 6: END Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: array reworkRob Clark2016-01-161-8/+27
| | | | Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: add support for store instructionsRob Clark2015-07-271-4/+20
| | | | | | | | | | For store instructions, the "dst" register is a read register, not a written register. (Ie. it is the address to store to.) Lets not confuse register allocation, scheduling, etc, with these details. Instead just leave a dummy instr->regs[0], and take "dst" from instr->regs[1] and srcs following. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: updated cat6 encodingRob Clark2015-07-271-6/+21
| | | | | | | Sync updated cat6 encoding from freedreno.git, needed to properly encode store instructions. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: fix indirects trackingRob Clark2015-07-031-0/+11
| | | | | | | | | | cp would update instr->address but not update the indirects array resulting in sched getting confused when it had to 'spill' the address register. Add an ir3_instr_set_address() helper to set instr->address and also update ir->indirects, and update all places that were writing instr->address to use helper instead. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: cache defining instructionRob Clark2015-06-301-3/+4
| | | | | | | | | It is silly to traverse back to find first instruction that writes part of a larger "virtual" register many times per instruction (plus per use as a src to later instructions). Cache this information so we only figure it out once. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: fixes for indirect writesRob Clark2015-06-301-1/+0
| | | | Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: block reshuffling and loops!Rob Clark2015-06-211-10/+50
| | | | | | | | | | | | | | | | This shuffles things around to allow the shader to have multiple basic blocks. We drop the entire CFG structure from nir and just preserve the blocks. At scheduling we know whether to schedule conditional branches or unconditional jumps at the end of the block based on the # of block successors. (Dropping jumps to the following instruction, etc.) One slight complication is that variables (load_var/store_var, ie. arrays) are not in SSA form, so we have to figure out where to put the phi's ourself. For this, we use the predecessor set information from nir_block. (We could perhaps use NIR's dominance frontier information to help with this?) Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: a4xx encodes larger immed offsetRob Clark2015-06-211-1/+6
| | | | | | | Without this, negative branch/jump offsets look like very large positive offsets. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: move inputs/outputs to shaderRob Clark2015-06-211-33/+13
| | | | | | | | | These belong in the shader, rather than the block. Mostly a lot of churn and nothing too interesting. But splitting this out from the rest of ir3_block reshuffling to cut down the noise in the later patch. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: introduce ir3_compiler objectRob Clark2015-06-211-1/+2
| | | | | | | | Right now, just provides a cleaner way to get at the gpu-id, given the separation between compiler and context. But we will need this also to hold the reg-set for new register allocation. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3/sched: convert to priority queueRob Clark2015-06-211-0/+1
| | | | | | | | Use a more standard priority-queue based scheduling algo. It is simpler and will make things easier once we have multiple basic blocks and flow control. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: use standard list implementationRob Clark2015-06-211-10/+17
| | | | | | | | | | Use standard list_head double-linked list and related iterators, helpers, etc, rather than weird combo of instruction array and next pointers depending on stage. Now block has an instrs_list. In certain stages where we want to remove and re-add to the blocks list we just use list_replace() to copy the list to a new list_head. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3/asm: change assert to warningRob Clark2015-04-111-1/+4
| | | | | | | | It probably *should* be an assert, but for now TGSI f/e isn't very good about dealing w/ CONST vs ABS/NEG. So for debug builds, print a warning instead of crashing with an assert for now. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/a3xx: add UBO supportIlia Mirkin2015-04-051-1/+1
| | | | Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
* freedreno/ir3: split float/int abs/negRob Clark2015-04-051-37/+34
| | | | | | | | | | | | Even though in the end, they map to the same bits, the backend will need to be able to differentiate float abs/neg vs integer abs/neg. Rather than making the backend figure it out based on instruction opcode (which when combined with mov/absneg instructions, can be awkward), just split out different flags for each so the frontend can signal it's intentions more clearly. Also, since (neg) for bitwise op's is actually a bitwise- not, split it out into bnot flag. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: bit of cleanupRob Clark2015-03-151-14/+3
| | | | | | | | Add an array_insert() macro to simplify inserting into dynamically sized arrays, add a comment, and remove unused prototype inherited from the original freedreno.git/fdre-a3xx test code, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: fix register usage calculationsRob Clark2015-03-081-7/+14
| | | | | | | | | For cat1 instructions, use reg() as well for relative src, to ensure proper accounting of register usage. Also, for relative instructions, use reg->size rather than reg->wrmask to determine the number of components read/written. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: handle flat bypass for a4xxRob Clark2015-03-031-0/+2
| | | | | | | | | | | We may not need this for later a4xx patchlevels, but we do at least need this for patchlevel 0. Bypass bary.f for fetching varyings when flat shading is needed (rather than configure via cmdstream). This requires a special dummy bary.f w/ (ei) flag to signal to scheduler when all varyings are consumed. And requires shader variants based on rasterizer flatshade state to handle TGSI_INTERPOLATE_COLOR. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: fix up cat6 instruction encodingsRob Clark2015-03-031-40/+22
| | | | | | | I think there is at least one more sub-encoding, but these two should be enough to cover the common load/store instructions. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: make reg array dynamicRob Clark2015-01-071-7/+39
| | | | | | | | | To use fanin's to group registers in an array, we can potentially have a much larger array of registers. Rather than continuing to bump up the array size, just make it dynamically allocated when the instruction is created. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno: add adreno 420 supportRob Clark2014-11-151-5/+10
| | | | | | | | Very initial support. Basic stuff working (es2gears, es2tri, and maybe about half of glmark2). Expect broken stuff. Still missing: mem->gmem (restore), queries, mipmaps (blob segfaults!), hw binning, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: fix potential gpu lockup with killRob Clark2014-10-201-0/+11
| | | | | | | | It seems like the hardware is unhappy if we execute a kill instruction prior to last input (ei). Probably the shader thread stops executing and the end-input flag is never set. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: large const supportRob Clark2014-10-151-1/+1
| | | | Signed-off-by: Rob Clark <robclark@freedesktop.org>
* Eliminate several cases of multiplication in arguments to callocCarl Worth2014-09-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | In commit 32f2fd1c5d6088692551c80352b7d6fa35b0cd09, several calls to _mesa_calloc(x) were replaced with calls to calloc(1, x). This is strictly equivalent to what the code was doing previously. But for cases where "x" involves multiplication, now that we are explicitly using the two-argument calloc, we can do one step better and replace: calloc(1, A * B); with: calloc(A, B); The advantage of the latter is that calloc will detect any overflow that would have resulted from the multiplication and will fail the allocation, (whereas the former would return a small allocation). So this fix can change potentially exploitable buffer overruns into segmentation faults. Reviewed-by: Matt Turner <mattst88@gmail.com>
* freedreno/ir3: split out shader compiler from a3xxRob Clark2014-07-251-0/+675
Move the bits we want to share between generations from fd3_program to ir3_shader. So overall structure is: fdN_shader_stateobj -> ir3_shader -> ir3_shader_variant -> ir3 |- ... \- ir3_shader_variant -> ir3 So the ir3_shader becomes the topmost generation neutral object, which manages the set of variants each of which generates, compiles, and assembles it's own ir. There is a bit of additional renaming to s/fd3_compiler/ir3_compiler/, etc. Keep the split between the gallium level stateobj and the shader helper object because it might be a good idea to pre-compute some generation specific register values (ie. anything that is independent of linking). Signed-off-by: Rob Clark <robclark@freedesktop.org>