summaryrefslogtreecommitdiffstats
path: root/src/glsl/nir
Commit message (Collapse)AuthorAgeFilesLines
* nir: Optimize useless comparisons against true/false.Matt Turner2015-12-081-2/+4
| | | | | | | | | | Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v1] Reviewed-by: Eric Anholt <eric@anholt.net> [v1] v2: Move new rule to Boolean simplification section Add a a@bool != true simplification Suggested-by: Neil Roberts <neil@linux.intel.com>
* nir/lower_io: Pass the builder and type_size into get_io_offsetJason Ekstrand2015-12-031-15/+15
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* Remove Sun CC specific code.Jose Fonseca2015-12-021-8/+0
| | | | | Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Alan Coopersmith <alan.coopersmith@oracle.com>
* glsl: Rename safe_reverse -> reverse_safe.Matt Turner2015-12-012-6/+6
| | | | | | To match existing foreach_in_list_reverse_safe. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
* nir: remove recursive inclusion in builtin_type_macros.hEmil Velikov2015-11-291-2/+0
| | | | | | | The header is already included by glsl_types.{cpp,h}. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
* nir: remove unneeded includeEmil Velikov2015-11-291-1/+0
| | | | | Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* nir: include what we want/needEmil Velikov2015-11-251-1/+1
| | | | | | | | Swap core.h with macros.h, as the latter provides the required MAX2 macro. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* nir/lower_tex: Add support for lowering texture swizzleJason Ekstrand2015-11-232-0/+80
| | | | Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
* nir: Add a tex_instr_is_query helperJason Ekstrand2015-11-231-0/+25
| | | | Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
* nir: Add a ssa_def_rewrite_uses_after helperJason Ekstrand2015-11-232-0/+51
| | | | Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
* nir: Use instr/if_rewrite in nir_ssa_def_rewrite_usesJason Ekstrand2015-11-231-12/+4
| | | | | | | nir_ssa_def_rewrite_uses is one of the older helpers in NIR and predated both of those. Now it can be substantially simplified. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
* nir/validate: Validated dests after sourcesJason Ekstrand2015-11-231-9/+9
| | | | | | | | | | | | | Previously, if someone accidentally made an instruction that refers to its own SSA destination, the validator wouldn't catch it. The reason for this is that it validated the destination too early and, by the time it got to the source, the destination SSA value was already added to the set of seen SSA values so it would assume that it came from some previous instruction. By moving destination validation to be after source validation, the SSA value is not in the list of seen values and the validator will catch self-referential instructions. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
* nir/lower_tex: Set the dest_type for txs instructionsJason Ekstrand2015-11-231-0/+1
| | | | Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* nir/lower_tex: Report progressJason Ekstrand2015-11-232-5/+16
| | | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* nir: s/nir_type_unsigned/nir_type_uintJason Ekstrand2015-11-235-44/+44
| | | | | | | | | | | v2: do the same in tgsi_to_nir (Samuel) v3: added missing cases after rebase (Iago) v4: Add a blank space after '#' in one of the comments (Matt) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
* nir/builder: only read meaningful channels in nir_swizzle()Connor Abbott2015-11-231-1/+1
| | | | | | | | | | This way the caller doesn't have to initialize all 4 channels when they aren't using them. v2: Fix signed/unsigned comparison warning (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
* nir: Add support for gl_HelperInvocation system value.Matt Turner2015-11-202-0/+5
| | | | Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
* nir: Add nir_texop_samples_identical opcodeIan Romanick2015-11-193-1/+13
| | | | | | | | | | | This is the NIR analog to GLSL IR ir_samples_identical. v2: Don't add the second nir_tex_src_ms_index parameter. Suggested by Ken and Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* nir: add nir_ssa_for_alu_src()Rob Clark2015-11-192-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using something like: numer = nir_ssa_for_src(bld, alu->src[0].src, nir_ssa_alu_instr_src_components(alu, 0)); for alu src's with swizzle, like: vec1 ssa_10 = intrinsic load_uniform () () (0, 0) vec2 ssa_11 = intrinsic load_uniform () () (1, 0) vec2 ssa_2 = udiv ssa_10.xx, ssa_11 ends up turning into something like: vec1 ssa_10 = intrinsic load_uniform () () (0, 0) vec2 ssa_11 = intrinsic load_uniform () () (1, 0) vec2 ssa_13 = imov ssa_10 ... because nir_ssa_for_src() ignore's the original nir_alu_src's swizzle. Instead for alu instructions, nir_src_for_alu_src() should be used to ensure the original alu src's swizzle doesn't get lost in translation: vec1 ssa_10 = intrinsic load_uniform () () (0, 0) vec2 ssa_11 = intrinsic load_uniform () () (1, 0) vec2 ssa_13 = imov ssa_10.xx ... v2: check for abs/neg, and re-use existing nir_alu_src Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* nir: fix missing increments of num_inputs/num_outputsRob Clark2015-11-192-0/+4
| | | | | | | | | | | Note: not quite perfect, we should use type_size vfunc (in compiler_options or nir_shader?) to determine how much we increment num_inputs/outputs/uniforms. But we don't have that yet, so let's at least fix things for the existing users of these passes. Signed-off-by: Rob Clark <robclark@freedesktop.org> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
* nir/print: show # of uniforms/inputs/outputsRob Clark2015-11-191-0/+4
| | | | Signed-off-by: Rob Clark <robclark@freedesktop.org>
* nir/print: show shader name/label if setRob Clark2015-11-191-0/+6
| | | | | Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* nir: add nir_var_all enumRob Clark2015-11-193-1/+6
| | | | | | | | | Otherwise, passing -1 gets you: error: invalid conversion from 'int' to 'nir_variable_mode' [-fpermissive] Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* nir: fix constant folding of bfiConnor Abbott2015-11-191-2/+2
| | | | Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
* nir: Add support for cloning shadersJason Ekstrand2015-11-183-0/+681
| | | | | | | | This commit is heavily based on one by Rob Clark <robdclark@gmail.com> but reworked to re-use nir_create functions and do less hashing. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>
* i965/nir: Validate that NIR passes call nir_metadata_preserve().Kenneth Graunke2015-11-182-0/+41
| | | | | | | | | | | | | | | | | | | | | Failing to call nir_metadata_preserve() can have nasty consequences: some pass breaks dominance information, but leaves it marked as valid, causing some subsequent pass to go haywire and probably crash. This pass adds a simple validation mechanism to ensure passes handle this properly. We add a new bogus metadata flag that isn't used for anything in particular, set it before each pass, and ensure it *isn't* still set after the pass. nir_metadata_preserve will reset the flag, so correct passes will work, and bad passes will assert fail. (I would have made these functions static inline, but nir.h is included in C++, so we can't bit-or enums without lots of casting...) Thanks to Dylan Baker for the idea. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* nir: add array length fieldRob Clark2015-11-182-0/+10
| | | | | | | | This will simplify things somewhat in clone. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* nir: remove nir_variable::max_ifc_array_accessRob Clark2015-11-182-22/+0
| | | | | | | | No users. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* nir: fix typo in idiv lowering, causing large-udiv-udiv failuresIlia Mirkin2015-11-181-1/+1
| | | | | | | | | | | | | In nv50, and in the python script that Rob circulated, we do: bld.mkCmp(OP_SET, CC_GE, TYPE_U32, (s = bld.getSSA()), TYPE_U32, m, b); Do the same in the nir div lowering pass. This fixes the large-udiv-udiv piglit tests on freedreno. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Clark <robclark@freedesktop.org>
* nir: Store the size of the TCS output patch in nir_shader_info.Kenneth Graunke2015-11-182-0/+9
| | | | | Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* glsl: copy each field's precision information in glsl_types's structure ↵Samuel Iglesias Gonsálvez2015-11-171-0/+1
| | | | | | | constructor Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
* glsl/nir: initialize precision field in glsl_struct_field constructorSamuel Iglesias Gonsálvez2015-11-171-1/+2
| | | | | Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
* nir: reduce memory footprint of glsl_struct_field's precisionSamuel Iglesias Gonsálvez2015-11-171-1/+1
| | | | | Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
* nir/glsl: Fix copy-n-paste mistakes from commit 213f864.Matt Turner2015-11-161-3/+3
| | | | Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
* nir/glsl_to_nir: use _mesa_fls() to compute num_texturesJuan A. Suarez Romero2015-11-161-7/+2
| | | | | | | | | | | | | Replace the current loop by a direct call to _mesa_fls() function. It also fixes an implicit bug in the current code where num_textures seems to be one value less than it should be when sh->Program->SamplersUsed > 0. For instance, num_textures is 0 instead of 1 when sh->Program->SamplersUsed is 1. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
* nir/copy_propagate: do not copy-propagate MOV srcs with source modifiersIago Toral Quiroga2015-11-161-1/+6
| | | | | | | | | | | | | If a source operand in a MOV has source modifiers, then we cannot copy-propagate it from the parent instruction and remove the MOV. v2: remove the check for source modifiers from is_move() (Jason) v3: Put the check for source modifiers back into is_move() since this function is called from copy_prop_alu_src(). Add source modifiers checks to is_vec() instead. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* nir: Silence GCC maybe-uninitialized warnings.Vinson Lee2015-11-131-0/+3
| | | | | | | | | | | | | nir/nir_control_flow.c: In function ‘split_block_cursor.isra.11’: nir/nir_control_flow.c:460:15: warning: ‘after’ may be used uninitialized in this function [-Wmaybe-uninitialized] *_after = after; ^ nir/nir_control_flow.c:458:16: warning: ‘before’ may be used uninitialized in this function [-Wmaybe-uninitialized] *_before = before; ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
* nir: Add helpers for getting input/output intrinsic sources.Kenneth Graunke2015-11-132-0/+45
| | | | | | | | | | With the many variants of IO intrinsics, particular sources are often in different locations. It's convenient to say "give me the indirect offset" or "give me the vertex index" and have it just work, without having to think about exactly which kind of intrinsic you have. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* nir: Don't lower TCS outputs to temporaries.Kenneth Graunke2015-11-131-0/+3
| | | | | | | | We'd like to shadow these when possible, but the current code doesn't work properly for TCS outputs. For now, disable it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* nir: Allow outputs reads and add the relevant intrinsics.Kenneth Graunke2015-11-134-8/+21
| | | | | | | | | | | | | | | | | | | | | | | | | Normally, we rely on nir_lower_outputs_to_temporaries to create shadow variables for outputs, buffering the results and writing them all out at the end of the program. However, this is infeasible for tessellation control shader outputs. Tessellation control shaders can generate multiple output vertices, and write per-vertex outputs. These are arrays indexed by the vertex number; each thread only writes one element, but can read any other element - including those being concurrently written by other threads. The barrier() intrinsic synchronizes between threads. Even if we tried to shadow every output element (which is of dubious value), we'd have to read updated values in at barrier() time, which means we need to allow output reads. Most stages should continue using nir_lower_outputs_to_temporaries(), but in theory drivers could choose not to if they really wanted. v2: Rebase to accomodate Jason's review feedback. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* nir/lower_io: Introduce nir_store_per_vertex_output intrinsics.Kenneth Graunke2015-11-133-5/+26
| | | | | | | | | | | Similar to nir_load_per_vertex_input, but for outputs. This is not useful in geometry shaders, but will be useful in tessellation shaders. v2: Change stage_uses_per_vertex_outputs() to is_per_vertex_output(), taking a nir_variable (requested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* nir/lower_io: Use load_per_vertex_input intrinsics for TCS and TES.Kenneth Graunke2015-11-131-4/+8
| | | | | | | | | | | | | | | | Tessellation control shader inputs are an array indexed by the vertex number, like geometry shader inputs. There aren't per-patch TCS inputs. Tessellation evaluation shaders have both per-vertex and per-patch inputs. Per-vertex inputs get the new intrinsics; per-patch inputs continue to use the ordinary load_input intrinsics, as they already work like we want them to. v2: Change stage_uses_per_vertex_inputs into is_per_vertex_input(), which takes a variable (requested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* Revert "nir/copy_propagate: do not copy-propagate MOV srcs with source ↵Iago Toral Quiroga2015-11-131-10/+7
| | | | | | | | | | modifiers" The change proposed in the review leads to piglit regressions because is_move() is used in other places and relies on the checks for source modifiers to be there. Revert this until we agree on a better solution.
* nir/copy_propagate: do not copy-propagate MOV srcs with source modifiersIago Toral Quiroga2015-11-131-7/+10
| | | | | | | | | If a source operand in a MOV has source modifiers, then we cannot copy-propagate it from the parent instruction and remove the MOV. v2: remove the check for source source modifiers from is_move() (Jason) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* nir/vars_to_ssa: Delete dead output set codeJason Ekstrand2015-11-121-7/+0
| | | | | | | | This was a remnant of an early attempt to handle output reads in vars_to_ssa. That attempt was abandon a long time ago but these few lines were aparently left in the pass and managed to evade review. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
* nir/vars_to_ssa: Rework copy set handling in lower_copies_to_load_storeJason Ekstrand2015-11-121-1/+4
| | | | | | | | | | | | | | | | | | | | | Previously, we walked through a given deref_node's copies and, after lowering the copy away, removed it from both the source and destination copy sets. This commit changes this to only remove it from the other node's copy set (not the one we're lowering). At the end of the loop, we just throw away the copy set for the node we're lowering since that node no longer has any copies. This has two advantages: 1) It's more efficient because we're doing potentially half as many set search operations. 2) It now properly handles copies from a node to itself. Perviously, it would delete the copy from the set when processing the destinatioon and then assert-fail when we couldn't find it for the source. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92588 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
* nir/validate: Allow subroutine types for the tails of derefsJason Ekstrand2015-11-121-2/+6
| | | | | | | | | | The shader-subroutine code creates uniforms of type SUBROUTINE for subroutines that are then read as integers in the backends. If we ever want to do any optimizations on these, we'll need to come up with a better plan where they are actual scalars or something, but this works for now. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92859 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
* glsl: add gl_HelperInvocation system valueIlia Mirkin2015-11-121-0/+1
| | | | | Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>
* glsl: Add precision information to ir_variableIago Toral Quiroga2015-11-122-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We will need this later on when we implement proper support for precision qualifiers in the drivers and also to do link time checks for uniforms as indicated by the spec. This patch also adds compile-time checks for variables without precision information (currently, Mesa only checks that a default precision is set for floats in fragment shaders). As indicated by Ian, the addition of the precision information to ir_variable has been done using a bitfield and pahole to identify an available hole so that memory requirements for ir_variable stay the same. v2 (Ian): - Avoid if-ladders by defining arrays of supported sampler names and indexing into them with type->sampler_array + 2 * type->sampler_shadow - Make the code that selects the precision qualifier to use an utility function - Fix a typo v3 (Tapani): - rebased - squashed in "Precision qualifiers are not allowed on structs" - fixed select_gles_precision for sampler arrays - fixed precision_qualifier_allowed for arrays of structs v4 (Tapani): - add atomic_uint handling - do not allow precision qualifier on images (issues reported by Marta) v5 (Tapani): - support precision qualifier on image types v6 (Tapani): - set precision qualifier on interface block members Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
* nir/nir_opt_peephole_ffma: Move this lowering pass to the i965 driverEduardo Lima Mitev2015-11-102-269/+0
| | | | | | | | | Because the next patch will add an optimization that is specific to i965, we want to move this loweing pass to that driver altogether. This is safe because i965 is the only consumer. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>