summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gallium/util: avoid unreferencing random memory on buffer alloc failureIlia Mirkin2015-09-281-1/+1
| | | | | | | | Found by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* mesa: don't leak interface_nameIlia Mirkin2015-09-281-0/+1
| | | | | | | Found by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
* glsl: fix component size calculation for tessellation and geom shadersTimothy Arceri2015-09-281-1/+1
| | | | | | Broken in commit abdab88b30ab when adding arrays of arrays support Reviewed-by: Dave Airlie <airlied@redhat.com>
* docs/GL3.txt: fix typoBoyan Ding2015-09-271-1/+1
| | | | | Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com>
* i965/gs: Optimize away the EOT write on Gen8+ with static vertex count.Kenneth Graunke2015-09-261-0/+15
| | | | | | | | | | | | | | | | | | | With static vertex counts, the final EOT write doesn't actually write any data - it's just there to end the thread. Typically, the last thing before ending the thread will be an EmitVertex() call, resulting in a URB write. We can just set EOT on that. Note that this isn't always possible - there might be an intervening SSBO write/image store, or the URB write may have been in a loop. shader-db statistics for geometry shaders only: total instructions in shared programs: 3173 -> 3149 (-0.76%) instructions in affected programs: 176 -> 152 (-13.64%) helped: 8 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/gs: Allow src0 immediates in GS_OPCODE_SET_WRITE_OFFSET.Kenneth Graunke2015-09-262-2/+14
| | | | | | | | | | | | | | | | | | | GS_OPCODE_SET_WRITE_OFFSET is a MUL with a constant src[1] and special strides. We can easily make the generator handle constant src[0] arguments by instead generating a MOV with the product of both operands. This isn't necessarily a win in and of itself - instead of a MUL, we generate a MOV, which should be basically the same cost. However, we can probably avoid the earlier MOV to put src[0] into a register. shader-db statistics for geometry shaders only: total instructions in shared programs: 3207 -> 3173 (-1.06%) instructions in affected programs: 3207 -> 3173 (-1.06%) helped: 11 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Implement "Static Vertex Count" geometry shader optimization.Kenneth Graunke2015-09-265-4/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Broadwell's 3DSTATE_GS contains new "Static Output" and "Static Vertex Count" fields, which control a new optimization. Normally, geometry shaders can output arbitrary numbers of vertices, which means that resource allocation has to be done on the fly. However, if the number of vertices is statically known, the hardware can pre-allocate resources up front, which is more efficient. Thanks to the new NIR GS intrinsics, this is easy. We just call the function introduced in the previous commit to get the vertex count. If it obtains a count, we stop emitting the extra 32-bit "Vertex Count" field in the VUE, and instead fill out the 3DSTATE_GS fields. Improves performance of Gl32GSCloth by 5.16347% +/- 0.12611% (n=91) on my Lenovo X250 laptop (Broadwell GT2) at 1024x768. shader-db statistics for geometry shaders only: total instructions in shared programs: 3227 -> 3207 (-0.62%) instructions in affected programs: 242 -> 222 (-8.26%) helped: 10 v2: Don't break non-NIR paths (just skip this optimization). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Move GS_THREAD_END mlen calculations out of the generator.Kenneth Graunke2015-09-262-2/+2
| | | | | | | | | | The visitor was setting a mlen that was wrong for Broadwell, but the generator was ignoring it and doing the right thing regardless. We may as well move the logic fully into the visitor. This will be useful in the next commit as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* nir: Add a function to count the number of vertices a GS emits.Kenneth Graunke2015-09-263-0/+96
| | | | | | | | | | | Some hardware (such as Broadwell) can run geometry shaders more efficiently when the number of vertices emitted is statically known. This pass provides a way to obtain the constant vertex count, or -1 indicating that the vertex count is unknown/non-constant. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Simplify handling of VUE map changes.Kenneth Graunke2015-09-264-42/+17
| | | | | | | | | | | | | | | | | | | | The old code was disasterously complex - spread across multiple atoms which may not even run, inspecting the dirty bits to try and decide whether it was necessary to do checks...storing VS information in brw_context...extra flagging... This code tripped me and Carl up very badly when working on the shader cache code. It's very fragile and hard to maintain. Now that geometry shaders only depend on their inputs and don't have to worry about the VS VUE map, we can dramatically simplify this: just compute the VUE map coming out of the geometry shader stage in brw_upload_programs. If it changes, flag it. Done. v2: Also check vue_map.separable. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965/gs: Remove the dependency on the VS VUE map.Kenneth Graunke2015-09-262-11/+14
| | | | | | | | | | | | | | | | Because we only support geometry shaders in core profile, we can safely ignore any driver-extending of VS outputs. Those are: - Legacy userclipping (doesn't exist in core profile) - Edgeflag copying (Gen4-5 only, no GS support) - Point coord replacement (Gen4-5 only, no GS support) - front/back color hacks (Gen4-5 only, no GS support) v2: Rebase; leave a comment about why SSO works. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Don't re-layout varyings for separate shader programs.Kenneth Graunke2015-09-265-18/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, our VUE map code always assigned slots to varyings sequentially, in one contiguous block. This was a bad fit for separate shaders - the GS input layout depended or the VS output layout, so if we swapped out vertex shaders, we might have to recompile the GS on the fly - which rather defeats the point of using separate shader objects. (Tessellation would suffer from this as well - we could have to recompile the HS, DS, and GS.) Instead, this patch makes the VUE map for separate shaders use a fixed layout, based on the input/output variable's location field. (This is either specified by layout(location = ...) or assigned by the linker.) Corresponding inputs/outputs will match up by location; if there's a mismatch, we're allowed to have undefined behavior. This may be less efficient - depending what locations were chosen, we may have empty padding slots in the VUE. But applications presumably use small consecutive integers for locations, so it hopefully won't be much worse in practice. 3% of Dota 2 Reborn shaders are hurt, but only by 2 instructions. This seems like a small price to pay for avoiding recompiles. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965/vue: Make assign_vue_map() take an explicit slot.Kenneth Graunke2015-09-261-16/+19
| | | | | | | | | | | | Our plan of assigning consecutive slots doesn't work properly for separate shader objects - at least, if we want to avoid recompiling them whenever the interface changes. As a first step, make assign_vue_map take an explicit slot parameter, rather than implicitly incrementing it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Initialize unused VUE map slots to BRW_VARYING_SLOT_PAD.Kenneth Graunke2015-09-261-1/+1
| | | | | | | | | | Nothing actually relies on unused slots being initialized to BRW_VARYING_SLOT_COUNT. Soon, we're going to have VUE maps with holes in them, at which point pre-filling with BRW_VARYING_SLOT_PAD make a lot more sense. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Fix BRW_VARYING_SLOT_PAD handling in the scalar VS backend.Kenneth Graunke2015-09-261-4/+2
| | | | | | | | | | | | | We can't just break for padding slots. Instead, treat them like unwritten output variables, so we handle flushing and incrementing urb_offset correctly. Paul introduced the concept of padding slots back in 2011, but we've never actually used them for anything. So it's unsurprising that the scalar VS backend didn't handle them quite right. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* main/tests: Enable glShaderStorageBlockBinding() check in dispatch_sanity testSamuel Iglesias Gonsalvez2015-09-261-1/+1
| | | | | Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
* docs: add news item and link release notes for 11.0.1Emil Velikov2015-09-262-0/+7
| | | | Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
* docs: add sha256 checksums for 11.0.1Emil Velikov2015-09-261-1/+2
| | | | | Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit 7f1a77ae664cca29208edc32ff82dc7ff4faa02b)
* docs: add release notes for 11.0.1Emil Velikov2015-09-261-0/+133
| | | | | Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit bcb9e1d26ba4198359300b50e5c188977cef932e)
* glsl: calculate component size for arrays of arrays when varying packing ↵Timothy Arceri2015-09-261-3/+10
| | | | | | | | disabled Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* glsl: validate binding qualifier for AoATimothy Arceri2015-09-261-1/+1
| | | | | Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* glsl: add helper for calculating size of AoATimothy Arceri2015-09-261-0/+19
| | | | | | | V2: return 0 if not array rather than -1 Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* glsl: clean-up link uniform codeTimothy Arceri2015-09-261-11/+6
| | | | | | | These changes are also needed to allow linking of struct and interface arrays of arrays. Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
* radeonsi: add scratch buffer to the buffer list when it's re-allocatedMarek Olšák2015-09-261-0/+1
| | | | | Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: mesa-stable@lists.freedesktop.org
* radeon/vce: fix vui time_scale zero errorLeo Liu2015-09-251-0/+3
| | | | | | | | | if app pass 0 as frame_rate_num, it should not be encoded to the VUI. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
* mesa: Add locking to programs.Matt Turner2015-09-252-8/+12
| | | | | Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>
* mesa: Add locking to sampler objects.Matt Turner2015-09-252-4/+7
| | | | | Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>
* mesa: Remove debugging code from _mesa_reference_*.Matt Turner2015-09-256-61/+0
| | | | | Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>
* c11/threads: Assert that mtx is non-NULL and check return values.Matt Turner2015-09-251-25/+23
| | | | | | | | | | | | Passing NULL to C11 threads functions isn't safe, so there's no need for our implementation to handle it. Cuts about 1k of .text. text data bss dec hex filename 5009514 198440 26328 5234282 4fde6a i965_dri.so before 5008346 198440 26328 5233114 4fd9da i965_dri.so after Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>
* glsl: fix packed varyings interface type and add default caseTapani Pälli2015-09-251-0/+4
| | | | | | | | fixes Piglit test: arb_program_interface_query/linker/query-varyings.shader_test Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* glsl: Mark as active all elements of shared/std140 block arraysAntia Puentes2015-09-251-0/+24
| | | | | | | | | | | | | | | | | | | | | | | Commit 1ca25ab (glsl: Do not eliminate 'shared' or 'std140' blocks or block members) considered as active 'shared' and 'std140' uniform blocks and uniform block arrays, but did not include the block array elements. Because of that, it was possible to have an active uniform block array without any elements marked as used, making the assertion ((b->num_array_elements > 0) == b->type->is_array()) in link_uniform_blocks() fail. Fixes the following 5 dEQP tests: * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.18 * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.24 * dEQP-GLES3.functional.ubo.random.nested_structs_arrays_instance_arrays.19 * dEQP-GLES3.functional.ubo.random.all_per_block_buffers.49 * dEQP-GLES3.functional.ubo.random.all_shared_buffer.36 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83508 Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* docs: Mark ARB_shader_storage_buffer_object as done for i965Iago Toral Quiroga2015-09-251-2/+2
| | | | | | | v2: - Mark it too for GLES 3.1 Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* i965: Enable ARB_shader_storage_buffer_object extension for gen7+Samuel Iglesias Gonsalvez2015-09-251-0/+1
| | | | | Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* mesa: enable ARB_shader_storage_buffer_object extension for GLES 3.1Samuel Iglesias Gonsalvez2015-09-252-2/+2
| | | | | | Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* mesa: Add getters for the GL_ARB_shader_storage_buffer_object max constantsSamuel Iglesias Gonsalvez2015-09-252-0/+23
| | | | | | | | | | | | | | | v2: - Add tessellation shader constants support v3: - Add GLES 3.1 support. v4: - Move the getters to the proper place Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* glapi: add ARB_shader_storage_block_buffer_objectSamuel Iglesias Gonsalvez2015-09-254-2/+59
| | | | | | Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* main/tests: add ARB_shader_storage_buffer_object tokens to enum_stringsSamuel Iglesias Gonsalvez2015-09-251-0/+15
| | | | | | Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* main: Add SHADER_STORAGE_BLOCK and BUFFER_VARIABLE support for ↵Samuel Iglesias Gonsalvez2015-09-255-12/+278
| | | | | | | | | | | | | ARB_program_interface_query Including TOP_LEVEL_ARRAY_SIZE and TOP_LEVEL_ARRAY_STRIDE queries. v2: - Use std430_array_stride() to get top level array stride following std430's rules. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* glsl: Do not allow reads from write-only buffer variablesIago Toral Quiroga2015-09-251-0/+56
| | | | | | | | | | | | | The error location won't be right, but fixing that would require to check for this as we process each type of AST node that can involve a variable read. v2: - Limit the check to buffer variables, image variables have different semantics involved. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* glsl: Do not allow assignments to read-only buffer variablesIago Toral Quiroga2015-09-251-1/+10
| | | | | | | | | | | v2: - Merge the error check for the readonly qualifier with the already existing check for variables flagged as readonly (Timothy). - Limit the check to buffer variables, image variables have different semantics involved (Curro). Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* glsl: Allow memory qualifiers on shader storage buffer blocksSamuel Iglesias Gonsalvez2015-09-251-0/+14
| | | | | | | | v2: - Memory qualifiers on shader storage buffer objects do not come in the form of layout qualifiers, they are block-level qualifiers. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* glsl: Apply memory qualifiers to buffer variablesIago Toral Quiroga2015-09-253-3/+91
| | | | | | | | | | | | v2: - Save memory qualifier info in the top level members of a shader storage block. - Add a checks to record_compare() which is used when comparing shader storage buffer declarations in different shaders. - Always report an error for incompatible readonly/writeonly definitions, whether they are present at block or field level. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* glsl: Allow use of memory qualifiers with ARB_shader_storage_buffer_object.Iago Toral Quiroga2015-09-251-5/+5
| | | | Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* glsl: fix UNIFORM_BUFFER_START or UNIFORM_BUFFER_SIZE query when no buffer ↵Samuel Iglesias Gonsalvez2015-09-251-2/+4
| | | | | | | | | | | | | | object is bound According to ARB_uniform_buffer_object spec: "If the parameter (starting offset or size) was not specified when the buffer object was bound (e.g. if bound with BindBufferBase), or if no buffer object is bound to <index>, zero is returned." Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* mesa: Add queries for GL_SHADER_STORAGE_BUFFERIago Toral Quiroga2015-09-251-0/+31
| | | | | | | | These handle querying the buffer name attached to a giving binding point as well as the start offset and size of that buffer. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* mesa: add glShaderStorageBlockBinding()Samuel Iglesias Gonsalvez2015-09-252-0/+56
| | | | | | | Defined in ARB_shader_storage_buffer_object extension. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* glsl: First argument to atomic functions must be a buffer variableIago Toral Quiroga2015-09-251-0/+42
| | | | | | | | | | | v2: - Add ssbo_in the names of the static functions so it is clear that this is specific to SSBO atomics. v3: - Move the check after the loop (Kristian Høgsberg) Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* i965/nir/vec4: Implement nir_intrinsic_ssbo_atomic_*Iago Toral Quiroga2015-09-252-0/+79
| | | | Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* i965/nir/fs: Implement nir_intrinsic_ssbo_atomic_*Iago Toral Quiroga2015-09-252-0/+79
| | | | Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* nir: Implement lowered SSBO atomic intrinsicsIago Toral Quiroga2015-09-252-0/+82
| | | | | | | | | | | | The original GLSL IR intrinsics have been lowered to an internal version that accepts a block index and an offset instead of a SSBO reference. v2 (Connor): - Document the sources used by the atomic intrinsics. Reviewed-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>