summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
Commit message (Collapse)AuthorAgeFilesLines
* i965: Allocate at least some URB space even when max_vertices = 0.Kenneth Graunke2016-12-141-1/+7
| | | | | | | | | | | | | | | | | | | | | Allocating zero URB space is a really bad idea. The hardware has to give threads a handle to their URB space, and threads have to use that to terminate the thread. Having it be an empty region just breaks a lot of assumptions. Hence, why we asserted that it isn't possible. Unfortunately, it /is/ possible prior to Gen8, if max_vertices = 0. In theory a geometry shader could do SSBO/image access and maybe still accomplish something. In reality, this is tripped up by conformance tests. Gen8+ already avoids this problem by placing the vertex count DWord in the URB entry header. This fixes things on earlier generations. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit a41f5dcb141a11ca5ca0c765c305027b0f0b609e)
* i965/gs: Allow primitive id to be a system valueJason Ekstrand2016-11-241-1/+2
| | | | | | | | | | | | | | | This allows for gl_PrimitiveId to come in as a system value rather than as an input. This is the way it will come in from SPIR-V. We keeps the input path working for now so we don't break GL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit a5e88e66e633aaeb587b274d80e21cd46c8ee2cb) [Emil Velikov: nir_shader::info is not a pointer in branch] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
* i965: Fix gl_InvocationID in dual object GS where invocations == 1.Kenneth Graunke2016-10-171-1/+4
| | | | | | | | | | | | | | | | | | | | dEQP-GLES31.functional.geometry_shading.instanced.geometry_1_invocations draws using a geometry shader that specifies layout(points, invocations = 1) in; and then uses gl_InvocationID. According to the Haswell PRM, the "GS Instance ID 0" (and 1) thread payload fields are undefined in dual object mode: "If 'dispatch mode' is DUAL_OBJECT this field is not valid." But there's no point in using them - if there's only one invocation, the ID will be 0. So just load a constant. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
* util: Move _mesa_fsl/util_last_bit into util/bitscan.hMathias Fröhlich2016-08-091-1/+1
| | | | | | | | | | | As requested with the initial creation of util/bitscan.h now move other bitscan related functions into util. v2: Split into two patches. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>
* i965: If control_data_header_size_bits is zero, don't do EndPrimitiveIan Romanick2016-06-011-0/+3
| | | | | | | | This can occur when max_vertices=0 is explicitly specified. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>
* i965: Silence unused parameter warningsIan Romanick2016-05-181-3/+2
| | | | | | | | | | | | The only place that actually used the type parameter was the GS visitor, and it was always passed glsl_type::int. Just remove the parameter. brw_vec4_vs_visitor.cpp:38:61: warning: unused parameter ‘type’ [-Wunused-parameter] const glsl_type *type) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
* i965/fs: Stop setting dispatch_grf_start_reg from the visitorJason Ekstrand2016-05-141-0/+1
| | | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Support instanced GS inputs in the scalar backend.Kenneth Graunke2016-05-121-3/+0
| | | | | Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Avoid recalculating the normal VUE map for IO lowering.Kenneth Graunke2016-02-261-18/+19
| | | | | | | | The caller already computes it. Now that we have stage specific functions, it's really easy to pass this in. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965: Eliminate brw_nir_lower_{inputs,outputs,io} functions.Kenneth Graunke2016-02-261-1/+2
| | | | | | | | | | | | Now that each stage is directly calling brw_nir_lower_io(), and we have per-stage helper functions, it makes sense to just call the relevant one directly, rather than going through multiple switch statements. This also eliminates stupid function parameters, such as the two that only apply to vertex attributes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965: Always do NIR IO lowering at specialization time.Kenneth Graunke2016-02-261-0/+1
| | | | | | | | | | | | | | We've now hit literally every case other than geometry shaders (and compute shaders, but those are a no-op). So, let's just move geometry shaders over too and be done with it. The only advantage to doing this at link time was to save the expense of running the pass on recompiles. But we're already running a lot of passes, and the extra code complexity isn't worth it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965: Make an is_scalar boolean in brw_compile_gs().Kenneth Graunke2016-02-261-4/+4
| | | | | | | | | Shorter than compiler->scalar_stage[MESA_SHADER_GEOMETRY], which can help with line-wrapping. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965/gs: Pass VerticesIn though prog_dataJason Ekstrand2016-02-111-0/+2
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/vec4/gs: Stop munging the ATTR containing gl_PointSize.Kenneth Graunke2016-02-091-23/+0
| | | | | | | | | | | | | | | | | | gl_PointSize is delivered in the .w component of the VUE header, while the language expects it to be a float (and thus in the .x component). Previously, we emitted MOVs to copy it over to the .x component. But this is silly - we can just use a .wwww swizzle and access it without copying anything or clobbering the value stored at .x (which admittedly is useless). Removes the last use of ATTR destinations. v2: Use BRW_SWIZZLE_WWWW, not SWIZZLE_WWWW (caught by GCC). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965/fs/generator: Take an actual shader stage rather than a stringJason Ekstrand2016-01-151-1/+1
| | | | | | Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Use nir_lower_tex for texture coordinate loweringJason Ekstrand2015-11-231-0/+2
| | | | | | | | | | Previously, we had a rescale_texcoords helper in the FS backend for handling rescaling of texture coordinates. Now that we can do variants in NIR, we can use nir_lower_tex to do the rescaling for us. This allows us to delete the i965-specific code and gives us proper TEXTURE_RECTANGLE and GL_CLAMP handling in vertex and geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Move postprocess_nir to codegen timeJason Ekstrand2015-11-231-1/+6
| | | | | | | | | This allows us to insert NIR passes between initial NIR compilation and optimization (link time) and actual backend code-gen. In particular, it will allow us to do shader variants in NIR and share some of that shader variant code between backends. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965/vec4: Replace src_reg(imm) constructors with brw_imm_*().Matt Turner2015-11-191-18/+20
| | | | | | | Cuts 1.5k of .text. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Return the correct value type from brw_compile_gs()Eduardo Lima Mitev2015-11-171-1/+1
| | | | | | | | | | | | | | | brw_compile_gs() should return a pointer to unsigned, but it is returning the bool 'false' at some point, hence annoying us with a compiler warning: In function 'const unsigned int* brw::brw_compile_gs(const brw_compiler*, void*, void*, const brw_gs_prog_key*, brw_gs_prog_data*, const nir_shader*, gl_shader_program*, int, unsigned int*, char**)': brw_vec4_gs_visitor.cpp:776:14: warning: converting 'false' to pointer type 'const unsigned int*' [-Wconversion-null] return false; ^ Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Convert scalar_* flags to a scalar_stage array.Kenneth Graunke2015-11-161-1/+1
| | | | | | | | | I was going to add scalar_tcs and scalar_tes flags, and then thought better of it and decided to convert this to an array. Simpler. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Print input/output VUE maps on INTEL_DEBUG=vs, gs.Kenneth Graunke2015-11-131-0/+6
| | | | | | | | | | | | I've been carrying around a patch to do this for the last few months, and it's been exceedingly useful for debugging GS and tessellation problems. I've caught lots of bugs by inspecting the interface expectations of two adjacent stages. It's not that much spam, so I figure we may as well just print it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Matt Turner <mattst88@gmail.com>
* i965: Add scalar geometry shader support.Kenneth Graunke2015-11-031-0/+25
| | | | | | | | | | | | | | | | | | | | This is hidden behind INTEL_SCALAR_GS=1 for now, as we don't yet support instanced geometry shaders, and Orbital Explorer's shader spills like crazy. But the infrastructure is in place, and it's largely working. v2: Lots of rebasing. v3: (feedback from Kristian Høgsberg) - Handle stride and subreg_offset correctly for ATTRs; use a helper. - Fix missing emit_shader_time_end() call. - Delete dead code after early EOT in static vertex case to avoid tripping asserts in emit_shader_time_end(). - Use proper D/UD type in intexp2(). - Fix "EndPrimitve" and "to that" typos. - Assert that invocations == 1 so we know this is missing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* i965/vec4: Wrap vec4_generator in a C function.Kenneth Graunke2015-10-291-6/+6
| | | | | | | | | vec4_generator is a class for convenience, but only exports a single method as its public API. It makes much more sense to just export a single function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/vec4/gs: Fix signed/unsigned comparison warning.Matt Turner2015-10-221-1/+1
|
* i965/gs: Do prog_data setup and other calculations in brw_compile_gsJason Ekstrand2015-10-211-4/+207
| | | | | | | | | | | | | | | | This commit moves the large pile of setup calculations we have to do for geometry shaders out of brw_gs_emit and into brw_compile_gs. This has a couple of nice implications. First, it's less work that the caller of brw_compile_gs has to do. Second, it's consistent with the vertex and fragment stages. Finally, it allows us to put brw_gs_compile back behind the API boundary where it belongs. v2 (Jason Ekstrand): - Pull the changes to use nir info into a separate patch - Put brw_gs_compile into brw_shader.h rather than brw_vec4_gs_visitor.h so that we can use it for scalar GS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/gs: Pull prog_data out of brw_gs_compileJason Ekstrand2015-10-211-24/+27
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/gs: Use NIR instead of the brw_geometry_program for GS metadataJason Ekstrand2015-10-211-3/+3
| | | | | | With this, we can remove the geometry program from brw_gs_compile. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/gs: Move the mem_ctx argument to brw_compile_gsJason Ekstrand2015-10-211-1/+1
| | | | | | This makes it better match the other brw_compile_* functions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Rename brw_foo_emit to brw_compile_fooJason Ekstrand2015-10-191-8/+8
| | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965/gs: Rework gs_emit to take a nir_shader and a brw_compilerJason Ekstrand2015-10-191-36/+19
| | | | | | | | | This commit removes all dependence on GL state by getting rid of the brw_context parameter and the GL data structures. Unfortunately, we still have to pass in the gl_shader_program for gen6 because it's needed for transform feedback. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965: Use a const nir_shader in backend_shaderJason Ekstrand2015-10-191-1/+1
| | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965/vec4: Remove gl_program and gl_shader_program from the generatorJason Ekstrand2015-10-191-6/+5
| | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965/gs: Make MAX_GS_INPUT_VERTICES a #define in brw_context.h.Kenneth Graunke2015-10-101-2/+0
| | | | | | | | For scalar VS, I'll need this in brw_fs.cpp as well. It seems silly to redeclare it in three places. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Move brw_get_shader_time_index() call out of emit functionsKristian Høgsberg Kristensen2015-10-081-7/+4
| | | | | | | | | | brw_get_shader_time_index() is all tangled up in brw_context state and we can't call it from the compiler. Thanks the Jasons recent refactoring, we can just get the index and pass to the emit functions instead. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
* i965: Move brw_dump_ir() out of brw_*_emit() functionsKristian Høgsberg Kristensen2015-10-081-3/+0
| | | | | | | We move these calls one level up into the codegen functions. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
* i965: Remove shader_prog from vec4_gs_visitor.Kenneth Graunke2015-10-041-4/+2
| | | | | | | Unfortunately it has to stay in gen6_gs_visitor. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Use nir->has_transform_feedback_varyings to avoid shader_prog.Kenneth Graunke2015-10-041-1/+1
| | | | | Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/shader: Get rid of the shader, prog, and shader_prog fieldsJason Ekstrand2015-10-021-11/+13
| | | | | | | | | | Unfortunately, we can't get rid of them entirely. The FS backend still needs gl_program for handling TEXTURE_RECTANGLE. The GS vec4 backend still needs gl_shader_program for handling transfom feedback. However, the VS needs neither and we can substantially reduce the amount they are used. One day we will be free from their tyranny. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/vec4: Delete the old vec4_vp codeJason Ekstrand2015-10-021-9/+0
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/vec4: Delete the old ir_visitor codeJason Ekstrand2015-10-021-45/+0
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/gs: Optimize away the EOT write on Gen8+ with static vertex count.Kenneth Graunke2015-09-261-0/+15
| | | | | | | | | | | | | | | | | | | With static vertex counts, the final EOT write doesn't actually write any data - it's just there to end the thread. Typically, the last thing before ending the thread will be an EmitVertex() call, resulting in a URB write. We can just set EOT on that. Note that this isn't always possible - there might be an intervening SSBO write/image store, or the URB write may have been in a loop. shader-db statistics for geometry shaders only: total instructions in shared programs: 3173 -> 3149 (-0.76%) instructions in affected programs: 176 -> 152 (-13.64%) helped: 8 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Implement "Static Vertex Count" geometry shader optimization.Kenneth Graunke2015-09-261-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Broadwell's 3DSTATE_GS contains new "Static Output" and "Static Vertex Count" fields, which control a new optimization. Normally, geometry shaders can output arbitrary numbers of vertices, which means that resource allocation has to be done on the fly. However, if the number of vertices is statically known, the hardware can pre-allocate resources up front, which is more efficient. Thanks to the new NIR GS intrinsics, this is easy. We just call the function introduced in the previous commit to get the vertex count. If it obtains a count, we stop emitting the extra 32-bit "Vertex Count" field in the VUE, and instead fill out the 3DSTATE_GS fields. Improves performance of Gl32GSCloth by 5.16347% +/- 0.12611% (n=91) on my Lenovo X250 laptop (Broadwell GT2) at 1024x768. shader-db statistics for geometry shaders only: total instructions in shared programs: 3227 -> 3207 (-0.62%) instructions in affected programs: 242 -> 222 (-8.26%) helped: 10 v2: Don't break non-NIR paths (just skip this optimization). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Move GS_THREAD_END mlen calculations out of the generator.Kenneth Graunke2015-09-261-1/+1
| | | | | | | | | | The visitor was setting a mlen that was wrong for Broadwell, but the generator was ignoring it and doing the right thing regardless. We may as well move the logic fully into the visitor. This will be useful in the next commit as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965/gs: Fix extra level of indentation left by the previous commit.Kenneth Graunke2015-09-231-63/+61
| | | | | | | | I left a bunch of code indented a level in the previous patch to make the diff easier to read. But now we should fix that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* i965/gs: Use new NIR intrinsics.Kenneth Graunke2015-09-231-13/+15
| | | | | | | | | | | | | | | | | | | By performing the vertex counting in NIR, we're able to elide a ton of useless safety checks around every EmitVertex() call: total instructions in shared programs: 3952 -> 3720 (-5.87%) instructions in affected programs: 3491 -> 3259 (-6.65%) helped: 11 HURT: 0 Improves performance in Gl32GSCloth by 0.671742% +/- 0.142202% (n=621) on Haswell GT3e at 1024x768. This should also make it easier to implement Broadwell's "Static Vertex Count" feature someday. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* i965: Remove the brw_vue_prog_key base class.Kenneth Graunke2015-09-031-1/+1
| | | | | | | | | The legacy userclip fields are only used for the vertex shader, and at that point there's only program_string_id and the tex struct, which are common to all keys. So there's no need for a "VUE" key base class. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Move legacy clip plane handling to vec4_vs_visitor.Kenneth Graunke2015-09-031-2/+2
| | | | | | | | | | | | | This is now only used for the vertex shader, so it makes sense to get it out of any paths run by the geometry shader. Instead of passing the gl_clip_plane array into the run() method (which is shared among all subclasses), we add it as a vec4_vs_visitor constructor parameter. This eliminates the bogus NULL parameter in the GS case. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965/gs: Refactor ir_emit_vertex and ir_end_primitiveIago Toral Quiroga2015-08-031-4/+16
| | | | | | | | So the implementation is independent of GLSL IR and the visit methods of the vec4 visitor. This way we will be able to reuse that implementation directly from the NIR vec4 backend. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* i965/vec4: Redefine make_reg_for_system_value() to allow reuse in NIR->vec4 passAlejandro Piñeiro2015-08-031-3/+4
| | | | | | | | | | | The new virtual method is more flexible, it has a signature: dst_reg *make_reg_for_system_value(int location, const glsl_type *type); v2 (Jason Ekstrand): Use the new version in unit tests so make check passes again Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* i965: Fix indentation in emit_control_data_bits().Kenneth Graunke2015-07-101-72/+70
| | | | | | | The last patch left the code indented too far. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>