summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/gen6_sf_state.c
Commit message (Collapse)AuthorAgeFilesLines
* i965: get inputs read from nir infoTimothy Arceri2016-10-061-1/+3
| | | | | | | | | | This is a step towards dropping the GLSL IR version of do_set_program_inouts() in i965 and moving towards native nir support. This is important because we want to eventually convert to nir and use its optimisations passes before we can call this GLSL IR pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Eliminate brw->wm.prog_data pointer.Kenneth Graunke2016-10-051-4/+8
| | | | | | | | | | | | Just say no to: - brw->wm.base.prog_data = &brw->wm.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_wm_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>
* i965: Use gs_prog_data in is_drawing_points/lines().Kenneth Graunke2016-08-311-3/+5
| | | | | | | | State upload code should use prog_data rather than poking at core Mesa shader data structures wherever possible. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965/state: Move is_drawing_lines/points to gen6_clip_state.cJason Ekstrand2016-08-191-2/+2
| | | | | Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Fix point size with tessellation/geometry shaders in GLES.Kenneth Graunke2016-06-221-3/+2
| | | | | | | | | | | | | Our previous code worked for desktop GL, and ES without geometry or tessellation shaders. But those features require fancier point size handling. Fortunately, we can use one rule for all APIs. Fixes a number of dEQP tests with EXT_tessellation_shader enabled: dEQP-GLES31.functional.tessellation_geometry_interaction.point_size.* Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
* mesa: Rename CoordReplaceBits back to CoordReplace.Mathias Fröhlich2016-06-161-1/+1
| | | | | | | | It used to be called like that and fits better with 80 columns. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
* i965: Convert i965 to use CoordsReplaceBits.Mathias Fröhlich2016-06-161-1/+1
| | | | | | | | Switch over to use the CoordsReplaceBits bitmask. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
* i965: Add _NEW_POINT to a couple of comments.Kenneth Graunke2016-06-021-1/+1
| | | | | Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965: Make all atoms to track BRW_NEW_BLORP by defaultKenneth Graunke2016-04-231-1/+2
| | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com
* i965/sf_state: Pull flat_enables out of prog_dataJason Ekstrand2016-04-061-19/+2
| | | | | | | | | | Previously, we were walking over the shader source to figure out which inputs should be marked flat. Now, we can just pull it out of prog_data. This is needed for properly setting up 3DSTATE_SF/SBE for Vulkan and it also means that it will get properly cached. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Scissor to the viewport when rendering points/lines.Kenneth Graunke2016-03-181-2/+3
| | | | | | | | | | | | | | We're about to start allowing wide points/lines whose vertices are outside the viewport past the clipper. This scissoring hack ensures that any fragments generated are still restricted to the viewport. It is not necessary on Gen8+ as those platforms already discard fragments which are outside the viewport. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Move is_drawing_points to brw_state.h.Kenneth Graunke2016-03-181-24/+0
| | | | | | | | | I need to use this in multiple source files. v2: Rebase on TES output domain fix. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Account for TES in is_drawing_points().Kenneth Graunke2016-03-181-1/+8
| | | | | | | | | | | | | Now that we implement tessellation shaders, the TES might be the last stage enabled. If it's outputting points, then the primitive type reaching the SF is points. We need to account for this. Caught by Ilia Mirkin. v2: Update dirty bit comment above caller (caught by Iago) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965: Implement ARB_fragment_layer_viewport.Kenneth Graunke2015-10-281-0/+31
| | | | | | | | | | | | | | Normally, we could read gl_Layer from bits 26:16 of R0.0. However, the specification requires that bogus out-of-range 32-bit values written by previous stages need to appear in the fragment shader as-written. Instead, we pass in the full 32-bit value from the VUE header as an extra flat-shaded varying. We have the SF override the value to 0 when the previous stage didn't actually write a value (it's actually defined to return 0). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Make calculate_attr_overrides return the URB read offset.Kenneth Graunke2015-10-281-5/+8
| | | | | | | | | | | | Traditionally, we've hardcoded "URB Entry Read Offset" to 1 (which represents 2 vec4 varying slots) to skip over the 8 DWord VUE header. In order to support ARB_fragment_layer_viewport, we'll need to read from that header. This patch adds the basic plumbing necessary to calculate a value dynamically and hook it up in the SBE packets. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Use float calculations when double is unnecessary.Matt Turner2015-07-291-1/+1
| | | | | | | | | | | | | | | | | Literals without an f/F suffix are of type double, and implicit conversion rules specify that the float in (float op double) be converted to a double before the operation is performed. I believe float execution was intended (in nearly all cases) or is sufficient (in the case of gen7_urb.c). Removes a lot of float <-> double conversion instructions and replaces many double instructions with float instructions which are cheaper. text data bss dec hex filename 4928659 195160 26192 5150011 4e953b i965_dri.so before 4928315 195152 26192 5149659 4e93db i965_dri.so after Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965: Use _mesa_geometric_ functions appropriatelyKevin Rogovin2015-06-171-1/+2
| | | | | | | | | | | | | | Change references to gl_framebuffer::Width, Height, MaxNumLayers and Visual::samples to use the _mesa_geometry_ convenience functions for those places where the geometry of the gl_framebuffer is needed (in contrast to the geometry of the intersection of the attachments of the gl_framebuffer). This patch is to pave the way to enable GL_ARB_framebuffer_no_attachments on Gen7 and higher in i965. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
* i965/gen8: Fix antialiased line rendering with width < 1.5Iago Toral Quiroga2015-06-111-21/+1
| | | | | | | | | | | The same fix Marius implemented for gen6 (commit a9b04d8a) and gen7 (commit 24ecf37a). Also, we need the same code to handle special cases of line width in gen6, gen7 and now gen8, so put that in the helper function we use to compute the line width. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: do not round line width when multisampling or antialiaing are enabledIago Toral Quiroga2015-06-111-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit fe74fee8fa721a we rounded the line width to the nearest integer to match the GLES3 spec requirements stated in section 13.4.2.1, but that seems to break a dEQP test that renders wide lines in some multisampling scenarios. Ian noted that the Open 4.4 spec has the following similar text: "The actual width of non-antialiased lines is determined by rounding the supplied width to the nearest integer, then clamping it to the implementation-dependent maximum non-antialiased line width." and suggested that when ES removed antialiased lines, they removed "non-antialised" from that paragraph but probably should not have. Going by that note, this patch restricts the quantization implemented in fe74fee8fa721a only to regular aliased lines. This seems to keep the tests fixed with that commit passing while fixing the broken test. v2: - Drop one of the clamps (Ken, Marius) - Add a rule to prevent advertising line widths that when rounded go beyond the limits allowed by the hardware (Ken) - Update comments in the code accordingly (Ian) - Put the code in a utility function (Ian) Fixes: dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_max.primitives.lines_wide Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90749 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.6" <mesa-stable@lists.freedesktop.org>
* i965/aa: fixing anti-aliasing bug for thinnest width lines - GEN6Marius Predut2015-05-051-3/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | On SNB and IVB hw, for 1 pixel line thickness or less, the general anti-aliasing algorithm give up - garbage line is generated. Setting a Line Width of 0.0 specifies the rasterization of the “thinnest” (one-pixel-wide), non-antialiased lines. Lines rendered with zero Line Width are rasterized using Grid Intersection Quantization rules as specified by bspec section 6.3.12.1 Zero-Width (Cosmetic) Line Rasterization. v2: Daniel Stone: Fix = used instead of == in an if-statement. v3: Ian Romanick: Use "._Enabled" flag insteed ".Enabled". Add code comments. re-word wrap the commit message. Add a complete bugzillia list. Improve the hardcoded values to produce better results. v4: Matt Turner: typo fixes and adjust <= 1.49 to become < 1.5 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28832 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=9951 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27007 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60797 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=15006 Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Marius Predut <marius.predut@intel.com>
* i965: Implement support for ARB_clip_control.Mathias Fröhlich2015-04-051-1/+1
| | | | | | | | | | | | | Switch between the two clip space definitions already available in hardware. Update winding order dependent state according to the clip control state. This change did not introduce new piglit quick.test regressions on an Ivybridge Mobile and a GM45 Express chipset. Also it enables and passes the clip-control and clip-control-depth-precision tests on these two chipsets. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
* i965: Fix non-AA wide line rendering with fractional line widthsIago Toral Quiroga2015-02-241-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | "(...)Let w be the width rounded to the nearest integer (...). If the line segment has endpoints given by (x0,y0) and (x1,y1) in window coordinates, the segment with endpoints (x0,y0-(w-1)/2) and (x1,y1-(w-1/2)) is rasterized, (...)" The hardware it not rounding the line width, so we should do it. Also, we should be careful not to go beyond the hardware limits for the line width after it gets rounded. Gen6-7 define a maximum line width slightly below 8.0, so we should advertise a maximum line width lower than 7.5 to make sure that 7.0 is the maximum integer line width that we can select. Since the line width granularity in these platforms is 0.125, we choose 7.375. Other platforms advertise rounded maximum line widths, so those are fine. Fixes the following 3 dEQP tests: dEQP-GLES3.functional.rasterization.primitives.lines_wide dEQP-GLES3.functional.rasterization.fbo.texture_2d.primitives.lines_wide dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.primitives.lines_wide Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/gen6+: enable EXT_polygon_offset_clampIlia Mirkin2015-02-021-1/+1
| | | | | | | Replace the hard-coded 0's with the context clamp value. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/Gen6-7: Fix point sprites with PolygonMode(GL_POINT)Chris Forbes2014-12-071-0/+6
| | | | | | | | | | | | | | | | | This was an oversight in the original patch. When PolygonMode is used, then front faces, back faces, or both may be rendered as points and are affected by point sprite state. Note that SNB/IVB can't actually be fully conformant here, for a legacy context -- we don't have separate sets of pointsprite enables for front and back faces. Haswell ignores pointsprite state correctly in hardware for non-point rasterization, so can do this correctly, but it doesn't seem worth it. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.4" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86764 Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Move BRW_NEW_*_PROG_DATA flags to .brw (not .cache).Kenneth Graunke2014-12-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | I put the BRW_NEW_*_PROG_DATA flags at the beginning so that brw_state_cache.c can still continue using 1 << brw_cache_id. I also added a comment explaining the difference between BRW_NEW_*_PROG_DATA and BRW_NEW_*_PROGRAM, as it took me a long time to remember it. Non-mechanical changes: - brw_state_cache.c and brw_ff_gs.c now signal .brw, not .cache. - brw_state_upload.c - INTEL_DEBUG=state changes. - brw_context.h - bit definition merging. v2: Correct the explanation of BRW_NEW_*_PROG_DATA to mention state-based recompiles, and nix the "proper subset" claim, as it's false. (Caught by Kristian Høgsberg). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Rename CACHE_NEW_*_PROG to BRW_NEW_*_PROG_DATA.Kenneth Graunke2014-12-021-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we've moved a bunch of CACHE_NEW_* bits to BRW_NEW_*, the only ones that are left are legitimately related to the program cache. Yet, it seems a bit wasteful to have an entire bitfield for only 7 bits. State upload is one of the hottest paths in the driver. For each atom in the list, we call check_state() to see if it needs to be emitted. Currently, this involves comparing three separate bitfields (mesa, brw, and cache). Consolidating the brw and cache bitfields would save a small amount of CPU overhead per atom. Broadwell, for example, has 57 state atoms, so this small savings can add up. CACHE_NEW_*_PROG covers the brw_*_prog_data structures, as well as the offset into the program cache BO (prog_offset). Since most uses refer to brw_*_prog_data, I decided to use BRW_NEW_*_PROG_DATA as the name. Removing "cache" completely is a bit painful, so I decided to do it in several patches for easier review, and to separate mechanical changes from manual ones. This one simply renames things, and was made via: $ for file in *.[ch]; do sed -i -e 's/CACHE_NEW_\([A-Z_\*]*\)_PROG/BRW_NEW_\1_PROG_DATA/g' \ -e 's/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/g' $file done Note that BRW_NEW_*_PROG_DATA is still in .cache, not .brw! The next patch will remedy this flaw. It will also fix the alphabetization issues. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Matt Turner <mattst88@gmail.com>
* i965: Alphabetize brw_tracked_state flags and use a consistent style.Kenneth Graunke2014-11-291-13/+13
| | | | | | | | | | | | | | | | Most of the dirty flags were listed in some arbitrary order. Some used bonus parenthesis. Some put multiple flags on one line, others put one per line. Some used tabs instead of spaces...but only on some lines. This patch settles on one flag per line, in alphabetical order, using spaces instead of tabs, and sheds the unnecessary parentheses. Sorting was mostly done with vim's visual block feature and !sort, although I alphabetized short lists by hand; it was pretty manual. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/Gen6-7: Do not replace texcoords with point coord if not drawing pointsChris Forbes2014-11-251-11/+46
| | | | | | | | | | | | | | | | | | | | | | Fixes broken rendering in Windows-based QtQuick2 apps run through Wine. This library sets all texture units' GL_COORD_REPLACE, leaves point sprite mode enabled, and then draws a triangle fan. Will need a slightly different fix for Gen4-5, but I don't have my old machines in a usable state currently. V2: - Simplify patch -- the real changes are no longer duplicated across the Gen6 and Gen7 atoms. - Also don't clobber attr overrides -- which matters on Haswell too, and fixes the other half of the problem - Fix newly-introduced warnings V3: - Use BRW_NEW_GEOMETRY_PROGRAM and brw->geometry_program rather than core flag and state; keep the state flags in order. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.4" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84651 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Use ctx->Const.MaxLineWidth when clamping ctx->Line.Width.Kenneth Graunke2014-11-081-1/+2
| | | | | | | | | | | | Rather than hardcoding platform values in every code path, just use the maximum value we set. Currently, ctx->Const.LineWidth == 5, which is smaller than the hardware limit. But applications shouldn't be using a value larger than we support anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* i965: Add missing /* BRW_NEW_FRAGMENT_PROGRAM */ comments.Kenneth Graunke2014-10-011-2/+3
| | | | | | | | | I had to dig a bit to figure out why this was necessary. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Add context flag to disable the viewport transformKristian Høgsberg2014-08-151-2/+3
| | | | | | | | This lets us disable the viewport transform, which will be useful for emitting 3DPRIM_RECTLIST. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Use unreachable() instead of unconditional assert().Matt Turner2014-07-011-6/+3
| | | | Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* mesa: Update gl_scissor_attrib to support ARB_viewport_arrayCourtney Goeltzenleuchter2014-01-201-1/+1
| | | | | | | | | | | | | | | | | | Update Mesa and drivers to access updated gl_scissor_attrib. Now have an enable bitfield and array of gl_scissor_rects. Drivers have been updated to the new scissor enable state attribute (gl_context.scissor.EnableFlags) but still treat it as a single boolean which is okay as mesa will only use bit 0 when communicating with a driver that does not support ARB_viewport_array. v2 (idr): Rebase fixes. v3 (idr): Small code formatting fix suggsted by Ken. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Use {point_sprite,flat}_enable variable names instead of dw*.Kenneth Graunke2013-12-201-5/+7
| | | | | | | | | | | Calling the local variables flat_enable and point_sprite_enable is clearer than dw16 and such. It also matches the names used in calculate_attr_overrides, which computes them. v2: Add /* dw16 */ and /* dw10 */ comments, requested by Jordan. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Zero out {point_sprite,flat}_enables in calculate_attr_overrides.Kenneth Graunke2013-12-201-2/+3
| | | | | | | | | | | | | | | calculate_attr_overrides is responsible for computing the point sprite and flat-shading enable bitfields. It does so by OR'ing in a bunch of bits. However, it relied on the caller to set the initial value to zero. This is pretty fragile - if the caller neglects to zero out those variables, then the enable bitfields end up full of garbage, which shows up as random things being flat-shaded. This patch moves the zero-initialization into calculate_attr_overrides, so that the computation is completely in one place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Make fs gl_PrimitiveID input work even when there's no gs.Paul Berry2013-10-271-5/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a geometry shader is present, the fragment shader gl_PrimitiveID input acts like an ordinary varying, receiving data from the gs gl_PrimitiveID output. When there's no geometry shader, we have to ask the fixed function SF hardware to provide the primitive ID to the fragment shader instead. Previously, the SF setup code would handle this situation by recognizing that the FS gl_PrimitiveID input didn't match to any VS output; since normally an FS input with no corresponding VS output leads to undefined data, the SF setup code used to just arbitrarily assign it to receive data from attribute 0. This patch changes the SF setup code so that instead of arbitrarily using attribute 0, it assigns the unmatched FS input to receive gl_PrimitiveID. In the case where the FS input really is gl_PrimitiveID, this produces the intended result. In all other cases, no harm is done since GL specifies that the behaviour is undefined. Fixes piglit test primitive-id-no-gs. v2: If an attribute is already being overridden with point coordinates, don't try to also override it with gl_PrimitiveID. This is necessary to avoid regressing piglit tests such as shaders/glsl-fs-pointcoord. Reviewed-by: Eric Anholt <eric@anholt.net>
* i965/fs: Stop wasting input attribute space on gl_FragCoord and gl_FrontFacing.Paul Berry2013-09-161-8/+0
| | | | | | | | | | | | | | | | | | Previously, if a fragment shader accessed gl_FragCoord or gl_FrontFacing, we would assign them their own slots in the fragment shader input attribute array, using up space that could be made available to real varyings. This was not strictly necessary (since these values are not true varyings, and are instead computed from other data available in the FS payload). But we had to do it anyway because the SF/SBE setup code assumed that every 1 bit in the gl_program::InputsRead bitfield corresponded to a genuine varying variable. Now that the SF/SBE code consults brw_wm_prog_data and only sets up the attributes that the fragment shader actually needs, we don't have to do this anymore. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/sf: Consult brw_wm_prog_data when setting up SF/SBE state.Paul Berry2013-09-161-18/+27
| | | | | | | | | | | | | | | | | | | | Previously, the SF/SBE setup code delivered varying inputs to the FS in the order in which they appear in the gl_program::InputsRead bitfield, since that's what the FS expects. When we add support for more than 64 varying components, this will no longer always be the case, because the Gen6+ SF/SBE stage is only capable of performing arbitrary reorderings of 16 varying slots. So, when there are more than 16 vec4's worth of varying inputs, the FS will have to adjust the order its input varyings in order to partially match the order of outputs from the geometry or vertex shader. To allow extra flexibility in the ordering of FS varyings, this patch causes the SF/SBE to deliver varying inputs to the FS in exactly the order that the FS requests, by consulting brw_wm_prog_data::urb_setup and brw_wm_prog_data::num_varying_inputs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/sf: Consolidate common code for setting up gen6-7 attribute overrides.Paul Berry2013-09-161-66/+87
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/sf: Use BRW_SF_URB_ENTRY_READ_OFFSET rather than hardcoded values.Paul Berry2013-09-161-1/+1
| | | | | | | | | We always program the SF unit to start reading the vertex URB entry at offset 1. In upcoming patches, we'll be adding FS code that relies on this. So consistently use the constant BRW_SF_URB_ENTRY_READ_OFFSET rather than hardcoding a 1. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Delete intel_context entirely.Kenneth Graunke2013-07-091-2/+1
| | | | | | | | | | This makes brw_context inherit directly from gl_context; that was the only thing left in intel_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965: Shorten context base class dereference chains.Kenneth Graunke2013-07-091-1/+1
| | | | | | | | | ctx->DrawBuffer is much more sensible than brw->intel.ctx.DrawBuffer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965: Use brw.vue_map_geom_out instead of VS output VUE map where appropriate.Paul Berry2013-03-241-5/+5
| | | | | | | | | | | | | This patch modifies post-GS pipeline stages (transform feedback, clip, sf, fs) to refer to the VUE map through brw->vue_map_geom_out rather than brw->vs.prog_data->vue_map. This ensures that when geometry shader support is added, these pipeline stages will consult the geometry shader output VUE map when appropriate, rather than the vertex shader output VUE map. v2: Fixed some stale "CACHE_NEW_VS_PROG" comments. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Clarify nomenclature: vert_result -> varyingPaul Berry2013-03-231-7/+7
| | | | | | | | | | | | | | | | | | This patch removes the terminology "vert_result" from the i965 driver, replacing it with "varying". The old terminology, "vert_result", was confusing because (a) it referred to the enum gl_vert_result, which no longer exists (it was replaced with gl_varying_slot), and (b) it implied a vertex output, but with the advent of geometry shaders, it could be either a vertex or a geometry output, depending what shaders are in use. The generic term "varying" is less confusing. No functional change. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Whitespace fixes.
* Replace gl_frag_attrib enum with gl_varying_slot.Paul Berry2013-03-151-8/+8
| | | | | | | | | | | | This patch makes the following search-and-replace changes: gl_frag_attrib -> gl_varying_slot FRAG_ATTRIB_* -> VARYING_SLOT_* FRAG_BIT_* -> VARYING_BIT_* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>
* Get rid of _mesa_frag_attrib_to_vert_result().Paul Berry2013-03-151-8/+7
| | | | | | | | | | Now that there is no difference between the enums that represent vertex outputs and fragment inputs, there's no need for a conversion function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>
* Replace gl_vert_result enum with gl_varying_slot.Paul Berry2013-03-151-9/+9
| | | | | | | | | | | This patch makes the following search-and-replace changes: gl_vert_result -> gl_varying_slot VERT_RESULT_* -> VARYING_SLOT_* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>
* i965: Consign COORD_REPLACE VS hacks to Pre-Gen6.Paul Berry2013-02-201-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | Pre-Gen6, the SF thread requires exact matching between VS output slots (aka VUE slots) and FS input slots, even when the corresponding VS output slot is unused due to being overwritten by point coordinate replacement (glTexEnvi(GL_POINT_SPRITE, GL_COORD_REPLACE, GL_TRUE)). As a result, we have a special hack in the VS to ensure when any texture coordinate is subject to point coordinate replacement, it is always allocated space in the VUE, even if it isn't written to by the VS. This hack isn't needed from Gen6 onwards, since SF (Gen7: SBE) swizzling has the ability to insert the point coordinate into gl_TexCoord[] without needing a corresponding unused VUE slot. Note that no modification of SF setup code is required for this patch--get_attr_override() already does the right thing. However, we make a slight comment change to clarify why this works. In addition to eliminating unnecessary VS recompiles and saving precious URB space on Gen6+, this will save us the trouble of having to adjust this hack when we implement geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Fix the SF Vertex URB Read Length calculation for Sandybridge.Kenneth Graunke2013-02-031-16/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (This commit message was primarily written by Paul Berry, who explained what's going on far better than I would have.) Previous to this patch, we thought that the only restrictions on 3DSTATE_SF's URB read length were (a) it needs to be large enough to read all the VUE data that the SF needs, and (b) it can't be so large that it tries to read VUE data that doesn't exist. Since the VUE map already tells us how much VUE data exists, we didn't bother worrying about restriction (a); we just did the easy thing and programmed the read length to satisfy restriction (b). However, we didn't notice this erratum in the hardware docs: "[errata] Corruption/Hang possible if length programmed larger than recommended". Judging by the context surrounding this erratum, it's pretty clear that it means "URB read length must be exactly the size necessary to read all the VUE data that the SF needs, and no larger". Which means that we can't program the read length based on restriction (b)--we have to program it based on restriction (a). The URB read size needs to precisely match the amount of data that the SF consumes; it doesn't work to simply base it on the size of the VUE. Thankfully, the PRM contains the precise formula the hardware expects. Fixes random UI corruption in Steam's "Big Picture Mode", random terrain corruption in PlaneShift, and Piglit's fbo-5-varyings test. NOTE: This is a candidate for all stable branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56920 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60172 Tested-by: Jordan Justen <jordan.l.justen@intel.com> (v1/Piglit) Tested-by: Martin Steigerwald <martin@lichtvoll.de> (PlaneShift) Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Compute the maximum SF source attribute.Kenneth Graunke2013-02-031-2/+8
| | | | | | | | | | The maximum SF source attribute is necessary to compute the Vertex URB read length properly, which will be done in the next commit. NOTE: This is a candidate for all stable branches. Reviewed-by: Paul Berry <stereotype441@gmail.com> Tested-by: Martin Steigerwald <martin@lichtvoll.de> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>