| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
This is a step towards dropping the GLSL IR version of
do_set_program_inouts() in i965 and moving towards native nir support.
This is important because we want to eventually convert to nir and
use its optimisations passes before we can call this GLSL IR pass.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just say no to:
- brw->wm.base.prog_data = &brw->wm.prog_data->base.base;
We'll just use the brw_stage_prog_data pointer in brw_stage_state
and downcast it to brw_wm_prog_data as needed.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>
|
|
|
|
|
|
|
|
| |
State upload code should use prog_data rather than poking at core
Mesa shader data structures wherever possible.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
| |
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our previous code worked for desktop GL, and ES without geometry or
tessellation shaders. But those features require fancier point size
handling. Fortunately, we can use one rule for all APIs.
Fixes a number of dEQP tests with EXT_tessellation_shader enabled:
dEQP-GLES31.functional.tessellation_geometry_interaction.point_size.*
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
|
|
| |
It used to be called like that and fits better with 80 columns.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
|
|
|
|
|
|
|
|
| |
Switch over to use the CoordsReplaceBits bitmask.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com
|
|
|
|
|
|
|
|
|
|
| |
Previously, we were walking over the shader source to figure out which
inputs should be marked flat. Now, we can just pull it out of prog_data.
This is needed for properly setting up 3DSTATE_SF/SBE for Vulkan and it
also means that it will get properly cached.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're about to start allowing wide points/lines whose vertices are
outside the viewport past the clipper. This scissoring hack ensures
that any fragments generated are still restricted to the viewport.
It is not necessary on Gen8+ as those platforms already discard
fragments which are outside the viewport.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
|
|
|
| |
I need to use this in multiple source files.
v2: Rebase on TES output domain fix.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we implement tessellation shaders, the TES might be the last
stage enabled. If it's outputting points, then the primitive type
reaching the SF is points. We need to account for this.
Caught by Ilia Mirkin.
v2: Update dirty bit comment above caller (caught by Iago)
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normally, we could read gl_Layer from bits 26:16 of R0.0. However, the
specification requires that bogus out-of-range 32-bit values written by
previous stages need to appear in the fragment shader as-written.
Instead, we pass in the full 32-bit value from the VUE header as an
extra flat-shaded varying. We have the SF override the value to 0
when the previous stage didn't actually write a value (it's actually
defined to return 0).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Traditionally, we've hardcoded "URB Entry Read Offset" to 1 (which
represents 2 vec4 varying slots) to skip over the 8 DWord VUE header.
In order to support ARB_fragment_layer_viewport, we'll need to read
from that header. This patch adds the basic plumbing necessary to
calculate a value dynamically and hook it up in the SBE packets.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Literals without an f/F suffix are of type double, and implicit
conversion rules specify that the float in (float op double) be
converted to a double before the operation is performed. I believe float
execution was intended (in nearly all cases) or is sufficient (in the
case of gen7_urb.c).
Removes a lot of float <-> double conversion instructions and replaces
many double instructions with float instructions which are cheaper.
text data bss dec hex filename
4928659 195160 26192 5150011 4e953b i965_dri.so before
4928315 195152 26192 5149659 4e93db i965_dri.so after
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change references to gl_framebuffer::Width, Height, MaxNumLayers
and Visual::samples to use the _mesa_geometry_ convenience functions
for those places where the geometry of the gl_framebuffer is needed
(in contrast to the geometry of the intersection of the attachments
of the gl_framebuffer).
This patch is to pave the way to enable GL_ARB_framebuffer_no_attachments
on Gen7 and higher in i965.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
The same fix Marius implemented for gen6 (commit a9b04d8a) and
gen7 (commit 24ecf37a).
Also, we need the same code to handle special cases of line width
in gen6, gen7 and now gen8, so put that in the helper function
we use to compute the line width.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit fe74fee8fa721a we rounded the line width to the nearest integer to
match the GLES3 spec requirements stated in section 13.4.2.1, but that seems
to break a dEQP test that renders wide lines in some multisampling scenarios.
Ian noted that the Open 4.4 spec has the following similar text:
"The actual width of non-antialiased lines is determined by rounding the
supplied width to the nearest integer, then clamping it to the
implementation-dependent maximum non-antialiased line width."
and suggested that when ES removed antialiased lines, they removed
"non-antialised" from that paragraph but probably should not have.
Going by that note, this patch restricts the quantization implemented in
fe74fee8fa721a only to regular aliased lines. This seems to keep the
tests fixed with that commit passing while fixing the broken test.
v2:
- Drop one of the clamps (Ken, Marius)
- Add a rule to prevent advertising line widths that when rounded go beyond
the limits allowed by the hardware (Ken)
- Update comments in the code accordingly (Ian)
- Put the code in a utility function (Ian)
Fixes:
dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_max.primitives.lines_wide
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90749
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On SNB and IVB hw, for 1 pixel line thickness or less,
the general anti-aliasing algorithm give up - garbage line is generated.
Setting a Line Width of 0.0 specifies the rasterization of
the “thinnest” (one-pixel-wide), non-antialiased lines.
Lines rendered with zero Line Width are rasterized using
Grid Intersection Quantization rules as specified
by bspec section 6.3.12.1 Zero-Width (Cosmetic) Line Rasterization.
v2: Daniel Stone: Fix = used instead of == in an if-statement.
v3: Ian Romanick: Use "._Enabled" flag insteed ".Enabled".
Add code comments. re-word wrap the commit message.
Add a complete bugzillia list.
Improve the hardcoded values to produce better results.
v4: Matt Turner: typo fixes and adjust <= 1.49 to become < 1.5
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28832
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=9951
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27007
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60797
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=15006
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Marius Predut <marius.predut@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Switch between the two clip space definitions already available
in hardware. Update winding order dependent state according
to the clip control state.
This change did not introduce new piglit quick.test regressions on
an Ivybridge Mobile and a GM45 Express chipset.
Also it enables and passes the clip-control and clip-control-depth-precision
tests on these two chipsets.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"(...)Let w be the width rounded to the nearest integer (...). If the
line segment has endpoints given by (x0,y0) and (x1,y1) in window
coordinates, the segment with endpoints (x0,y0-(w-1)/2) and
(x1,y1-(w-1/2)) is rasterized, (...)"
The hardware it not rounding the line width, so we should do it.
Also, we should be careful not to go beyond the hardware limits
for the line width after it gets rounded. Gen6-7 define a maximum line
width slightly below 8.0, so we should advertise a maximum line
width lower than 7.5 to make sure that 7.0 is the maximum integer
line width that we can select. Since the line width granularity in these
platforms is 0.125, we choose 7.375. Other platforms advertise rounded
maximum line widths, so those are fine.
Fixes the following 3 dEQP tests:
dEQP-GLES3.functional.rasterization.primitives.lines_wide
dEQP-GLES3.functional.rasterization.fbo.texture_2d.primitives.lines_wide
dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.primitives.lines_wide
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
| |
Replace the hard-coded 0's with the context clamp value.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was an oversight in the original patch. When PolygonMode is
used, then front faces, back faces, or both may be rendered as
points and are affected by point sprite state.
Note that SNB/IVB can't actually be fully conformant here, for
a legacy context -- we don't have separate sets of pointsprite
enables for front and back faces. Haswell ignores pointsprite
state correctly in hardware for non-point rasterization, so can
do this correctly, but it doesn't seem worth it.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86764
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I put the BRW_NEW_*_PROG_DATA flags at the beginning so that
brw_state_cache.c can still continue using 1 << brw_cache_id.
I also added a comment explaining the difference between
BRW_NEW_*_PROG_DATA and BRW_NEW_*_PROGRAM, as it took me a long time
to remember it.
Non-mechanical changes:
- brw_state_cache.c and brw_ff_gs.c now signal .brw, not .cache.
- brw_state_upload.c - INTEL_DEBUG=state changes.
- brw_context.h - bit definition merging.
v2: Correct the explanation of BRW_NEW_*_PROG_DATA to mention
state-based recompiles, and nix the "proper subset" claim,
as it's false. (Caught by Kristian Høgsberg).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we've moved a bunch of CACHE_NEW_* bits to BRW_NEW_*, the only
ones that are left are legitimately related to the program cache. Yet,
it seems a bit wasteful to have an entire bitfield for only 7 bits.
State upload is one of the hottest paths in the driver. For each atom
in the list, we call check_state() to see if it needs to be emitted.
Currently, this involves comparing three separate bitfields (mesa, brw,
and cache). Consolidating the brw and cache bitfields would save a
small amount of CPU overhead per atom. Broadwell, for example, has
57 state atoms, so this small savings can add up.
CACHE_NEW_*_PROG covers the brw_*_prog_data structures, as well as the
offset into the program cache BO (prog_offset). Since most uses refer
to brw_*_prog_data, I decided to use BRW_NEW_*_PROG_DATA as the name.
Removing "cache" completely is a bit painful, so I decided to do it in
several patches for easier review, and to separate mechanical changes
from manual ones. This one simply renames things, and was made via:
$ for file in *.[ch]; do
sed -i -e 's/CACHE_NEW_\([A-Z_\*]*\)_PROG/BRW_NEW_\1_PROG_DATA/g' \
-e 's/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/g' $file
done
Note that BRW_NEW_*_PROG_DATA is still in .cache, not .brw!
The next patch will remedy this flaw. It will also fix the
alphabetization issues.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Acked-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the dirty flags were listed in some arbitrary order. Some used
bonus parenthesis. Some put multiple flags on one line, others put one
per line. Some used tabs instead of spaces...but only on some lines.
This patch settles on one flag per line, in alphabetical order, using
spaces instead of tabs, and sheds the unnecessary parentheses.
Sorting was mostly done with vim's visual block feature and !sort,
although I alphabetized short lists by hand; it was pretty manual.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes broken rendering in Windows-based QtQuick2 apps run through Wine.
This library sets all texture units' GL_COORD_REPLACE, leaves point
sprite mode enabled, and then draws a triangle fan.
Will need a slightly different fix for Gen4-5, but I don't have my old
machines in a usable state currently.
V2: - Simplify patch -- the real changes are no longer duplicated across
the Gen6 and Gen7 atoms.
- Also don't clobber attr overrides -- which matters on Haswell too,
and fixes the other half of the problem
- Fix newly-introduced warnings
V3: - Use BRW_NEW_GEOMETRY_PROGRAM and brw->geometry_program rather than
core flag and state; keep the state flags in order.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84651
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than hardcoding platform values in every code path, just use the
maximum value we set.
Currently, ctx->Const.LineWidth == 5, which is smaller than the hardware
limit. But applications shouldn't be using a value larger than we
support anyway.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
|
|
|
|
|
|
|
|
|
| |
I had to dig a bit to figure out why this was necessary.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
|
|
|
|
|
|
|
|
| |
This lets us disable the viewport transform, which will be useful for
emitting 3DPRIM_RECTLIST.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
| |
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update Mesa and drivers to access updated gl_scissor_attrib.
Now have an enable bitfield and array of gl_scissor_rects.
Drivers have been updated to the new scissor enable state
attribute (gl_context.scissor.EnableFlags) but still treat it
as a single boolean which is okay as mesa will only use
bit 0 when communicating with a driver that does not support
ARB_viewport_array.
v2 (idr): Rebase fixes.
v3 (idr): Small code formatting fix suggsted by Ken.
Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Calling the local variables flat_enable and point_sprite_enable is
clearer than dw16 and such. It also matches the names used in
calculate_attr_overrides, which computes them.
v2: Add /* dw16 */ and /* dw10 */ comments, requested by Jordan.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
calculate_attr_overrides is responsible for computing the point sprite
and flat-shading enable bitfields. It does so by OR'ing in a bunch of
bits. However, it relied on the caller to set the initial value to
zero. This is pretty fragile - if the caller neglects to zero out those
variables, then the enable bitfields end up full of garbage, which shows
up as random things being flat-shaded.
This patch moves the zero-initialization into calculate_attr_overrides,
so that the computation is completely in one place.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a geometry shader is present, the fragment shader gl_PrimitiveID
input acts like an ordinary varying, receiving data from the gs
gl_PrimitiveID output. When there's no geometry shader, we have to
ask the fixed function SF hardware to provide the primitive ID to the
fragment shader instead.
Previously, the SF setup code would handle this situation by
recognizing that the FS gl_PrimitiveID input didn't match to any VS
output; since normally an FS input with no corresponding VS output
leads to undefined data, the SF setup code used to just arbitrarily
assign it to receive data from attribute 0.
This patch changes the SF setup code so that instead of arbitrarily
using attribute 0, it assigns the unmatched FS input to receive
gl_PrimitiveID. In the case where the FS input really is
gl_PrimitiveID, this produces the intended result. In all other
cases, no harm is done since GL specifies that the behaviour is
undefined.
Fixes piglit test primitive-id-no-gs.
v2: If an attribute is already being overridden with point
coordinates, don't try to also override it with gl_PrimitiveID. This
is necessary to avoid regressing piglit tests such as
shaders/glsl-fs-pointcoord.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, if a fragment shader accessed gl_FragCoord or
gl_FrontFacing, we would assign them their own slots in the fragment
shader input attribute array, using up space that could be made
available to real varyings. This was not strictly necessary (since
these values are not true varyings, and are instead computed from
other data available in the FS payload). But we had to do it anyway
because the SF/SBE setup code assumed that every 1 bit in the
gl_program::InputsRead bitfield corresponded to a genuine varying
variable.
Now that the SF/SBE code consults brw_wm_prog_data and only sets up
the attributes that the fragment shader actually needs, we don't have
to do this anymore.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the SF/SBE setup code delivered varying inputs to the FS
in the order in which they appear in the gl_program::InputsRead
bitfield, since that's what the FS expects.
When we add support for more than 64 varying components, this will no
longer always be the case, because the Gen6+ SF/SBE stage is only
capable of performing arbitrary reorderings of 16 varying slots. So,
when there are more than 16 vec4's worth of varying inputs, the FS
will have to adjust the order its input varyings in order to partially
match the order of outputs from the geometry or vertex shader.
To allow extra flexibility in the ordering of FS varyings, this patch
causes the SF/SBE to deliver varying inputs to the FS in exactly the
order that the FS requests, by consulting brw_wm_prog_data::urb_setup
and brw_wm_prog_data::num_varying_inputs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
| |
We always program the SF unit to start reading the vertex URB entry at
offset 1. In upcoming patches, we'll be adding FS code that relies on
this. So consistently use the constant BRW_SF_URB_ENTRY_READ_OFFSET
rather than hardcoding a 1.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
| |
This makes brw_context inherit directly from gl_context; that was the
only thing left in intel_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
|
|
| |
ctx->DrawBuffer is much more sensible than brw->intel.ctx.DrawBuffer.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch modifies post-GS pipeline stages (transform feedback, clip,
sf, fs) to refer to the VUE map through brw->vue_map_geom_out rather
than brw->vs.prog_data->vue_map. This ensures that when geometry
shader support is added, these pipeline stages will consult the
geometry shader output VUE map when appropriate, rather than the
vertex shader output VUE map.
v2: Fixed some stale "CACHE_NEW_VS_PROG" comments.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes the terminology "vert_result" from the i965 driver,
replacing it with "varying". The old terminology, "vert_result", was
confusing because (a) it referred to the enum gl_vert_result, which no
longer exists (it was replaced with gl_varying_slot), and (b) it
implied a vertex output, but with the advent of geometry shaders, it
could be either a vertex or a geometry output, depending what shaders
are in use. The generic term "varying" is less confusing.
No functional change.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Whitespace fixes.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_frag_attrib -> gl_varying_slot
FRAG_ATTRIB_* -> VARYING_SLOT_*
FRAG_BIT_* -> VARYING_BIT_*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
|
|
|
| |
Now that there is no difference between the enums that represent
vertex outputs and fragment inputs, there's no need for a conversion
function.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_vert_result -> gl_varying_slot
VERT_RESULT_* -> VARYING_SLOT_*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pre-Gen6, the SF thread requires exact matching between VS output
slots (aka VUE slots) and FS input slots, even when the corresponding
VS output slot is unused due to being overwritten by point coordinate
replacement (glTexEnvi(GL_POINT_SPRITE, GL_COORD_REPLACE, GL_TRUE)).
As a result, we have a special hack in the VS to ensure when any
texture coordinate is subject to point coordinate replacement, it is
always allocated space in the VUE, even if it isn't written to by the
VS.
This hack isn't needed from Gen6 onwards, since SF (Gen7: SBE)
swizzling has the ability to insert the point coordinate into
gl_TexCoord[] without needing a corresponding unused VUE slot.
Note that no modification of SF setup code is required for this
patch--get_attr_override() already does the right thing. However, we
make a slight comment change to clarify why this works.
In addition to eliminating unnecessary VS recompiles and saving
precious URB space on Gen6+, this will save us the trouble of having
to adjust this hack when we implement geometry shaders.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(This commit message was primarily written by Paul Berry, who explained
what's going on far better than I would have.)
Previous to this patch, we thought that the only restrictions on
3DSTATE_SF's URB read length were (a) it needs to be large enough to
read all the VUE data that the SF needs, and (b) it can't be so large
that it tries to read VUE data that doesn't exist. Since the VUE map
already tells us how much VUE data exists, we didn't bother worrying
about restriction (a); we just did the easy thing and programmed the
read length to satisfy restriction (b).
However, we didn't notice this erratum in the hardware docs: "[errata]
Corruption/Hang possible if length programmed larger than recommended".
Judging by the context surrounding this erratum, it's pretty clear that
it means "URB read length must be exactly the size necessary to read all
the VUE data that the SF needs, and no larger". Which means that we
can't program the read length based on restriction (b)--we have to
program it based on restriction (a).
The URB read size needs to precisely match the amount of data that the
SF consumes; it doesn't work to simply base it on the size of the VUE.
Thankfully, the PRM contains the precise formula the hardware expects.
Fixes random UI corruption in Steam's "Big Picture Mode", random terrain
corruption in PlaneShift, and Piglit's fbo-5-varyings test.
NOTE: This is a candidate for all stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56920
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60172
Tested-by: Jordan Justen <jordan.l.justen@intel.com> (v1/Piglit)
Tested-by: Martin Steigerwald <martin@lichtvoll.de> (PlaneShift)
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
| |
The maximum SF source attribute is necessary to compute the Vertex URB
read length properly, which will be done in the next commit.
NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
|