| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Just say no to:
- brw->wm.base.prog_data = &brw->wm.prog_data->base.base;
We'll just use the brw_stage_prog_data pointer in brw_stage_state
and downcast it to brw_wm_prog_data as needed.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>
|
|
|
|
|
|
|
|
| |
State upload code should use prog_data rather than poking at core
Mesa shader data structures wherever possible.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
|
| |
calculate_attr_overrides() uses is_drawing_points(), which depends
on tessellation and geometry program state, as well as polygon state.
v2: Add missing _NEW_POLYGON as well. Caught by Iago Toral.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
| |
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our previous code worked for desktop GL, and ES without geometry or
tessellation shaders. But those features require fancier point size
handling. Fortunately, we can use one rule for all APIs.
Fixes a number of dEQP tests with EXT_tessellation_shader enabled:
dEQP-GLES31.functional.tessellation_geometry_interaction.point_size.*
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com
|
|
|
|
|
|
|
|
|
|
| |
Previously, we were walking over the shader source to figure out which
inputs should be marked flat. Now, we can just pull it out of prog_data.
This is needed for properly setting up 3DSTATE_SF/SBE for Vulkan and it
also means that it will get properly cached.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're about to start allowing wide points/lines whose vertices are
outside the viewport past the clipper. This scissoring hack ensures
that any fragments generated are still restricted to the viewport.
It is not necessary on Gen8+ as those platforms already discard
fragments which are outside the viewport.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Traditionally, we've hardcoded "URB Entry Read Offset" to 1 (which
represents 2 vec4 varying slots) to skip over the 8 DWord VUE header.
In order to support ARB_fragment_layer_viewport, we'll need to read
from that header. This patch adds the basic plumbing necessary to
calculate a value dynamically and hook it up in the SBE packets.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Literals without an f/F suffix are of type double, and implicit
conversion rules specify that the float in (float op double) be
converted to a double before the operation is performed. I believe float
execution was intended (in nearly all cases) or is sufficient (in the
case of gen7_urb.c).
Removes a lot of float <-> double conversion instructions and replaces
many double instructions with float instructions which are cheaper.
text data bss dec hex filename
4928659 195160 26192 5150011 4e953b i965_dri.so before
4928315 195152 26192 5149659 4e93db i965_dri.so after
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change references to gl_framebuffer::Width, Height, MaxNumLayers
and Visual::samples to use the _mesa_geometry_ convenience functions
for those places where the geometry of the gl_framebuffer is needed
(in contrast to the geometry of the intersection of the attachments
of the gl_framebuffer).
This patch is to pave the way to enable GL_ARB_framebuffer_no_attachments
on Gen7 and higher in i965.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
The same fix Marius implemented for gen6 (commit a9b04d8a) and
gen7 (commit 24ecf37a).
Also, we need the same code to handle special cases of line width
in gen6, gen7 and now gen8, so put that in the helper function
we use to compute the line width.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit fe74fee8fa721a we rounded the line width to the nearest integer to
match the GLES3 spec requirements stated in section 13.4.2.1, but that seems
to break a dEQP test that renders wide lines in some multisampling scenarios.
Ian noted that the Open 4.4 spec has the following similar text:
"The actual width of non-antialiased lines is determined by rounding the
supplied width to the nearest integer, then clamping it to the
implementation-dependent maximum non-antialiased line width."
and suggested that when ES removed antialiased lines, they removed
"non-antialised" from that paragraph but probably should not have.
Going by that note, this patch restricts the quantization implemented in
fe74fee8fa721a only to regular aliased lines. This seems to keep the
tests fixed with that commit passing while fixing the broken test.
v2:
- Drop one of the clamps (Ken, Marius)
- Add a rule to prevent advertising line widths that when rounded go beyond
the limits allowed by the hardware (Ken)
- Update comments in the code accordingly (Ian)
- Put the code in a utility function (Ian)
Fixes:
dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_max.primitives.lines_wide
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90749
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On SNB and IVB hw, for 1 pixel line thickness or less,
the general anti-aliasing algorithm give up - garbage line is generated.
Setting a Line Width of 0.0 specifies the rasterization of
the “thinnest” (one-pixel-wide), non-antialiased lines.
Lines rendered with zero Line Width are rasterized using
Grid Intersection Quantization rules as specified
by bspec section 6.3.12.1 Zero-Width (Cosmetic) Line Rasterization.
v2: Daniel Stone: Fix = used instead of == in an if-statement.
v3: Ian Romanick: Use "._Enabled" flag insteed ".Enabled".
Add code comments. re-word wrap the commit message.
Add a complete bugzillia list.
Improve the hardcoded values to produce better results.
v4: Matt Turner: typo fixes and adjust <= 1.49 to become < 1.5
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28832
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=9951
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27007
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60797
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=15006
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Marius Predut <marius.predut@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Switch between the two clip space definitions already available
in hardware. Update winding order dependent state according
to the clip control state.
This change did not introduce new piglit quick.test regressions on
an Ivybridge Mobile and a GM45 Express chipset.
Also it enables and passes the clip-control and clip-control-depth-precision
tests on these two chipsets.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"(...)Let w be the width rounded to the nearest integer (...). If the
line segment has endpoints given by (x0,y0) and (x1,y1) in window
coordinates, the segment with endpoints (x0,y0-(w-1)/2) and
(x1,y1-(w-1/2)) is rasterized, (...)"
The hardware it not rounding the line width, so we should do it.
Also, we should be careful not to go beyond the hardware limits
for the line width after it gets rounded. Gen6-7 define a maximum line
width slightly below 8.0, so we should advertise a maximum line
width lower than 7.5 to make sure that 7.0 is the maximum integer
line width that we can select. Since the line width granularity in these
platforms is 0.125, we choose 7.375. Other platforms advertise rounded
maximum line widths, so those are fine.
Fixes the following 3 dEQP tests:
dEQP-GLES3.functional.rasterization.primitives.lines_wide
dEQP-GLES3.functional.rasterization.fbo.texture_2d.primitives.lines_wide
dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.primitives.lines_wide
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
| |
Replace the hard-coded 0's with the context clamp value.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
| |
It's been merged into brw_state_flags::brw for simplicity and
efficiency.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I put the BRW_NEW_*_PROG_DATA flags at the beginning so that
brw_state_cache.c can still continue using 1 << brw_cache_id.
I also added a comment explaining the difference between
BRW_NEW_*_PROG_DATA and BRW_NEW_*_PROGRAM, as it took me a long time
to remember it.
Non-mechanical changes:
- brw_state_cache.c and brw_ff_gs.c now signal .brw, not .cache.
- brw_state_upload.c - INTEL_DEBUG=state changes.
- brw_context.h - bit definition merging.
v2: Correct the explanation of BRW_NEW_*_PROG_DATA to mention
state-based recompiles, and nix the "proper subset" claim,
as it's false. (Caught by Kristian Høgsberg).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we've moved a bunch of CACHE_NEW_* bits to BRW_NEW_*, the only
ones that are left are legitimately related to the program cache. Yet,
it seems a bit wasteful to have an entire bitfield for only 7 bits.
State upload is one of the hottest paths in the driver. For each atom
in the list, we call check_state() to see if it needs to be emitted.
Currently, this involves comparing three separate bitfields (mesa, brw,
and cache). Consolidating the brw and cache bitfields would save a
small amount of CPU overhead per atom. Broadwell, for example, has
57 state atoms, so this small savings can add up.
CACHE_NEW_*_PROG covers the brw_*_prog_data structures, as well as the
offset into the program cache BO (prog_offset). Since most uses refer
to brw_*_prog_data, I decided to use BRW_NEW_*_PROG_DATA as the name.
Removing "cache" completely is a bit painful, so I decided to do it in
several patches for easier review, and to separate mechanical changes
from manual ones. This one simply renames things, and was made via:
$ for file in *.[ch]; do
sed -i -e 's/CACHE_NEW_\([A-Z_\*]*\)_PROG/BRW_NEW_\1_PROG_DATA/g' \
-e 's/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/g' $file
done
Note that BRW_NEW_*_PROG_DATA is still in .cache, not .brw!
The next patch will remedy this flaw. It will also fix the
alphabetization issues.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Acked-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the dirty flags were listed in some arbitrary order. Some used
bonus parenthesis. Some put multiple flags on one line, others put one
per line. Some used tabs instead of spaces...but only on some lines.
This patch settles on one flag per line, in alphabetical order, using
spaces instead of tabs, and sheds the unnecessary parentheses.
Sorting was mostly done with vim's visual block feature and !sort,
although I alphabetized short lists by hand; it was pretty manual.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes broken rendering in Windows-based QtQuick2 apps run through Wine.
This library sets all texture units' GL_COORD_REPLACE, leaves point
sprite mode enabled, and then draws a triangle fan.
Will need a slightly different fix for Gen4-5, but I don't have my old
machines in a usable state currently.
V2: - Simplify patch -- the real changes are no longer duplicated across
the Gen6 and Gen7 atoms.
- Also don't clobber attr overrides -- which matters on Haswell too,
and fixes the other half of the problem
- Fix newly-introduced warnings
V3: - Use BRW_NEW_GEOMETRY_PROGRAM and brw->geometry_program rather than
core flag and state; keep the state flags in order.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84651
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than hardcoding platform values in every code path, just use the
maximum value we set.
Currently, ctx->Const.LineWidth == 5, which is smaller than the hardware
limit. But applications shouldn't be using a value larger than we
support anyway.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
|
|
|
|
|
|
|
|
|
| |
I believe when I wrote this code, gen6_sf_state used CACHE_NEW_VS_PROG,
which has since been replaced by BRW_NEW_VUE_MAP_GEOM_OUT. It's not
needed here anyway - only SBE needs it. Just a copy and paste mistake.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
| |
I had to dig a bit to figure out why this was necessary.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
|
|
|
|
|
|
|
|
| |
This lets us disable the viewport transform, which will be useful for
emitting 3DPRIM_RECTLIST.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
| |
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update Mesa and drivers to access updated gl_scissor_attrib.
Now have an enable bitfield and array of gl_scissor_rects.
Drivers have been updated to the new scissor enable state
attribute (gl_context.scissor.EnableFlags) but still treat it
as a single boolean which is okay as mesa will only use
bit 0 when communicating with a driver that does not support
ARB_viewport_array.
v2 (idr): Rebase fixes.
v3 (idr): Small code formatting fix suggsted by Ken.
Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Calling the local variables flat_enable and point_sprite_enable is
clearer than dw16 and such. It also matches the names used in
calculate_attr_overrides, which computes them.
v2: Add /* dw16 */ and /* dw10 */ comments, requested by Jordan.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
calculate_attr_overrides is responsible for computing the point sprite
and flat-shading enable bitfields. It does so by OR'ing in a bunch of
bits. However, it relied on the caller to set the initial value to
zero. This is pretty fragile - if the caller neglects to zero out those
variables, then the enable bitfields end up full of garbage, which shows
up as random things being flat-shaded.
This patch moves the zero-initialization into calculate_attr_overrides,
so that the computation is completely in one place.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the SF/SBE setup code delivered varying inputs to the FS
in the order in which they appear in the gl_program::InputsRead
bitfield, since that's what the FS expects.
When we add support for more than 64 varying components, this will no
longer always be the case, because the Gen6+ SF/SBE stage is only
capable of performing arbitrary reorderings of 16 varying slots. So,
when there are more than 16 vec4's worth of varying inputs, the FS
will have to adjust the order its input varyings in order to partially
match the order of outputs from the geometry or vertex shader.
To allow extra flexibility in the ordering of FS varyings, this patch
causes the SF/SBE to deliver varying inputs to the FS in exactly the
order that the FS requests, by consulting brw_wm_prog_data::urb_setup
and brw_wm_prog_data::num_varying_inputs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
| |
We always program the SF unit to start reading the vertex URB entry at
offset 1. In upcoming patches, we'll be adding FS code that relies on
this. So consistently use the constant BRW_SF_URB_ENTRY_READ_OFFSET
rather than hardcoding a 1.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
| |
This makes brw_context inherit directly from gl_context; that was the
only thing left in intel_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
|
|
| |
ctx->DrawBuffer is much more sensible than brw->intel.ctx.DrawBuffer.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch modifies post-GS pipeline stages (transform feedback, clip,
sf, fs) to refer to the VUE map through brw->vue_map_geom_out rather
than brw->vs.prog_data->vue_map. This ensures that when geometry
shader support is added, these pipeline stages will consult the
geometry shader output VUE map when appropriate, rather than the
vertex shader output VUE map.
v2: Fixed some stale "CACHE_NEW_VS_PROG" comments.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_frag_attrib -> gl_varying_slot
FRAG_ATTRIB_* -> VARYING_SLOT_*
FRAG_BIT_* -> VARYING_BIT_*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ivybridge doesn't appear to have the same errata as Sandybridge; no
corruption was observed by setting it to more than the minimal correct
value. It's possible that we were simply lucky, since the URB entries
are 1024-bit on Ivybridge vs. 512-bit Sandybridge. Or perhaps the
underlying hardware issue is fixed.
Either way, we may as well program the minimum value since it's now
readily available, likely to be more efficient, and possibly more
correct.
v2: Use GEN7_SBE_* defines rather than GEN6_SF_*. (A copy and paste
mistake.) They're the same, but using the right names is better.
NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
| |
The maximum SF source attribute is necessary to compute the Vertex URB
read length properly, which will be done in the next commit.
NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
| |
This is needed to compute render_to_fbo. It even has the comment.
NOTE: This is a candidate for stable branches.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were accidentally setting bit 14 in DWord 2 (which is Reserved/MBZ)
rather than bit 14 in DWord 3 (which is AA Line Distance Mode).
There's also no reason to ever set it to legacy mode; the bit is only
used when drawing antialiased lines anyway. Set it unconditionally.
NOTE: This is a candidate for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EXT_framebuffer_multisample is a required subpart of
ARB_framebuffer_object, which means that we must support it even on
platforms that don't support MSAA. Fortunately
EXT_framebuffer_multisample allows for this by allowing GL_MAX_SAMPLES
to be set to 1.
This leads to a tricky quirk in the GL spec: since
GlRenderbufferStorageMultisamples() accepts any value for its
"samples" parameter up to and including GL_MAX_SAMPLES, that means
that on platforms that don't support MSAA, GL_SAMPLES is allowed to be
set to either 0 or 1. On platforms that do support MSAA, GL_SAMPLES=1
is not used; 0 means no MSAA, and 2 or higher means MSAA.
In other words, GL_SAMPLES needs to be interpreted as follows:
=0 no MSAA (possible on all platforms)
=1 no MSAA (only possible on platforms where MSAA unsupported)
>1 MSAA (only possible on platforms where MSAA supported)
This patch modifies all MSAA-related code to choose between
multisampling and single-sampling based on the condition (GL_SAMPLES >
1) instead of (GL_SAMPLES > 0) so that GL_SAMPLES=1 will be treated as
"no MSAA".
Note that since GL_SAMPLES=1 implies GL_SAMPLE_BUFFERS=1, we can no
longer use GL_SAMPLE_BUFFERS to distinguish between MSAA and non-MSAA
rendering.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we used the number of samples in draw buffer 0 to
determine whether to set up the 3D pipeline for multisampling. Using
the visual is cleaner, and has the benefit of working properly when
there is no color buffer.
Fixes all piglit tests "EXT_framebuffer_multisample/no-color" on Gen7.
On Gen6, the "depth-computed" variants of these tests still fail; this
will be addresed in a later patch.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the GL 3.0 spec (p.116):
"Multisample rasterization is enabled or disabled by calling
Enable or Disable with the symbolic constant MULTISAMPLE."
Elsewhere in the spec, where multisample rasterization is described
(sections 3.4.3, 3.5.4, and 3.6.6), the following text is consistently
used:
"If MULTISAMPLE is enabled, and the value of SAMPLE_BUFFERS is
one, then..."
So, in other words, disabling GL_MULTISAMPLE should prevent
multisample rasterization from occurring, even if the draw framebuffer
is multisampled. This patch implements that behaviour by setting the
WM and SF stage's "multisample rasterization mode" to
MSRAST_ON_PATTERN only when the draw framebuffer is multisampled *and*
GL_MULTISAMPLE is enabled.
Fixes piglit test spec/EXT_framebuffer_multisample/enable-flag.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enables MSAA for Gen6, by modifying intel_mipmap_tree to
understand multisampled buffers, adapting the rendering pipeline setup
to enable multisampled rendering, and adding multisample resolve
operations to brw_blorp_blit.cpp. Some preparation work is also
included for Gen7, but it is not yet enabled.
MSAA support is still fairly preliminary. In particular, the
following are not yet supported:
- Fully general blits between MSAA and non-MSAA buffers.
- Formats other than RGBA8, DEPTH24, and STENCIL8.
- Centroid interpolation.
- Coverage parameters (glSampleCoverage, GL_SAMPLE_ALPHA_TO_COVERAGE,
GL_SAMPLE_ALPHA_TO_ONE, GL_SAMPLE_COVERAGE, GL_SAMPLE_COVERAGE_VALUE,
GL_SAMPLE_COVERAGE_INVERT).
Fixes piglit tests "EXT_framebuffer_multisample/accuracy" on
i965/Gen6.
v2:
- In intel_alloc_renderbuffer_storage(), quantize the requested number
of samples to the next higher sample count supported by the
hardware. This ensures that a query of GL_SAMPLES will return the
correct value. It also ensures that MSAA is fully disabled on Gen7
for now (since Gen7 MSAA support doesn't work yet).
- When reading from a non-MSAA surface, ensure that s_is_zero is true
so that we won't try to read from a nonexistent sample.
|
|
|
|
| |
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
| |
Apparently this needs to be the same as in 3DSTATE_WM.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
| |
With this and the previous patch, 640x480 nexuiz is running 0.169118%
+/- 0.0863696% faster (n=121). On a VS state change microbenchmark,
performance is increased 8.28645% +/- 0.460478% (n=52).
v2: Fix CACHE_NEW_VS comment.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|