| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
This is required following the change in 8X sample positions.
Fixes the recently modified multisample-scaled-blit piglit tests.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
_mesa_is_multisample_enabled.
This removes any dependency on driver validation of the number of
framebuffer samples.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
| |
v2: Fix the x_scale in the shader. Remove the doubts in the commit
message.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the standard pattern used by the other 3D graphics API.
BDW has slots for these values, but they aren't actually used until
SKL. Even though the documentation for BDW says they must be zero, it
doesn't seem to cause any harm to program them anyway.
The comment above for the 8x sample positions says that the hardware
implements centroid interpolation by picking the centre-most sample
that is inside the primitive. That implies that it might be worthwhile
to pick a pattern that includes 0.5,0.5. However by experimentation
this doesn't seem to actually be the case. With the sample positions
in this patch, if I modify the piglit test below so that it instead
reports the centroid position, it reports 0.492188,0.421875 which
doesn't match any of the positions. If I modify the sample positions
so that they include one at exactly 0.5,0.5 it doesn't help and it
reports another position which is even further from the center for
some reason.
arb_gpu_shader5-interpolateAtSample-different
Kenneth Graunke experimented with some other patterns that have a
higher standard deviation but I think after some discussion it was
decided that it would be better to pick the same pattern as the other
graphics API in case there are games that rely on this pattern.
(Based on a patch by Kenneth Graunke)
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben at bwidawsk.net>
|
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Literals without an f/F suffix are of type double, and implicit
conversion rules specify that the float in (float op double) be
converted to a double before the operation is performed. I believe float
execution was intended (in nearly all cases) or is sufficient (in the
case of gen7_urb.c).
Removes a lot of float <-> double conversion instructions and replaces
many double instructions with float instructions which are cheaper.
text data bss dec hex filename
4928659 195160 26192 5150011 4e953b i965_dri.so before
4928315 195152 26192 5149659 4e93db i965_dri.so after
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change references to gl_framebuffer::Width, Height, MaxNumLayers
and Visual::samples to use the _mesa_geometry_ convenience functions
for those places where the geometry of the gl_framebuffer is needed
(in contrast to the geometry of the intersection of the attachments
of the gl_framebuffer).
This patch is to pave the way to enable GL_ARB_framebuffer_no_attachments
on Gen7 and higher in i965.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sandybridge requires the post-sync non-zero workaround in a ton of
places, and if you ever miss one, the GPU usually hangs.
Currently, we try to track exactly when a workaround flush is
necessary (via the brw->batch.need_workaround_flush flag). This is
tricky to get right, and we've botched it several times in the past.
This patch unconditionally performs the post-sync non-zero flush at the
start of each primitive's state upload (including BLORP). We drop the
needs_workaround_flush flag, and drop all the other callers, as the
flush has already been performed.
We have no data to indicate that simply flushing all the time will
hurt performance, and it has the potential to help stability.
v2: Add post-sync workaround to initial GPU state upload to be extra
cautious (suggested by Chad Versace).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
| |
It's been merged into brw_state_flags::brw for simplicity and
efficiency.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the dirty flags were listed in some arbitrary order. Some used
bonus parenthesis. Some put multiple flags on one line, others put one
per line. Some used tabs instead of spaces...but only on some lines.
This patch settles on one flag per line, in alphabetical order, using
spaces instead of tabs, and sheds the unnecessary parentheses.
Sorting was mostly done with vim's visual block feature and !sort,
although I alphabetized short lists by hand; it was pretty manual.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
with values specific to Intel hardware.
V2: Define and use gen6_get_sample_map() function to initialize
the variables.
V3: Change the function name to gen6_set_sample_maps() and use
memcpy() to fill in the data.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
| |
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
| |
This keeps us from having to emit the nonpipelined state packet on every
FBO binding.
-4.42003% +/- 1.09961% effect on cairo-perf-trace runtime on glamor (n=110).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
| |
There are only two sensible placements for 2x MSAA samples - and one is
the mirror image of the other. I chose (0.25, 0.25) and (0.75, 0.75).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
|
|
|
|
|
|
|
|
| |
Storing a single value in an array is rather pointless.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The 4x and 8x cases contained identical code for extracting the X and
Y sample offset values and converting them from U0.4 back to float.
Without this refactoring, we'd have to duplicate it a third time in
order to support 2x MSAA.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On previous platforms, 3DSTATE_MULTISAMPLE contained the number of
samples, pixel location, and the positions of each sample within a pixel
for each multisampling mode (4x and 8x). It was also a non-pipelined
command, presumably since changing the sample positions is fairly
drastic.
Broadwell improves upon this by splitting the sample positions out into
a separate non-pipelined state packet, 3DSTATE_SAMPLE_PATTERN. With
that removed, 3DSTATE_MULTISAMPLE becomes a pipelined state packet.
Broadwell also supports 2x and 16x multisampling, in addition to the 4x
and 8x supported by Gen7. This patch, however, does not implement 2x
and 16x.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
| |
Haswell needs a copy of the sample mask in 3DSTATE_PS; this makes that
convenient.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
| |
For some reason, we put the flush in the caller, rather than just before
emitting the packet. This is more than a cosmetic problem: BLORP calls
gen6_emit_3dstate_multisample() directly, and so it missed the flush.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
|
|
|
|
|
|
|
|
| |
Move the arrays to the new header brw_multisample_state.h, which will be
shared with Broadwell code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Place each array in the brw namespace by renaming it:
sample_positions_4x -> brw_multisample_positions_4x
sample_positions_8x -> brw_multisample_positions_8x
This prepares for moving the arrays to a header shared by gen6 and gen8.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
| |
This makes brw_context inherit directly from gl_context; that was the
only thing left in intel_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
Most functions no longer use intel_context, so this patch additionally
removes the local "intel" variables to avoid compiler warnings.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes brw_context available in every function that used
intel_context. This makes it possible to start migrating fields from
intel_context to brw_context.
Surprisingly, this actually removes some code, as functions that use
OUT_BATCH don't need to declare "intel"; they just use "brw."
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
|
|
| |
From low to high bits, the sample positions are packed y0,x0,y1,x1...
Fixes arb_texture_multisample-sample-position piglit.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Moves the definition of the sample positions out of
gen6_emit_3dstate_multisample, and unpacks them in
gen6_get_sample_position.
V2: Be consistent about `sample position` rather than `location`.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EXT_framebuffer_multisample is a required subpart of
ARB_framebuffer_object, which means that we must support it even on
platforms that don't support MSAA. Fortunately
EXT_framebuffer_multisample allows for this by allowing GL_MAX_SAMPLES
to be set to 1.
This leads to a tricky quirk in the GL spec: since
GlRenderbufferStorageMultisamples() accepts any value for its
"samples" parameter up to and including GL_MAX_SAMPLES, that means
that on platforms that don't support MSAA, GL_SAMPLES is allowed to be
set to either 0 or 1. On platforms that do support MSAA, GL_SAMPLES=1
is not used; 0 means no MSAA, and 2 or higher means MSAA.
In other words, GL_SAMPLES needs to be interpreted as follows:
=0 no MSAA (possible on all platforms)
=1 no MSAA (only possible on platforms where MSAA unsupported)
>1 MSAA (only possible on platforms where MSAA supported)
This patch modifies all MSAA-related code to choose between
multisampling and single-sampling based on the condition (GL_SAMPLES >
1) instead of (GL_SAMPLES > 0) so that GL_SAMPLES=1 will be treated as
"no MSAA".
Note that since GL_SAMPLES=1 implies GL_SAMPLE_BUFFERS=1, we can no
longer use GL_SAMPLE_BUFFERS to distinguish between MSAA and non-MSAA
rendering.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
| |
The code to emit 3DSTATE_SAMPLE_MASK was already correct for 8x
MSAA--this patch just removes an assertion that would have prevented
it from being used for 8x MSAA.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we used the number of samples in draw buffer 0 to
determine whether to set up the 3D pipeline for multisampling. Using
the visual is cleaner, and has the benefit of working properly when
there is no color buffer.
Fixes all piglit tests "EXT_framebuffer_multisample/no-color" on Gen7.
On Gen6, the "depth-computed" variants of these tests still fail; this
will be addresed in a later patch.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enables glSampleCoverage() functionality, which allows the
client program to specify that only a portion of the samples be lit up
when performing multisampled rendering. i965 supports
glSampleCoverage() through the 3DSTATE_SAMPLE_MASK command packet,
which allows the driver to specify a bitfield indicating which samples
to light up.
Fixes piglit tests "EXT_framebuffer_multisample/sample-coverage {2,4}
{inverted,non-inverted}".
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Basic 4x MSAA support now works on Gen7. This patch enables it.
As with Gen6, MSAA support is still fairly preliminary. In
particular, the following are not yet supported:
- 8x oversampling (Gen7 has hardware support for this, but we do not
yet expose it).
- Fully general blits between MSAA and non-MSAA buffers.
- Formats other than RGBA8, DEPTH24, and STENCIL8.
- Centrold interpolation.
- Coverage parameters (glSampleCoverage, GL_SAMPLE_ALPHA_TO_COVERAGE,
GL_SAMPLE_ALPHA_TO_ONE, GL_SAMPLE_COVERAGE, GL_SAMPLE_COVERAGE_VALUE,
GL_SAMPLE_COVERAGE_INVERT).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
This patch enables MSAA for Gen6, by modifying intel_mipmap_tree to
understand multisampled buffers, adapting the rendering pipeline setup
to enable multisampled rendering, and adding multisample resolve
operations to brw_blorp_blit.cpp. Some preparation work is also
included for Gen7, but it is not yet enabled.
MSAA support is still fairly preliminary. In particular, the
following are not yet supported:
- Fully general blits between MSAA and non-MSAA buffers.
- Formats other than RGBA8, DEPTH24, and STENCIL8.
- Centroid interpolation.
- Coverage parameters (glSampleCoverage, GL_SAMPLE_ALPHA_TO_COVERAGE,
GL_SAMPLE_ALPHA_TO_ONE, GL_SAMPLE_COVERAGE, GL_SAMPLE_COVERAGE_VALUE,
GL_SAMPLE_COVERAGE_INVERT).
Fixes piglit tests "EXT_framebuffer_multisample/accuracy" on
i965/Gen6.
v2:
- In intel_alloc_renderbuffer_storage(), quantize the requested number
of samples to the next higher sample count supported by the
hardware. This ensures that a query of GL_SAMPLES will return the
correct value. It also ensures that MSAA is fully disabled on Gen7
for now (since Gen7 MSAA support doesn't work yet).
- When reading from a non-MSAA surface, ensure that s_is_zero is true
so that we won't try to read from a nonexistent sample.
|