summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/gen6_multisample_state.c
Commit message (Collapse)AuthorAgeFilesLines
* i965: Change 8X MSAA sample mappingAnuj Phogat2016-08-121-1/+1
| | | | | | | | This is required following the change in 8X sample positions. Fixes the recently modified multisample-scaled-blit piglit tests. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
* i965: Make all atoms to track BRW_NEW_BLORP by defaultKenneth Graunke2016-04-231-1/+2
| | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com
* mesa: replace gl_context->Multisample._Enabled with ↵Bas Nieuwenhuizen2016-03-241-1/+1
| | | | | | | | | | | | _mesa_is_multisample_enabled. This removes any dependency on driver validation of the number of framebuffer samples. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Brian Paul <brianp@vmware.com>
* meta: Support 16x MSAA in the multisample scaled blit shaderNeil Roberts2015-11-051-0/+14
| | | | | | v2: Fix the x_scale in the shader. Remove the doubts in the commit message. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965: Program 16x MSAA sample positions.Neil Roberts2015-11-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the standard pattern used by the other 3D graphics API. BDW has slots for these values, but they aren't actually used until SKL. Even though the documentation for BDW says they must be zero, it doesn't seem to cause any harm to program them anyway. The comment above for the 8x sample positions says that the hardware implements centroid interpolation by picking the centre-most sample that is inside the primitive. That implies that it might be worthwhile to pick a pattern that includes 0.5,0.5. However by experimentation this doesn't seem to actually be the case. With the sample positions in this patch, if I modify the piglit test below so that it instead reports the centroid position, it reports 0.492188,0.421875 which doesn't match any of the positions. If I modify the sample positions so that they include one at exactly 0.5,0.5 it doesn't help and it reports another position which is even further from the center for some reason. arb_gpu_shader5-interpolateAtSample-different Kenneth Graunke experimented with some other patterns that have a higher standard deviation but I think after some discussion it was decided that it would be better to pick the same pattern as the other graphics API in case there are games that rely on this pattern. (Based on a patch by Kenneth Graunke) Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben at bwidawsk.net>
* i965: Trivial formatting changes in gen6_multisample_state.cIan Romanick2015-08-031-5/+2
| | | | | | Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
* i965: Use float calculations when double is unnecessary.Matt Turner2015-07-291-2/+2
| | | | | | | | | | | | | | | | | Literals without an f/F suffix are of type double, and implicit conversion rules specify that the float in (float op double) be converted to a double before the operation is performed. I believe float execution was intended (in nearly all cases) or is sufficient (in the case of gen7_urb.c). Removes a lot of float <-> double conversion instructions and replaces many double instructions with float instructions which are cheaper. text data bss dec hex filename 4928659 195160 26192 5150011 4e953b i965_dri.so before 4928315 195152 26192 5149659 4e93db i965_dri.so after Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965: Use _mesa_geometric_ functions appropriatelyKevin Rogovin2015-06-171-1/+2
| | | | | | | | | | | | | | Change references to gl_framebuffer::Width, Height, MaxNumLayers and Visual::samples to use the _mesa_geometry_ convenience functions for those places where the geometry of the gl_framebuffer is needed (in contrast to the geometry of the intersection of the attachments of the gl_framebuffer). This patch is to pave the way to enable GL_ARB_framebuffer_no_attachments on Gen7 and higher in i965. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
* i965: Do Sandybridge workaround flushes before each primitive.Kenneth Graunke2015-02-171-3/+0
| | | | | | | | | | | | | | | | | | | | | | | Sandybridge requires the post-sync non-zero workaround in a ton of places, and if you ever miss one, the GPU usually hangs. Currently, we try to track exactly when a workaround flush is necessary (via the brw->batch.need_workaround_flush flag). This is tricky to get right, and we've botched it several times in the past. This patch unconditionally performs the post-sync non-zero flush at the start of each primitive's state upload (including BLORP). We drop the needs_workaround_flush flag, and drop all the other callers, as the flush has already been performed. We have no data to indicate that simply flushing all the time will hurt performance, and it has the potential to help stability. v2: Add post-sync workaround to initial GPU state upload to be extra cautious (suggested by Chad Versace). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
* i965: Delete brw_state_flags::cache and related code.Kenneth Graunke2014-12-021-1/+0
| | | | | | | | | It's been merged into brw_state_flags::brw for simplicity and efficiency. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Alphabetize brw_tracked_state flags and use a consistent style.Kenneth Graunke2014-11-291-2/+2
| | | | | | | | | | | | | | | | Most of the dirty flags were listed in some arbitrary order. Some used bonus parenthesis. Some put multiple flags on one line, others put one per line. Some used tabs instead of spaces...but only on some lines. This patch settles on one flag per line, in alphabetical order, using spaces instead of tabs, and sheds the unnecessary parentheses. Sorting was mostly done with vim's visual block feature and !sort, although I alphabetized short lists by hand; it was pretty manual. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Initialize the SampleMap{2,4,8}x variablesAnuj Phogat2014-10-011-0/+45
| | | | | | | | | | | | | with values specific to Intel hardware. V2: Define and use gen6_get_sample_map() function to initialize the variables. V3: Change the function name to gen6_set_sample_maps() and use memcpy() to fill in the data. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Use unreachable() instead of unconditional assert().Matt Turner2014-07-011-4/+2
| | | | Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965: Track the number of samples in the drawbuffer.Eric Anholt2014-04-301-10/+7
| | | | | | | | | This keeps us from having to emit the nonpipelined state packet on every FBO binding. -4.42003% +/- 1.09961% effect on cairo-perf-trace runtime on glamor (n=110). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Program 2x MSAA sample positions.Kenneth Graunke2014-02-101-0/+3
| | | | | | | | | There are only two sensible placements for 2x MSAA samples - and one is the mirror image of the other. I chose (0.25, 0.25) and (0.75, 0.75). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Store 4x MSAA sample positions in a scalar value, not an array.Kenneth Graunke2014-02-101-2/+2
| | | | | | | | Storing a single value in an array is rather pointless. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Duplicate less code in GetSamplePositions driver hook.Kenneth Graunke2014-02-101-11/+12
| | | | | | | | | | | | The 4x and 8x cases contained identical code for extracting the X and Y sample offset values and converting them from U0.4 back to float. Without this refactoring, we'd have to duplicate it a third time in order to support 2x MSAA. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Update multisampling state for Broadwell.Kenneth Graunke2014-01-311-0/+2
| | | | | | | | | | | | | | | | | | | | On previous platforms, 3DSTATE_MULTISAMPLE contained the number of samples, pixel location, and the positions of each sample within a pixel for each multisampling mode (4x and 8x). It was also a non-pipelined command, presumably since changing the sample positions is fairly drastic. Broadwell improves upon this by splitting the sample positions out into a separate non-pipelined state packet, 3DSTATE_SAMPLE_PATTERN. With that removed, 3DSTATE_MULTISAMPLE becomes a pipelined state packet. Broadwell also supports 2x and 16x multisampling, in addition to the 4x and 8x supported by Gen7. This patch, however, does not implement 2x and 16x. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
* i965: refactor sample mask calculationChris Forbes2013-12-071-28/+35
| | | | | | | | | Haswell needs a copy of the sample mask in 3DSTATE_PS; this makes that convenient. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Move post-sync non-zero flush for 3DSTATE_MULTISAMPLE.Kenneth Graunke2013-10-281-3/+3
| | | | | | | | | | | For some reason, we put the flush in the caller, rather than just before emitting the packet. This is more than a cosmetic problem: BLORP calls gen6_emit_3dstate_multisample() directly, and so it missed the flush. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Xinkai Chen <yeled.nova@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "9.2" <mesa-stable@lists.freedesktop.org>
* i965: Move arrays brw_multisample_positions* to new headerChad Versace2013-08-131-46/+1
| | | | | | | | Move the arrays to the new header brw_multisample_state.h, which will be shared with Broadwell code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
* i965: Refactor names of sample_positions_8/4x arraysChad Versace2013-08-131-7/+7
| | | | | | | | | | | Place each array in the brw namespace by renaming it: sample_positions_4x -> brw_multisample_positions_4x sample_positions_8x -> brw_multisample_positions_8x This prepares for moving the arrays to a header shared by gen6 and gen8. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
* i965: Delete intel_context entirely.Kenneth Graunke2013-07-091-1/+1
| | | | | | | | | | This makes brw_context inherit directly from gl_context; that was the only thing left in intel_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965: Move intel_context::gen and gt fields to brw_context.Kenneth Graunke2013-07-091-6/+3
| | | | | | | | | | Most functions no longer use intel_context, so this patch additionally removes the local "intel" variables to avoid compiler warnings. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965: Pass brw_context to functions rather than intel_context.Kenneth Graunke2013-07-091-3/+1
| | | | | | | | | | | | | | This makes brw_context available in every function that used intel_context. This makes it possible to start migrating fields from intel_context to brw_context. Surprisingly, this actually removes some code, as functions that use OUT_BATCH don't need to declare "intel"; they just use "brw." Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965: report correct sample positionsChris Forbes2013-04-251-4/+4
| | | | | | | | | From low to high bits, the sample positions are packed y0,x0,y1,x1... Fixes arb_texture_multisample-sample-position piglit. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>
* i965: expose sample positionsChris Forbes2013-03-021-43/+74
| | | | | | | | | | | | Moves the definition of the sample positions out of gen6_emit_3dstate_multisample, and unpacks them in gen6_get_sample_position. V2: Be consistent about `sample position` rather than `location`. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com>
* i965: add support for sample mask on Gen6+Chris Forbes2013-03-021-6/+13
| | | | | | | Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965/msaa: Treat GL_SAMPLES=1 as equivalent to GL_SAMPLES=0.Paul Berry2012-08-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | EXT_framebuffer_multisample is a required subpart of ARB_framebuffer_object, which means that we must support it even on platforms that don't support MSAA. Fortunately EXT_framebuffer_multisample allows for this by allowing GL_MAX_SAMPLES to be set to 1. This leads to a tricky quirk in the GL spec: since GlRenderbufferStorageMultisamples() accepts any value for its "samples" parameter up to and including GL_MAX_SAMPLES, that means that on platforms that don't support MSAA, GL_SAMPLES is allowed to be set to either 0 or 1. On platforms that do support MSAA, GL_SAMPLES=1 is not used; 0 means no MSAA, and 2 or higher means MSAA. In other words, GL_SAMPLES needs to be interpreted as follows: =0 no MSAA (possible on all platforms) =1 no MSAA (only possible on platforms where MSAA unsupported) >1 MSAA (only possible on platforms where MSAA supported) This patch modifies all MSAA-related code to choose between multisampling and single-sampling based on the condition (GL_SAMPLES > 1) instead of (GL_SAMPLES > 0) so that GL_SAMPLES=1 will be treated as "no MSAA". Note that since GL_SAMPLES=1 implies GL_SAMPLE_BUFFERS=1, we can no longer use GL_SAMPLE_BUFFERS to distinguish between MSAA and non-MSAA rendering. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965/msaa: Remove assertion in 3DSTATE_SAMPLE_MASK to allow 8x MSAA.Paul Berry2012-07-241-3/+0
| | | | | | | | The code to emit 3DSTATE_SAMPLE_MASK was already correct for 8x MSAA--this patch just removes an assertion that would have prevented it from being used for 8x MSAA. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/msaa: Adjust 3DSTATE_MULTISAMPLE packet for 8x MSAA.Paul Berry2012-07-241-6/+64
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/msaa: Control multisampling behaviour via the visual.Paul Berry2012-07-241-5/+3
| | | | | | | | | | | | | Previously, we used the number of samples in draw buffer 0 to determine whether to set up the 3D pipeline for multisampling. Using the visual is cleaner, and has the benefit of working properly when there is no color buffer. Fixes all piglit tests "EXT_framebuffer_multisample/no-color" on Gen7. On Gen6, the "depth-computed" variants of these tests still fail; this will be addresed in a later patch. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
* i965/msaa: Implement glSampleCoverage.Paul Berry2012-06-261-4/+22
| | | | | | | | | | | | | | This patch enables glSampleCoverage() functionality, which allows the client program to specify that only a portion of the samples be lit up when performing multisampled rendering. i965 supports glSampleCoverage() through the 3DSTATE_SAMPLE_MASK command packet, which allows the driver to specify a bitfield indicating which samples to light up. Fixes piglit tests "EXT_framebuffer_multisample/sample-coverage {2,4} {inverted,non-inverted}". Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965/msaa: Enable 4x MSAA on Gen7.Paul Berry2012-05-251-8/+4
| | | | | | | | | | | | | | | | | | Basic 4x MSAA support now works on Gen7. This patch enables it. As with Gen6, MSAA support is still fairly preliminary. In particular, the following are not yet supported: - 8x oversampling (Gen7 has hardware support for this, but we do not yet expose it). - Fully general blits between MSAA and non-MSAA buffers. - Formats other than RGBA8, DEPTH24, and STENCIL8. - Centrold interpolation. - Coverage parameters (glSampleCoverage, GL_SAMPLE_ALPHA_TO_COVERAGE, GL_SAMPLE_ALPHA_TO_ONE, GL_SAMPLE_COVERAGE, GL_SAMPLE_COVERAGE_VALUE, GL_SAMPLE_COVERAGE_INVERT). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/gen6: Initial implementation of MSAA.Paul Berry2012-05-151-0/+102
This patch enables MSAA for Gen6, by modifying intel_mipmap_tree to understand multisampled buffers, adapting the rendering pipeline setup to enable multisampled rendering, and adding multisample resolve operations to brw_blorp_blit.cpp. Some preparation work is also included for Gen7, but it is not yet enabled. MSAA support is still fairly preliminary. In particular, the following are not yet supported: - Fully general blits between MSAA and non-MSAA buffers. - Formats other than RGBA8, DEPTH24, and STENCIL8. - Centroid interpolation. - Coverage parameters (glSampleCoverage, GL_SAMPLE_ALPHA_TO_COVERAGE, GL_SAMPLE_ALPHA_TO_ONE, GL_SAMPLE_COVERAGE, GL_SAMPLE_COVERAGE_VALUE, GL_SAMPLE_COVERAGE_INVERT). Fixes piglit tests "EXT_framebuffer_multisample/accuracy" on i965/Gen6. v2: - In intel_alloc_renderbuffer_storage(), quantize the requested number of samples to the next higher sample count supported by the hardware. This ensures that a query of GL_SAMPLES will return the correct value. It also ensures that MSAA is fully disabled on Gen7 for now (since Gen7 MSAA support doesn't work yet). - When reading from a non-MSAA surface, ensure that s_is_zero is true so that we won't try to read from a nonexistent sample.