external_mesa3d.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965: Eliminate brw->wm.prog_data pointer.	Kenneth Graunke	2016-10-05	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Just say no to: - brw->wm.base.prog_data = &brw->wm.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_wm_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>
*	i965: get rid of duplicated values from gen_device_info	Lionel Landwerlin	2016-09-23	1	-1/+2
\| \| \| \| \| \| \| \|	Now that we have gen_device_info mutable, we can update its values and drop all copies we had in brw_context. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: use new subroutine index uploader.	Dave Airlie	2016-08-23	1	-0/+3
\| \| \| \| \| \| \| \|	This plugs the subroutine index updates into the i965 backend, where it loads constants. Signed-off-by: Dave Airlie <airlied@redhat.com> Acked-by: Andres Gomez <agomez@igalia.com>
*	i965: Fix cross-primitive scratch corruption when changing the per-thread ↵	Francisco Jerez	2016-06-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	allocation. I haven't found any mention of this in the hardware docs, but experimentally what seems to be going on is that when the per-thread scratch slot size is changed between two pipelined draw calls, shader invocations using the old and new scratch size setting may end up being executed in parallel, causing their scratch offset calculations to be based in a different partitioning of the scratch space, which can cause their thread-local scratch space to overlap leading to cross-thread scratch corruption. I've been experimenting with alternative workarounds, like emitting a PIPE_CONTROL with DC flush and CS stall between draw (or dispatch compute) calls using different per-thread scratch allocation settings, or avoiding reuse of the scratch BO if the per-thread scratch allocation doesn't exactly match the original. Both seem to be as effective as this workaround, but they have potential performance implications, while this should be basically for free. Fixes over 40 failures in our CI system with spilling forced on (including CTS, dEQP and Piglit failures) on a number of different platforms from Gen4 to Gen9. The 'glsl-max-varyings' piglit test seems to be able to reproduce this bug consistently in the vertex shader on at least Gen4, Gen8 and Gen9 with spilling forced on. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Organize prog_data by ksp number rather than SIMD width	Jason Ekstrand	2016-05-14	1	-52/+11
\| \| \| \| \| \| \| \| \| \|	The hardware packets organize kernel pointers and GRF start by slots that don't map directly to dispatch width. This means that all of the state setup code has to re-arrange the data from prog_data into these slots. This logic has been duplicated 4 times in the GL driver and one more time in the Vulkan driver. Let's just put it all in brw_fs.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/state: Clean up WM/PS state to pull more things out of prog_data	Jason Ekstrand	2016-05-14	1	-20/+8
\| \| \| \| \| \| \| \| \|	Now that we have a persample_shading bit in prog_data we can reduce the amount the state setup code needs to be looking at the GL state. In particular, it no longer pulls anything directly out of the gl_fragment_program and no longer depends on NEW_FRAGMENT_PROGRAM. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Rework the persample shading key/prog_data bits	Jason Ekstrand	2016-05-14	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit reworks and simplifies the way we handle persample shading in the shader key and prog_data. The previous approach had three different key bits that had slightly different and hard-to-decern meanings while the new bits are far more clear. This commit changes it to two easily understood bits that communicate everything we need: 1) key->persample_interp: means that the user has requested persample interpolation through the API. This is equivalent to having SAMPLE_SHADING enabled and having MIN_SAMPLE_SHADING_VALUE set high enough that you actually get multiple per-sample invocations. 2) key->multisample_fbo: means that the shader will be running on an actual multi-sampled framebuffer. This commit also adds a new "persample_dispatch" bit to prog_data which indicates that the shader should be run in persample mode. This way the state setup code doesn't have to look at the fragment program or GL state and can just pull that data out of the prog_data. In theory, this shuffle could mean more recompiles. However, in practice, we were shoving enough state into the key before that we were probably hitting a recompile on every per-sample shader anyway. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Make all atoms to track BRW_NEW_BLORP by default	Kenneth Graunke	2016-04-23	1	-0/+2
\| \| \| \|	Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com
*	i965: Use _mesa_geometric_ functions appropriately	Kevin Rogovin	2015-06-17	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change references to gl_framebuffer::Width, Height, MaxNumLayers and Visual::samples to use the _mesa_geometry_ convenience functions for those places where the geometry of the gl_framebuffer is needed (in contrast to the geometry of the intersection of the attachments of the gl_framebuffer). This patch is to pave the way to enable GL_ARB_framebuffer_no_attachments on Gen7 and higher in i965. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
*	i965/wm/gen6: Add option for disabling statistics collection	Topi Pohjolainen	2015-05-07	1	-3/+11
\| \| \| \| \| \| \| \|	Normally this is always needed but for internal blits and clears we need to be able to disable it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
*	i965/wm/gen6: Refactor state setup	Topi Pohjolainen	2015-05-07	1	-45/+66
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
*	i965: Respect the no_8 flag on Gen6, not just Gen7+.	Kenneth Graunke	2015-01-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When doing repclears, we only want to use the SIMD16 program, not the SIMD8 one. Kristian added this to the Gen7+ code, but apparently we missed it in the Gen6 code. This patch copies that code over. Approximately doubles the performance in a clear microbenchmark from mesa-demos (clearspd -width 500 -height 500 +color) on Sandybridge. Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com> References: https://code.google.com/p/chrome-os-partner/issues/detail?id=34681
*	i965: Store floating point mode choice in brw_stage_prog_data.	Kenneth Graunke	2014-12-04	1	-6/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We use IEEE mode for GLSL programs, but need to use ALT mode for ARB programs so that 0^0 == 1. The choice is based entirely on the shader source language. Previously, our code to determine which mode we wanted was duplicated in 8 different places (VS and FS for Gen4-5, Gen6, Gen7, and Gen8). The ctx->_Shader->CurrentProgram[stage] == NULL check was confusing as well - we use CurrentProgram (non-derived state), but _Shader (derived state). It also relies on knowing that ARB programs don't use gl_shader_program structures today. The compiler already makes this assumption in a few places, but I'd rather keep that assumption out of the state upload code. With this patch, we select the mode at compile time, and store that choice in prog_data. The state upload code simply uses that decision. This eliminates a BRW_NEW_*_PROGRAM dependency in the state upload code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965: Move BRW_NEW_*_PROG_DATA flags to .brw (not .cache).	Kenneth Graunke	2014-12-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I put the BRW_NEW__PROG_DATA flags at the beginning so that brw_state_cache.c can still continue using 1 << brw_cache_id. I also added a comment explaining the difference between BRW_NEW__PROG_DATA and BRW_NEW__PROGRAM, as it took me a long time to remember it. Non-mechanical changes: - brw_state_cache.c and brw_ff_gs.c now signal .brw, not .cache. - brw_state_upload.c - INTEL_DEBUG=state changes. - brw_context.h - bit definition merging. v2: Correct the explanation of BRW_NEW__PROG_DATA to mention state-based recompiles, and nix the "proper subset" claim, as it's false. (Caught by Kristian Høgsberg). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965: Rename CACHE_NEW__PROG to BRW_NEW__PROG_DATA.	Kenneth Graunke	2014-12-02	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we've moved a bunch of CACHE_NEW_* bits to BRW_NEW_, the only ones that are left are legitimately related to the program cache. Yet, it seems a bit wasteful to have an entire bitfield for only 7 bits. State upload is one of the hottest paths in the driver. For each atom in the list, we call check_state() to see if it needs to be emitted. Currently, this involves comparing three separate bitfields (mesa, brw, and cache). Consolidating the brw and cache bitfields would save a small amount of CPU overhead per atom. Broadwell, for example, has 57 state atoms, so this small savings can add up. CACHE_NEW__PROG covers the brw__prog_data structures, as well as the offset into the program cache BO (prog_offset). Since most uses refer to brw__prog_data, I decided to use BRW_NEW__PROG_DATA as the name. Removing "cache" completely is a bit painful, so I decided to do it in several patches for easier review, and to separate mechanical changes from manual ones. This one simply renames things, and was made via: $ for file in .[ch]; do sed -i -e 's/CACHE_NEW_$[A-Z_\]$_PROG/BRW_NEW_\1_PROG_DATA/g' \ -e 's/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/g' $file done Note that BRW_NEW_*_PROG_DATA is still in .cache, not .brw! The next patch will remedy this flaw. It will also fix the alphabetization issues. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Matt Turner <mattst88@gmail.com>
*	i965: Alphabetize brw_tracked_state flags and use a consistent style.	Kenneth Graunke	2014-11-29	1	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most of the dirty flags were listed in some arbitrary order. Some used bonus parenthesis. Some put multiple flags on one line, others put one per line. Some used tabs instead of spaces...but only on some lines. This patch settles on one flag per line, in alphabetical order, using spaces instead of tabs, and sheds the unnecessary parentheses. Sorting was mostly done with vim's visual block feature and !sort, although I alphabetized short lists by hand; it was pretty manual. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965: Use brw_wm_prog_data::uses_kill, not gl_fragment_program::UsesKill	Kenneth Graunke	2014-11-27	1	-1/+1
\| \| \| \| \| \|	Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965: Create prog_data temporary variables in PS state upload code.	Kenneth Graunke	2014-11-27	1	-25/+21
\| \| \| \| \| \| \| \| \|	prog_data->foo is a bit more readable than brw->wm.prog_data->foo. The local variable definition is also a great location to put the obligatory /* CACHE_NEW_WM_PROG */ comment. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
*	i965: Move curb_read_length/total_scratch to brw_stage_prog_data.	Kenneth Graunke	2014-09-03	1	-2/+2
\| \| \| \| \| \| \| \|	All shader stages have these fields, so it makes sense to store them in the common base structure, rather than duplicating them in each. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
*	i965: Assign PS kernel start pointers when we decide which kernels to use	Kristian Høgsberg	2014-08-14	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \|	Right now we decide which kernels to use and the GRF start offsets in one place and emit the kernel pointers later. The logic of how to map 8, 16 and 32 kernels to kernel start pointers follows the same logic as which GRF start offsets to use, so lets figure out these two things in one place. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
*	i965/gen6: Add a spec citation about push constant packet requirements.	Eric Anholt	2014-07-02	1	-1/+8
\| \| \| \|	Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/gen6+: Merge VS/GS and WM push constant buffer upload paths.	Eric Anholt	2014-07-02	1	-38/+3
\| \| \| \|	Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Move dispatch_grf_start_reg and first_curbe_grf into stage_prog_data.	Eric Anholt	2014-07-02	1	-5/+6
\| \| \| \| \| \| \|	I wanted to access this value from stage-generic code, so stop storing it under two different names. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Move push constant state packets to push constant update time.	Eric Anholt	2014-05-02	1	-1/+7
\| \| \| \| \| \| \|	-0.553779% +/- 0.423394% effect on cairo-perf-trace runtime on glamor (n=612) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/gen6: Don't update unit state when samplers change.	Eric Anholt	2014-05-02	1	-2/+1
\| \| \| \| \| \|	There's no remaining dependency between these two packets that I can find. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Drop a NEW_SAMPLER annotation for use of sampler_count.	Eric Anholt	2014-05-02	1	-1/+0
\| \| \| \| \| \| \|	The sampler count is set up from the gl_program at draw time, not at sampler change time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Fix state flag comments on color_buffer_write_enabled() calls.	Eric Anholt	2014-04-30	1	-0/+1
\| \| \| \|	Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Drop bogus state flag comment.	Eric Anholt	2014-04-30	1	-1/+0
\| \| \| \| \| \| \|	This was introduced with the comment and code below it, though the code only touches prog_data (CACHE_NEW_WM_PROG). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	mesa/sso: rename Shader to the pointer _Shader	Gregory Hainaut	2014-03-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Basically a sed but shaderapi.c and get.c. get.c => GL_CURRENT_PROGAM always refer to the "old" UseProgram behavior shaderapi.c => the old api stil update the Shader object directly V2: formatting improvement V3 (idr): * Rebase fixes after a block of code was moved from ir_to_mesa.cpp to shaderapi.c. * Trivial reformatting. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
*	i965: Move the remaining driver debug over to stderr.	Eric Anholt	2014-02-22	1	-6/+6
\| \| \| \| \| \|	Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965: Move up duplicated fields from stage-specific prog_data to ↵	Francisco Jerez	2014-02-19	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	brw_stage_prog_data. There doesn't seem to be any reason for nr_params, nr_pull_params, param, and pull_param to be duplicated in the stage-specific subclasses of brw_stage_prog_data. Moving their definition to the common base class will allow some code sharing in a future commit, the removal of brw_vec4_prog_data_compare and brw__prog_data_free, and the simplification of the stage-specific brw__prog_data_compare. Reviewed-by: Paul Berry <stereotype441@gmail.com>
*	i965: Fix comments to refer to the new ctx->Shader.CurrentProgram array.	Paul Berry	2014-01-21	1	-2/+2
\| \| \| \| \|	Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>
*	mesa: Replace ctx->Shader.Current{Vertex,Fragment,Geometry}Program with an ↵	Paul Berry	2014-01-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	array. These are replaced with ctx->Shader.CurrentProgram[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}]. In patches to follow, this will allow us to replace a lot of ad-hoc logic with a variable index into the array. With the exception of the changes to mtypes.h, this patch was generated entirely by the command: find src -type f '(' -iname '.c' -o -iname '.cpp' ')' \ -print0 \| xargs -0 sed -i \ -e 's/\.CurrentVertexProgram/.CurrentProgram[MESA_SHADER_VERTEX]/g' \ -e 's/\.CurrentGeometryProgram/.CurrentProgram[MESA_SHADER_GEOMETRY]/g' \ -e 's/\.CurrentFragmentProgram/.CurrentProgram[MESA_SHADER_FRAGMENT]/g' Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>
*	i965: Add an option to ignore sample qualifier	Anuj Phogat	2014-01-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	This will be useful in my next patch which depends on a functionality of _mesa_get_min_invocations_per_fragment() to ignore the sample qualifier (prog->IsSample) based on a flag passed to it. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
*	i965: Fix 'SIMD16 only' dispatch of fragment shader in case of sample shading	Anuj Phogat	2013-11-07	1	-7/+13
\| \| \| \| \| \| \| \| \| \| \|	This patch make changes to correctly set up the Dispatch GRF Start Register in case of 'SIMD16 only' FS dispatch. This fixes an issue of incorrect rendering on dolphin emulator with GL_SAMPLE_SHADING enabled. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>
*	i965/gen6: Don't allow SIMD16 dispatch in 4x PERPIXEL mode with computed depth.	Paul Berry	2013-11-06	1	-1/+33
\| \| \| \| \| \| \|	Hardware docs say we can only use SIMD8 dispatch in this condition. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
*	i965: Tell the unit states how many binding table entries we have.	Eric Anholt	2013-11-05	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	Before the series with 3c9dc2d31b80fc73bffa1f40a91443a53229c8e2 to dynamically assign our binding table indices, we didn't really track our binding table count per shader, so we never filled in these fields. Affects cairo-gl trace runtime by -2.47953% +/- 1.07281% (n=20) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/gen6: Enable the features required for GL_ARB_sample_shading	Anuj Phogat	2013-11-01	1	-5/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Enable GEN6_WM_MSDISPMODE_PERSAMPLE, GEN6_WM_POSOFFSET_SAMPLE, GEN6_WM_OMASK_TO_RENDER_TARGET as per extension's specification. - Only enable one of GEN6_WM_8_DISPATCH_ENABLE or GEN6_WM_16_DISPATCH_ENABLE when GEN6_WM_MSDISPMODE_PERSAMPLE is enabled. Refer SNB PRM Vol. 2, Part 1, Page 279 for details. V2: - Use shared function _mesa_get_min_invocations_per_fragment(). - Use brw_wm_prog_data variables: uses_pos_offset, uses_omask. V3: - Enable simd16 dispatch with per sample shading. - Make changes to give preference to 'simd16 only' mode over 'simd8 only' mode in case of non 1x per sample shading. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
*	i965/fs: Remove bogus field prog_data->dispatch_width.	Paul Berry	2013-10-15	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Despite the name, this field wasn't being set to the dispatch width at all; it was always 8. The only place it was used was that the constant buffer read length was aligned to it, and as far as I can tell from the docs, there is no need to align this value to the dispatch width; aligning it to a multiple of 8 is sufficient. So I've just replaced it with a hardcoded 8. v2: In gen6_wm_state, use brw->wm.base.push_const_size for consistency with VS and GS state upload. Reviewed-by: Eric Anholt <eric@anholt.net>
*	i965: Set brw_stage_state::push_const_size for PS constants.	Kenneth Graunke	2013-09-16	1	-1/+6
\| \| \| \| \| \| \| \| \|	This paves the way for using gen7_upload_constant_state for PS data. The formula is copied from gen7_wm_state.c. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
*	i965: Introduce a prog_data temporary in gen6_upload_wm_push_constants.	Kenneth Graunke	2013-09-16	1	-8/+8
\| \| \| \| \| \| \|	This saves a bit of typing and shortens a few lines. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
*	i965/fs: Consult brw_wm_prog_data::num_varying_inputs when setting up WM state.	Paul Berry	2013-09-16	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we assumed that the number of varying inputs consumed by the fragment shader was equal to the number of bits set in gl_program::InputsRead. However, we'll soon be making two changes that will cause that not to be true: - We'll stop wasting varying input space for gl_FragCoord and gl_FrontFacing, which aren't varyings. - For fragment shaders that have more than 16 varying inputs, we'll adjust the layout of the inputs to account for the fact that the SF/SBE pipeline stage can't reorder inputs beyond the first 16; if there are GS outputs that the FS doens't use (or vice versa) this may cause the number of FS varying inputs to change. So, instead of trying to guess the number of FS inputs from gl_program::InputsRead, simply read it from brw_wm_prog_data:num_varying_inputs, which is guaranteed to be correct since it's populated by fs_visitor::calculate_urb_setup(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Use brw_stage_state for WM data as well.	Kenneth Graunke	2013-09-13	1	-6/+8
\| \| \| \| \| \| \| \|	This gets the VS, GS, and PS all using the same data structure. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>
*	i965: Make sure constants re-sent after constant buffer reallocation.	Paul Berry	2013-08-31	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The hardware requires that after constant buffers for a stage are allocated using a 3DSTATE_PUSH_CONSTANT_ALLOC_{VS,HS,DS,GS,PS} command, and prior to execution of a 3DPRIMITIVE, the corresponding stage's constant buffers must be reprogrammed using a 3DSTATE_CONSTANT_{VS,HS,DS,GS,PS} command. Previously we didn't need to worry about this, because we only programmed 3DSTATE_PUSH_CONSTANT_ALLOC_{VS,HS,DS,GS,PS} once on startup (or, previous to that, whenever BRW_NEW_CONTEXT was flagged). But now that we reallocate the constant buffers whenever geometry shaders are switched on and off, we need to make sure the constant buffers are reprogrammed. We do this by adding a new bit, BRW_NEW_PUSH_CONSTANT_ALLOCATION, to brw->state.dirty.brw. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Split sampler count variable to be per-stage.	Kenneth Graunke	2013-08-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Currently, we only have a single sampler state table shared among all stages, so we just copy wm.sampler_count into vs.sampler_count. In the future, each shader stage will have its own SAMPLER_STATE table, at which point we'll need these separate sampler counts. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
*	i965: Delete intel_context entirely.	Kenneth Graunke	2013-07-09	1	-4/+2
\| \| \| \| \| \| \| \| \| \|	This makes brw_context inherit directly from gl_context; that was the only thing left in intel_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
*	Replace gl_frag_attrib enum with gl_varying_slot.	Paul Berry	2013-03-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This patch makes the following search-and-replace changes: gl_frag_attrib -> gl_varying_slot FRAG_ATTRIB_* -> VARYING_SLOT_* FRAG_BIT_* -> VARYING_BIT_* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>
*	i965: Replace brw_wm_* with dumping code into the fs_visitor.	Eric Anholt	2012-10-08	1	-6/+2
\| \| \| \| \| \| \| \| \| \| \|	This makes a giant pile of code newly dead. It also fixes TXB on newer chipsets, which has been totally broken (I now have a piglit test for that). It passes the same set of Ian's ARB_fragment_program tests. It also improves high-settings ETQW performance by 3.2 +/- 1.9% (n=3), thanks to better optimization and having 8-wide along with 16-wide shaders. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=24355 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/msaa: Treat GL_SAMPLES=1 as equivalent to GL_SAMPLES=0.	Paul Berry	2012-08-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	EXT_framebuffer_multisample is a required subpart of ARB_framebuffer_object, which means that we must support it even on platforms that don't support MSAA. Fortunately EXT_framebuffer_multisample allows for this by allowing GL_MAX_SAMPLES to be set to 1. This leads to a tricky quirk in the GL spec: since GlRenderbufferStorageMultisamples() accepts any value for its "samples" parameter up to and including GL_MAX_SAMPLES, that means that on platforms that don't support MSAA, GL_SAMPLES is allowed to be set to either 0 or 1. On platforms that do support MSAA, GL_SAMPLES=1 is not used; 0 means no MSAA, and 2 or higher means MSAA. In other words, GL_SAMPLES needs to be interpreted as follows: =0 no MSAA (possible on all platforms) =1 no MSAA (only possible on platforms where MSAA unsupported) >1 MSAA (only possible on platforms where MSAA supported) This patch modifies all MSAA-related code to choose between multisampling and single-sampling based on the condition (GL_SAMPLES > 1) instead of (GL_SAMPLES > 0) so that GL_SAMPLES=1 will be treated as "no MSAA". Note that since GL_SAMPLES=1 implies GL_SAMPLE_BUFFERS=1, we can no longer use GL_SAMPLE_BUFFERS to distinguish between MSAA and non-MSAA rendering. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
*	i965: Remove unused param conversion code.	Eric Anholt	2012-07-25	1	-2/+1
\| \| \| \| \| \| \| \|	Ever since ctx->NativeIntegers was set, the conversion flag has been PARAM_NO_CONVERT. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>