summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_curbe.c
Commit message (Collapse)AuthorAgeFilesLines
* i965: get inputs read from nir infoTimothy Arceri2016-10-061-1/+3
| | | | | | | | | | This is a step towards dropping the GLSL IR version of do_set_program_inouts() in i965 and moving towards native nir support. This is important because we want to eventually convert to nir and use its optimisations passes before we can call this GLSL IR pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Eliminate brw->wm.prog_data pointer.Kenneth Graunke2016-10-051-3/+3
| | | | | | | | | | | | Just say no to: - brw->wm.base.prog_data = &brw->wm.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_wm_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>
* i965: Eliminate brw->vs.prog_data pointer.Kenneth Graunke2016-10-051-3/+3
| | | | | | | | | | | | Just say no to: - brw->vs.base.prog_data = &brw->vs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_vs_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>
* i965: Use bitmask/ffs to iterate enabled clip planes.Mathias Fröhlich2016-06-161-10/+11
| | | | | | | | | | | Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
* i965: Make all atoms to track BRW_NEW_BLORP by defaultKenneth Graunke2016-04-231-0/+2
| | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com
* i965: Drop #include of main/glheader.h.Matt Turner2015-11-241-1/+0
| | | | | | It's never used. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965: Mark constant static data as const.Matt Turner2015-07-141-1/+1
| | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965: Implement proper workaround for Gen4 GPU CONSTANT_BUFFER hangs.Kenneth Graunke2015-04-141-13/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | I finally managed to dig up some information on our mysterious GPU hangs. A wiki page from the Crestline validation team mentions that they found a GPU hang in "Serious Sam 2" (on Windows) with remarkably similar conditions to the ones we've seen in Google Chrome and glmark2. Apparently, if WM_STATE has "PS Use Source Depth" enabled, CC_STATE has most depth state disabled, and you issue a CONSTANT_BUFFER command and immediately draw, the depth interpolator makes a small mistake that leads to hangs. Most of the traces I looked at contained a CONSTANT_BUFFER packet immediately followed by 3DPRIMITIVE, or at least very few packets. It appears they also have "PS Use Source Depth" enabled - either at the hang, or a little before it. So I think this is our bug. The workaround is to emit a non-pipelined state packet after issuing a CONSTANT_BUFFER packet. This is really similar to the workaround I developed in commit c4fd0c9052dd391d6f2e9bb8e6da209dfc7ef35b. v2: Fix word-wrapping issues. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/state: Don't use brw->state.dirty.brwJordan Justen2015-03-311-2/+2
| | | | | | | | | | | | | | | Now, we only use ctx->NewDriverState. I used this bash & sed command in the i965 directory: for file in *.[ch] *.[ch]pp; do sed -i -e 's/state\.dirty\.brw/ctx.NewDriverState/g' $file done Followed by manual changes to brw_state_upload.c. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Work around mysterious Gen4 GPU hangs with minimal state changes.Kenneth Graunke2015-01-191-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Gen4 hardware appears to GPU hang frequently when using Chromium, and also when running 'glmark2 -b ideas'. Most of the error states contain 3DPRIMITIVE commands in quick succession, with very few state packets between them - usually VERTEX_BUFFERS/ELEMENTS and CONSTANT_BUFFER. I trimmed an apitrace of the glmark2 hang down to two draw calls with a glUniformMatrix4fv call between the two. Either draw by itself works fine, but together, they hang the GPU. Removing the glUniform call makes the hangs disappear. In the hardware state, this translates to removing the CONSTANT_BUFFER packet between the two 3DPRIMITIVE packets. Flushing before emitting CONSTANT_BUFFER packets also appears to make the hangs disappear. I observed a slowdown in glxgears by doing it all the time, so I've chosen to only do it when BRW_NEW_BATCH and BRW_NEW_PSP are unset (i.e. we haven't done a CS_URB_STATE change or already flushed the whole pipeline). I'd much rather understand the problem, but at this point, I don't see how we'd ever be able to track it down further. We have no real tools, and the hardware people moved on years ago. I've analyzed 20+ error states and read every scrap of documentation I could find. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80568 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85367 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Matt Turner <mattst88@gmail.com> Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
* i965: Move BRW_NEW_*_PROG_DATA flags to .brw (not .cache).Kenneth Graunke2014-12-021-6/+6
| | | | | | | | | | | | | | | | | | | | | | I put the BRW_NEW_*_PROG_DATA flags at the beginning so that brw_state_cache.c can still continue using 1 << brw_cache_id. I also added a comment explaining the difference between BRW_NEW_*_PROG_DATA and BRW_NEW_*_PROGRAM, as it took me a long time to remember it. Non-mechanical changes: - brw_state_cache.c and brw_ff_gs.c now signal .brw, not .cache. - brw_state_upload.c - INTEL_DEBUG=state changes. - brw_context.h - bit definition merging. v2: Correct the explanation of BRW_NEW_*_PROG_DATA to mention state-based recompiles, and nix the "proper subset" claim, as it's false. (Caught by Kristian Høgsberg). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Rename CACHE_NEW_*_PROG to BRW_NEW_*_PROG_DATA.Kenneth Graunke2014-12-021-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we've moved a bunch of CACHE_NEW_* bits to BRW_NEW_*, the only ones that are left are legitimately related to the program cache. Yet, it seems a bit wasteful to have an entire bitfield for only 7 bits. State upload is one of the hottest paths in the driver. For each atom in the list, we call check_state() to see if it needs to be emitted. Currently, this involves comparing three separate bitfields (mesa, brw, and cache). Consolidating the brw and cache bitfields would save a small amount of CPU overhead per atom. Broadwell, for example, has 57 state atoms, so this small savings can add up. CACHE_NEW_*_PROG covers the brw_*_prog_data structures, as well as the offset into the program cache BO (prog_offset). Since most uses refer to brw_*_prog_data, I decided to use BRW_NEW_*_PROG_DATA as the name. Removing "cache" completely is a bit painful, so I decided to do it in several patches for easier review, and to separate mechanical changes from manual ones. This one simply renames things, and was made via: $ for file in *.[ch]; do sed -i -e 's/CACHE_NEW_\([A-Z_\*]*\)_PROG/BRW_NEW_\1_PROG_DATA/g' \ -e 's/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/g' $file done Note that BRW_NEW_*_PROG_DATA is still in .cache, not .brw! The next patch will remedy this flaw. It will also fix the alphabetization issues. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Matt Turner <mattst88@gmail.com>
* i965: Alphabetize brw_tracked_state flags and use a consistent style.Kenneth Graunke2014-11-291-7/+8
| | | | | | | | | | | | | | | | Most of the dirty flags were listed in some arbitrary order. Some used bonus parenthesis. Some put multiple flags on one line, others put one per line. Some used tabs instead of spaces...but only on some lines. This patch settles on one flag per line, in alphabetical order, using spaces instead of tabs, and sheds the unnecessary parentheses. Sorting was mostly done with vim's visual block feature and !sort, although I alphabetized short lists by hand; it was pretty manual. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Make Gen4-5 push constants call _mesa_load_state_parameters too.Kenneth Graunke2014-11-211-0/+4
| | | | | | | | | | | | | | | | | | In commit 5e37a2a4a8a, I made the pull constant code stop calling _mesa_load_state_parameters() when there were no pull parameters. This worked fine on Gen6+ because the push constant code also called it if there were any push constants. However, the Gen4-5 push constant code wasn't doing this. This patch makes it do so, like the Gen6+ code. A better long term solution would be to make core Mesa just handle this for us when necessary. Fixes around 8766 Piglit tests on Ironlake, and probably Gen4 as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Mark Janes <mark.a.janes@intel.com>
* Revert 5 i965 patches: 8e27a4d2, 373143ed, c5bdf9be, 6f56e142, 88e3d404Jordan Justen2014-09-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | Reverts * "i965: Modify state upload to allow 2 different sets of state atoms." 8e27a4d2b3e4e74e9a77446bce49607433d86be3 * "i965: Modify dirty bit handling to support 2 pipelines." 373143ed9187c4d4ce1e3c486b5dd0880d18ec8b * "i965: Create a macro for checking a dirty bit." c5bdf9be1eca190417998d548fd140c1eca37a54 Conflicts: src/mesa/drivers/dri/i965/brw_context.h * "i965: Create a macro for setting all dirty bits." 6f56e1424d923fd80c84090fbf4506c9eaaffea1 Conflicts: src/mesa/drivers/dri/i965/brw_blorp.cpp src/mesa/drivers/dri/i965/brw_state_cache.c src/mesa/drivers/dri/i965/brw_state_upload.c * "i965: Create a macro for setting a dirty bit." 88e3d404dad009d8cff5124cf8acee7daeaceb64 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Create a macro for setting a dirty bit.Paul Berry2014-09-011-1/+1
| | | | | | | This will make it easier to extend dirty bit handling to support compute shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Store uniform constant values in a gl_constant_value instead of floatNeil Roberts2014-08-141-10/+12
| | | | | | | | | | | | | | | | | | | | | | The brw_stage_prog_data struct previously contained an array of float pointers to the values of parameters. These were then copied into a batch buffer to upload the values using a regular assignment. However the float values were also being overloaded to store integer values for integer uniforms. This can break if x87 floating-point registers are used to do the assignment because the fst instruction tries to fix up invalid float values. If an integer constant happened to look like an invalid float value then it would get altered when it was copied into the batch buffer. This patch changes the pointers to be gl_constant_value instead so that the assignment should end up copying without any alteration. This also makes it more obvious that the values being stored here are overloaded for multiple types. There are some static asserts where the values are uploaded to ensure that the size of gl_constant_value is the same as a float. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81150 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Update a ton of comments about constant buffers.Eric Anholt2014-07-021-32/+54
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Fix state flags for gen4/5 CURBE.Eric Anholt2014-07-021-8/+8
| | | | | | | | If we had some NOS affecting VS compilation that resulted in optimization changing the set of constants to be uploaded, we might not have reuploaded the constants. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Drop the memcmp for finding duplicated CURBE uploads.Eric Anholt2014-07-021-23/+2
| | | | | | | | | | | | | | At this point, the extra copy of the data and memcmp are as expensive as just re-uploading. Note: now that we'll always upload, and brw_constant_buffer watches BRW_NEW_BATCH anyway, we don't need to explicitly unref the old curbe_bo at batch reset time. No significant performance difference on glamor copywinwin10 (n=55), despite that test having a 98% hit rate on the cache. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Reuse intel_upload.c for gen4/5 constant buffers.Eric Anholt2014-07-021-28/+3
| | | | | | No performance difference on glamor with copywinwin10 (n=40) on my gm45. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Delete the intel_regions.c code.Eric Anholt2014-05-011-1/+0
| | | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
* i965: Move the remaining driver debug over to stderr.Eric Anholt2014-02-221-13/+13
| | | | | | Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Move up duplicated fields from stage-specific prog_data to ↵Francisco Jerez2014-02-191-6/+6
| | | | | | | | | | | | | brw_stage_prog_data. There doesn't seem to be any reason for nr_params, nr_pull_params, param, and pull_param to be duplicated in the stage-specific subclasses of brw_stage_prog_data. Moving their definition to the common base class will allow some code sharing in a future commit, the removal of brw_vec4_prog_data_compare and brw_*_prog_data_free, and the simplification of the stage-specific brw_*_prog_data_compare. Reviewed-by: Paul Berry <stereotype441@gmail.com>
* i965: Remove CACHED_BATCH support altogether.Kenneth Graunke2014-01-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Using an unoptimized variant of glamor spending 50% of its CPU time in brw_draw_prims() (and hitting the cache *very* frequently): N Min Max Median Avg Stddev x 200 29200 40500 34900 34750 958.43256 + 200 31000 40300 34700 34622 916.35941 No difference proven at 95.0% confidence Similarly, no difference on GLB2.7: N Min Max Median Avg Stddev x 63 64.1 71.36 70.69 70.113175 1.6782026 + 63 63.6 71.18 70.75 70.223651 1.6044186 No difference proven at 95.0% confidence v2: Rebase on master (by anholt) v3: Add a missing BEGIN_BATCH(3) to aa_line_parameters -- CACHED_BATCH didn't have the asserts about batchbuffer usage that ADVANCE_BATCH does, so we started assertion failing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net>
* s/Tungsten Graphics/VMware/José Fonseca2014-01-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/alanh@tungstengraphics.com/alanh@vmware.com/ s/jens@tungstengraphics.com/jowen@vmware.com/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\?@tungstengraphics.com/jfonseca@vmware.com/g s/keithw\?@tungstengraphics.com/keithw@vmware.com/g s/michel@tungstengraphics.com/daenzer@vmware.com/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/zack@tungstengraphics.com/zackr@vmware.com/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <brianp@vmware.com>
* i965: Drop trailing whitespace from the rest of the driver.Kenneth Graunke2013-12-051-9/+9
| | | | | | | Performed via: $ for file in *; do sed -i 's/ *//g'; done Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Delete intel_context entirely.Kenneth Graunke2013-07-091-3/+2
| | | | | | | | | | This makes brw_context inherit directly from gl_context; that was the only thing left in intel_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965: Move intel_context::bufmgr to brw_context.Kenneth Graunke2013-07-091-1/+1
| | | | | | | Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965: Pass brw_context to functions rather than intel_context.Kenneth Graunke2013-07-091-2/+0
| | | | | | | | | | | | | | This makes brw_context available in every function that used intel_context. This makes it possible to start migrating fields from intel_context to brw_context. Surprisingly, this actually removes some code, as functions that use OUT_BATCH don't need to declare "intel"; they just use "brw." Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965/vs: split brw_vs_prog_data into generic and VS-specific parts.Paul Berry2013-04-111-3/+3
| | | | | | | | | | | This will allow the generic parts to be re-used for geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v2: Put urb_read_length and urb_entry_size in the generic struct. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Remove some stale comments about the brw_constant_buffer atom.Eric Anholt2013-02-111-6/+0
| | | | | | | These have been wrong since f428255bde93a452a7cdd48fba21839c99beb6cb back in 2009! Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Update comment about clipper constants.Kenneth Graunke2012-11-011-9/+1
| | | | | | | The old VS backend doesn't exist, but I believe these still need to be delivered to the clipper thread. Reviewed-by: Eric Anholt <eric@anholt.net>
* i965/vs: Remove support for the old parameter layout.Kenneth Graunke2012-11-011-19/+2
| | | | | | Only the old backend used it. Reviewed-by: Eric Anholt <eric@anholt.net>
* i965: Remove unused param conversion code.Eric Anholt2012-07-251-2/+1
| | | | | | | | Ever since ctx->NativeIntegers was set, the conversion flag has been PARAM_NO_CONVERT. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/gen4: Move CURBE offset calculation to emit() time.Eric Anholt2011-10-291-1/+1
| | | | | | | This is consumed by the unit state. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>
* i965/gen4: Fold push constant prepare()/emit() together.Eric Anholt2011-10-291-13/+9
| | | | | | | | While other units need to know about our constant buffer offsets, nothing else cared about which particular BO other than the emit() half. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>
* i965: Remove the validated BO list, now that it's unused.Eric Anholt2011-10-291-2/+0
| | | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>
* mesa: Create _mesa_bitcount_64() to replace i965's brw_count_bits()Paul Berry2011-10-061-1/+1
| | | | | | | | | | | | | | | | The i965 driver already had a function to count bits in a 64-bit uint (brw_count_bits()), but it was buggy (it only counted the bottom 32 bits) and it was clumsy (it had a strange and broken fallback for non-GCC-like compilers, which fortunately was never used). Since Mesa already has a _mesa_bitcount() function, it seems better to just create a _mesa_bitcount_64() function rather than special-case this in the i965 driver. This patch creates the new _mesa_bitcount_64() function and rewrites all of the old brw_count_bits() calls to refer to it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
* i965 Gen6: Implement gl_ClipVertex.Paul Berry2011-10-051-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements proper support for gl_ClipVertex by causing the new VS backend to populate the clip distance VUE slots using VERT_RESULT_CLIP_VERTEX when appropriate, and by using the untransformed clip planes in ctx->Transform.EyeUserPlane rather than the transformed clip planes in ctx->Transform._ClipUserPlane when a GLSL-based vertex shader is in use. When not using a GLSL-based vertex shader, we use ctx->Transform._ClipUserPlane (which is what we used prior to this patch). This ensures that clipping is still performed correctly for fixed function and ARB vertex programs. A new function, brw_select_clip_planes() is used to determine whether to use _ClipUserPlane or EyeUserPlane, so that the logic for making this decision is shared between the new and old vertex shaders. Fixes the following Piglit tests on i965 Gen6: - vs-clip-vertex-const-accept - vs-clip-vertex-const-reject - vs-clip-vertex-different-from-position - vs-clip-vertex-equal-to-position - vs-clip-vertex-homogeneity - vs-clip-based-on-position - vs-clip-based-on-position-homogeneity - clip-plane-transformation clipvert_pos - clip-plane-transformation pos_clipvert - clip-plane-transformation pos Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad@chad-versace.us>
* i965 new VS: don't share clip plane constants in pre-GEN6Paul Berry2011-09-281-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | In pre-GEN6, when using clip planes, both the vertex shader and the clipper need access to the client-supplied clip planes, since the vertex shader needs them to set the clip flags, and the clipper needs them to determine where to insert new vertices. With the old VS backend, we used a clever optimization to avoid placing duplicate copies of these planes in the CURBE: we used the same block of memory for both the clipper and vertex shader constants, with the clip planes at the front of it, and then we instructed the clipper to read just the initial part of this block containing the clip planes. This optimization was tricky, of dubious value, and not completely working in the new VS backend, so I've removed it. Now, when using the new VS backend, separate parts of the CURBE are used for the clipper and the vertex shader. Note that this doesn't affect the number of push constants available to the vertex shader, it simply causes the CURBE to occupy a few more bytes of URB memory. The old VS backend is unaffected. GEN6+, which does clipping entirely in hardware, is also unaffected. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Remove bogus assertion on MAX_CLIP_PLANES.Paul Berry2011-09-201-1/+0
| | | | | | | | | This patch removes the assertion "MAX_CLIP_PLANES == 6" from the i965 driver. This assertion is unnecessary; nothing in the driver requires MAX_CLIP_PLANES to be 6. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965: Use native integer uniforms when the new VS backend is in use.Eric Anholt2011-08-301-2/+1
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/vs: Start adding support for uniformsEric Anholt2011-08-161-10/+17
| | | | There's no clever packing here, no pull constants, and no array support.
* i965: Move repeat-instruction-suppression to batchbuffer coreChris Wilson2011-02-211-9/+11
| | | | | | | | Move the tracking of the last emitted instructions into the core batchbuffer routines and take advantage of the shadow batch copy to avoid extra memory allocations and copies. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* i965: Drop push-mode reladdr constant loading and always use constant_map.Eric Anholt2010-12-081-15/+7
| | | | | | | | This eases the gen6 implementation, which can only handle up to 32 registers of constants, while likely not penalizing real apps using reladdr since all of those I've seen also end up hitting the pull constant buffer. On gen6, the constant map means that simple NV VPs fit under the 32-reg limit and now succeed. Fixes around 10 testcases.
* i965: Make FS uniforms be the actual type of the uniform at upload time.Eric Anholt2010-10-271-2/+4
| | | | | | | | This fixes some insanity that would otherwise be required for GLSL 1.30 bit ops or gen6 integer uniform operations in general, at the cost of upload-time pain. Given that we only have that pain because mesa's mangling our integer uniforms to be floats, this something that should be fixed outside of the shader codegen.
* Drop GLcontext typedef and use struct gl_context insteadKristian Høgsberg2010-10-131-2/+2
|
* Merge branch 'shader-file-reorg'Brian Paul2010-06-231-3/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Move all GL entrypoint functions and files into src/mesa/main/ This includes the ARB vp/vp, NV vp/fp, ATI fragshader and GLSL bits that were in src/mesa/shader/ 2. Move src/mesa/shader/slang/ to src/mesa/slang/ to reduce the tree depth 3. Rename src/mesa/shader/ to src/mesa/program/ since all the remaining files are concerned with GPU programs. 4. Misc code refactoring. In particular, I got rid of most of the GLSL-related ctx->Driver hook functions. None of the drivers used them. Conflicts: src/mesa/drivers/dri/i965/brw_context.c
| * mesa: rename src/mesa/shader/ to src/mesa/program/Brian Paul2010-06-101-3/+3
| |