summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_vs_surface_state.c
Commit message (Collapse)AuthorAgeFilesLines
* i965: use new subroutine index uploader.Dave Airlie2016-08-231-0/+2
| | | | | | | | This plugs the subroutine index updates into the i965 backend, where it loads constants. Signed-off-by: Dave Airlie <airlied@redhat.com> Acked-by: Andres Gomez <agomez@igalia.com>
* i965: Make all atoms to track BRW_NEW_BLORP by defaultKenneth Graunke2016-04-231-0/+4
| | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com
* i965/state: Get rid of dword_pitch arguments to buffer functionsJason Ekstrand2015-12-071-12/+4
| | | | | Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* glsl: keep track of intra-stage indices for atomicsTimothy Arceri2015-10-271-2/+2
| | | | | | | | | | | | | | | This is more optimal as it means we no longer have to upload the same set of ABO surfaces to all stages in the program. This also fixes a bug where since commit c0cd5b var->data.binding was being used as a replacement for atomic buffer index, but they don't have to be the same value they just happened to end up the same when binding is 0. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Alejandro Piñeiro <apinheiro@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90175
* i965: Use _mesa_is_image_unit_valid() instead of gl_image_unit::_Valid.Francisco Jerez2015-10-091-1/+2
| | | | | | | | gl_image_unit::_Valid will be removed in a future commit. Tested-by: Ye Tian <yex.tian@intel.com> CC: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965: Add 64-bit dirty flag handling to brw_upload_pull_constantsChris Forbes2015-09-081-1/+1
| | | | | Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Resolve GCC sign-compare warning.Rhys Kidd2015-08-181-1/+1
| | | | | | | | | | | | | mesa/src/mesa/drivers/dri/i965/brw_vs_surface_state.c: In function 'brw_upload_pull_constants': mesa/src/mesa/drivers/dri/i965/brw_vs_surface_state.c:84:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < prog_data->nr_pull_params; i++) { ^ mesa/src/mesa/drivers/dri/i965/brw_vs_surface_state.c:89:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ALIGN(prog_data->nr_pull_params, 4) / 4; i++) { ^ Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
* i965: Hook up image state upload.Francisco Jerez2015-08-111-0/+25
| | | | | | | | v2: Add CS support. Move the image_params array back to brw_stage_prog_data. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>
* i965: Create a shader_dispatch_mode enum to replace VS/GS fields.Kenneth Graunke2015-06-011-2/+2
| | | | | | | | | | | | | We used to store the GS dispatch mode in brw_gs_prog_data while separately storing the VS dispatch mode in brw_vue_prog_data::simd8. This patch introduces an enum to represent all possible dispatch modes, and stores it in brw_vue_prog_data::dispatch_mode, unifying the two. Based on a suggestion by Matt Turner. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
* i965/state: Don't use brw->state.dirty.brwJordan Justen2015-03-311-2/+2
| | | | | | | | | | | | | | | Now, we only use ctx->NewDriverState. I used this bash & sed command in the i965 directory: for file in *.[ch] *.[ch]pp; do sed -i -e 's/state\.dirty\.brw/ctx.NewDriverState/g' $file done Followed by manual changes to brw_state_upload.c. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Add new SIMD8 VS prog data flagKristian Høgsberg2014-12-101-2/+8
| | | | | | | | | | This flag signals that we have a SIMD8 VS shader so we can set up the corresponding state accordingly. This boils down to setting the BDW+ SIMD8 enable bit in 3DSTATE_VS and making UBO and pull constant buffers use dword pitch. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Move BRW_NEW_*_PROG_DATA flags to .brw (not .cache).Kenneth Graunke2014-12-021-6/+6
| | | | | | | | | | | | | | | | | | | | | | I put the BRW_NEW_*_PROG_DATA flags at the beginning so that brw_state_cache.c can still continue using 1 << brw_cache_id. I also added a comment explaining the difference between BRW_NEW_*_PROG_DATA and BRW_NEW_*_PROGRAM, as it took me a long time to remember it. Non-mechanical changes: - brw_state_cache.c and brw_ff_gs.c now signal .brw, not .cache. - brw_state_upload.c - INTEL_DEBUG=state changes. - brw_context.h - bit definition merging. v2: Correct the explanation of BRW_NEW_*_PROG_DATA to mention state-based recompiles, and nix the "proper subset" claim, as it's false. (Caught by Kristian Høgsberg). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Rename CACHE_NEW_*_PROG to BRW_NEW_*_PROG_DATA.Kenneth Graunke2014-12-021-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we've moved a bunch of CACHE_NEW_* bits to BRW_NEW_*, the only ones that are left are legitimately related to the program cache. Yet, it seems a bit wasteful to have an entire bitfield for only 7 bits. State upload is one of the hottest paths in the driver. For each atom in the list, we call check_state() to see if it needs to be emitted. Currently, this involves comparing three separate bitfields (mesa, brw, and cache). Consolidating the brw and cache bitfields would save a small amount of CPU overhead per atom. Broadwell, for example, has 57 state atoms, so this small savings can add up. CACHE_NEW_*_PROG covers the brw_*_prog_data structures, as well as the offset into the program cache BO (prog_offset). Since most uses refer to brw_*_prog_data, I decided to use BRW_NEW_*_PROG_DATA as the name. Removing "cache" completely is a bit painful, so I decided to do it in several patches for easier review, and to separate mechanical changes from manual ones. This one simply renames things, and was made via: $ for file in *.[ch]; do sed -i -e 's/CACHE_NEW_\([A-Z_\*]*\)_PROG/BRW_NEW_\1_PROG_DATA/g' \ -e 's/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/g' $file done Note that BRW_NEW_*_PROG_DATA is still in .cache, not .brw! The next patch will remedy this flaw. It will also fix the alphabetization issues. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Matt Turner <mattst88@gmail.com>
* i965: Alphabetize brw_tracked_state flags and use a consistent style.Kenneth Graunke2014-11-291-4/+7
| | | | | | | | | | | | | | | | Most of the dirty flags were listed in some arbitrary order. Some used bonus parenthesis. Some put multiple flags on one line, others put one per line. Some used tabs instead of spaces...but only on some lines. This patch settles on one flag per line, in alphabetical order, using spaces instead of tabs, and sheds the unnecessary parentheses. Sorting was mostly done with vim's visual block feature and !sort, although I alphabetized short lists by hand; it was pretty manual. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Skip _mesa_load_state_parameters when there are zero parameters.Kenneth Graunke2014-11-201-5/+5
| | | | | | | | Saves a tiny bit of CPU overhead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>
* Revert 5 i965 patches: 8e27a4d2, 373143ed, c5bdf9be, 6f56e142, 88e3d404Jordan Justen2014-09-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | Reverts * "i965: Modify state upload to allow 2 different sets of state atoms." 8e27a4d2b3e4e74e9a77446bce49607433d86be3 * "i965: Modify dirty bit handling to support 2 pipelines." 373143ed9187c4d4ce1e3c486b5dd0880d18ec8b * "i965: Create a macro for checking a dirty bit." c5bdf9be1eca190417998d548fd140c1eca37a54 Conflicts: src/mesa/drivers/dri/i965/brw_context.h * "i965: Create a macro for setting all dirty bits." 6f56e1424d923fd80c84090fbf4506c9eaaffea1 Conflicts: src/mesa/drivers/dri/i965/brw_blorp.cpp src/mesa/drivers/dri/i965/brw_state_cache.c src/mesa/drivers/dri/i965/brw_state_upload.c * "i965: Create a macro for setting a dirty bit." 88e3d404dad009d8cff5124cf8acee7daeaceb64 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Create a macro for setting a dirty bit.Paul Berry2014-09-011-2/+2
| | | | | | | This will make it easier to extend dirty bit handling to support compute shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Store uniform constant values in a gl_constant_value instead of floatNeil Roberts2014-08-141-4/+6
| | | | | | | | | | | | | | | | | | | | | | The brw_stage_prog_data struct previously contained an array of float pointers to the values of parameters. These were then copied into a batch buffer to upload the values using a regular assignment. However the float values were also being overloaded to store integer values for integer uniforms. This can break if x87 floating-point registers are used to do the assignment because the fst instruction tries to fix up invalid float values. If an integer constant happened to look like an invalid float value then it would get altered when it was copied into the batch buffer. This patch changes the pointers to be gl_constant_value instead so that the assignment should end up copying without any alteration. This also makes it more obvious that the values being stored here are overloaded for multiple types. There are some static asserts where the values are uploaded to ensure that the size of gl_constant_value is the same as a float. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81150 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Merge VS/GS and WM pull constant buffer upload paths.Eric Anholt2014-07-021-16/+29
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Use intel_upload_space() for pull constant uploads.Eric Anholt2014-03-261-16/+10
| | | | | | | | | | | | | This also happens to fix a leak of the current GS pull constant BO on context destroy, by just not holding on to the pull const bos after the surface state is generated. No statistically significant performance difference on GLB2.7 on HSW at 1024x768 (n=40) or 320x240 (n=44), or on BYT at 320x240 (n=47). v2: Rebase on intel_upload simplification. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* mesa/sso: rename Shader to the pointer _ShaderGregory Hainaut2014-03-251-2/+2
| | | | | | | | | | | | | | | | Basically a sed but shaderapi.c and get.c. get.c => GL_CURRENT_PROGAM always refer to the "old" UseProgram behavior shaderapi.c => the old api stil update the Shader object directly V2: formatting improvement V3 (idr): * Rebase fixes after a block of code was moved from ir_to_mesa.cpp to shaderapi.c. * Trivial reformatting. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
* i965: Move the remaining driver debug over to stderr.Eric Anholt2014-02-221-2/+2
| | | | | | Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Move up duplicated fields from stage-specific prog_data to ↵Francisco Jerez2014-02-191-5/+5
| | | | | | | | | | | | | brw_stage_prog_data. There doesn't seem to be any reason for nr_params, nr_pull_params, param, and pull_param to be duplicated in the stage-specific subclasses of brw_stage_prog_data. Moving their definition to the common base class will allow some code sharing in a future commit, the removal of brw_vec4_prog_data_compare and brw_*_prog_data_free, and the simplification of the stage-specific brw_*_prog_data_compare. Reviewed-by: Paul Berry <stereotype441@gmail.com>
* mesa: Fold long lines introduced by the previous patch.Paul Berry2014-01-211-2/+4
| | | | | Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>
* mesa: Replace ctx->Shader.Current{Vertex,Fragment,Geometry}Program with an ↵Paul Berry2014-01-211-2/+2
| | | | | | | | | | | | | | | | | | | | | array. These are replaced with ctx->Shader.CurrentProgram[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}]. In patches to follow, this will allow us to replace a lot of ad-hoc logic with a variable index into the array. With the exception of the changes to mtypes.h, this patch was generated entirely by the command: find src -type f '(' -iname '*.c' -o -iname '*.cpp' ')' \ -print0 | xargs -0 sed -i \ -e 's/\.CurrentVertexProgram/.CurrentProgram[MESA_SHADER_VERTEX]/g' \ -e 's/\.CurrentGeometryProgram/.CurrentProgram[MESA_SHADER_GEOMETRY]/g' \ -e 's/\.CurrentFragmentProgram/.CurrentProgram[MESA_SHADER_FRAGMENT]/g' Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>
* s/Tungsten Graphics/VMware/José Fonseca2014-01-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/alanh@tungstengraphics.com/alanh@vmware.com/ s/jens@tungstengraphics.com/jowen@vmware.com/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\?@tungstengraphics.com/jfonseca@vmware.com/g s/keithw\?@tungstengraphics.com/keithw@vmware.com/g s/michel@tungstengraphics.com/daenzer@vmware.com/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/zack@tungstengraphics.com/zackr@vmware.com/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <brianp@vmware.com>
* i965: Unvirtualize brw_create_constant_surface; delete Gen7+ variant.Kenneth Graunke2013-11-051-3/+3
| | | | | | | | | Now that brw_create_constant_surface uses a virtual function internally, it doesn't need to be virtual itself. We can delete the Gen7+ variant and simplify things. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
* i965: Implement ABO surface state emission.Francisco Jerez2013-10-291-0/+23
| | | | | | | | | | | | The maximum number of atomic buffer objects is somewhat arbitrary, we can change it in the future easily if it turns out it's not enough... v2: Add comments with the relevant mesa dirty bits. Fix usage of BRW_NEW_UNIFORM_BUFFER in the GS ABO state atom. v3: Update binding table layout diagrams. v4: Resolve conflicts with the recent dynamic surface index assignment changes. Reviewed-by: Paul Berry <stereotype441@gmail.com>
* i965: Make a brw_stage_prog_data for storing the SURF_INDEX information.Eric Anholt2013-10-151-7/+7
| | | | | | | | | | | It would be nice to be able to pack our binding table so that programs that use 1 render target don't upload an extra BRW_MAX_DRAW_BUFFERS - 1 binding table entries. To do that, we need the compiled program to have information on where its surfaces go. v2: Rename size to size_bytes to be more explicit. Reviewed-by: Paul Berry <stereotype441@gmail.com>
* i965: Move binding table code to a new file, brw_binding_tables.c.Kenneth Graunke2013-09-191-62/+0
| | | | | | | | | | | | The code to upload the binding tables for each stage was scattered across brw_{vs,gs,wm}_surface_state.c and brw_misc_state.c, which also contain a lot of code to populate individual SURFACE_STATE structures. This patch brings all the binding table upload code together, and splits it out from the code which fills in SURFACE_STATE entries. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
* i965: Generalize brw_vec4_upload_binding_table() beyond vec4 stages.Kenneth Graunke2013-09-191-10/+11
| | | | | | | | | | | | | | Instead of passing in a brw_vec4_prog_data structure, we can simply pass the one field it needs: the number of entries in the binding table. We also need to pass in the shader time surface index rather than hardcoding SURF_INDEX_VEC4_SHADER_TIME. Since the resulting function is stage-agnostic, this patch removes "vec4_" from the name. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
* i965: Convert loop to memcpy in brw_vec4_upload_binding_table().Kenneth Graunke2013-09-191-9/+6
| | | | | | | This is probably more efficient. At any rate, it's less code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
* i965: Update comments in brw_vec4_upload_binding_table().Kenneth Graunke2013-09-191-6/+1
| | | | | | | | | | The first comment was a bit stale; there are more kinds of surfaces than textures and pull constants. The second was a leftover "to do" comment for something I already did. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
* i965/vs: generalize brw_vs_binding_table in preparation for GS.Paul Berry2013-08-311-13/+29
| | | | | | | v2: Use GLbitfield instead of GLbitfield64 in brw_vec4_upload_binding_table. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: generalize brw_vs_pull_constants in preparation for GS.Paul Berry2013-08-311-26/+43
| | | | | | | v2: Use GLbitfield instead of GLbitfield64 in brw_upload_vec4_pull_constants. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Move data from brw->vs into a base class if gs will also need it.Paul Berry2013-08-311-19/+24
| | | | | | | | | This paves the way for sharing the code that will set up the vertex and geometry shader pipeline state. v2: Rename the base class to brw_stage_state. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
* i965/gs: Update defines related to GS surface organization.Paul Berry2013-08-311-4/+4
| | | | | | | | | | | | | | | | Defines that previously referred to VS now refer to VEC4, since they will be shared by the user-programmable vertex shader and geometry shader stages. Defines that previously referred to the Gen6 geometry shader stage (which is only used for transform feedback) are now renamed to explicitly refer to Gen6, to avoid confusion with the Gen7 user-programmable geometry shader stage. Based on work by Eric Anholt <eric@anholt.net>. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
* i965: Make the VS binding table as small as possible.Kenneth Graunke2013-08-191-3/+4
| | | | | | | | For some reason, we didn't use this information even though the VS backend has computed it (albeit poorly) for ages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
* i965/vs: Rework binding table size calculation.Kenneth Graunke2013-08-191-5/+1
| | | | | | | | | | | | | | | | | | | | | | Unlike the FS, the VS backend already computed the binding table size. However, it did so poorly: after compilation, it looked to see if any pull constants/textures/UBOs were in use, and set num_surfaces to the maximum surface index for that category. If the VS only used a single texture or UBO, this overcounted by quite a bit. The shader time surface was also noted at state upload time (during drawing), not at compile time, which is inefficient. I believe it also had an off by one error. This patch computes it accurately, while also simplifying the code. It also renames num_surfaces to binding_table_size, since num_surfaces wasn't actually the number of surfaces used. For example, a VS that used one UBO and no other surfaces would have set num_surfaces to SURF_INDEX_VS_UBO(1) == 18, rather than 1. A bit of a misnomer there. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
* i965: Delete intel_context entirely.Kenneth Graunke2013-07-091-2/+2
| | | | | | | | | | This makes brw_context inherit directly from gl_context; that was the only thing left in intel_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965: Move intel_context::bufmgr to brw_context.Kenneth Graunke2013-07-091-2/+1
| | | | | | | Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
* i965: Move intel_context::vtbl to brw_context.Kenneth Graunke2013-07-091-2/+2
| | | | | | | Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
* mesa: add & use a new driver flag for UBO updates instead of _NEW_BUFFER_OBJECTMarek Olšák2013-05-111-3/+2
| | | | | | | v2: move the flagging from intel_bufferobj_data to intel_bufferobj_alloc_buffer Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>
* i965/vs: split brw_vs_prog_data into generic and VS-specific parts.Paul Berry2013-04-111-8/+10
| | | | | | | | | | | This will allow the generic parts to be re-used for geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v2: Put urb_read_length and urb_entry_size in the generic struct. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Make the fragment shader pull constants index by dwords, not vec4s.Eric Anholt2013-04-011-1/+1
| | | | | | | | | | | | | | | | | We want to load vec4s, since loading a vec4 instead of a dword is basically no increased latency. But for variable indexed access, the previous requirement of aligned vec4s for a sampler LD was hard to implement. Note that this change only affects those messages that use the surface format, like sampler LDs, but not to the untyped data cache loads we've used in other cases. No significant performance difference on my GLSL demo with uniforms forced to take the varying pull constants path (n=4). NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Make the constant surface interface take a normal byte size.Eric Anholt2013-04-011-4/+3
| | | | | | | | | This puts the rounding-up logic into the function itself instead of all the callers having to manage it. Also drop an "unused" comment in gen4, as the stride *is* used for texbos (and will be for uniforms soon). NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Specialize SURFACE_STATE creation for shader time.Kenneth Graunke2013-03-141-4/+1
| | | | | | | | | | | | | | | | | | This is basically a copy and paste of gen7_create_constant_surface, but with the parameters filled in to offer a simpler interface. It will diverge shortly. I didn't bother adding it to the vtable for now since shader time is only exposed on Gen7+. v2: Replace tabs in the new code (by anholt) Add back dropped memset() and add a comment about HSW channel selects. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Consistently use nr_pull_params instead of NumParameters.Eric Anholt2012-12-281-3/+2
| | | | | | | | | | NumParameters used to be an upper bound on the number of vec4s to be uploaded, which was basically safe (unless your buffer was bound near the top of address space *and* you array indexed outside the buffer, in which case I think you might GPU hang). As I migrate the driver away from ParameterValues[], this is no longer true. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Add a debug flag for counting cycles spent in each compiled shader.Eric Anholt2012-12-051-0/+10
| | | | | | | | | | | | | | | | | | | | | | This can be used for two purposes: Using hand-coded shaders to determine per-instruction timings, or figuring out which shader to optimize in a whole application. Note that this doesn't cover the instructions that set up the message to the URB/FB write -- we'd need to convert the MRF usage in these instructions to GRFs so that our offsets/times don't overwrite our shader outputs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) v2: Check the timestamp reset flag in the VS, which is apparently getting set fairly regularly in the range we watch, resulting in negative numbers getting added to our 32-bit counter, and thus large values added to our uint64_t. v3: Rebase on reladdr changes, removing a new safety check that proved impossible to satisfy. Add a comment to the AOP defs from Ken's review, and put them in a slightly more sensible spot. v4: Check timestamp reset in the FS as well.
* intel: Remove NV_vertex_program support.Eric Anholt2012-10-151-4/+0
| | | | | | | | | | | | We were holding on to this code because we were aware that NWN 1 had some support for vertex programs -- no other linux programs I've come across would use it (since other software also has ARB_vp or GLSL support). Only, it turns out that NWN doesn't even give us any vertex programs. Given that we have known issues where the extension has never been fully supported, just give up on it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46795 Reviewed-by: Brian Paul <brianp@vmware.com>