summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_misc_state.c
Commit message (Collapse)AuthorAgeFilesLines
* i965/miptree: Remove the stencil_as_y_tiled parameter from get_tile_masksJason Ekstrand2016-08-171-3/+3
| | | | | | | It's only used to stomp the tiling to Y and it's only used by blorp so there's no reason why blorp can't do it itself. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965: Emit SNB write cache flush W/A from brw_emit_pipe_control_flush.Francisco Jerez2016-07-071-9/+0
| | | | | | | | Shouldn't cause any functional changes at this point, but we have forgotten to apply this workaround several times in the past, make sure it doesn't happen again. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
* i965: Assert that a depth_mt exists when using HiZ.Matt Turner2016-05-251-0/+1
| | | | | Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
* i965: Send the minimal number of STATE_BASE_ADDRESS packets.Kenneth Graunke2016-05-161-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | STATE_BASE_ADDRESS stalls the whole pipeline, and the documentation cautions us to emit it as little as possible for better performance. We recently put some hacks in BLORP to try and avoid emitting it if it was already set correctly. However, this wasn't quite minimal: if BLORP is the first operation (i.e. glClear()), then it would emit it, and subsequent draw calls would emit it again. This caused a small drop in performance in GPUTest Triangle when switching from Meta to BLORP. Unlike most packets, STATE_BASE_ADDRESS isn't influenced by GL state: it needs to be emitted once per batch, before most other commands, or whenever we change the program cache BO. It's also valid in both the 3D and compute pipelines, which makes it even more unique. This patch removes it from the atom mechanism and instead directly calls it as part of every draw, compute dispatch, or BLORP operation. We introduce a new flag indicating that STATE_BASE_ADDRESS has already been emitted this batch, and if so, skip doing it again. When we make a new program cache BO, we simply reset the flag, so the next operation will emit it again. When we flush/reset the batch, we reset the flag. This guarantees that we'll emit STATE_BASE_ADDRESS only when we have to. It's also less code than the old atom mechanism. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
* i965: Combine Gen4-7 and Gen8+ state base address emitters.Kenneth Graunke2016-05-161-4/+42
| | | | | | | | We're about to start calling it directly, and this means the callers won't have to think about generations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
* i965: Drop BRW_NEW_BLORP from stipple and line parameter packets.Kenneth Graunke2016-05-121-8/+4
| | | | | | | | | | | | BLORP never touches these, and they're all non-pipelined. Some are fairly large packets as well. I haven't tried to benchmark this; the effect is likely to be small. However, we may as well stop the pointless papercuts; maybe they'll add up someday. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965/blorp: Do not trigger re-emission of base state addressTopi Pohjolainen2016-04-231-1/+0
| | | | | | | | In case blorp needs to configure it will be just as if render or compute pipeline had configured it. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Make all atoms to track BRW_NEW_BLORP by defaultKenneth Graunke2016-04-231-7/+16
| | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com
* i965: Rename define for the PIPE_CONTROL DC flush bit.Francisco Jerez2016-02-081-1/+1
| | | | | | | Its previous name was somewhat misleading, this really behaves like a RW cache flush rather than an invalidation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/gen7.5+: Disable resource streamer during GPGPU workloads.Francisco Jerez2016-01-141-0/+40
| | | | | | | | | | The RS and hardware binding tables are only supported on the 3D pipeline and can lead to corruption if left enabled during a GPGPU workload. Disable it when switching to the GPGPU (or media) pipeline and re-enable it when switching back to the 3D pipeline. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
* i965/gen7: Emit stall and dummy primitive draw after switching to the 3D ↵Francisco Jerez2016-01-141-0/+24
| | | | | | | | | pipeline. This hardware bug can supposedly lead to a hang on IVB and VLV. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/gen4-5: Emit MI_FLUSH as required prior to switching pipelines.Francisco Jerez2016-01-141-0/+13
| | | | | | | | | AFAIK brw_emit_select_pipeline() is only called once during context init on Gen4-5, at which point the pipeline is likely to be already idle so it may just happen to work by luck regardless of the MI_FLUSH. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/gen6-7: Implement stall and flushes required prior to switching pipelines.Francisco Jerez2016-01-141-0/+37
| | | | | | | | | | | | | | | | | Switching the current pipeline while it's not completely idle or the read and write caches aren't flushed can lead to corruption. Fixes misrendering of at least the following Khronos CTS test: ES31-CTS.shader_image_load_store.basic-allTargets-store-fs The stall and flushes are no longer required on Gen8+. v2: Emit PIPE_CONTROL with non-zero post-sync op before the write cache flush on SNB due to hardware bug. (Ken) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93323 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/gen8+: Invalidate color calc state when switching to the GPGPU pipeline.Francisco Jerez2016-01-141-0/+20
| | | | | | | | | | | This hardware bug can cause a hang on context restore while the current pipeline is set to GPGPU (BDWGFX HSD 1909593). In addition to clearing the valid bit, mark the CC state as dirty to make sure that the CC indirect state pointer is re-emitted when we switch back to the 3D pipeline. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: add EXT_polygon_offset_clamp support to gen4/gen5Ilia Mirkin2015-10-051-8/+0
| | | | | Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
* i965: Use intel_get_tile_dims() to get tile masksAnuj Phogat2015-09-281-7/+13
| | | | | | | | | | | | | | | This will require change in the parameters passed to intel_miptree_get_tile_masks(). V2: Rearrange the order of parameters. (Ben) Change the name to intel_get_tile_masks(). (Topi) V3: Use temporary variables in intel_get_tile_masks() for clarity. Fix mask_y computation. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>
* i965: Always re-emit the pipeline select during invariant state emissionChris Wilson2015-08-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | On the older platforms where we don't have logical contexts preserving state across batches, we emit the invariant state setup on every batch using the brw_invariant_state atom. This includes the pipeline selection which is cached with the introduction of commit 0e0e23ef537c9add672ff322f34e129a07edc55e Author: Jordan Justen <jordan.l.justen@intel.com> Date: Wed Apr 22 11:43:50 2015 -0700 i965/state: Emit pipeline select when changing pipelines However, we do not reset the cache between batches on context-less platforms resulting in us not setting the pipeline selection and can cause GPU hangs if a media pipelined was loaded in the meantime (e.g. mixing mplayer/gstreamer using libva and gnome-shell). A simple solution is to just forcibly re-emit the pipeline select along with the invariant state and reset the cache at that point. Reported-and-tested-by: Tomasz C. <tomaszc@o2.pl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91254 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
* i965: Trivial formatting changes in brw_misc_state.cIan Romanick2015-08-031-26/+23
| | | | | | Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
* i965: Use float calculations when double is unnecessary.Matt Turner2015-07-291-2/+2
| | | | | | | | | | | | | | | | | Literals without an f/F suffix are of type double, and implicit conversion rules specify that the float in (float op double) be converted to a double before the operation is performed. I believe float execution was intended (in nearly all cases) or is sufficient (in the case of gen7_urb.c). Removes a lot of float <-> double conversion instructions and replaces many double instructions with float instructions which are cheaper. text data bss dec hex filename 4928659 195160 26192 5150011 4e953b i965_dri.so before 4928315 195152 26192 5149659 4e93db i965_dri.so after Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965: Rename intel_emit* to reflect their new location in brw_pipe_controlChris Wilson2015-06-241-1/+1
| | | | | Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Use _mesa_geometric_ functions appropriatelyKevin Rogovin2015-06-171-3/+6
| | | | | | | | | | | | | | Change references to gl_framebuffer::Width, Height, MaxNumLayers and Visual::samples to use the _mesa_geometry_ convenience functions for those places where the geometry of the gl_framebuffer is needed (in contrast to the geometry of the intersection of the attachments of the gl_framebuffer). This patch is to pave the way to enable GL_ARB_framebuffer_no_attachments on Gen7 and higher in i965. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
* i965/state: Emit pipeline select when changing pipelinesJordan Justen2015-05-021-6/+17
| | | | | Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/state: Don't use brw->state.dirty.brwJordan Justen2015-03-311-2/+2
| | | | | | | | | | | | | | | Now, we only use ctx->NewDriverState. I used this bash & sed command in the i965 directory: for file in *.[ch] *.[ch]pp; do sed -i -e 's/state\.dirty\.brw/ctx.NewDriverState/g' $file done Followed by manual changes to brw_state_upload.c. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/hiz: Start to separate miptree out from hiz buffersJordan Justen2015-03-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Today we allocate a miptree's for the hiz buffer. We needed this in the past because we would point the hardware at offsets of the hiz buffer. Since the hiz format is not documented, this is not a good idea. Since moving to support layered rendering on Gen7+, we no longer point at an offset into the buffer on Gen7+. Therefore, to support hiz on Gen7+, we don't need a full miptree structure allocated. This patch starts to create a new auxiliary buffer structure (intel_miptree_aux_buffer) that can be a more simplistic miptree side-band buffer associated with a miptree. (For example, to serve the needs of the hiz buffer.) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
* i965: Do Sandybridge workaround flushes before each primitive.Kenneth Graunke2015-02-171-27/+0
| | | | | | | | | | | | | | | | | | | | | | | Sandybridge requires the post-sync non-zero workaround in a ton of places, and if you ever miss one, the GPU usually hangs. Currently, we try to track exactly when a workaround flush is necessary (via the brw->batch.need_workaround_flush flag). This is tricky to get right, and we've botched it several times in the past. This patch unconditionally performs the post-sync non-zero flush at the start of each primitive's state upload (including BLORP). We drop the needs_workaround_flush flag, and drop all the other callers, as the flush has already been performed. We have no data to indicate that simply flushing all the time will hurt performance, and it has the potential to help stability. v2: Add post-sync workaround to initial GPU state upload to be extra cautious (suggested by Chad Versace). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
* i965: Delete brw_state_flags::cache and related code.Kenneth Graunke2014-12-021-8/+0
| | | | | | | | | It's been merged into brw_state_flags::brw for simplicity and efficiency. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Move BRW_NEW_*_PROG_DATA flags to .brw (not .cache).Kenneth Graunke2014-12-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | I put the BRW_NEW_*_PROG_DATA flags at the beginning so that brw_state_cache.c can still continue using 1 << brw_cache_id. I also added a comment explaining the difference between BRW_NEW_*_PROG_DATA and BRW_NEW_*_PROGRAM, as it took me a long time to remember it. Non-mechanical changes: - brw_state_cache.c and brw_ff_gs.c now signal .brw, not .cache. - brw_state_upload.c - INTEL_DEBUG=state changes. - brw_context.h - bit definition merging. v2: Correct the explanation of BRW_NEW_*_PROG_DATA to mention state-based recompiles, and nix the "proper subset" claim, as it's false. (Caught by Kristian Høgsberg). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Rename CACHE_NEW_*_PROG to BRW_NEW_*_PROG_DATA.Kenneth Graunke2014-12-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we've moved a bunch of CACHE_NEW_* bits to BRW_NEW_*, the only ones that are left are legitimately related to the program cache. Yet, it seems a bit wasteful to have an entire bitfield for only 7 bits. State upload is one of the hottest paths in the driver. For each atom in the list, we call check_state() to see if it needs to be emitted. Currently, this involves comparing three separate bitfields (mesa, brw, and cache). Consolidating the brw and cache bitfields would save a small amount of CPU overhead per atom. Broadwell, for example, has 57 state atoms, so this small savings can add up. CACHE_NEW_*_PROG covers the brw_*_prog_data structures, as well as the offset into the program cache BO (prog_offset). Since most uses refer to brw_*_prog_data, I decided to use BRW_NEW_*_PROG_DATA as the name. Removing "cache" completely is a bit painful, so I decided to do it in several patches for easier review, and to separate mechanical changes from manual ones. This one simply renames things, and was made via: $ for file in *.[ch]; do sed -i -e 's/CACHE_NEW_\([A-Z_\*]*\)_PROG/BRW_NEW_\1_PROG_DATA/g' \ -e 's/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/g' $file done Note that BRW_NEW_*_PROG_DATA is still in .cache, not .brw! The next patch will remedy this flaw. It will also fix the alphabetization issues. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Matt Turner <mattst88@gmail.com>
* i965: Combine CACHE_NEW_*_UNIT into BRW_NEW_GEN4_UNIT_STATE.Kenneth Graunke2014-11-291-7/+2
| | | | | | | | | | | | | | | | | | | | | On Gen4-5, unit state is specified as indirect state, rather than commands. If any unit state changes, we upload it via brw_state_batch and arrange for 3DSTATE_PIPELINED_POINTERS to be re-emitted, which updates pointers to all unit state at once. Since there's only one command and state atom (brw_psp_urb_cs) that needs to know about this, there's no benefit to having six separate flags. We can combine CACHE_NEW_*_UNIT into a single flag. We also haven't cached these in a long time, so it doesn't make sense to use the "CACHE_NEW_" prefix. Instead, use the "BRW_NEW_" prefix. This also saves 12 * sizeof(void *) bytes of memory per context, as we remove useless aux_compare/aux_free functions for each CACHE bit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Alphabetize brw_tracked_state flags and use a consistent style.Kenneth Graunke2014-11-291-16/+16
| | | | | | | | | | | | | | | | Most of the dirty flags were listed in some arbitrary order. Some used bonus parenthesis. Some put multiple flags on one line, others put one per line. Some used tabs instead of spaces...but only on some lines. This patch settles on one flag per line, in alphabetical order, using spaces instead of tabs, and sheds the unnecessary parentheses. Sorting was mostly done with vim's visual block feature and !sort, although I alphabetized short lists by hand; it was pretty manual. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Always enable VF statisticsBen Widawsky2014-11-131-2/+1
| | | | | | | | | | | | | | | Every other unit in the geometry pipeline automatically enables statistics gathering. This part of the pipe has been controlled by the DEBUG_STATS variable, but this is asymmetric. This dates back to the original implementation, and I am not sure if there is a reason for it. I need access to these stats to implement ARB_pipeline_statistics_query. Eric wrote it, and Ken touched it last. Do you have any opposition? Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86145 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
* i965/skl: Set mask bits in PIPELINE_SELECT on Skylake.Kenneth Graunke2014-11-031-1/+1
| | | | | | | | | | | Skylake has some extra bits in PIPELINE_SELECT, none of which are interesting for a 3D driver. In order to selectively change them, it also introduces new "mask bits" in 15:8. We care about the "Pipeline Selection" bits (1:0), so set the mask to 0x3. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
* Revert 5 i965 patches: 8e27a4d2, 373143ed, c5bdf9be, 6f56e142, 88e3d404Jordan Justen2014-09-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | Reverts * "i965: Modify state upload to allow 2 different sets of state atoms." 8e27a4d2b3e4e74e9a77446bce49607433d86be3 * "i965: Modify dirty bit handling to support 2 pipelines." 373143ed9187c4d4ce1e3c486b5dd0880d18ec8b * "i965: Create a macro for checking a dirty bit." c5bdf9be1eca190417998d548fd140c1eca37a54 Conflicts: src/mesa/drivers/dri/i965/brw_context.h * "i965: Create a macro for setting all dirty bits." 6f56e1424d923fd80c84090fbf4506c9eaaffea1 Conflicts: src/mesa/drivers/dri/i965/brw_blorp.cpp src/mesa/drivers/dri/i965/brw_state_cache.c src/mesa/drivers/dri/i965/brw_state_upload.c * "i965: Create a macro for setting a dirty bit." 88e3d404dad009d8cff5124cf8acee7daeaceb64 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Create a macro for setting a dirty bit.Paul Berry2014-09-011-2/+2
| | | | | | | This will make it easier to extend dirty bit handling to support compute shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965/gen6 depth surface: program 3DSTATE_DEPTH_BUFFER to top of surfaceJordan Justen2014-08-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | (bf25ee2 for gen6) Previously we would always find the 2D sub-surface of interest, and then program the surface to this location. Now we always program the 3DSTATE_DEPTH_BUFFER at the start of the surface. To select the lod/slice, we utilize the lod & minimum array element fields. We also must disable brw_workaround_depthstencil_alignment for gen >= 6. Now the hardware will handle alignment when rendering to additional slices/LODs. v3: * Set depth_mt bo RELOC offset to 0, as was done in bf25ee2 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56127 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Move has_hiz from the slice to the level.Eric Anholt2014-05-121-1/+1
| | | | | | | | The value depends only on the level, so no need to store the bool per slice. Shrinks intel_mipmap_slice from 24 bytes to 16, while slotting into an existing hole in intel_mipmap_level. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
* i965: Delete the intel_regions.c code.Eric Anholt2014-05-011-1/+0
| | | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
* i965: Drop use of intel_region from miptrees.Eric Anholt2014-05-011-14/+12
| | | | | | | | | | | | Note: region->width/height used to reflect the total_width/height padding of separate stencil, though mt->total_width didn't. region->width/height was being used in EGL images, where the padded value would have been the wrong one, so I converted them to use rb->Width/Height. v2: Drop debug printf that slipped in (caught by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
* i965: Move intel_region_get_aligned_offset() to be a miptree function.Eric Anholt2014-05-011-9/+8
| | | | | | | | | | All the consumers are doing it on a miptree. v2: fix a silly duplicated dereference (review by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> (v1) Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v1)
* i965: Move intel_region_get_tile_masks() to be a miptree function.Eric Anholt2014-05-011-7/+7
| | | | | | | | All the consumers are doing it on a miptree. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
* i965: Actually emit PIPELINE_SELECT and 3DSTATE_VF_STATISTICS.Kenneth Graunke2014-05-011-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | For platforms using hardware contexts (currently Gen6+), we failed to emit PIPELINE_SELECT and 3DSTATE_VF_STATISTICS, instead emitting MI_NOOP for both. During one of the context initialization reordering patches, we accidentally moved brw_init_state before we set brw->CMD_PIPELINE_SELECT and brw->CMD_VF_STATISTICS. So, when brw_init_state uploaded initial GPU state (brw_init_state -> brw_upload_initial_gpu_state -> brw_upload_invariant_state), these would be 0 (MI_NOOP). Storing the commands in the context is not worthwhile. We have many generation checks in our state upload code, and for platforms with hardware contexts, this only gets called once per GL context anyway. The cost is negligable, and it's easy to botch context creation ordering. This may fix hangs on Gen6+ when using the media pipeline. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
* i965: Fix render-to-texture in non-FinishRenderTexture cases.Eric Anholt2014-03-061-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've had several problems now with FinishRenderTexture not getting called enough, and we're ready to just give up on it ever doing what we need. In particular, an upcoming Steam title had rendering bugs that could be fixed by always_flush_cache=true. Instead of hoping Mesa core can figure out when we need to flush our caches, just track what BOs we've rendered to in a set, and when we render from a BO in that set, emit a flush and clear the set. There's some overhead to keeping this set, but most of that is just hashing the pointer -- it turns out our set never even gets very large, because cache flushes are so common (even on cairo-gl). No statistically significant performance difference in cairo-gl (n=100), despite spending ~.5% CPU in these set operations. v1: (Original patch by Eric Anholt.) v2: (Changes by Ken Graunke.) - Rebase forward from May 7th 2013 -> March 4th 2014. - Drop the FinishRenderTexture hook entirely; after rebasing the patch, the hook was just an empty function. - Move the brw_render_cache_set_clear() call from intel_batchbuffer_emit_flush() to brw_emit_pipe_control_flush(). In theory, this could catch more cases where we've flushed. - Consider stencil as a possible texturing source. v3: (changes by anholt): - Move set_clear() back to emit_mi_flush() -- it means we can drop more forced flushes from the code. In the previous location, it wouldn't have been called when we wanted pre-gen6. - Move the set clear from batch init to reset -- it should be empty at the start of every batch, since the kernel handled any inter-batch flush for us. v4: Drop the debug code in set.c that I accidentally committed. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Dylan Baker <baker.dylan.c@gmail.com> [v2]
* i965: Pull format conversion logic out of brw_depthbuffer_format.Kenneth Graunke2014-02-191-32/+1
| | | | | | | | | | | | brw_depthbuffer_format is not very reusable at the moment, since it uses global state (ctx->DrawBuffer) to access a particular depth buffer. For HiZ on Broadwell, I need a function which simply converts the formats. However, at least one existing user of brw_depthbuffer_format really wants the existing interface. So, I've created a new function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
* mesa: Fix MESA_FORMAT_Z24_UNORM_S8_UINT vs. X8_UINT mix-up.Kenneth Graunke2014-02-091-3/+3
| | | | | | | | | | | | | | | | | | | | In commit eeed49f5f290793870c60b5b635b977a732a1eb4, Mark accidentally renamed MESA_FORMAT_S8_Z24 to MESA_FORMAT_Z24_UNORM_X8_UINT and MESA_FORMAT_X8_Z24 to MESA_FORMAT_Z24_UNORM_S8_UINT, reversing their sense. The commit message was correct, but what sed commands actually got run didn't match that. This patch swaps the two enum names, reversing them. This should undo the damage, but might break things if people have manually fixed a few instances in the meantime... Mark's commit also failed to mention renames: s/MESA_FORMAT_ARGB2101010_UINT\b/MESA_FORMAT_B10G10R10A2_UINT/g s/MESA_FORMAT_ABGR2101010\b/MESA_FORMAT_R10G10B10A2_UNORM/g but those seem okay. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* mesa: Change many Type P MESA_FORMATs to meet naming specMark Mueller2014-01-271-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conversion of Type P formats as follows (w/related comment fixes): s/MESA_FORMAT_RGB565\b/MESA_FORMAT_B5G6R5_UNORM/g s/MESA_FORMAT_RGB565_REV\b/MESA_FORMAT_R5G6B5_UNORM/g s/MESA_FORMAT_ARGB4444\b/MESA_FORMAT_B4G4R4A4_UNORM/g s/MESA_FORMAT_ARGB4444_REV\b/MESA_FORMAT_A4R4G4B4_UNORM/g s/MESA_FORMAT_RGBA5551\b/MESA_FORMAT_A1B5G5R5_UNORM/g s/MESA_FORMAT_XBGR8888_SNORM\b/MESA_FORMAT_R8G8B8X8_SNORM/g s/MESA_FORMAT_XBGR8888_SRGB\b/MESA_FORMAT_R8G8B8X8_SRGB/g s/MESA_FORMAT_ARGB1555\b/MESA_FORMAT_B5G5R5A1_UNORM/g s/MESA_FORMAT_ARGB1555_REV\b/MESA_FORMAT_A1R5G5B5_UNORM/g s/MESA_FORMAT_AL44\b/MESA_FORMAT_L4A4_UNORM/g s/MESA_FORMAT_RGB332\b/MESA_FORMAT_B2G3R3_UNORM/g s/MESA_FORMAT_ARGB2101010\b/MESA_FORMAT_B10G10R10A2_UNORM/g s/MESA_FORMAT_Z24_S8\b/MESA_FORMAT_S8_UINT_Z24_UNORM/g s/MESA_FORMAT_S8_Z24\b/MESA_FORMAT_Z24_UNORM_S8_UINT/g s/MESA_FORMAT_X8_Z24\b/MESA_FORMAT_Z24_UNORM_X8_UINT/g s/MESA_FORMAT_Z24_X8\b/MESA_FORMAT_X8Z24_UNORM/g s/MESA_FORMAT_RGB9_E5_FLOAT\b/MESA_FORMAT_R9G9B9E5_FLOAT/g s/MESA_FORMAT_R11_G11_B10_FLOAT\b/MESA_FORMAT_R11G11B10_FLOAT/g s/MESA_FORMAT_Z32_FLOAT_X24S8\b/MESA_FORMAT_Z32_FLOAT_S8X24_UINT/g s/MESA_FORMAT_ABGR2101010_UINT\b/MESA_FORMAT_R10G10B10A2_UINT/g s/MESA_FORMAT_XRGB4444_UNORM\b/MESA_FORMAT_B4G4R4X4_UNORM/g s/MESA_FORMAT_XRGB1555_UNORM\b/MESA_FORMAT_B5G5R5X1_UNORM/g s/MESA_FORMAT_XRGB2101010_UNORM\b/MESA_FORMAT_B10G10R10X2_UNORM/g s/MESA_FORMAT_AL88\b/MESA_FORMAT_L8A8_UNORM/g s/MESA_FORMAT_AL88_REV\b/MESA_FORMAT_A8L8_UNORM/g s/MESA_FORMAT_AL1616\b/MESA_FORMAT_L16A16_UNORM/g s/MESA_FORMAT_AL1616_REV\b/MESA_FORMAT_A16L16_UNORM/g s/MESA_FORMAT_RG88\b/MESA_FORMAT_G8R8_UNORM/g s/MESA_FORMAT_GR88\b/MESA_FORMAT_R8G8_UNORM/g s/MESA_FORMAT_GR1616\b/MESA_FORMAT_R16G16_UNORM/g s/MESA_FORMAT_RG1616\b/MESA_FORMAT_G16R16_UNORM/g s/MESA_FORMAT_SRGBA8\b/MESA_FORMAT_A8B8G8R8_SRGB/g s/MESA_FORMAT_SARGB8\b/MESA_FORMAT_B8G8R8A8_SRGB/g s/MESA_FORMAT_SLA8\b/MESA_FORMAT_L8A8_SRGB/g Conflicts: src/mesa/drivers/dri/i965/brw_surface_formats.c src/mesa/main/format_pack.c src/mesa/main/format_unpack.c src/mesa/main/formats.c src/mesa/main/texformat.c src/mesa/main/texstore.c
* mesa: Change many Type A MESA_FORMATs to meet naming standardMark Mueller2014-01-271-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update comments. Conversion of the following Type A formats: s/MESA_FORMAT_RGB888\b/MESA_FORMAT_BGR_UNORM8/g s/MESA_FORMAT_BGR888\b/MESA_FORMAT_RGB_UNORM8/g s/MESA_FORMAT_A8\b/MESA_FORMAT_A_UNORM8/g s/MESA_FORMAT_A16\b/MESA_FORMAT_A_UNORM16/g s/MESA_FORMAT_L8\b/MESA_FORMAT_L_UNORM8/g s/MESA_FORMAT_L16\b/MESA_FORMAT_L_UNORM16/g s/MESA_FORMAT_I8\b/MESA_FORMAT_I_UNORM8/g s/MESA_FORMAT_I16\b/MESA_FORMAT_I_UNORM16/g s/MESA_FORMAT_R8\b/MESA_FORMAT_R_UNORM8/g s/MESA_FORMAT_R16\b/MESA_FORMAT_R_UNORM16/g s/MESA_FORMAT_Z16\b/MESA_FORMAT_Z_UNORM16/g s/MESA_FORMAT_Z32\b/MESA_FORMAT_Z_UNORM32/g s/MESA_FORMAT_S8\b/MESA_FORMAT_S_UINT8/g s/MESA_FORMAT_SRGB8\b/MESA_FORMAT_BGR_SRGB8/g s/MESA_FORMAT_RGBA_16\b/MESA_FORMAT_RGBA_UNORM16/g s/MESA_FORMAT_SL8\b/MESA_FORMAT_L_SRGB8/g s/MESA_FORMAT_Z32_FLOAT\b/MESA_FORMAT_Z_FLOAT32/g s/MESA_FORMAT_XBGR16161616_UNORM\b/MESA_FORMAT_RGBX_UNORM16/g s/MESA_FORMAT_XBGR16161616_SNORM\b/MESA_FORMAT_RGBX_SNORM16/g s/MESA_FORMAT_XBGR16161616_FLOAT\b/MESA_FORMAT_RGBX_FLOAT16/g s/MESA_FORMAT_XBGR16161616_UINT\b/MESA_FORMAT_RGBX_UINT16/g s/MESA_FORMAT_XBGR16161616_SINT\b/MESA_FORMAT_RGBX_SINT16/g s/MESA_FORMAT_XBGR32323232_FLOAT\b/MESA_FORMAT_RGBX_FLOAT32/g s/MESA_FORMAT_XBGR32323232_UINT\b/MESA_FORMAT_RGBX_UINT32/g s/MESA_FORMAT_XBGR32323232_SINT\b/MESA_FORMAT_RGBX_SINT32/g s/MESA_FORMAT_XBGR8888_UINT\b/MESA_FORMAT_RGBX_UINT8/g s/MESA_FORMAT_XBGR8888_SINT\b/MESA_FORMAT_RGBX_SINT8/g
* i965: Update invariant state for Broadwell.Kenneth Graunke2014-01-181-4/+12
| | | | | | | | The only difference is that STATE_SIP takes a 48-bit address, so we need to output two zeroes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
* i965: Remove CACHED_BATCH support altogether.Kenneth Graunke2014-01-171-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | Using an unoptimized variant of glamor spending 50% of its CPU time in brw_draw_prims() (and hitting the cache *very* frequently): N Min Max Median Avg Stddev x 200 29200 40500 34900 34750 958.43256 + 200 31000 40300 34700 34622 916.35941 No difference proven at 95.0% confidence Similarly, no difference on GLB2.7: N Min Max Median Avg Stddev x 63 64.1 71.36 70.69 70.113175 1.6782026 + 63 63.6 71.18 70.75 70.223651 1.6044186 No difference proven at 95.0% confidence v2: Rebase on master (by anholt) v3: Add a missing BEGIN_BATCH(3) to aa_line_parameters -- CACHED_BATCH didn't have the asserts about batchbuffer usage that ADVANCE_BATCH does, so we started assertion failing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net>
* s/Tungsten Graphics/VMware/José Fonseca2014-01-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/alanh@tungstengraphics.com/alanh@vmware.com/ s/jens@tungstengraphics.com/jowen@vmware.com/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\?@tungstengraphics.com/jfonseca@vmware.com/g s/keithw\?@tungstengraphics.com/keithw@vmware.com/g s/michel@tungstengraphics.com/daenzer@vmware.com/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/zack@tungstengraphics.com/zackr@vmware.com/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <brianp@vmware.com>
* i965: Drop trailing whitespace from the rest of the driver.Kenneth Graunke2013-12-051-11/+11
| | | | | | | Performed via: $ for file in *; do sed -i 's/ *//g'; done Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>