summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi/si_hw_context.c
Commit message (Collapse)AuthorAgeFilesLines
* gallium/radeon: implement set_device_reset_callbackNicolai Hähnle2016-10-051-0/+3
| | | | | | | | | Check for device reset on flush. It would be nicer if the kernel just reported this as an error on the submit ioctl (and similarly for fences), but this will do for now. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: move VGT_LS_HS_CONFIG to derived tess_stateMarek Olšák2016-10-041-1/+0
| | | | | Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
* radeonsi: remove the cache_flush atomMarek Olšák2016-09-091-1/+1
| | | | Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeonsi: skip redundant INDEX_TYPE writesMarek Olšák2016-09-071-0/+1
| | | | | | Ported from Vulkan. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallium/radeon: set VPORT_ZMIN/MAX registers correctlyMarek Olšák2016-09-051-0/+1
| | | | | | | | | | | | Calculate depth ranges from viewport states and pipe_rasterizer_state::clip_halfz. The evergreend.h change is required to silence a warning. This fixes this recently updated piglit: arb_depth_clamp/depth-clamp-range Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallium/radeon: unify and simplify checking for an empty gfx IBMarek Olšák2016-08-251-10/+3
| | | | | | | We can take advantage of the fact that multi_fence does the obvious thing with NULL fences. This fixes unflushed fences that can get stuck due to empty IBs.
* gallium/radeon: count gfx IB flushesMarek Olšák2016-08-061-1/+1
| | | | | | | This will be used as a counter for whether fence_finish needs to flush the IB. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallium/radeon: move radeon_winsys::cs_memory_below_limit to driversMarek Olšák2016-08-061-2/+3
| | | | Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallium/radeon: move last_gfx_fence from radeonsi to common codeMarek Olšák2016-08-031-5/+5
| | | | Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallium/radeon: remove RADEON_FLUSH_KEEP_TILING_FLAGS flagMarek Olšák2016-07-191-3/+0
| | | | | | always set Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeonsi: explicitly choose center locations for 1xAA on PolarisNicolai Hähnle2016-07-081-1/+2
| | | | | | | | | | | | | Unlike SC, the small primitive filter does not automatically use center locations in 1xAA mode, so this is needed to avoid artifacts caused by the small primitive filter discarding triangles that it shouldn't. As a side effect of how the effective number of samples is now calculated, this patch also avoids submitting the sample locations for line/poly smoothing when they're not really needed. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: fix fractional odd tessellation spacing for PolarisMarek Olšák2016-06-241-0/+1
| | | | | | | ported from Vulkan (and no source explains why this is needed) Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeon: check VM faults from DMA flushNicolai Hähnle2016-06-241-1/+1
| | | | Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: move gfx fence wait out of si_check_vm_faultsNicolai Hähnle2016-06-241-1/+7
| | | | Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: extract IB and bo list saving into separate functionsNicolai Hähnle2016-06-241-23/+2
| | | | Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: Reinitialize all descriptors in CE preamble.Bas Nieuwenhuizen2016-06-101-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | This fixes a problem with the CE preamble and restoring only stuff in the preamble when needed. To illustrate suppose we have two graphics IB's 1 and 2, which are submitted in that order. Furthermore suppose IB 1 does not use CE ram, but IB 2 does, and we have a context switch at the start of IB 1, but not between IB 1 and IB 2. The old code put the CE RAM loads in the preamble of IB 2. As the preamble of IB 1 does not have the loads and the preamble of IB 2 does not get executed, the old values are not load into CE RAM. Fix this by always restoring the entire CE RAM. v2: - Just load all descriptor set buffers instead of load and store the entire CE RAM. - Leave the ce_ram_dirty tracking in place for the non-preamble case. v3: - Fixed parameter alignment. - Rebased to master (Nicolai's descriptor series). Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeon/winsys: introduce radeon_winsys_cs_chunkNicolai Hähnle2016-06-011-4/+11
| | | | | | | We will chain multiple chunks together and will keep pointers to the older chunks to support IB dumping. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* gallium/radeon: use cs_check_space throughoutNicolai Hähnle2016-06-011-3/+2
| | | | Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: don't flush TC at the end of IBs on DRM >= 3.2.0Marek Olšák2016-05-311-2/+3
| | | | | | It's not needed since it was fixed in the kernel. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
* gallium/radeon: add radeon_emitted to check for non-trivial IBsNicolai Hähnle2016-05-171-2/+2
| | | | Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: remove flushes at the beginning and end of IBs done by the kernelMarek Olšák2016-04-281-12/+12
| | | | Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeonsi: add RW_BUFFERS only once in si_ce_needed_cs_spaceMarek Olšák2016-04-261-2/+2
| | | | Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
* radeonsi: Enable loading into CE RAM.Bas Nieuwenhuizen2016-04-211-0/+5
| | | | | | | | | | We need to enable a bit in the CONTEXT_CONTROL packet for the loads to work. v2: Style issues. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeonsi: do not do two full flushes on every compute dispatchBas Nieuwenhuizen2016-04-191-0/+1
| | | | | | | | | | | | | | | | | | | | v2: Add more CS_PARTIAL_FLUSH events. Essentially every place with waits on finishing for pixel shaders also has a write after read hazard with compute shaders. Invalidating L2 waits implicitly on pixel and compute shaders, so, we don't need a CS_PARTIAL_FLUSH for switching FBO. v3: Add CS_PARTIAL_FLUSH events even if we already have INV_GLOBAL_L2. According to Marek the INV_GLOBAL_L2 events don't wait for compute shaders to finish, so wait for them explicitly. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
* radeonsi: do per cs setup for compute shaders once per csBas Nieuwenhuizen2016-04-191-0/+2
| | | | | | | | | | | | Also removes PKT3_CONTEXT_CONTROL as that is already being done by si_begin_new_cs, when emitting init_config. v2: - Use radeon_set_sh_reg_seq. - Also set COMPUTE_STATIC_THREAD_MGMT_SE2 / SE3 for CIK+ Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: Create CE IB.Bas Nieuwenhuizen2016-04-191-1/+34
| | | | | | | | | | | | | | | | | | | Based on work by Marek Olšák. v2: Add preamble IB. Leaves the load packet in the space calculation as the radeon winsys might not be able to support a premable. The added space calculation may look expensive, but is converted to a constant with (at least) -O2 and -O3. v3: - Fix code style. - Remove needed space for vertex buffer descriptors. - Fail when the preamble cannot be created. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: fix a critical SI hang since PIPELINESTAT_START/STOP was addedMarek Olšák2016-04-131-1/+2
| | | | | For some reason unknown to me, SI hangs if the event is written after CONTEXT_CONTROL.
* radeonsi: move scissor and viewport states into gallium/radeonMarek Olšák2016-04-121-4/+4
| | | | | | Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallium/radeon: remove old CS tracingMarek Olšák2016-03-201-2/+1
| | | | | | | | | | | | | | Cons: - it was only integrated in r600g - it doesn't work with GPUVM - it records buffer contents at the end of IBs instead of at the beginning, so the replay isn't exact - it lacks an IB parser and user-friendliness A better solution is apitrace in combination with gallium/ddebug, which has a complete IB parser and can pinpoint hanging CP packets. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeonsi: implement forcing per-sample_interpolation using the shader key onlyMarek Olšák2016-02-091-1/+0
| | | | | | | | | | | It was partly a state and partly emulated by shader code, but since we want to do this in a fragment shader prolog, we need to put it into the shader key, which will be used to generate the prolog. This also removes the spi_ps_input states and moves the registers to the PS state. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeonsi: rename cb_target_mask state to cb_render_stateMarek Olšák2016-02-021-1/+1
| | | | | | | | and rename a variable in the function. SX_PS_DOWNCONVERT will be emitted here. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeonsi: calculate optimal GS ring sizes to fix GS hangs on TongaMarek Olšák2015-11-131-0/+2
| | | | | | | | | | | | | | I discovered that increasing the ESGS ring size fixes GS hangs on Tonga, so let's do it properly. There is now a separate init_config_gs_rings state that is not immutable, because GS rings are resized when needed. This also saves some memory. Most apps won't need more than 1MB per ring per shader engine. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallium/radeon: atomize render condition (SET_PREDICATION)Marek Olšák2015-11-131-0/+1
| | | | Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallium/radeon: inline the r600_rings structureMarek Olšák2015-11-131-8/+8
| | | | Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeonsi: prevent recursion in si_context_gfx_flushMarek Olšák2015-11-131-0/+7
| | | | | | The recursion can only occur if you modify need_cs_space to always flush. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallium/radeon: remove the IB flushing flagMarek Olšák2015-11-131-3/+0
| | | | | | | Not needed anymore. A similar flag will be introduced in the next commit, which will be private in radeonsi. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallium/radeon: move GFX/DMA flushing from add_to_buffer_list to need_cs_spaceMarek Olšák2015-11-131-0/+5
| | | | | | | | need_cs_space isn't invoked so often and is called before all commands too. This is a lot cleaner. The code in radeon_add_to_buffer_list always seemed dodgy to me. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeonsi: rename cache flushing flags once moreMarek Olšák2015-11-131-5/+5
| | | | | | | | | | | | | | | KCACHE, TC L1 and TC L2 are renamed to: - SMEM L1 - VMEM L1 - GLOBAL L2 You can easily tell what they are used for now. Shaders must deal with coherency issues between both L1s manually, e.g. by setting GLC=1 or by using s_dcache_*. BOTH_ICACHE_KCACHE was an unused definition. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeonsi: move SPI_PS_INPUT_ENA/ADDR registers to a separate stateMarek Olšák2015-10-031-0/+1
| | | | | | | This will be a derived state used for changing center->sample and centroid->sample at runtime. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
* radeonsi: dump buffer lists while debuggingMarek Olšák2015-10-031-1/+14
| | | | Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
* gallium/radeon: stop using "reloc" in a few placesMarek Olšák2015-10-031-1/+1
| | | | Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
* radeonsi: add an option for debugging VM faultsMarek Olšák2015-10-031-0/+4
| | | | Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
* radeonsi: don't send IB dword usage to si_need_cs_spaceMarek Olšák2015-09-011-2/+1
| | | | | Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>
* radeonsi: don't count IB space for states, just use an upper boundMarek Olšák2015-09-011-48/+5
| | | | | | | | Since we don't put any resource descriptors in IBs, the space used by draw calls is quite small. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>
* radeonsi: convert SPI state to an atomMarek Olšák2015-09-011-0/+1
| | | | | Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>
* radeonsi: convert CB_TARGET_MASK setup to an atomMarek Olšák2015-09-011-0/+1
| | | | | Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>
* radeonsi: convert stencil ref state into an atomMarek Olšák2015-09-011-0/+1
| | | | | Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>
* radeonsi: convert blend color state into an atomMarek Olšák2015-09-011-0/+1
| | | | | Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>
* radeonsi: convert sample mask state into an atomMarek Olšák2015-09-011-0/+1
| | | | | Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>
* radeonsi: convert clip state into an atomMarek Olšák2015-09-011-0/+1
| | | | | | | Reducing calloc overhead. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>