summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi/si_perfcounter.c
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: remove unnecessary #includesMarek Olšák2016-10-041-1/+0
| | | | | Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
* gallium/radeon: emit relocations for query fencesNicolai Hähnle2016-09-301-1/+1
| | | | | | | | | This is only needed for r600 which doesn't have ARB_query_buffer_object and therefore wouldn't really need the fences, but let's be optimistic about filling in this feature gap eventually. Cc: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* gallium/radeon: add r600_gfx_{write,wait}_fenceNicolai Hähnle2016-09-291-38/+3
| | | | | | | For bottom-of-pipe fences inside the gfx command stream. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: really wait for the second EOP event and not the first oneMarek Olšák2016-07-191-1/+5
| | | | Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeonsi: expose performance counters as 64 bitNicolai Hähnle2016-05-091-5/+8
| | | | | | | This is useful for shader-related counters, since they tend to quickly exceed 32 bits. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: don't flush CB/DB caches for performance countersMarek Olšák2016-04-181-3/+6
| | | | | | | | I'm not sure about this. This will make the engines go idle, but the caches will be unflushed. This should match app behavior without performance counters, which can be a good thing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* radeonsi: add placeholder MC and SRBM performance counter groupsNicolai Hähnle2016-02-051-16/+54
| | | | | | | | | | Yet another change motivated by AMD GPUPerfStudio compatibility. These groups are not directly accessible from userspace, and AMD GPUPerfStudio does not actually query them - it just requires them to be there. Hence, adding a placeholder for now. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: re-order the SQ_xx performance counter blocksNicolai Hähnle2016-02-051-0/+21
| | | | | | | | This is yet another change motivated by appeasing AMD GPUPerfStudio's hardcoding of performance counter group numbers. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: re-order the perfcounter hardware blocksNicolai Hähnle2016-02-051-12/+18
| | | | | | | | | As documented in the comment, AMD GPUPerfStudio unfortunately hardcodes the order of performance counter groups. Let's do the pragmatic thing and present the same order as Catalyst/Crimson. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: fix perfcounter selection for SI_PC_MULTI_BLOCK layoutsNicolai Hähnle2015-12-151-1/+1
| | | | | | The incorrectly computed register count caused lockups. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
* radeon: delay the generation of driver query names until first useNicolai Hähnle2015-11-261-8/+7
| | | | | | | This shaves a bit more time off the startup of programs that don't actually use performance counters. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* radeonsi: implement AMD_performance_monitor for CIK+Nicolai Hähnle2015-11-251-0/+695
Expose most of the performance counter groups that are exposed by Catalyst. Ideally, the driver will work with GPUPerfStudio at some point, but we are not quite there yet. In any case, this is the reason for grouping multiple instances of hardware blocks in the way it is implemented. The counters can also be shown using the Gallium HUD. If one is interested to see how work is distributed across multiple shader engines, one can set the environment variable RADEON_PC_SEPARATE_SE=1 to obtain finer-grained performance counter groups. Part of the implementation is in radeon because an implementation for older hardware would largely follow along the same lines, but exposing a different set of blocks which are programmed slightly differently. Reviewed-by: Marek Olšák <marek.olsak@amd.com>