| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Using consistent naming allows us to create macros more easily.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
| |
There's an assert right above this.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
While the current CFG code is valid in the case where a switch break also
happens to be a loop continue, it's a bit suboptimal. Since hardware is
capable of handling the continue as a direct jump, it's better to use a
continue instruction when we can than to bother with all of the nasty
switch break lowering.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible that the break block of a switch is actually the continue of
the loop containing the switch. In this case, we need to identify the
break block as a continue and break out of current level of CFG handling.
If we don't, the continue portion of the loop will get handled twice, once
by following after the break and a second time by the loop handling code
handling it explicitly.
This fixes 6 of the new Vulkan CTS tests:
- dEQP-VK.spirv_assembly.instruction.graphics.opphi.out_of_order*
- dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order*
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
| |
CovID: 1373369
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
|
| |
These structs will be written to disk as part of the shader cache
so use uint32_t just to be safe.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
|
|
|
|
|
|
|
| |
The warning was also the wrong location, it should have been
in the else.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
| |
Almost all of the other drawing validation code is in api_validate.c
so put this function there as well.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
|
| |
Detect all of the CPUs in the system. Expose metrics
for min, max and current frequency in Hz.
Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
|
| |
This reverts commit dbfeb0ec12d6550e68de1bcd164e422e79bccf2d.
With max_value being rounded to 100, it's often wrong.
Reviewed-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
| |
Regression introduced by "gallium/radeon: zero all query buffers".
Cc: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
| |
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
|
|
| |
This is only needed for r600 which doesn't have ARB_query_buffer_object and
therefore wouldn't really need the fences, but let's be optimistic about
filling in this feature gap eventually.
Cc: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
| |
We don't plan to use sub-allocated buffers with UVD, but just in case one
slips through, this increases the chances of things working out anyway.
Reviewed-by: Christian König <christian.koenig@amd.com>
|
|
|
|
|
|
|
| |
We don't plan to use sub-allocated buffers with VCE, but just in case one
slips through, this increases the chances of things working out anyway.
Reviewed-by: Christian König <christian.koenig@amd.com>
|
|
|
|
|
|
|
| |
Cc: Christian König <christian.koenig@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97976
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97969
Reviewed-by: Christian König <christian.koenig@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement support for power based sensors, reporting units in
milli-watts and watts.
Also, minor cleanup - change the related if block to a switch.
Tested with two different power sensors, including the nouveau
'power1' sensors on a GTX950 card.
Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commutativity is not allowed with SHLADD, but src2 can accept
loads. To allow the load propagation pass to do its job, add a
special case like for SUCLAMP because src1 is always an immediate.
This IMAD to SHLADD optimization helps a bunch of shaders from Tomb
Raider, Victor Vran, UE4 demos (+15% perf with Elemental) and Shadow
Warrior.
GF100/GK104:
total instructions in shared programs :2838045 -> 2834712 (-0.12%)
total gprs used in shared programs :396684 -> 396386 (-0.08%)
total local used in shared programs :34416 -> 34416 (0.00%)
local gpr inst bytes
helped 0 326 1105 1105
hurt 0 55 3 3
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
|
|
|
|
|
| |
Only and only if src1 is a power of 2 we can replace IMAD by SHLADD.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
|
|
|
|
|
|
| |
Unfortunately, we can't use the emit helpers for GF100/GK110
because src1 and src2 are swapped.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
|
|
|
|
|
|
|
|
| |
This instruction is available since SM20 (Fermi) and allow to do
(a << b) + c in one shot. In some situations, IMAD should be
replaced by SHLADD when b is a power of 2, and ADD+SHL should be
replaced by SHLADD as well.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
|
|
|
|
|
|
|
| |
envyas now uses a much better representation for those control
codes and it displays the different flags instead of an
unreadable hex number.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
|
|
|
|
|
| |
Most of the time, even the 512 bytes that we now get is more than sufficient
(pipeline stats queries are the largest at 184 bytes per shot).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
| |
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
| |
v2: enable only when compute is available
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
| |
v2: fix a comment (Gustaw Smolarczyk)
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
| |
To ensure that fences are properly initialized.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We will support the waiting option in ARB_query_buffer_object using
WAIT_REG_MEM on an appropriate fence-like dword. Some queries conveniently
write their results with the highest bit set, and we can just use that;
for others, we have to write a fence explicitly.
ZPASS_DONE for occlusion queries writes its results with the high bit
set, but it writes up to 8 pairs of results (one for each DB). We have
to wait for all of these results, so let's just add an explicit fence.
The new function provides summary information to be used by subsequent
patches.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
|
| |
Save compute shader state that will be used for the ARB_query_buffer_object
implementation.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
|
|
|
| |
These functions extract the pipe state structure from the current
descriptors, for state saving.
v2: correctly dereference *buf (Bas)
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
| |
For bottom-of-pipe fences inside the gfx command stream.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
|
|
|
|
| |
There are driver-specific context flags for barriers that are not covered
by the Gallium barrier interfaces.
The R600 settings of these flags may not be optimal, but we're not going
to use them yet anyway.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
| |
Acked-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
| |
Acked-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
| |
Acked-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
| |
Acked-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
| |
Acked-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
V8: Feedback based on peer review
convert if block into a switch
Constify some func args
V7: Increase precision when measuring lmsensors volts
Flatten patch series.
V6: Feedback based on peer review
Simplify sensor initialization (arg passing).
Constify some func args
V5: Feedback based on peer review
Convert sprintf to snprintf
Convert char * to const char *
int arg converted to bool
Func changes to take a filename vs a larger struct.
Omit the space between '*' and the param name.
V4: Merged with master as of 2016/9/27 6pm
V3: Flatten the entire patchset ready for the ML
V2: Additional seperate patches based on feedback
a) configure.ac: Add a comment related to libsensors
b) HUD: Disable Block/NIC I/O stats by default.
Implement configuration option --enable-gallium-extra-hud=yes
and enable both statistics when this option is enabled.
c) Configure.ac: Minor cleanup to user visible configuration settings
d) Configure.ac: HUD stats - build system improvements
Move the -lsensors out of a deeper Makefile, bring it into the configure.ac.
Also, rename a compiler directive to more closely follow the standard.
V1: Initial release to the ML
Three new features:
1. Disk/block I/O device read/write stats MB/ps.
2. Network Interface RX/TX transfer statistics as a percentage
of the overall NIC speed.
3. lmsensor power, voltage and temperature sensors.
The lmsensor changes makes a dependency on libsensors so support
for the change is opt out by default.
Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
|
|
| |
The code already skips doing the depth stall on gen >= 8, and as we
enable new platforms this assertion will fail needlessly. Instead of
changing the caller, make this simple change.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
I found this in a shader that was doing an alpha test when alpha is fixed
at 1.0.
v2: Rebase on master (now the const value is "u32" not "u").
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
|
|
|
|
|
|
|
| |
Fixes lots of piglit tests crashing due to using uninitialized memory.
Fixes: ecd6fce2611e ("mesa/st: support lowering multi-planar YUV")
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|