| Commit message (Collapse) | Author | Age | Files | Lines |
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For compute shaders, we free the selector after the shader has been
compiled, so we need to save this bit somewhere else. Also, make sure that
this type of bug cannot re-appear, by NULL-ing the selector pointer after
we're done with it.
This bug has been there since the feature was added, but was only exposed
in piglit arb_compute_variable_group_size-local-size by commit
9bfee7047b70cb0aa026ca9536465762f96cb2b1 (which is totally unrelated).
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 42d5e91a2ae235c007c5d17935be9bb1c4ff388e)
|
| |
| |
| |
| |
| |
| |
| |
| | |
I had this exactly backwards, but apparently the piglit tests were all
landing in r0-r3 anyway.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 977d8b526b983c8d19df00af224033389f8ab7c8)
|
| |
| |
| |
| |
| |
| |
| | |
Fixes piglit glsl-fs-shadow2D-clamp-z.
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 08d51487e3b8cfb14ca2ece9545b2e2ed344e3cc)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It's much better to just skip the draw call entirely. Getting this
information out of register allocation will also be useful for
implementing threaded fragment shaders, which will need to retry
non-threaded if RA fails.
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4d019bd703e7c20d56d5b858577607115b4926a3)
|
|\ \
| |/ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In the event that multiple threads attempt to install a graph
concurrently, protect the shared list.
Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 381edca826ee27b1a49f19b0731c777bdf241b20)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We're missing the closedir() to the matching opendir().
Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 5a58323064b32442e2de23c95642bc421be696f8)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Instead of trying to maintain a reference counted list of valid HUD
objects, and freeing them accordingly, creating race conditions
between unanticipated multiple threads, simply accept they're
allocated once and never released until the process terminates.
They're a shared resource between multiple threads, so accept
they're always available for use.
Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 6ffed086795aaa84ab35668bb59d712cdde34da3)
|
| |
| |
| |
| |
| |
| |
| |
| | |
The 1/W was apparently not accurate enough, and we were getting sparklies
in the distance. The closed driver also did a N-R step here.
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 283d4d18e598793bbff7d9ba5a601bced9b36542)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This reverts commit d180de35320eafa3df3d76f0e82b332656530126.
This is a radeon specific hack that causes problems on nouveau
when combined with the SHARED flag later. If radeonsi needs a fix
for this, please fix it in the driver.
[chk]
Using linear surfaces for this makes sense because tilling isn't
beneficial and the surfaces can potentially be shared with other GPUs
using the VDPAU OpenGL interop.
[airlied]
I think we need a flag that isn't SHARED/LINEAR that is more
SHARED_OTHER_GPU.
[mareko]
Does radeonsi need PIPE_BIND_VIDEO_DECODE_OUTPUT that it would translate
into linear ?
[mareko]
My only concern is decoding performance. If the decoder works in 64x1
blocks, tiling will hurt. That's the theory. I don't know how the
decoder works.
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> (I+A)
(cherry picked from commit d0d5f7600c2e8ab8d0c153787185f7a534753edd)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This fixes a crash in Deus Ex: Mankind Divided. Release builds were
unaffected, so it's not too serious.
Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 00baaa4752ab7e721218a2840cf0952d8c7c6eca)
|
| |
| |
| |
| |
| |
| |
| |
| | |
Fixes spec/arb_gpu_shader5/execution/built-in-functions/*-bitfield{Extract,Insert}
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 5aef14932ac047dc5f1af311a26b7f41b140d79f)
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| | |
the push mutex acquired
|
| | |
|
| |
| |
| |
| |
| | |
* add dri2_create_from_texture to driswImageExtension
* add dri2FenceExtension to drisw_screen_extensions
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
System boots up with gles_mesa/softpipe/llvmpipe.
NO_REF_TASK
Tested: local run
Change-Id: I629ed0ca9fad12e32270eb8e8bfa9f7681b68474
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Redirect logs printed to stderr to logcat.
NO_REF_TASK
Tested: local run
Change-Id: I58e3966a608af361b86c54b4c95a92561b711968
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In the callchain destroy_surface->destroyDrawable->dri_put_drawable->
dri_put_drawable->DestroyBuffer
By the semantic of it, dri_destroy_buffer should not free drawable struct,
all vendor specific and legacy swrast version of the function do not.
wonder why no body else ran into this.
NO_REF_TASK
Tested: local run
Change-Id: Ibe82d82d2e34b162e64bf0b8805f8a4553d362d5
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This is a try-and-error patch which fixes the Android-x86 black screen
issue of VMware on Linux host. Tested OK on VMware Workstation 12 Player.
But the red and blue colors are exchanged.
Note it doesn't affect VMware on Windows host.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Introduce load_pipe_screen() public entry point for other code which
dlopen()'s gralloc_dri.so for purposes of loading a pipe_screen. This way
drm_gralloc can avoid static linking of each gallium winsys and driver,
and avoid duplicated logic to figure out which pipe driver to load.
This is based on Rob Clark's work. I moved it into pipe_loader which seems
to be a better spot.
Signed-off-by: Rob Herring <robh@kernel.org>
|
| |
| |
| |
| |
| |
| |
| | |
This doesn't work yet because the exported include files can't be picked
up by the android build system unless the library has a 'lib' prefix.
Signed-off-by: Rob Herring <robh@kernel.org>
|
|/
|
|
|
|
|
|
|
|
|
|
|
| |
The changes add support for LLVM 3.7.0 for marshmallow,
while keeping support for LLVM 3.5.0 with lollipop.
MESA_LLVM_VERSION_PATCH=0 is compatible with radeonsi build in lollipop-x86,
since mesa 11.0 and newer do not check anymore for LLVM 3.5.2
This changes, combined with specific R600 patches for external/llvm,
enable building gallium radeonsi driver in marshmallow-x86.
The patch is applicable to 11.2.0devel, 11.1 and 11.0 branches.
|
|
|
|
|
|
|
|
|
|
|
| |
When the video coded size is different from frame size, we need the result
buffers are same as coded size, which are not size compatible with encode
required size, so that simply use no tunnel for this case instead of frame
by frame converting.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 06e3cd6a45ae2ad19f77e0f283c46d5f85112847)
|
|
|
|
|
|
|
|
|
| |
Otherwise fails the check of matching between decoder size and buffers
size in kernel.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d9b2c4048d55011bb04bd9848a3b47af7216389f)
|
|
|
|
|
|
|
|
| |
12.0 and older need the same fix but elsewhere.
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 4bf45a6079b5cc6b0360b637c0c7baa456b8257d)
|
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e24dc4316487eeaa6ee8aa5c709546d814e96f03)
|
|
|
|
|
|
|
|
|
|
| |
The emitter tried to emit sub instead of subr when src0 has
actually a NEG modifier.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 84e946380b2d5ddc62a107b667be39abf1932704)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This affects GF100:GK110 chipsets, but not GM107+ where the
logic is a bit different. The emitters tried to emit sub
instead of subr when src0 has a NEG modifier.
This fixes the following piglit tests glsl-fs-loop-nested
and glsl-vs-loop-nested.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1ec7227d44dceae8de7b93f846bbd33d66007909)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Maybe this is why SDMA has been broken for many amdgpu users?
SDMA is the only block which is used with imported textures and relies
on this variable. DB also uses it, but it doesn't get imported textures,
so it's unaffected.
I do get SDMA failures on Tonga before this patch if R600_DEBUG=testdma
is changed to use imported textures.
Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 6ec3b2a4b1d41b83a4721d06b42c49f55e695cbf)
|
|
|
|
|
|
|
|
|
| |
This should fix random GPU hangs on Hawaii and Fiji.
Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit dce05b342355eac9296ee7110385b16d6edb059d)
|
|
|
|
|
|
|
|
|
| |
Oh my god, I wonder what catastrophic issues this was causing on SI.
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 8a21f52d73936e23a314a288a36782a698c7c1b9)
|
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7b2712c367891e96384226a1fa94679a814235d0)
|
|
|
|
|
|
|
|
|
|
|
| |
Only one face of Cubetextures was locked when in DEFAULT Pool.
Fixes:
https://github.com/iXit/Mesa-3D/issues/129
CC: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit eed605a473554575305e1bf10c3641761a85feb9)
|
|
|
|
|
|
|
|
|
|
| |
In the format fallback path,
the height was used instead of the depth.
CC: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit fe7bb46134162c9a9a18832f1746991aa78121e8)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Leak introduced by:
a83dce01284f220b1bf932774730e13fca6cdd20
The patch also moves the part to
release changed.vs_const_i and changed.vs_const_b
before the if (!cb.buffer_size) check,
to avoid reuploading every draw call if
integer or boolean constants are dirty, but the shaders
use no constants.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
CC: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 25beccb379731b0e6fc728982190779da47aa6fd)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Long story short, 3D and CP are aliased on Fermi and initializing
compute after pushing the MS sample coordinate offsets seems to
corrupt 3D state for weird reasons.
I still don't have the faintest clue what is going on, but
this seems to only affect Fermi generation. A possible fix
could be to use two different channels, one for 3D and one
for CP.
This fixes a bunch of regressions pinpointed by piglit.
Fixes: "nvc0: fix up image support for allowing multiple samples"
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 42273edf79c2500957f51690499aa3405cc689db)
|
|
|
|
|
|
|
|
|
|
| |
Fixes spec/arb_tessellation_shader/execution/dvec[23]-vs-tcs-tes, among
others.
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 4a2dbfff05f7be271c2aa72e783e24b31906db51)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With ARB_gpu_shader5, texture offsets can be any source, including TEMPs
and IN's. Make sure to process them as regular sources so that we pick
up masks, etc.
This should fix some CTS tests that feed offsets directly to
textureGatherOffset, and we were not picking up the input use, thus not
advertising it in the shader header.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dave Airlie <airlied@redhat.com>
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cd45d758ff87305ceecca899fe7325779bb6755b)
|
|
|
|
|
|
|
|
|
|
| |
The state tracker tries to attach the info to the wrong shader. This is
easy enough to protect against.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 313fba5ee1de9416930e45da8aff63a24763940b)
|
|
|
|
|
|
| |
st/mesa does this for us.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The per-element fetch has quite some calculations which are constant,
these can be moved outside both the per-element as well as the main
shader loop (llvm can figure out it's constant mostly on its own, however
this can have a significant compile time cost).
Similarly, it looks easier swapping the fetch loops (outer loop per attrib,
inner loop filling up the per vertex elements - this way the aos->soa
conversion also can be done per attrib and not just at the end though again
this doesn't really make much of a difference in the generated code). (This
would also make it possible to vectorize the calculations leading to the
fetches.)
There's also some minimal change simplifying the overflow math slightly.
All in all, the generated code seems to look slightly simpler (depending
on the actual vs), but more importantly I've seen a significant reduction
in compile times for some vs (albeit with old (3.3) llvm version, and the
time reduction is only really for the optimizations run on the IR).
v2: adapt to other draw change.
No changes with piglit.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previous attempts to zero initialize all inputs were not really optimal
(though no performance impact was measurable). In fact this is not really
necessary, since we know the max number of inputs used.
Instead, just generate fetch for up to max inputs used by the shader,
directly replacing inputs for which there was no vertex element by zero.
This also cleans up key generation, which previously would have stored
some garbage for these elements.
And also drop the assertion which indicates such bogus usage by a
debug_printf (the whole point of initializing the undefined inputs was to
make this case safe to handle).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
|
|
|
|
|
|
|
|
| |
Compilation to actual machine code can easily take as much time as the
optimization passes on the IR if not more, so print this out too.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the texturing packs, things looked pretty terrible. For every
lerp, we were repacking the values, and while those look sort of cheap
with 128bit, with 256bit we end up with 2 of them instead of just 1 but
worse, plus 2 extracts too (the unpack, however, works fine with a
single instruction, albeit only with llvm 3.8 - the vpmovzxbw).
Ideally we'd use more clever pack for llvmpipe backend conversion too
since we actually use the "wrong" shuffle (which is more work) when doing
the fs twiddle just so we end up with the wrong order for being able to
do native pack when converting from 2x8f -> 1x16b. But this requires some
refactoring, since the untwiddle is separate from conversion.
This is only used for avx2 256bit pack/unpack for now.
Improves openarena scores by 8% or so, though overall it's still pretty
disappointing how much faster 256bit vectors are even with avx2 (or
rather, aren't...). And, of course, eliminating the needless
packs/unpacks in the first place would eliminate most of that advantage
(not quite all) from this patch.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
|