| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
|
|
| |
Commit 65dfb30 added exec_list EmptyUniformLocations, but only
initialized the list if ARB_explicit_uniform_location was enabled,
leading to crashes if the extension was not available.
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tl;dr: For many types of GL object, we can *NEVER* use the Gen function.
In OpenGL ES (all versions!) and OpenGL compatibility profile,
applications don't have to call Gen functions. The GL spec is very
clear about how you can mix-and-match generated names and non-generated
names: you can use any name you want for a particular object type until
you call the Gen function for that object type.
Here's the problem scenario:
- Application calls a meta function that generates a name. The first
Gen will probably return 1.
- Application decides to use the same name for an object of the same
type without calling Gen. Many demo programs use names 1, 2, 3,
etc. without calling Gen.
- Application calls the meta function again, and the meta function
replaces the data. The application's data is lost, and the app
fails. Have fun debugging that.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tl;dr: For many types of GL object, we can *NEVER* use the Gen function.
In OpenGL ES (all versions!) and OpenGL compatibility profile,
applications don't have to call Gen functions. The GL spec is very
clear about how you can mix-and-match generated names and non-generated
names: you can use any name you want for a particular object type until
you call the Gen function for that object type.
Here's the problem scenario:
- Application calls a meta function that generates a name. The first
Gen will probably return 1.
- Application decides to use the same name for an object of the same
type without calling Gen. Many demo programs use names 1, 2, 3,
etc. without calling Gen.
- Application calls the meta function again, and the meta function
replaces the data. The application's data is lost, and the app
fails. Have fun debugging that.
Fixes piglit tests:
- object-namespace-pollution glGetTexImage-compressed framebuffer
- object-namespace-pollution glGenerateMipmap framebuffer
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
|
|
|
|
|
| |
object handle
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
|
|
|
|
|
| |
API object handle
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables later patches that will stop calling _mesa_GenFramebuffers
or _mesa_CreateFramebuffers which pollute the framebuffer namespace.
For framebuffers, the Bind call is still necessary.
sed -i -e 's/_mesa_GenFramebuffers/_mesa_CreateFramebuffers/' \
src/mesa/drivers/common/*.c
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables later patches that will stop calling _mesa_GenFramebuffers
or _mesa_CreateFramebuffers which pollute the framebuffer namespace.
For framebuffers, the Bind call is still necessary.
sed -i -e 's/_mesa_GenFramebuffers/_mesa_CreateFramebuffers/' \
src/mesa/drivers/dri/i965/*.c
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
API object handle
Some meta operations can be called recursively. Future changes (the
"Don't pollute the ... namespace" changes) will cause objects with
invalid names to be used. If a nested meta operation tries to restore
an object named 0xDEADBEEF, it will fail.
This also fixes another latent bug in meta. In a multithreaded,
multicontext application, one thread can delete an object that is bound
in another thread. That object continues to exist until it is unbound
(i.e., its refcount drops to zero). Meta unbinds objects all over the
place. As a result, the rebind in _mesa_meta_end could fail because the
object vanished!
See https://bugs.freedesktop.org/show_bug.cgi?id=92363#c8.
Using _mesa_reference_<object type> to save and restore the objects
prevents the refcount from going to zero.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
Fixing dd_function_table::BindFramebuffer will come later because that
change is probably not suitable for stable.
v2: Fix whitespace issue noticed by Topi.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
sed -i -e 's/_mesa_CheckFramebufferStatus(GL_DRAW_FRAMEBUFFER/_mesa_check_framebuffer_status(ctx, ctx->DrawBuffer/' \
-e 's/_mesa_CheckFramebufferStatus(GL_FRAMEBUFFER[^)]*/_mesa_check_framebuffer_status(ctx, ctx->DrawBuffer/' \
-e 's/_mesa_CheckFramebufferStatus(GL_READ_FRAMEBUFFER/_mesa_check_framebuffer_status(ctx, ctx->ReadBuffer/' \
$(grep -rl _mesa_CheckFramebufferStatus src/mesa/drivers)
The second expression catches both GL_FRAMEBUFFER and GL_FRAMEBUFFER_EXT.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a GL API handle
Also change the name of the function to
_mesa_meta_framebuffer_texture_image. The function is basically a
wrapper around _mesa_framebuffer_texture (which is used to implement
glFramebufferTexture1D and friends), so it makes sense for it's name to
be similar to that.
The next patch will clean _mesa_meta_framebuffer_texture_image up
considerably.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(This is commit 4a1c8a3037cd29938b2a6e2c680c341e9903cfbe for vec4 mode.)
Using the push model for inputs is much more efficient than pulling
inputs - the hardware can simply copy a large chunk into URB registers
at thread creation time, rather than having the thread send messages to
request data from the L3 cache. Unfortunately, it's possible to have
more TES inputs than fit in registers, so we have to fall back to the
pull model in some cases.
However, it turns out that most tessellation evaluation shaders are
fairly simple, and don't use many inputs. An arbitrary cut-off of
24 vec4 slots (12 registers) should suffice. (I chose this instead of
the 32 vec4 slots used in the scalar backend to avoid regressing a few
Piglit tests due to the vec4 register allocator being too stupid to
figure out what to do. We probably ought to fix that, but it's a
separate issue.)
Improves performance in GPUTest's tessmark_x64 microbenchmark by
41.5394% +/- 0.288519% (n = 115) at 1024x768 on my Clevo W740SU
(with Iris Pro 5200).
Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by
38.3576% +/- 0.759748% (n = 42).
v2: Simplify abs/negate handling, as requested by Matt.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When reusing a depth attachment as a stencil, we need to propogate
the layered bit, otherwise we fail to complete the framebuffer.
discovered running ./bin/fbo-depth-array depth-layered-clear
on virgl on haswell.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MESA_DRI_MODULE_PATH is only getting set for classic DRI drivers and may or
may not be set correctly for gallium_dri.so depending on the makefile
include ordering. For Android 6 and earlier it is fine, but with build
system changes in AOSP master, it is not.
Move the path variables to a single place at the top level and introduce
MESA_DRI_MODULE_REL_PATH for Android 5 and later which require relative
paths. With this, there is a single variable to change.
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
|
|
|
|
|
|
|
|
|
|
| |
With the Android build system changes to ninja/kati, the use of
.SECONDEXPANSION is no longer supported. Fix this by avoiding rule specific
variables and using $(transform-generated-source).
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commits a39a8fbbaa12 ("nir: move to compiler/") and eb63640c1d38
("glsl: move to compiler/") broke Android builds. Fix them.
There is also a missing dependency between generated NIR headers and
several libraries. This isn't a new issue, but seems to have been
exposed by the NIR move.
Built with i915, i965, freedreno, r300g, r600g, vc4, and virgl enabled.
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Cc: Mauro Rossi <issor.oruam@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
|
|
|
|
|
|
|
|
|
|
|
| |
The two extensions are identical, and are largely taking bits of already
existing desktop functionality. We continue to do a poor job of
supporting the 'precise' keyword, just like we do on desktop.
This passes the relevant dEQP tests that I could find.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
|
|
|
|
|
|
|
|
| |
Could be exposed on earlier GLES versions if we supported EXT_sRGB, but
we don't, for now.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
|
|
|
|
|
|
| |
Trivial
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
| |
v2: Fix some bad indentation. Suggested by Curro.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
|
|
|
|
|
|
|
|
|
|
| |
This will now never occur. The empty if-else part would have already
been removed leaving an empty if-endif part.
No shader-db changes.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On BDW,
total instructions in shared programs: 8448571 -> 8448367 (-0.00%)
instructions in affected programs: 21000 -> 20796 (-0.97%)
helped: 116
HURT: 0
v2: Remove spurious attempt to combine the if_block with the (removed!)
else_block. Suggested by Matt and Curro. Correct the comment
describing what the new pass does. Suggested by Matt.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
|
|
|
|
|
|
|
|
| |
This provides a trivial simplification now, and it makes some future
changes more straight forward.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
|
|
|
|
|
|
|
|
|
| |
'git diff -w' is a bit more illustrative. A couple declarations were
moved, the continue was removed, and the code was reindented. This will
simplify future changes.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
|
|
|
|
|
|
|
| |
The same code appeared in both branches; pull it above the if statement.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
| |
The caller already computes it. Now that we have stage specific
functions, it's really easy to pass this in.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
| |
The caller already computes it. Now that we have stage specific
functions, it's really easy to pass this in.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that each stage is directly calling brw_nir_lower_io(), and we have
per-stage helper functions, it makes sense to just call the relevant one
directly, rather than going through multiple switch statements.
This also eliminates stupid function parameters, such as the two that
only apply to vertex attributes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
| |
These functions are both giant switch statements where most cases don't
overlap at all. Let's put the bulk of the work in per-stage helpers.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
| |
Most cases already call nir_lower_io explicitly for input and output
lowering. This catch all isn't very useful anymore - we can just add it
to the remaining cases.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
|
| |
This simplifies things. Every caller of brw_nir_lower_io() immediately
calls brw_postprocess_nir(). The only real change this will have is
that we get an extra brw_nir_optimize() call when compiling compute
shaders, but that seems fine.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've now hit literally every case other than geometry shaders (and
compute shaders, but those are a no-op). So, let's just move geometry
shaders over too and be done with it.
The only advantage to doing this at link time was to save the expense
of running the pass on recompiles. But we're already running a lot of
passes, and the extra code complexity isn't worth it.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
| |
Shorter than compiler->scalar_stage[MESA_SHADER_GEOMETRY], which can
help with line-wrapping.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Vulkan driver wants to be able to delete fragment outputs that are
beyond key.nr_color_regions; this is a lot easier if we lower outputs at
specialization time rather than link time.
(Rationale added to commit message by Ken)
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without this, on SIMD 16 the send instruction destination will appear
to write more than one destination register, causing the simulator to
report an error.
Of course, the send instruction can actually write more than one
destination register regardless of the type set for the destination,
so this is a bit strange.
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
|
|
|
|
|
|
| |
It was already done in get_mesa_program()
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allows us to transform
mad res src0 src1 src2
mov.sat dst -res
into
mad.sat dst -src0 -src1 src2
instructions in affected programs: 3712 -> 3688 (-0.65%)
helped: 24
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allows us to transform
add res src0 src1
mov.sat dst -res
into
add.sat dst -src0 -src1
No shader-db changes.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allows us to transform
mul res src0 src1
mov.sat dst -res
into
mul.sat dst src0 -src1
instructions in affected programs: 45246 -> 45054 (-0.42%)
helped: 162
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's not correct to CSE these multiplies
mul.sat dst1, -a, b
mul.sat dst2, a, b
by emitting a negated MOV from dst1 to dst2:
mul.sat dst1, -a, b
mov dst2, -dst1
Take 2.0*2.0 for example. The first multiply would produce 0.0 and the
second would produce 1.0.
Fixes bad generated code in 18 to 22 shaders:
instructions in affected programs: 432 -> 464 (7.41%)
helped: 4
HURT: 18
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
RGBA8 and BGRA8 unorm formats are compatible with the various
mem_copy functions. Their sRGB counterparts are also compatible
because they're also color-renderable (of importance when the
specified resource is a readbuffer) and they share the same
physical layout.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
| |
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
|
|
|
|
|
| |
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
|
|
|
|
|
| |
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
|
|
|
|
|
| |
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
|
|
|
|
|
|
|
| |
Not called from any other file. Remove _mesa_ prefix and update comments.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
|