| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
|
|
|
| |
CACHE_NEW_CS_PROG hasn't existed in quite a long time...the old
comment was there, but not the actual bit.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
| |
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
|
|
|
| |
3DSTATE_PS doesn't need this. 3DSTATE_PS_EXTRA however does,
for brw_color_buffer_write_enabled().
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
| |
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
|
|
|
| |
Needed for user clip plane enables. Broken since this code was
introduced.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
| |
This will make it easier to add more operations.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just generate an __intrinsic_atomic_add with a negated parameter.
Some background on the non-obvious reasons for the the big change to
builtin_builder::call()... this is cribbed from some discussion with
Ilia on mesa-dev.
Why change builtin_builder::call() to allow taking dereferences and
create them here rather than just feeding in the ir_variables directly?
The problem is the neg_data ir_variable node would have to be in two
lists at the same time: the instruction stream and parameters. The
ir_variable node is automatically added to the instruction stream by the
call to make_temp. Restructuring the code so that the ir_variables
could be in parameters then move them to the instruction stream would
have been pretty terrible.
ir_call in the instruction stream has an exec_list that contains
ir_dereference_variable nodes.
The builtin_builder::call method previously took an exec_list of
ir_variables and created a list of ir_dereference_variable. All of the
original users of that method wanted to make a function call using
exactly the set of parameters passed to the built-in function (i.e.,
call __intrinsic_atomic_add using the parameters to atomicAdd). For
these users, the list of ir_variables already existed: the list of
parameters in the built-in function signature.
This new caller doesn't do that. It wants to call a function with a
parameter from the function and a value calculated in the function. So,
I changed builtin_builder::call to take a list that could either be a
list of ir_variable or a list of ir_dereference_variable. In the former
case it behaves just as it previously did. In the latter case, it uses
(and removes from the input list) the ir_dereference_variable nodes
instead of creating new ones.
text data bss dec hex filename
6036395 283160 28608 6348163 60dd83 lib64/i965_dri.so before
6036923 283160 28608 6348691 60df93 lib64/i965_dri.so after
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
text data bss dec hex filename
6038043 283160 28608 6349811 60e3f3 lib64/i965_dri.so before
6036507 283160 28608 6348275 60ddf3 lib64/i965_dri.so after
v2: s/ir_intrinsic_atomic_sub/ir_intrinsic_atomic_counter_sub/. Noticed
by Ilia.
v3: Silence unhandled enum in switch warnings in st_glsl_to_tgsi.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
|
|
| |
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
| |
Fixes unused variable warning in release build.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
| |
This function is only ever used by an assert() this fixes an
unused function warning in release builds.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
| |
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
| |
This fixes an unused variable warning on release builds.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
| |
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch causes 2 regressions in khronos' gles cts tests
on various intel platforms.
Failing tests:
ES3-CTS.functional.state_query.integers.viewport_getinteger
ES3-CTS.functional.state_query.integers.viewport_getfloat
Here is an explanation of what's causing the failures:
CTS tests are not clamping the x, y location of the viewport's
bottom-left corner as recommended by ARB_viewport_array and
OES_viewport_array:
"The location of the viewport's bottom-left corner, given by (x,y), are
clamped to be within the implementation-dependent viewport bounds range.
The viewport bounds range [min, max] tuple may be determined by
calling GetFloatv with the symbolic constant VIEWPORT_BOUNDS_RANGE_OES"
Khronos CTS merge request to fix the test case:
https://gitlab.khronos.org/opengl/cts/merge_requests/399
V2: Initialize the relevant variables for GL_OES_viewport_array on gen8+
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
|
|
|
| |
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
|
|
|
| |
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In core profile, we support up to 16 viewports. However, in the
majority of cases, only 1 of them is actually used - we only need
the others if the last shader stage prior to the rasterizer writes
gl_ViewportIndex.
Processing all 16 viewports adds additional CPU overhead, which hurts
CPU-intensive workloads such as Glamor. This meant that switching to
core profile actually penalized Glamor to an extent, which is
unfortunate.
This patch tracks the number of relevant viewports, switching between
1 and ctx->Const.MaxViewports if gl_ViewportIndex is written. A new
BRW_NEW_VIEWPORT_COUNT flag tracks this. This could mean re-emitting
viewport state when switching, but hopefully this is offset by doing
1/16th of the work in the common case. The new flag is also lighter
weight than BRW_NEW_VUE_MAP_GEOM_OUT, which we were using in one case.
According to Eric Anholt, x11perf -copypixwin10 performance improves by
11.5094% +/- 3.10841% (n=10) on his Skylake.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
| |
Using consistent naming allows us to create macros more easily.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
| |
Using consistent naming allows us to create macros more easily.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
| |
There's an assert right above this.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
|
| |
These structs will be written to disk as part of the shader cache
so use uint32_t just to be safe.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
|
|
|
|
|
|
|
| |
The warning was also the wrong location, it should have been
in the else.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
| |
Almost all of the other drawing validation code is in api_validate.c
so put this function there as well.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
|
|
| |
The code already skips doing the depth stall on gen >= 8, and as we
enable new platforms this assertion will fail needlessly. Instead of
changing the caller, make this simple change.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
| |
For now that's never since advanced blend hasn't been piped through.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
| |
These will be used by the on disk shader cache.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
| |
Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support multi-planar YUV for external EGLImage's (currently just in the
dma-buf import path) by lowering to multiple texture fetch's for each
plane and CSC in shader.
There was some discussion of alternative approaches for tracking the
additional UV or U/V planes:
https://lists.freedesktop.org/archives/mesa-dev/2016-September/127832.html
They all seemed worse than pipe_resource::next
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
|
|
| |
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
|
|
| |
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
|
|
|
|
| |
We already pass the shader so we can just get the stage from this.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
|
|
|
|
|
|
|
| |
This is needed to be consistent with other drivers.
Signed-off-by: Emilio Cobos Álvarez <me@emiliocobos.me>
Reviewed-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
| |
Usually, there's no user-specified texture swizzle so we can optimize
the swizzle_swizzle() function and skip the loop/switch.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some demos, like Heaven, were creating and destroying a large number
of sampler views because of a swizzle issue.
Basically, we compute the sampler view's swizzle by examining the
texture format, user swizzle, depth mode, etc. Later, during validation
we recompute that swizzle (in case something like depth mode changes)
and see if it matches the view's swizzle.
In the case of PIPE_FORMAT_RGTC2_UNORM, get_texture_format_swizzle
returned SWIZZLE_XYZW but the u_sampler_view_default_template() function
was setting the sampler view's swizzle to SWIZZLE_XY01. This mismatch
caused the validation step to always "fail" so we'd destroy the old
sampler view and create a new one.
By removing the conditional, the sampler view's swizzle and the computed
texture swizzle match and validation "passes". When creating a new sampler
view, we always want to use the texture swizzle which we just computed.
Fixes VMware issue 1733389.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
|
|
|
|
|
|
|
| |
This reverts commit f5a6aab4031bc4754756c1773411728ad9a73381.
This broke some tests. It seems gl_transform_feedback_info gets memset
to 0 so we were losing the values in BufferStride before we used them.
|
|
|
|
|
|
|
|
| |
It makes more sense to have this here where we store the other values
from xfb qualifiers. The struct it was previously part of is now only
used to store values that come from the api.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
|
|
|
|
|
|
|
|
| |
Now that we have gen_device_info mutable, we can update its values and drop
all copies we had in brw_context.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make gen_device_info a mutable structure so we can update the fields that
can be refined by querying the kernel (like subslices and EU numbers).
This patch does not make any functional change, it just makes
gen_get_device_info() fill a structure rather than returning a const
pointer.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
|
|
|
|
|
|
|
|
| |
This just up-converts them to doubles. Not great, but this is what all
the other variants also do.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
| |
This is needed for GL_OES_viewport_array.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
|
|
| |
The expectation is that drivers will set this based on
OES_geometry_shader and ARB_viewport_array support. This is a separate
enable on the same reasoning as for OES_texture_cube_map_array.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
|