| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
| |
This should get us the same decode code generated, but with a lot less
custom code in the driver.
|
|
|
|
|
| |
This means doing Newton-Raphson on the RCP, but it's probably actually a
good thing to be accurate on.
|
|
|
|
|
| |
This makes fewer programs with loops assertion fail, replacing them with
the rendering failure warning.
|
|
|
|
|
| |
In particular it's been hard to find the point where we switch from
dumping pre-optimization QIR and post-optimization QIR.
|
|
|
|
|
|
| |
It's not doing anything according to shader-db now that we're using NIR.
It would have had to be reworked significantly anyway, to handle control
flow.
|
|
|
|
|
|
| |
We were generating piles of FRAG_W for interpolation, only to CSE them
away immediately. Since this is the only thing that CSE is doing for us
any more, just avoid making the CSE work necessary.
|
|
|
|
|
|
|
| |
This is one of two uses of the current QIR CSE pass according to
shader-db. The NIR pass means that we'll end up doing Newton-Raphson on
our RCP, which we weren't doing before, but that's probably actually a
good thing.
|
|
|
|
|
| |
Now that we're using NIR for our optimization, there's no need for this
tricky code.
|
|
|
|
|
|
|
|
|
| |
This matches the "foreach x in container" pattern found in many other
programming languages. Generated by the following regular expression:
s/nir_foreach_function(\([^,]*\),\s*\([^,]*\))/nir_foreach_function(\2, \1)/
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This matches the "foreach x in container" pattern found in many other
programming languages. Generated by the following regular expression:
s/nir_foreach_instr(\([^,]*\),\s*\([^,]*\))/nir_foreach_instr(\2, \1)/
and similar expressions for nir_foreach_instr_safe etc.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
| |
A later patch will add lower_flrp64 option to NIR.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
|
| |
Part of fixing piglit EXT_framebuffer_multisample/sample-coverage inverted
(there is also a bug with RCL tiled blits)
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
|
|
|
|
|
|
| |
There's no reason we couldn't do non-MSAA full resolution tile buffer
load/stores, but we would have claimed buffer overflow was being
attempted. Nothing does this currently.
|
|
|
|
|
| |
I noticed this as a problem with ET:QW traces emitting coverage code when
the framebuffer was supposed to be single sampled.
|
|
|
|
|
|
|
|
|
|
| |
This was a bug from the MSAA enabling. Tests for surfaces with
nr_samples==1 instead of 0 (generally GL renderbuffers) would incorrectly
fail out.
Fixes the ARB_framebuffer_sRGB piglit tests other than srgb_conformance.
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
|
|
|
|
|
|
|
|
|
| |
I had made the previous blit fix non-MSAA only because I was thinking
about how the hardware infers stride from the RENDERING_CONFIG packet.
However, I'm also inferring the stride for both MSAA src and dst in
vc4_render_cl.c from the width argument in the ioctl.
Fixes 15 EXT_framebuffer_multisample piglit tests.
|
|
|
|
|
|
|
|
|
| |
Even when begin_query succeeds, there can still be failures in query handling.
For example for radeon, additional buffers may have to be allocated when
queries span multiple command buffers.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
|
| |
Use PIPE_SWIZZLE_* everywhere.
Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE.
The new enum is called pipe_swizzle.
Acked-by: Jose Fonseca <jfonseca@vmware.com>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
|
| |
Single-sampled texture miplevels > 1 are stored in POT-aligned areas, but
we only get one value to control the stride of the src and dst for single
sampled buffers. A RCL tile blit from level != 1 to level == 0 would
therefore load from the wrong stride.
|
|
|
|
| |
This was for TGSI, which we no longer have to deal with.
|
|
|
|
|
|
|
| |
We need to fix up the offset to point at the face of the cube. Fixes
piglit fbo-cubemap, copyteximage CUBE, and glean's fbo test.
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
|
|
|
|
|
|
| |
Fixes piglit mixed-immediate-and-vbo, and may significantly improve
performance of applications that store a 4-byte IB in the same VBO as
vertex data.
|
|
|
|
|
|
| |
Catches the cause of failure in
arb_vertex_buffer_object-mixed-immediate-and-vbo, I've had this class of
failure before, and it probably won't be the last time.
|
|
|
|
|
|
|
| |
If we're going to sample from or render to them at some particular size,
we'd better make sure that they actually are that size. Causes some tests
under simulation to generate appropriate error messages instead of
failures.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This code started out like the T case, iterating over utile offsets, but I
had partially switched it to iterating over pixel offsets. I hadn't
caught this before because it's unusual to do piecemeal uploads to small
textures.
Fixes bad text rendering in QT5 apps, which use a 256x16 glyph cache.
Also fixes 6 piglit tests related to glTexSubImage() and
glGetTexSubImage().
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
|
|
|
|
|
|
|
| |
The old version of the pass only worked on globals and locals and always
left inputs, outputs, uniforms, etc. alone.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
| |
Fixes rendering failures in glmark2's refract and
bump:render-mode=high-poly demos, and partially in its terrain demo.
|
|
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
|
|
|
|
|
| |
This gives us one less set of special instruction generation cases, and
instead just the case for returning the correct register to read.
|
|
|
|
|
|
|
|
| |
This lets us write the Z directly from the FTOI for computed Z, and may
let us coalesce color writes in the future.
No change in my shader-db, but clearly drops an instruction in piglit's
early-z test.
|
|
|
|
|
| |
The separate declaration of the struct is not helping clarity, and I was
going to be writing a whole lot more of these in the upcoming patches.
|
| |
|
|
|
|
|
|
|
| |
It wasn't correctly flagged everywhere, and QPU generation now handles the
only remaining case that was paying attention to it.
No change on shader-db.
|
|
|
|
|
|
|
| |
Normal SFU writes couldn't have SF because they were marked as
multi_instruction, but tex_result and tlb_color_read weren't. This ended
up not being a problem according to anything in shader-db, but it seems
possible.
|
|
|
|
|
|
|
|
|
|
|
| |
There used to be multi-instruction operations that would use src[] twice,
which is why we couldn't do some optimizations on them. This is no longer
the case.
total instructions in shared programs: 77973 -> 77969 (-0.01%)
instructions in affected programs: 84 -> 80 (-4.76%)
total estimated cycles in shared programs: 234165 -> 234157 (-0.00%)
estimated cycles in affected programs: 92 -> 84 (-8.70%)
|
|
|
|
| |
This gets us better validation of our NIR transformations.
|
|
|
|
|
|
|
|
| |
I liked having all my NIR be scalar, but nir_validate() complains that the
intrinsic writes 4 components but the destination we set up was only 1
component. I could generate a new scalar variant, but it's a lot easier
to just leave it as a vec4. This doesn't hurt codegen since we GC unused
uniforms, and UCP dot products use all the components anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't really suppor control flow yet, but it's a lot nicer to render
something and warn on stderr than to crash.
Fixes the following piglit tests:
- shaders/complex-loop-analysis-bug
- shaders/glsl-fs-discard-04
Converts the following piglit tests from crash to fail:
- shaders/glsl-fs-continue-inside-do-while
- shaders/glsl-fs-loop
- shaders/glsl-fs-loop-continue
- shaders/glsl-fs-loop-nested
- shaders/glsl-texcoord-array
- shaders/glsl-vs-continue-inside-do-while
- shaders/glsl-vs-loop
- shaders/glsl-vs-loop-continue
- shaders/glsl-vs-loop-nested
No piglit regressions.
v2 (Eric): Add stronger stderr warning.
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
| |
We shouldn't have any NIR functions present since all GLSL functions get
inlined, but this would be a more informative error if it does happen.
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure NIR control flow graph nodes that are unhandled in QIR
are reported with sufficient verbosity to aid debugging.
This improves piglit outputs, amongst other tools.
There are no other remaining uses of assert(0) as a blunt tool
within vc4.
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
| |
Found with grep and inspection. Test compiled on RPi hw.
Assists any future effort to remove TGSI as an intermediate stage.
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add PIPE_CAP to determine if the GL extension
'GL_ARB_framebuffer_no_attachments' shall be
supported.
The driver is required to support 'PIPE_FORMAT_NONE'
via its 'is_format_supported()' callback in order
to determine the MSAA modes the hardware supports so
that values requested from the application using
'GL_ARB_framebuffer_no_attachments' may be quantized
to what the hardware expects.
V.2:
Fix doc for a more detailed description of the PIPE_CAP
and the corresponding GL constant.
V.3:
Renamed and repurposed once again.
V.4:
Remove CAP from cap_mapping array.
[airlied: fix damaged whitespace]
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
|
|
|
|
| |
Found with grep and inspection. Test compiled on RPi hw.
Assists any future effort to remove TGSI as an intermediate stage.
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Squash multiple commits addressing the new parameter in different
files so we don't break the build (Iago)
v3: Fix tgsi (Samuel)
v4: Fix nir_clone.c (Samuel)
v5: Fix vc4 and freedreno (Iago)
v6 (Sam)
- Fix build errors in nir_lower_indirect_derefs
- Use helper to get type size from nir_alu_type.
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some opcodes need explicit bitsizes, and sometimes we need to use the
double version when constant folding.
v2: fix output type for u2f (Iago)
v3: do not change vecN opcodes to be float. The next commit will add
infrastructure to enable 64-bit integer constant folding so this is isn't
really necessary. Also, that created problems with source modifiers in
some cases (Iago)
v4 (Jason):
- do not change bcsel to work in terms of floats
- leave ldexp generic
Squashed changes to handle different bit sizes when constant
folding since otherwise we would break the build.
v2:
- Use the bit-size information from the opcode information if defined (Iago)
- Use helpers to get type size and base type of nir_alu_type enum (Sam)
- Do not fallback to sized types to guess bit-size information. (Jason)
Squashed changes in i965 and gallium/nir drivers to support sized types.
These functions should only see sized types, but we can't make that change
until we make sure that nir uses the sized versions in all the relevant places.
A later commit will address this.
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that the field exists in the instruction, we can make discards less
special. As a bonus, that means that we should be able to merge some more
.sf instructions together when we get around to that.
This causes some scheduling changes, as it allows tlb_color_reads to be
delayed past the discard condition setup. Since the tlb_color_read ends
up later, this may mean performance improvements, but I haven't tested.
total instructions in shared programs: 78114 -> 78035 (-0.10%)
instructions in affected programs: 1922 -> 1843 (-4.11%)
total estimated cycles in shared programs: 234318 -> 234329 (0.00%)
estimated cycles in affected programs: 8200 -> 8211 (0.13%)
|
|
|
|
|
|
|
|
|
| |
The register allocator doesn't really do anything about the temp, so it
doesn't seem like it should matter. However, the scheduler would think
that a new def is being created.
This doesn't change anything yet, but it avoids a bunch of regressions in
the next commit.
|