| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
so that we don't abuse PIPE_BIND_VERTEX_BUFFER all the time.
|
| |
|
| |
|
|
|
|
|
| |
We ask for Hyper-Z access when clearing a zbuffer.
We release it if no zbuffer clear has been done for 2 seconds.
|
|
|
|
|
| |
Renaming a few files, types, and functions.
Also make the winsys independent of r300g.
|
|
|
|
|
|
|
|
|
|
| |
The drivers have been changed so that they behave as if all of the flags
were set. This is already implicit in most hardware drivers and required
for multiple contexts.
Some state trackers were also abusing the PIPE_FLUSH_RENDER_CACHE flag
to decide whether flush_frontbuffer should be called.
New flag ST_FLUSH_FRONT has been added to st_api.h as a replacement.
|
|
|
|
| |
Thread offloading is not sometimes desirable, e.g. when mapping a buffer.
|
|
|
|
| |
Running any older kernel is not recommended anyway.
|
|
|
|
|
| |
This was getting hard to maintain and didn't really bring any real benefits.
Instead, validate buffers when the vertex array state is dirty.
|
|
|
|
|
|
|
|
|
|
| |
This is a multi-threading optimization which hides the kernel overhead
behind a thread. It improves performance in CPU-limited apps by 2-15%.
Of course you must have at least 2 cores for it to make any difference.
It can be disabled with:
export RADEON_THREAD=0
|
|
|
|
| |
So finally we have them.
|
|
|
|
| |
The winsys takes care of flushing automatically.
|
|
|
|
|
| |
I thought I couldn't skip emitting this packet in some cases.
Well it looks like I can.
|
|
|
|
|
| |
We don't have to unmap and recreate the upload buffer when a flush occurs.
This should also prevent buffer allocations from failing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The motivation behind this rework is to get some speed by reducing
CPU overhead. The performance increase depends on many factors,
but it's measurable (I think it's about 10% increase in Torcs).
This commit replaces libdrm's radeon_cs_gem with our own implemention.
It's optimized specifically for r300g, but r600g could use it as well.
Reloc writes and space checking are faster and simpler than their
counterparts in libdrm (the time complexity of all the functions
is O(1) in nearly all scenarios, thanks to hashing).
(libdrm's radeon_bo_gem is still being used in the driver.)
It works like this:
cs_add_reloc(cs, buf, read_domain, write_domain) adds a new relocation and
also adds the size of 'buf' to the used_gart and used_vram winsys variables
based on the domains, which are simply or'd for the accounting purposes.
The adding is skipped if the reloc is already present in the list, but it
accounts any newly-referenced domains.
cs_validate is then called, which just checks:
used_vram/gart < vram/gart_size * 0.8
The 0.8 number allows for some memory fragmentation. If the validation
fails, the pipe driver flushes CS and tries do the validation again,
i.e. it validates only that one operation. If it fails again, it drops
the operation on the floor and prints some nasty message to stderr.
cs_write_reloc(cs, buf) just writes a reloc that has been added using
cs_add_reloc. The read_domain and write_domain parameters have been removed,
because we already specify them in cs_add_reloc.
The space checking has been tested by putting small values in vram/gart_size
variables.
|
|
|
|
| |
because the upload buffers are reused for subsequent draw operations.
|
|
|
|
| |
This prevents needless buffer validation (CS space checking).
|
|
|
|
| |
This also removes DBG_STATS (the stats can be obtained with valgrind instead).
|
|
|
|
| |
.. instead of calling r500_index_bias_supported(..) every draw call.
|
|
|
|
| |
https://bugs.freedesktop.org/show_bug.cgi?id=29901
|
| |
|
|
|
|
|
|
|
|
|
| |
the current code reuses the same vbo over and over, however after a flush
we'd stall and wait for mapping on the vbo when we should just fire and forget.
On a gears test this brings me from ~620 to ~750 on my rv530 in swtcl mode.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements fast Z clear, Z compression, and HiZ support for r300->r500
GPUs.
It also allows cbzb clears when fast Z clears are being used for the ZB.
It requires a kernel with hyper-z support.
Thanks to Marek Olšák <maraeo@gmail.com>, who started this off, and Alex Deucher at AMD for providing lots of hints.
v2:
squashed zmask ram size fix]
squashed r300g/blitter: fix Z readback when compressed]
v3:
rebase around texture changes in master - .1 fix more bits
v4:
migrated to using u_mm in r300_texture to manage hiz/zmask rams consistently
disabled HiZ when using OQ
flush z-cache before turning hyper-z off
update hyper-z state on dsa state change
store depthclearvalue across cbzb clears and replace it afterwards.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
|
|
|
| |
The Draw flush inside r300_flush was the culprit.
Also, no need to flush Draw when changing a state since the flush is
already inside swtcl_draw_vbo.
|
|
|
|
| |
fix vb/ib uploads
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An initial implementation made by Dave Airlie.
For it to be used, a color-only clear must be invoked and exactly one
point-sampled render target must be set. The render target must be
macrotiled (for us to overcome alignment issues) and bpp must be either
16 or 32.
I can't see a difference in performance. :(
Conflicts:
src/gallium/drivers/r300/r300_blit.c
|
|
|
|
| |
FDO bug #28563.
|
|
|
|
| |
The fewer macros, the better.
|
| |
|
| |
|
| |
|
|
|
|
| |
See comments in the code.
|
| |
|
| |
|
|
|
|
|
|
| |
It seems to me that the resource management code is the most blurry.
Also some of my notes about fastfill are wrong, removing them too.
|
| |
|
| |
|
| |
|
|
|
|
| |
Now SWTCL renders at least "something", we're close...
|
| |
|
| |
|
|\
| |
| |
| |
| |
| |
| | |
Conflicts:
src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c
src/gallium/auxiliary/util/Makefile
src/gallium/drivers/r300/r300_state_derived.c
|
| | |
|
|/
|
|
| |
It fixes almost all regressions introduced lately.
|
|
|
|
| |
Split from the fastpath WIP.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Add support for begin/end in each CS so we don't get any other
processes rendering in between.
TODO:
blame other parts of driver for this not working like Z.
|