external_mesa3d.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	swr: [rasterizer core] implement depth bounds test	Tim Rowley	2016-10-11	6	-9/+101
\| \| \| \|	Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
*	swr: [rasterizer core] update/add formats	Tim Rowley	2016-10-11	7	-1705/+2592
\| \| \| \|	Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
*	swr: [rasterizer core] SwrStoreTiles api change	Tim Rowley	2016-10-11	7	-19/+27
\| \| \| \| \| \| \|	SwrStoreTiles now takes a mask of surfaces to store. Reduces overhead when storing multiple render targets. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
*	swr: [rasterizer scripts] add ENABLE_ASSERT_DIALOGS knob for windows	Tim Rowley	2016-10-11	1	-0/+8
\| \| \| \|	Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
*	swr: [rasterizer archrast] add mako template	Tim Rowley	2016-10-11	4	-2/+117
\| \| \| \| \| \|	Add template for generating code to save events to a file. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
*	swr: [rasterizer core] disable cull for rect_list	Tim Rowley	2016-10-11	1	-0/+8
\| \| \| \|	Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
*	swr: [rasterizer core] add support for "RAW" surface format	Tim Rowley	2016-10-11	2	-0/+29
\| \| \| \|	Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
*	swr: [rasterizer core] align Macrotile FIFO memory to SIMD size	Tim Rowley	2016-10-11	4	-9/+23
\| \| \| \| \| \| \| \| \|	Align and use streaming store instructions for BE fifo queues. Provides slightly faster enqueue and doesn't pollute the caches. Add appropriate memory fences to ensure streaming writes are globally visible. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
*	swr: [rasterizer common] remove threadviz code	Tim Rowley	2016-10-11	3	-94/+0
\| \| \| \|	Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
*	swr: [rasterizer memory] split load/store for compile speed	Tim Rowley	2016-10-11	17	-1836/+2290
\| \| \| \|	Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
*	i915g: fix incorrect gl_FragCoord value	Nicholas Bishop	2016-10-10	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Intel Pineview M hardware, the i915 gallium driver doesn't output the correct gl_FragCoord. It seems to always have an X coord of 0.0 and a Y coord of the window's height in pixels, e.g. 600.0f or such. I believe this is a regression caused in part by this commit: afa035031ff9e0c07a2297d864e46c76f7bfff58 The old behavior used the output at index zero, while the new behavior uses actual zeroes. In the case of gl_FragCoord the output at index zero happened to be the correct one, so the behavior appeared correct although the code already had a bug. Fixed by checking for I915_SEMANTIC_POS when setting up texCoords. If the generic_mapping is I915_SEMANTIC_POS, look for the TGSI_SEMANTIC_POSITION instead of a TGSI_SEMANTIC_GENERIC output. https://bugs.freedesktop.org/show_bug.cgi?id=97477 Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Tested-by: Stéphane Marchesin <marcheu@chromium.org>
*	softpipe: Cap to 2 GB on 32 bits	Axel Davy	2016-10-10	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	On 32 bits system, application memory is quite limited. softpipe uses application memory. To help prevent memory exhaustion, limit reported memory availability to 2GB. Some gallium nine apps do check reported memory by allocating resources until memory is full. Gallium nine refuses allocations when 80% of the reported memory limit is used. This change helps some apps to start. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
*	llvmpipe: Cap to 2 GB on 32 bits	Axel Davy	2016-10-10	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	On 32 bits system, application memory is quite limited. llvmpipe uses application memory. To help prevent memory exhaustion, limit reported memory availability to 2GB. Some gallium nine apps do check reported memory by allocating resources until memory is full. Gallium nine refuses allocations when 80% of the reported memory limit is used. This change helps some apps to start. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
*	nvc0: fix valid range for shader buffers	Samuel Pitoiset	2016-10-10	3	-0/+3
\| \| \| \| \| \| \| \|	When offset != 0, the valid range was wrong because the second argument of util_range_add() is end, not size. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0/ir: fix overwriting of value backing non-constant gather offset	Ilia Mirkin	2016-10-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Normally the value is an immediate, which is moved to some temporary, so there's no problem. In the case of a non-constant offset (as allowed by ARB_gpu_shader5), we have to take care to copy it first before using it to build up the bits. This fixes a compilation error observed in F1 2015. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org
*	nv50/ir: only stick one preret per function	Ilia Mirkin	2016-10-10	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \|	A function with multiple returns would have had multiple preret settings at the top of the function. While this is unlikely to have caused issues since we don't use functions in earnest, it could have in some cases overflowed the call stack, in case a function had a lot of early returns. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	radeonsi: make more use of si_have_tgsi_compute	Nicolai Hähnle	2016-10-10	1	-3/+1
\| \| \| \| \|	Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
*	gallium/radeon: assign a name to LLVM output variables in debug builds	Nicolai Hähnle	2016-10-10	1	-1/+6
\| \| \| \| \| \| \|	This can be helpful with R600_DEBUG=preoptir. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
*	gallium/radeon: avoid redundant work with overlapping in/out arrays	Nicolai Hähnle	2016-10-10	1	-1/+4
\| \| \| \| \|	Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
*	radeonsi: support ARB_compute_variable_group_size	Nicolai Hähnle	2016-10-10	5	-17/+53
\| \| \| \| \| \| \| \|	Not sure if it's possible to avoid programming the block size twice (once for the userdata and once for the dispatch). Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
*	gallium: add missing zero-init for resource templates	Rob Clark	2016-10-07	1	-0/+1
\| \| \| \| \| \| \|	Mostly test code, plus one spot I noticed in r600. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
*	freedreno: don't try to shadow layered textures	Rob Clark	2016-10-07	1	-0/+3
\| \| \| \| \| \| \| \|	We will only hit this with multi-planar YUV external images, so we would probably never hit this code path in the first place. But if we did, it wouldn't do the right thing so just bail. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	freedreno/a3xx+a4xx: fix clip-plane lowering state	Rob Clark	2016-10-07	2	-0/+6
\| \| \| \| \| \| \|	If enabled clip-planes have changed, we need to mark program state dirty. Signed-off-by: Rob Clark <robdclark@gmail.com>
*	vc4: Don't worry about partial Z/S clear if the other is already cleared.	Eric Anholt	2016-10-06	1	-3/+7
\| \| \| \| \| \| \| \| \|	We have to be careful to not smash the value they're clearing to, but other than that we're fine. Avoids quad clears in Processing, which likes to do glClear(Z\|S); glClear(Z). Improves performance of Processing's QuadRendering demo at 5000 quads by 5.46507% +/- 1.35576% (n=15 before, 32 after)
*	vc4: Try to fix the HW-2116 workaround.	Eric Anholt	2016-10-06	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were incrementing the count at the end of vc4_start_draw(), except that that function returns immediately if we've already started drawing on this batch. It also failed to count the statechanges from the GFXH-515 workaround. This incidentally allows repeated glClear() to be coalesced, because the fast clears aren't counted in draw_calls_queued any more. Fixes most of the extra flushes in Processing, which emits glClear(Z\|S); glClear(Z); glClear(C) during its frame setup. Improves performance of Processing's QuadRendering demo at 5000 quads by 3.33538% +/- 2.05846% (n=21 before, 15 after)
*	vc4: Drop dead argument from vc4_start_draw().	Eric Anholt	2016-10-06	1	-3/+3
\|
*	vc4: Fix fallback to quad clears of depth in GLX.	Eric Anholt	2016-10-06	4	-25/+64
\| \| \| \| \|	The fix in the vc4-jobs series ended up triggering the fallback path on GLX apps that use depth but not stencil.
*	vc4: Add the format name in miptree_debug.	Eric Anholt	2016-10-06	1	-2/+4
\| \| \| \| \|	I was curious if my Z/S buffer was actually ZS or ZX, and the vc4 format of "0" didn't tell me much.
*	vc4: Fix perf debug formatting on partial Z/S clear.	Eric Anholt	2016-10-06	1	-1/+1
\|
*	vc4: Drop destination register when it's unused.	Eric Anholt	2016-10-06	1	-1/+22
\| \| \| \| \| \| \|	This slightly reduces instructions on shader-db, but I think it's just perturbing register allocation -- the allocator should have always trivially colored these nodes, before. This commit is just to make QIR code failing more intelligible when register allocation fails.
*	vc4: Fix live intervals analysis for screening defs in if statements.	Eric Anholt	2016-10-06	3	-5/+20
\| \| \| \| \| \| \| \| \|	If a conditional assignment is only conditioned on the exec mask, that's still screening off the value in the executed channels (and, since we're not storing to the unexcuted channels, we don't care what's in there). Fixes a bunch of extra register pressure on Processing's Ribbons demo, which is failing to allocate.
*	vc4: Fix simulator when more than one vc4_screen is opened.	Eric Anholt	2016-10-06	3	-3/+39
\| \| \| \| \| \|	We would assertion fail in setting up the simulator the second time around. This at least postpones the assertion failure until we've closed all of the first set of screens and started opening a new set.
*	vc4: Fix assertion fails from trying to cast non-ALU instrs to ALU.	Eric Anholt	2016-10-06	1	-0/+2
\| \| \| \| \|	Fixes 100 piglit tests since the assertions were added to nir.h. What's amazing is that these tests used to pass, even when casting garbage.
*	nv50/ir: fix wrong check when optimizing MAD to SHLADD	Samuel Pitoiset	2016-10-07	1	-1/+1
\| \| \| \| \| \| \| \| \|	Checking if MAD is supported is definitely wrong, and it's more likely a typo I introduced few days ago which breaks NV50 because SHLADD is not supported there. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0: dump program binary only when NV50_PROG_DEBUG is set	Samuel Pitoiset	2016-10-07	1	-1/+1
\| \| \| \| \| \| \| \|	When the chipset is forced with NV50_PROG_CHIPSET, we actually only want to output the binary if NV50_PROG_DEBUG is also enabled. Otherwise, this pollutes the shader-db output. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	nvc0: expose ARB_compute_variable_group_size	Samuel Pitoiset	2016-10-07	1	-2/+6
\| \| \| \| \| \| \| \| \|	Only expose 512 threads/block on Fermi to not be limited by 32 GPRs/thread. v4: - use 512 threads on Fermi, 1024 on Kepler+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	nv50/ir: set number of threads/block for variable local size	Samuel Pitoiset	2016-10-07	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	When a variable local size is defined as specified by ARB_compute_variable_group_size, the fixed local size is set to 0 and a SIGFPE occurs when we compute the maximum number of regs. This allows to use 64 GPRs/thread. v4: - use 512 threads on Fermi, 1024 on Kepler+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	gallium: add PIPE_COMPUTE_CAP_MAX_VARIABLE_THREADS_PER_BLOCK	Samuel Pitoiset	2016-10-07	5	-0/+9
\| \| \| \| \| \| \| \| \|	v3: - use a new case statement in r600_pipe_common.c - fix compilation of softpipe... Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
*	nv50/ir: optimize sub(a, 0) to a	Karol Herbst	2016-10-06	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	helped some ue4 demos and divinity OS shaders total instructions in shared programs : 2818674 -> 2818606 (-0.00%) total gprs used in shared programs : 379273 -> 379273 (0.00%) total local used in shared programs : 9505 -> 9505 (0.00%) total bytes used in shared programs : 25837792 -> 25837192 (-0.00%) local gpr inst bytes helped 0 0 33 33 hurt 0 0 0 0 Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
*	nir: Make nir_foo_first/last_cf_node return a block instead	Jason Ekstrand	2016-10-06	2	-11/+6
\| \| \| \| \| \| \| \| \| \|	One of NIR's invariants is that control flow lists always start and end with blocks. There's no good reason why we should return a cf_node from these functions since we know that it's always a block. Making it a block lets us remove a bunch of code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
*	svga: add svga_mksstats.h to the sources list	Emil Velikov	2016-10-06	1	-0/+1
\| \| \| \| \| \| \|	Otherwise it won't be picked in the tarball and the build will fail. Fixes: 0035f7f1365 ("svga: add guest statistic gathering interface") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
*	nvc0: dump program binary when chipset has been forced	Samuel Pitoiset	2016-10-05	1	-0/+5
\| \| \| \| \| \| \| \|	Currently, program binaries are only dumped at upload time, but when the chipset has been forced via NV50_PROG_CHIPSET we might want to show the generated code, especially with shaderdb. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	radeonsi: fix texture border colors for compute shaders	Marek Olšák	2016-10-05	1	-0/+12
\| \| \| \| \| \| \| \|	There are VM faults without this. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
*	radeonsi: fix interpolateAt opcodes for .zw components	Marek Olšák	2016-10-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Not returning garbage in .zw seems pretty important. This fixes: GL45-CTS.shader_multisample_interpolation.render.interpolate_at__check. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
*	radeonsi: add assertions to validate interpolation flags	Marek Olšák	2016-10-05	1	-0/+34
\| \| \| \|	Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
*	radeonsi: interpolate colors after interpolation weight shuffling	Marek Olšák	2016-10-05	1	-48/+48
\| \| \| \|	Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
*	ddebug: dump most driver information with GALLIUM_DDEBUG=always	Marek Olšák	2016-10-05	1	-1/+5
\| \| \| \|	Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
*	nv50/ra: let simplify return an error and handle that	Karol Herbst	2016-10-05	1	-5/+7
\| \| \| \| \| \| \| \|	fixes a crash in the case simplify reports an error Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	virgl: Fix build regression of commit 8a943564	Nicolai Hähnle	2016-10-05	1	-1/+1
\|
*	gallium/radeon: implement set_device_reset_callback	Nicolai Hähnle	2016-10-05	4	-0/+40
\| \| \| \| \| \| \| \| \|	Check for device reset on flush. It would be nicer if the kernel just reported this as an error on the submit ioctl (and similarly for fences), but this will do for now. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>