external_mesa3d.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	nv50/ir: teach insnCanLoad() about SHLADD	Samuel Pitoiset	2016-09-29	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commutativity is not allowed with SHLADD, but src2 can accept loads. To allow the load propagation pass to do its job, add a special case like for SUCLAMP because src1 is always an immediate. This IMAD to SHLADD optimization helps a bunch of shaders from Tomb Raider, Victor Vran, UE4 demos (+15% perf with Elemental) and Shadow Warrior. GF100/GK104: total instructions in shared programs :2838045 -> 2834712 (-0.12%) total gprs used in shared programs :396684 -> 396386 (-0.08%) total local used in shared programs :34416 -> 34416 (0.00%) local gpr inst bytes helped 0 326 1105 1105 hurt 0 55 3 3 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nv50/ir: add preliminary support for SHLADD	Samuel Pitoiset	2016-09-29	1	-2/+9
\| \| \| \| \| \| \| \| \| \|	This instruction is available since SM20 (Fermi) and allow to do (a << b) + c in one shot. In some situations, IMAD should be replaced by SHLADD when b is a power of 2, and ADD+SHL should be replaced by SHLADD as well. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0/ir: fix comments about instructions info	Samuel Pitoiset	2016-09-17	1	-2/+3
\| \| \| \| \| \| \| \| \|	The comment for the commutative flags was wrong because OP_MUL is before OP_MAD. While we are at it add missing opcodes, and fix the comment about the short forms. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0/ir: allow min/max instructions to be dual-issued in pairs	Karol Herbst	2016-09-03	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	changes for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0 /benchmark_duration_ms=60000 /width=1024 /height=640: inst_executed: 1.03G inst_issued1: 614M -> 580M inst_issued2: 213M -> 230M score: 1021 -> 1030 Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0/ir: don't dual-issue ops that depend or interfere with each other	Karol Herbst	2016-09-03	1	-0/+6
\| \| \| \| \| \| \|	Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> [imirkin: rewrite to split up the helpers and move more logic to target] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nouveau: Add support for SV_WORK_DIM	Hans de Goede	2016-07-02	1	-0/+1
\| \| \| \| \| \| \| \|	Add support for SV_WORK_DIM for nvc0 and nve4. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	nvc0/ir: limit max number of regs based on availability in SM	Ilia Mirkin	2016-05-30	1	-1/+3
\| \| \| \| \| \| \| \| \|	This effectively limits registers to 32 and 64 for fermi and kepler when 1024 threads are used, but allows the full amount to be used with smaller thread sizes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	nv50/ir: fix a comment in canDualIssue()	Samuel Pitoiset	2016-05-21	1	-1/+1
\| \| \| \| \| \| \|	Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers	Hans de Goede	2016-04-20	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for OpenCL global buffers. This commits changes the buffer code to use FILE_MEMORY_BUFFER at the ir_from_tgsi and lowering steps, freeing use of FILE_MEMORY_GLOBAL for use with OpenCL global buffers. Note that after lowering buffer accesses use the FILE_MEMORY_GLOBAL register file. Tested with piglet on a gf119 and a gk107: ./piglit run -o shader -t '.arb_shader_storage_buffer_object.' results/shader [9/9] pass: 9 / ./piglit run -o shader -t '.arb_compute_shader.' results/shader [20/20] skip: 4, pass: 16 \| Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	nvc0/ir: be careful about propagating very large offsets into const load	Ilia Mirkin	2016-01-14	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	Indirect constbuf indexing works by using very large offsets. However if an indirect constbuf index load is const-propagated, it becomes a very large const offset. Take that into account when legalizing the SSA by moving the high parts of that offset into the file index. Also disallow very large (or small) indices on most other instructions. This fixes regressions in ubo_array_indexing/*-two-arrays piglit tests. Fixes: abd326e81b (nv50/ir: propagate indirect loads into instructions) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0: add ARB_shader_draw_parameters support	Ilia Mirkin	2015-12-30	1	-0/+3
\| \| \| \|	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0/ir: fix up mul+add -> mad algebraic opt, enable for integers	Ilia Mirkin	2015-12-07	1	-2/+0
\| \| \| \| \| \| \|	For some reason this has been disabled for integers ever since codegen was merged, despite there being emission code for IMAD. Seems to work. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nv50/ir: avoid looking at uninitialized srcMods entries	Ilia Mirkin	2015-12-03	1	-1/+1
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
*	nvc0/ir: Teach insnCanLoad about double immediates	Hans de Goede	2015-11-06	1	-6/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Teach insnCanLoad about double immediates, together with the "Add support for merge-s to the ConstantFolding pass" This turns the following (nvc0) code: 1: mov u32 $r2 0x00000000 (8) 2: mov u32 $r3 0x3fe00000 (8) 3: add f64 $r0d $r0d $r2d (8) Into: 1: add f64 $r0d $r0d 0.500000 (8) Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0/ir: tess factors are now sysvals, adapt codegen to expect that	Ilia Mirkin	2015-07-23	1	-1/+2
\| \| \| \|	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0/ir: no instruction can load a double immediate	Ilia Mirkin	2015-02-20	1	-0/+2
\| \| \| \|	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nv50/ir: avoid array overrun when checking for supported mods	Ilia Mirkin	2014-09-08	1	-1/+1
\| \| \| \| \| \| \|	Reported by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
*	nvc0/ir: use SM35 ISA with GK20A	Alexandre Courbot	2014-05-27	1	-5/+10
\| \| \| \| \| \| \| \|	GK20A is mostly compatible with GK104, but uses the SM35 ISA. Use the GK110 path when this chip is detected. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0: maxwell isa has no per-instruction join modifier	Ben Skeggs	2014-05-15	1	-1/+2
\| \| \| \| \|	Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0: allow for easier modification of compiler library routines	Ben Skeggs	2014-05-15	1	-12/+12
\| \| \| \| \|	Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0/ir: add support for new bitfield manipulation opcodes	Ilia Mirkin	2014-04-28	1	-1/+6
\| \| \| \| \| \| \| \| \| \|	This adds support for: IBFE, UBFE, BFI, LSB, IMSB, UMSB, BREV, POPC Which are all required for ARB_gs5 support. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0/ir: add support for SAMPLEMASK sysval	Ilia Mirkin	2014-04-26	1	-0/+1
\| \| \| \|	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0: add support for PIPE_CAP_SAMPLE_SHADING	Ilia Mirkin	2014-04-26	1	-0/+2
\| \| \| \|	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nv50/ir/gk110: add 64/128-bit fetch/export support	Ilia Mirkin	2014-03-18	1	-7/+0
\| \| \| \|	Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
*	nvc0: fixup gk110 and up not being listed in various switch statements	Ben Skeggs	2013-12-06	1	-4/+8
\| \| \| \|	Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
*	Move nv30, nv50 and nvc0 to nouveau.	Johannes Obermayr	2013-09-11	1	-0/+604
	It is planned to ship openSUSE 13.1 with -shared libs. nouveau.la, nv30.la, nv50.la and nvc0.la are currently LIBADDs in all nouveau related targets. This change makes it possible to easily build one shared libnouveau.so which is then LIBADDed. Also dlopen will be faster for one library instead of three and build time on -jX will be reduced. Whitespace fixes were requested by 'git am'. Signed-off-by: Johannes Obermayr <johannesobermayr@gmx.de> Acked-by: Christoph Bumiller <christoph.bumiller@speed.at> Acked-by: Ian Romanick <ian.d.romanick@intel.com>