external_mesa3d.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	i965/vec4: Move can_do_writemask to vec4_instruction	Jason Ekstrand	2016-04-15	3	-30/+30
\| \| \| \|	Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/surface_formats: Update some formats for more recent gens	Jason Ekstrand	2016-04-15	1	-12/+12
\| \| \| \| \| \| \| \|	The surface format table hasn't entirely been kept up-to-date. This commit marks a couple more compressed formats as sampleable on gen8+ and adds the A4B4G4R4 format as renderable on gen9. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Expose the surface format table	Jason Ekstrand	2016-04-14	3	-18/+48
\| \| \| \|	Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	dri: Fix robust context creation via EGL attribute	Chad Versace	2016-04-14	1	-2/+23
\| \| \| \| \| \| \| \| \| \| \|	driCreateContextAttribs() emits an error if bit __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS is set for an ES context. But, EGL_EXT_create_context_robustness and EGL 1.5 both allow creation of robust ES contexts. One requests a robust ES context by setting the EGL_CONTEXT_OPENGL_ROBUST_ACCESS attribute, which Mesa's EGL layer translates into the __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS bit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
*	i965: Push everything if pull_param == NULL	Jason Ekstrand	2016-04-14	2	-2/+14
\| \| \| \|	Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Push small uniform arrays	Jason Ekstrand	2016-04-14	1	-23/+53
\| \| \| \| \| \| \| \| \| \| \| \| \|	Unfortunately, this also means that we need to use a slightly different algorithm for assign_constant_locations. The old algorithm worked based on the assumption that each read of a uniform value read exactly one float. If it encountered a MOV_INDIRECT, it would immediately bail and push the whole thing. Since we can now read ranges using MOV_INDIRECT, we need to be able to push a series of floats without breaking them up. To do this, we use an algorithm similar to the on in split_virtual_grfs. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Rename demote_pull_constants to lower_constant_loads	Jason Ekstrand	2016-04-14	2	-3/+3
\| \| \| \| \|	Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/vec4: Get rid of the uniform_size array	Jason Ekstrand	2016-04-14	6	-33/+0
\| \| \| \| \|	Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants	Jason Ekstrand	2016-04-14	4	-51/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit moves us to an instruction based model rather than a register-based model for indirects. This is more accurate anyway as we have to emit instructions to resolve the reladdr. It's also a lot simpler because it gets rid of the recursive reladdr problem by design. One side-effect of this is that we need a whole new algorithm in move_uniform_array_access_to_pull_constants. This new algorithm is much more straightforward than the old one and is fairly similar to what we're already doing in the FS backend. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Get rid of the param_size array	Jason Ekstrand	2016-04-14	4	-15/+0
\| \| \| \| \|	Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Stop relying on param_size in assign_constant_locations	Jason Ekstrand	2016-04-14	1	-27/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have MOV_INDIRECT opcodes, we have all of the size information we need directly in the opcode. With a little restructuring of the algorithm used in assign_constant_locations we don't need param_size anymore. The big thing to watch out for now, however, is that you can have two ranges overlap where neither contains the other. In order to deal with this, we make the first pass just flag what needs pulling and handle assigning pull constant locations until later. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Get rid of reladdr	Jason Ekstrand	2016-04-14	2	-10/+2
\| \| \| \| \| \| \|	We aren't using it anymore. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Use MOV_INDIRECT for all indirect uniform loads	Jason Ekstrand	2016-04-14	2	-40/+87
\| \| \| \| \| \| \| \| \| \| \|	Instead of using reladdr, this commit changes the FS backend to emit a MOV_INDIRECT whenever we need an indirect uniform load. We also have to rework some of the other bits of the backend to handle this new form of uniform load. The obvious change is that demote_pull_constants now acts more like a lowering pass when it hits a MOV_INDIRECT. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware	Jason Ekstrand	2016-04-14	2	-13/+66
\| \| \| \| \| \| \| \| \| \|	While we're at it, we also add support for the possibility that the indirect is, in fact, a constant. This shouldn't happen in the common case (if it does, that means NIR failed to constant-fold something), but it's possible so we should handle it. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Fix regs_read() for MOV_INDIRECT with a non-zero subnr	Jason Ekstrand	2016-04-14	1	-1/+1
\| \| \| \| \| \| \|	The subnr field is in bytes so we don't need to multiply by type_sz. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Don't force MASK_DISABLE on INDIRECT_MOV instructions	Jason Ekstrand	2016-04-14	1	-1/+0
\| \| \| \| \| \| \|	It should work fine without it and the visitor can set it if it wants. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Add support for doing MOV_INDIRECT on uniforms	Jason Ekstrand	2016-04-14	1	-1/+4
\| \| \| \| \|	Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Make intel_get_param return an int	Ben Widawsky	2016-04-14	1	-10/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will fix the spurious error message: "Failed to query GPU properties." that was unintentionally added in cc01b63d730. This patch changes the function to return an int so that the caller is able to do stuff based on the return value. The equivalent of this patch was in the original series that fixed up the warning, but I dropped it at the last moment. It is required to make the desired behavior of not warning when trying to query GPU properties from the kernel unless there is something the user can do about it. v2: Use strerror (Jason) Make EINVAL check similar in all places (Ian) NOTE: Broadwell appears to actually have some issue where the kernel returns ENODEV when it shouldn't be. I will investigate this separately. Reported-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
*	i965/vec4: Use UD rather than D for uniform indirects	Jason Ekstrand	2016-04-14	2	-6/+6
\| \| \| \|	Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOAD	Jason Ekstrand	2016-04-14	2	-3/+4
\| \| \| \| \|	Reveiewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	nir/dead_variables: Configurably work with any variable mode	Jason Ekstrand	2016-04-13	1	-1/+1
\| \| \| \| \| \| \|	The old version of the pass only worked on globals and locals and always left inputs, outputs, uniforms, etc. alone. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Switch to NIR for ldexp lowering.	Kenneth Graunke	2016-04-13	2	-2/+1
\| \| \| \| \| \| \| \| \| \| \|	The old GLSL IR based lowering doesn't quite work right in all cases, and fails several dEQP-GLES31 and Vulkan CTS tests. Jason's new approach in NIR passes all the tests. There's not likely to be a ton of advantage to lowering early in GLSL IR anyway, so...switch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965: Implement the new imod and irem opcodes	Jason Ekstrand	2016-04-13	2	-0/+72
\| \| \| \|	Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/vec4: Inline get_pull_constant_offset	Jason Ekstrand	2016-04-13	2	-25/+14
\| \| \| \| \| \| \|	It's not really doing enough anymore to justify a helper function. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reveiewed-by: Kristian Høgsberg <krh@bitplanet.net>
*	i965/tiled_memcpy: Fix rgba8_copy_16_aligned_dst() typo	Kristian Høgsberg Kristensen	2016-04-12	1	-4/+4
\| \| \| \| \| \| \| \| \|	Copy and paste error in commit eafeb8db66dae7619ff3cb039706b990d718cba7: i965/tiled_memcpy: Unroll bytes==64 case. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/tiled_memcpy: Unroll bytes==64 case.	Matt Turner	2016-04-12	1	-0/+16
\| \| \| \|	Reviewed-by: Roland Scheidegger <sroland@vmware.com>
*	i965/tiled_memcpy: Provide SSE2 for RGBA8 <-> BGRA8 swizzle.	Roland Scheidegger	2016-04-12	1	-3/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The existing code uses SSSE3, and because it isn't compiled in a separate file compiled with that, it is usually not used (that, of course, could be fixed...), whereas SSE2 is always present with 64-bit builds. This should be pretty much as fast as the pshufb version, albeit those code paths aren't really used on chips without llc in any case. v2: fix andnot argument order, add comments v3: use pshuflw/hw instead of shifts (suggested by Matt Turner), cut comments v4: [mattst88] Rebase Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/tiled_memcpy: Move SSSE3 code back into inline functions.	Matt Turner	2016-04-12	1	-18/+24
\| \| \| \| \| \|	This will make adding SSE2 code a lot cleaner. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
*	i965/tiled_memcpy: Optimize RGBA -> BGRA swizzle.	Matt Turner	2016-04-12	1	-8/+11
\| \| \| \| \| \| \|	Replaces four byte loads and four byte stores with a load, bswap, rotate, store; or a movbe, rotate, store. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
*	dri/i965: fix incorrect rgbFormat in intelCreateBuffer().	Haixia Shi	2016-04-12	1	-8/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is incorrect to assume that pixel format is always in BGR byte order. We need to check bitmask parameters (such as \|redMask\|) to determine whether the RGB or BGR byte order is requested. v2: reformat code to stay within 80 character per line limit. v3: just fix the byte order problem first and investigate SRGB later. v4: rebased on top of the GLES3 sRGB workaround fix. v5: rebased on top of the GLES3 sRGB workaround fix v2. Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	dri/i965: extend GLES3 sRGB workaround to cover all formats	Haixia Shi	2016-04-12	1	-4/+3
\| \| \| \| \| \| \| \| \| \|	It is incorrect to assume BGRA byte order for the GLES3 sRGB workaround. v2: use _mesa_get_srgb_format_linear to handle all formats Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Add autogenerated 'brw_nir_trig_workarounds.c' to gitignore	Eduardo Lima Mitev	2016-04-12	1	-0/+1
\| \| \| \|	Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Port INTEL_PRECISE_TRIG=1 to NIR.	Kenneth Graunke	2016-04-11	7	-28/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes the extra multiply visible to NIR's algebraic optimizations (for constant reassociation) as well as constant folding. This means that when the result of sin/cos are multiplied by an constant, we can eliminate the extra multiply altogether, reducing the cost of the workaround. It also means we only have to implement it one place, rather than in both backends. This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion, which has a ton of sin() calls, but always multiplies them by an immediate constant. The extra multiply gets folded away. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965: Pass brw_compiler into brw_preprocess_nir() instead of is_scalar.	Kenneth Graunke	2016-04-11	2	-3/+6
\| \| \| \| \| \| \| \| \| \|	I want to be able to read other fields. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
*	nir/lower_system_values: Add support for several computed values	Jason Ekstrand	2016-04-11	1	-1/+2
\| \| \| \|	Reviewed-by: Rob Clark <robdclark@gmail.com>
*	i965: fix struct type in comment	Timothy Arceri	2016-04-11	1	-1/+1
\| \| \| \|	Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
*	i965: enable OES_texture_buffer on gen7+	Ilia Mirkin	2016-04-10	1	-0/+1
\| \| \| \| \| \| \| \| \|	It will only end up getting exposed on gen8+ since it requires GL ES 3.1, but it should be ready to go on gen7 when support for GL ES 3.1 is completed there. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/disasm: Decode per-slot offsets.	Kenneth Graunke	2016-04-09	1	-0/+5
\| \| \| \| \| \| \|	We just never bothered to decode this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
*	i965/disasm: Decode "channel mask present" bit correctly.	Kenneth Graunke	2016-04-09	1	-4/+15
\| \| \| \| \| \| \| \|	Bit 15 means "interleave" for most messages, but for SIMD8 messages it means "use channel masks". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
*	i965/disasm: Simplify the URB opcode printing with ?:.	Kenneth Graunke	2016-04-09	1	-7/+6
\| \| \| \| \|	Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
*	i965/tiled_memcopy: Get rid of the direction parameter to get_memcpy	Jason Ekstrand	2016-04-08	5	-22/+5
\| \| \| \| \| \| \| \| \|	Now that we can use the much simpler rgba8_copy function, we don't need to hand different functions out based on direction. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chad Versace <chad.versace@intel.com>
*	i965/tiled_memcpy: Rework the RGBA -> BGRA mem_copy functions	Jason Ekstrand	2016-04-08	1	-76/+63
\| \| \| \| \| \| \| \| \| \| \| \| \|	This splits the two copy functions into three: One for unaligned copies, one for aligned sources, and one for aligned destinations. Thanks to the previous commit, we are now guaranteed that the aligned ones will only operate on aligned memory so they should be safe. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93962 Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chad Versace <chad.versace@intel.com>
*	i965/tiled_memcopy: Add aligned mem_copy parameters to the [de]tiling functions	Jason Ekstrand	2016-04-08	1	-32/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Each of the [de]tiling functions has three mem_copy calls: 1) Left edge to tile boundary 2) Tile boundary to tile boundary in a loop 3) Tile boundary to right edge Copies 2 and 3 start at a tile edge so the pointer to tiled memory is guaranteed to be at least 16-byte aligned. Copy 1, on the other hand, starts at some arbitrary place in the tile so it doesn't have any such alignment guarantees. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chad Versace <chad.versace@intel.com>
*	i965: Check eu/subslices are > 0	Ben Widawsky	2016-04-08	1	-1/+1
\| \| \| \| \| \| \| \|	Now that the check is restricted to gen8+, we should always get back a non-zero positive value for the EU and subslice counts. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Fix eu/subslice warning	Ben Widawsky	2016-04-08	1	-11/+23
\| \| \| \| \| \| \| \| \| \| \|	Older gen platforms do not actually return a value for sublice and eu total (IMO, confusingly) they return -ENODEV. This patch defers the SSEU setup until we have the actual GPU generation to avoid useless warnings when running on older platforms with older kernels. Reported-by: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965: Extract SSEU configuration info	Ben Widawsky	2016-04-08	1	-14/+21
\| \| \| \| \|	Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/sf_state: Pull flat_enables out of prog_data	Jason Ekstrand	2016-04-06	4	-27/+5
\| \| \| \| \| \| \| \| \| \|	Previously, we were walking over the shader source to figure out which inputs should be marked flat. Now, we can just pull it out of prog_data. This is needed for properly setting up 3DSTATE_SF/SBE for Vulkan and it also means that it will get properly cached. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs: Add a flat_inputs field to prog_data	Jason Ekstrand	2016-04-06	2	-0/+37
\| \| \| \| \|	Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	brw/device_info: Add a helper for getting a device name	Jason Ekstrand	2016-04-06	2	-0/+13
\| \| \| \| \| \| \|	This is needed by the Vulkan driver Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
*	i965/fs_surface_builder: Mask signed integers after conversion	Jason Ekstrand	2016-04-06	1	-0/+18
\| \| \| \| \|	Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>