external_mesa3d.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965/tiled_memcpy: don't unconditionally use __builtin_bswap32	Jonathan Gray	2016-04-21	1	-1/+14
\| \| \| \| \| \| \| \| \| \|	Use the defines Mesa configure sets to indicate presence of the bswap32 builtins. This lets i965 work on OpenBSD again after the changes that were made in 0a5d8d9af42fd77fce1492d55f958da97816961a. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/tiled_memcpy: Fix rgba8_copy_16_aligned_dst() typo	Kristian Høgsberg Kristensen	2016-04-12	1	-4/+4
\| \| \| \| \| \| \| \| \|	Copy and paste error in commit eafeb8db66dae7619ff3cb039706b990d718cba7: i965/tiled_memcpy: Unroll bytes==64 case. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/tiled_memcpy: Unroll bytes==64 case.	Matt Turner	2016-04-12	1	-0/+16
\| \| \| \|	Reviewed-by: Roland Scheidegger <sroland@vmware.com>
*	i965/tiled_memcpy: Provide SSE2 for RGBA8 <-> BGRA8 swizzle.	Roland Scheidegger	2016-04-12	1	-3/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The existing code uses SSSE3, and because it isn't compiled in a separate file compiled with that, it is usually not used (that, of course, could be fixed...), whereas SSE2 is always present with 64-bit builds. This should be pretty much as fast as the pshufb version, albeit those code paths aren't really used on chips without llc in any case. v2: fix andnot argument order, add comments v3: use pshuflw/hw instead of shifts (suggested by Matt Turner), cut comments v4: [mattst88] Rebase Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965/tiled_memcpy: Move SSSE3 code back into inline functions.	Matt Turner	2016-04-12	1	-18/+24
\| \| \| \| \| \|	This will make adding SSE2 code a lot cleaner. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
*	i965/tiled_memcpy: Optimize RGBA -> BGRA swizzle.	Matt Turner	2016-04-12	1	-8/+11
\| \| \| \| \| \| \|	Replaces four byte loads and four byte stores with a load, bswap, rotate, store; or a movbe, rotate, store. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
*	i965/tiled_memcopy: Get rid of the direction parameter to get_memcpy	Jason Ekstrand	2016-04-08	1	-2/+1
\| \| \| \| \| \| \| \| \|	Now that we can use the much simpler rgba8_copy function, we don't need to hand different functions out based on direction. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chad Versace <chad.versace@intel.com>
*	i965/tiled_memcpy: Rework the RGBA -> BGRA mem_copy functions	Jason Ekstrand	2016-04-08	1	-76/+63
\| \| \| \| \| \| \| \| \| \| \| \| \|	This splits the two copy functions into three: One for unaligned copies, one for aligned sources, and one for aligned destinations. Thanks to the previous commit, we are now guaranteed that the aligned ones will only operate on aligned memory so they should be safe. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93962 Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chad Versace <chad.versace@intel.com>
*	i965/tiled_memcopy: Add aligned mem_copy parameters to the [de]tiling functions	Jason Ekstrand	2016-04-08	1	-32/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Each of the [de]tiling functions has three mem_copy calls: 1) Left edge to tile boundary 2) Tile boundary to tile boundary in a loop 3) Tile boundary to right edge Copies 2 and 3 start at a tile edge so the pointer to tiled memory is guaranteed to be at least 16-byte aligned. Copy 1, on the other hand, starts at some arbitrary place in the tile so it doesn't have any such alignment guarantees. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chad Versace <chad.versace@intel.com>
*	i965: Enable tiled mem_copy with sRGB-formatted resources	Nanley Chery	2016-02-24	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \|	RGBA8 and BGRA8 unorm formats are compatible with the various mem_copy functions. Their sRGB counterparts are also compatible because they're also color-renderable (of importance when the specified resource is a readbuffer) and they share the same physical layout. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
*	Revert "i965: Provide sse2 version for rgba8 <-> bgra8 swizzle"	Roland Scheidegger	2016-02-02	1	-57/+12
\| \| \| \| \| \| \| \| \| \|	This reverts commit ab30426e335116e29473faaafe8b57ec760516ee. Apparently the memory isn't quite as aligned when this gets called as it should be, causing crashes. (Albeit this looks independent from this code, should crash just as well if ssse3 is enabled when compiling without this patch.) https://bugs.freedesktop.org/show_bug.cgi?id=93962
*	i965: Provide sse2 version for rgba8 <-> bgra8 swizzle	Roland Scheidegger	2016-02-02	1	-12/+57
\| \| \| \| \| \| \| \| \| \| \| \| \|	The existing code used ssse3, and because it isn't compiled in a separate file compiled with that, it is usually not used (that, of course, could be fixed...), whereas sse2 is always present at least with 64bit builds. This should be pretty much as fast as the pshufb version, albeit those code paths aren't really used on chips without llc in any case. v2: fix andnot argument order, add comments v3: use pshuflw/hw instead of shifts (suggested by Matt Turner), cut comments Reviewed-by: Matt Turner <mattst88@gmail.com>
*	i965: Fix typos in license	Ian Romanick	2015-09-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	grep -lr 'sub license' \| while read f; do \ sed --in-place -e 's/sub license/sublicense/' $f ;\ done grep -lr 'NON-INFRINGEMENT' \| while read f; do \ sed --in-place -e 's/NON-INFRINGEMENT/NONINFRINGEMENT/' $f ;\ done As noted by Matt, both of these changes match the MIT license text found at http://opensource.org/licenses/MIT. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>
*	i965: Mark paths in linear <-> tiled functions as unreachable().	Matt Turner	2015-03-17	1	-0/+16
\| \| \| \| \| \| \| \| \|	text data bss dec hex filename 9663 0 0 9663 25bf intel_tiled_memcpy.o before 8215 0 0 8215 2017 intel_tiled_memcpy.o after Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
*	i965: Tell intel_get_memcpy() which direction the memcpy() is going.	Matt Turner	2015-03-05	1	-38/+86
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SSSE3 swizzling code was written for fast uploads to the GPU and assumed the destination was always 16-byte aligned. When we began using this code for fast downloads as well we didn't do anything to account for the fact that the destination pointer given by glReadPixels() or glGetTexImage() is not guaranteed to be suitably aligned. With SSSE3 enabled (at compile-time), some applications would crash when an SSE aligned-store instruction tried to store to an unaligned destination (or an assertion that the destination is aligned would trigger). To remedy this, tell intel_get_memcpy() whether we're uploading or downloading so that it can select whether to assume the destination or source is aligned, respectively. Cc: 10.5 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89416 Tested-by: Uriy Zhuravlev <stalkerg@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
*	i965/tiled_memcpy: Support a signed linear pitch	Jason Ekstrand	2015-01-30	1	-15/+15
\| \| \| \|	Reviewed-by: Chad Versace <chad.versace@intel.com>
*	i965/tiled_memcpy: Add tiled-to-linear paths	Sisinty Sasmita Patra	2015-01-26	1	-0/+272
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit addes tiled copy functions for coping from tiled memory to linear memory. These are very similar to the existing linear-to-tiled paths. v2: Jason Ekstrand <jason.ekstrand@intel.com> - New commit message - Various whitespace fixes - Added ptrdiff_t casts as done in commit 225a09790 v3: Jason Ekstrand <jason.ekstrand@intel.com> - Fixed a comment Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>
*	i965: Refactor tiled memcpy functions and move them into their own file	Sisinty Sasmita Patra	2015-01-26	1	-0/+450
	This commit refactors the tiled_memcpy code in intel_tex_subimage.c and moves it into its own file intel_tiled_memcpy files. Also, xtile_copy and ytile_copy are renamed to linear_to_xtiled and linear_to_ytiled respectively. The *_faster functions are similarly renamed. There was also a bit of logic to select between the the libc provided memcpy function and our custom memcpy that does an RGBA -> BGRA swizzle. This was moved into an intel_get_memcpy function so that rgba8_copy can live (and be inlined) in intel_tiled_memcpy.c. v2: Jason Ekstrand <jason.ekstrand@intel.com> - Better commit message - Fix up the copyright on the intel_tiled_memcpy files - Various whitespace fixes - Moved a bunch of stuff that did not need to be exposed from intel_tiled_memcpy.h to intel_tiled_memcpy.c - Added proper documentation for intel_get_memcpy - Incorperated the ptrdiff_t tweaks from commit 225a09790 v3: Jason Ekstrand <jason.ekstrand@intel.com> - Fixed a comment - Move the tile size constants into the .c file Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>