summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary/translate
Commit message (Collapse)AuthorAgeFilesLines
* translate: fix start_instance parameter in sse versionIlia Mirkin2016-06-211-7/+7
| | | | | | | | | The generic version gets this right already, but this was using an incorrect formula in SSE. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
* gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_*Marek Olšák2016-04-221-55/+55
| | | | | | | | Use PIPE_SWIZZLE_* everywhere. Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE. The new enum is called pipe_swizzle. Acked-by: Jose Fonseca <jfonseca@vmware.com>
* gallium/auxiliary: Sanitize NULL checks into canonical formEdward O'Callaghan2015-12-063-3/+3
| | | | | | | | | | Use NULL tests of the form `if (ptr)' or `if (!ptr)'. They do not depend on the definition of the symbol NULL. Further, they provide the opportunity for the accidental assignment, are clear and succinct. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>
* gallium: replace INLINE with inlineIlia Mirkin2015-07-212-6/+6
| | | | | | | | | | | | | | | | Generated by running: git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g' git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g' git checkout src/gallium/state_trackers/clover/Doxyfile and manual edits to src/gallium/include/pipe/p_compiler.h src/gallium/README.portability to remove mentions of the inline define. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com>
* gallium: Use util_cpu_to_le{16,32} in many more places.Matt Turner2015-02-231-32/+8
| | | | | | | | | | | | | ... and util_le{16,32}_to_cpu. I think I've used the right ones for describing the actual operation performed (even though they're both just "byte-swap this if I'm on big-endian"). The Linux Kernel has typedefs __le32/__be32 and friends that static analysis tools can use to check that byte-orderings are correct. It might be interesting to apply that here as well. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* rtasm,translate: Re-enable SSE on Mingw64.José Fonseca2014-11-201-1/+1
| | | | | | | | | | | This reverts f4dd0991719ef3e2606920c5100b372181c60899. The src/gallium/tests/unit/translate_test.c gives the same results on MinGW 64-bits as on Linux 64-bits. And since MinGW is often used for development/testing due to its convenience, it's better not to have this sort of differences relative to MSVC. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* translate_sse: Use the correct buffer index in this fast path.Andreas Hartmetz2014-04-291-1/+3
| | | | | | | | | | | | It is possible that there are multiple input buffers but only one is relevant for translation. Then there will be only a single translation group, which might need to source data from a buffer index != 0. Fixes wrong vertex shader inputs as observed while debugging with an application and driver combination that requires translation of a vertex attribute in a non-trivial set of attributes and input buffers. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* translate: fix buffer overflowsZack Rusin2014-03-044-6/+18
| | | | | | | | | | | | | | Because in draw we always inject position at slot 0 whenever fragment shader would take the maximum number of inputs (32) it meant that we had PIPE_MAX_ATTRIBS + 1 slots to translate, which meant that we were crashing with fragment shaders that took the maximum number of attributes as inputs. The actual max number of attributes we need to translate thus is PIPE_MAX_ATTRIBS + 1. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Matthew McClure <mcclurem@vmware.com>
* translate: reindent translate_sse.cBrian Paul2014-02-021-472/+474
| | | | Trivial.
* translate: deal with size overflows by casting to ptrdiff_tIlia Mirkin2014-01-272-3/+7
| | | | | | | | | | | This was discovered as a result of the draw-elements-base-vertex-neg piglit test, which passes very negative offsets in, followed up by large indices. The nouveau code correctly adjusts the pointer, but the translate code needs to do the proper inverse correction. Similarly fix up the SSE code to do a 64-bit multiply to compute the proper offset. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>
* s/Tungsten Graphics/VMware/José Fonseca2014-01-176-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/alanh@tungstengraphics.com/alanh@vmware.com/ s/jens@tungstengraphics.com/jowen@vmware.com/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\?@tungstengraphics.com/jfonseca@vmware.com/g s/keithw\?@tungstengraphics.com/keithw@vmware.com/g s/michel@tungstengraphics.com/daenzer@vmware.com/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/zack@tungstengraphics.com/zackr@vmware.com/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <brianp@vmware.com>
* translate_sse: Fix generated code argument handling for msabi on x86_64Jon TURNEY2013-10-181-3/+11
| | | | | | | | | | | | | | | | translate_sse.c contains code for msabi on x86_64, but it appears to be untested. Currently arguments 1 and 2 passed to the generated code are moved as 32-bit quantities into the registers used by sysvabi, irrespective of the architecture. Since these may be pointers, they must be moved as 64-bit quantities to avoid truncation. Commit f4dd0991719ef3e2606920c5100b372181c60899 disabled tranlate_sse.c on MinGW x86_64, I don't know if was due to this issue, or a different one... Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Brian Paul <brianp@vmware.com>
* draw: cleanup and fix instance id computationZack Rusin2013-07-252-6/+1
| | | | | | | | | | | | | | The instance id system value always starts at 0, even if the specified start instance is larger than 0. Instead of implicitly setting instance id to instance id plus start instance and then having to subtract instance id when computing the buffer offsets lets just set instance id to the proper instance id. This fixes instance id computation and cleansup buffer offset computation. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* draw/translate: fix instancingZack Rusin2013-06-283-11/+42
| | | | | | | | | | | | | | | | | | We were incorrectly computing the buffer offset when using the instances. The buffer offset is always equal to: start_instance * stride + (instance_num / instance_divisor) * stride We were completely ignoring the start instance quite often producing instances that completely wrong, e.g. if start instance = 5, instance divisor = 2, then on the first iteration it should be: 5 * stride, not (5/2) * stride as we'd have currently, and if start instance = 1, instance divisor = 3, then on the first iteration it should be: 1 * stride, not 0 as we'd have. This fixes it and adjusts all the code to the changes. Signed-off-by: Zack Rusin <zackr@vmware.com>
* translate: Fix the fetch function assertions.José Fonseca2012-12-041-1/+3
| | | | | | fetch_rgba_float is NULL for integer formats, and vice-versa. Reviewed-by: Brian Paul <brianp@vmware.com>
* translate: Fix typo in is_legal_int_format_combo.Vinson Lee2012-08-071-1/+1
| | | | | | | Fixes same on both sides defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>
* translate: Free elt8_func/elt16_func too.José Fonseca2012-06-291-1/+3
| | | | | | | These were leaking. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* translate: implement translation of 10_10_10_2 typesMarek Olšák2012-01-051-0/+148
| | | | | | | | | This is for GL_ARB_vertex_type_2_10_10_10_rev. I just took the code from u_format_table.c. It's based on pack_rgba_float. I had no other choice. The u_format hooks are not exactly compatible with translate. The cleanup of it is left for future work. Reviewed-by: Dave Airlie <airlied@redhat.com>
* translate: implement translation of (pure) integer formatsMarek Olšák2012-01-051-94/+252
| | | | | | | The conversion is limited to only a few cases, because converting to any other type shouldn't happen in any driver. Reviewed-by: Dave Airlie <airlied@redhat.com>
* translate: implement translation of half floats in the generic codepathMarek Olšák2012-01-051-0/+21
|
* translate: check for PIPE_SUBSYSTEM_EMBEDDEDBrian Paul2011-09-221-1/+1
|
* rtasm,translate: Disable on Mingw-w64.José Fonseca2011-09-061-1/+1
| | | | | | Causes crash and stack corruption. Needs more investigation. Disable for now.
* translate: disable clamping of instanced array indexesBrian Paul2011-04-192-9/+16
| | | | | | This fixes piglit's draw-instanced-divisor test for softpipe on both the generic and SSE paths. This is temporary until we have the correct per-array max_index information.
* translate: s/varient/variant/Brian Paul2011-04-151-44/+44
|
* translate: Respect translate_buffer::max_index.José Fonseca2011-04-011-2/+17
|
* secure malloc in translate_cache_createTim Wiederhake2011-01-241-0/+4
| | | | Signed-off-by: Brian Paul <brianp@vmware.com>
* translate: remove unused prototypesBrian Paul2010-10-251-9/+0
|
* translate: use function typedefs, casts to silence warningsBrian Paul2010-10-252-27/+35
|
* translate_sse: clear state for each function emissionLuca Barbieri2010-08-241-3/+3
| | | | Fixes #29771.
* translate_sse: fix x86-64Luca Barbieri2010-08-231-0/+1
|
* translate_sse: add R32G32B32A32_FLOAT -> X8X8X8X8_UNORM for EMIT_4UBJakob Bornecrantz2010-08-221-0/+26
| | | | Changed by me to use movd instead of movss to avoid penalties.
* translate_sse: refactor constant managementLuca Barbieri2010-08-221-81/+76
|
* translate_sse: Silence uninitialized variable warnings.Vinson Lee2010-08-211-0/+14
| | | | Initialize variables on error paths.
* translate_sse: enable on Win64Luca Barbieri2010-08-201-2/+1
| | | | According to Vinson, enabling it causes no regressions
* translate_sse: fix emit_load_sse2Luca Barbieri2010-08-191-0/+2
|
* translate_sse: don't overwrite source buffer pointerLuca Barbieri2010-08-181-5/+5
| | | | | We were putting the source pointer in a register used as a temporary, breaking all paths that don't read the data in a single instruction.
* translate: Move loop variable declaration outside for loop.Vinson Lee2010-08-161-1/+2
| | | | Fixes MSVC build.
* translate: Remove unused temporary register.José Fonseca2010-08-161-1/+0
| | | | Assuming the side-effect of x86_make_reg is also unnecessary.
* translate: Eliminate void pointer arithmetic.José Fonseca2010-08-161-1/+1
| | | | Non-portable.
* translate_sse: major rewrite (v5)Luca Barbieri2010-08-162-239/+936
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NOTE: Win64 is untested, and is thus currently disabled. If you have such a system, please enable it and report whether it works. To enable it, change src/gallium/auxiliary/translate/translate.c Changes in v5: - On Win64, preserve %xmm6 and %xmm7 as required by the ABI - Use _WIN64 instead of WIN64 Changes in v4: - Use x86_target() and x86_target_caps() - Enable translate_sse in x86-64, but not in Win64 Changes in v3: - Win64 support (untested) - Use u_cpu_detect.h constants instead of #ifs Changes in v2: - Minimize #ifs - Give a name to magic number CHANNELS_0001 - Add support for CPUs without SSE (only memcpy and swizzles, like non SSE2) - Fixed comments translate_sse is currently very limited to the point of being useless in essentially all cases. In particular, it only support some float32 and unorm8 formats and doesn't work on x86-64. This commit rewrites it to support: 1. Dumb memory copy for any pair of identical formats 2. All formats that are swizzles of each other 3. Converting 32/64-bit floats and all 8/16/32-bit integers to 32-bit float 4. Converting unorm8/snorm8 to snorm16 and uscaled8/sscaled8 to sscaled16 5. Support for x86-64 (doesn't take advantage of it in any way though) This new translate can even be useful to translate index buffers for cards that lack 8-bit index support. It passes the testsuite I wrote, but note that this is a major change, and more testing would be great.
* translate: add support for 8/16-bit indicesLuca Barbieri2010-08-163-19/+92
| | | | | Currently, only 32-bit indices are supported, but some use cases translate needs support for all types.
* translate_sse: remove useless generated function wrappersLuca Barbieri2010-08-161-51/+4
| | | | | | | | | | Currently translate_sse puts two trivial wrappers in the translate vtable. These slow it down and enlarge the source code for no gain, except perhaps the ability to set a breakpoint there, so remove them. Breakpoints can be set on the caller of the translate functions, with no loss of functionality.
* translate_generic: factor out common code between linear and indexedLuca Barbieri2010-08-161-115/+62
| | | | This moves the common code into a separate ALWAYS_INLINE function.
* translate_generic: use memcpy if possible (v3)Luca Barbieri2010-08-161-33/+75
| | | | | | | | | | | | | | | | | | | | | | | | Changes in v3: - If we can do a copy, don't try to get an emit func, as that can assert(0) Changes in v2: - Add comment regarding copy_size When used in GPU drivers, translate can be used to simultaneously perform a gather operation, and convert away from unsupported formats. In this use case, input and output formats will often be identical: clearly it would make sense to use a memcpy in this case. Instead, translate will insist to convert to and from 32-bit floating point numbers. This is not only extremely expensive, but it also loses precision for 32/64-bit integers and 64-bit floating point numbers. This patch changes translate_generic to just use memcpy if the formats are identical, non-blocked, and with an integral number of bytes per pixel (note that all sensible vertex formats are like this).
* translate: allow clients to ask for supported output formatsLuca Barbieri2010-08-113-0/+88
| | | | | | | | | | | | | Currently translate asserts on unsupported output formats, making it impossible to use for some purposes, such as testing whether it actually works on all formats it supports. Removing the assert was met with opposition, so this change allows clients to ask whether an output format is supported, and they are thus able to avoid attempting to use it. Since this is just an addition to the API, no adverse effect is possible, and it makes the testsuite work again.
* Revert "translate_generic: return NULL instead of assert(0) if format not ↵Luca Barbieri2010-08-111-6/+9
| | | | | | | | | | | supported" This reverts commit 16b45ca7cefb3432b4133fe9d0b1dbfe3f286131. José Fonseca asked for a revert. Note that the testsuite will now segfault since it attempts to test all possible formats.
* translate_generic: fix broken A8R8G8B8_UNORM outputLuca Barbieri2010-08-111-3/+9
| | | | | | | translate was attempting to output A8R8G8B8_UNORM as if it were R8G8B8A8_UNORM. Now the tests just added pass.
* translate_generic: return NULL instead of assert(0) if format not supportedLuca Barbieri2010-08-111-9/+6
| | | | This gives the caller a chance to recover (or crash anyway otherwise).
* gallium/translate: make generic_run() and generic_run_elts() more alikeBrian Paul2010-08-031-19/+44
| | | | Plus more debug code and do clamping in generic_run().
* translate: don't crash on elts paths with instancesZack Rusin2010-06-161-10/+13
|