summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary/gallivm
Commit message (Collapse)AuthorAgeFilesLines
* gallivm: Translate all util_cpu_caps bits to LLVM attributes.Jose Fonseca2015-10-221-2/+34
| | | | | | | | | | | | | | This should prevent disparity between features Mesa and LLVM believe are supported by the CPU. http://lists.freedesktop.org/archives/mesa-dev/2015-October/thread.html#96990 Tested on a i7-3720QM w/ LLVM 3.3 and 3.6. v2: Increase SmallVector initial size as suggested by Gustaw Smolarczyk. Reviewed-by: Roland Scheidegger <sroland@vmware.com> CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
* gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINTMarek Olšák2015-10-201-0/+2
| | | | | | | | | | | | | | This avoids a serious r600g bug leading to a GPU hang. The chances this bug will get fixed are pretty low now. I deeply regret listening to others and not pushing this patch, leaving other users with a GPU-crashing driver. Yes, it should be fixed in the compiler and it's ugly, but users couldn't care less about that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720 Cc: 11.0 10.6 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>
* gallivm: implement the correct version of LRPMarek Olšák2015-10-171-6/+13
| | | | | | | | | The previous version has precision issues. This can be a problem with tessellation. Sadly, I can't find the article where I read it anymore. I'm not sure if the unsafe-fp-math flag would be enough to revert this. v2: added the comment
* gallivm: set correct opcode info from unary/binary/ternary emitsMarek Olšák2015-10-171-3/+6
| | | | | | | | and clear the emit_data structure. The new radeonsi min/max opcode implementation requires this. (it looks good according to Roland S.)
* gallivm: Allow drivers and state trackers to initialize gallivm LLVM targets v2Tom Stellard2015-10-022-7/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drivers and state trackers that use LLVM for generating code, must register the targets they use with LLVM's global TargetRegistry. The TargetRegistry is not thread-safe, so all targets must be added to the registry before it can be queried for target information. When drivers and state trackers initialize their own targets, they need a way to force gallivm to initialize its targets at the same time. Otherwise, there can be a race condition in some multi-threaded applications (e.g. glx-multihreaded-shader-compile in piglit), when one thread creates a context for a driver that uses LLVM (e.g. radeonsi) and another thread creates a gallivm context (glxContextCreate does this). The race happens when the driver thread initializes its LLVM targets and then starts using the registry before the gallivm thread has a chance to register its targets. This patch allows users to force gallivm to register its targets by calling the gallivm_init_llvm_targets() function. v2: - Use call_once and remove mutexes and static initializations. - Replace gallivm_init_llvm_{begin,end}() with gallivm_init_llvm_targets(). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
* llvmpipe: convert double to long long instead of unsigned long longOded Gabbay2015-09-041-1/+1
| | | | | | | | | | | | | | | | | round(val*dscale) produces a double result, as val and dscale are double. However, LLVMConstInt receives unsigned long long, so there is an implicit conversion from double to unsigned long long. This is an undefined behavior. Therefore, we need to first explicitly convert the round result to long long, and then let the compiler handle conversion from that to unsigned long long. This bug manifests itself in POWER, where all IMM values of -1 are being converted to 0 implicitly, causing a wrong LLVM IR output. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* gallivm: Fix GCC unused-variable warning.Vinson Lee2015-07-311-2/+1
| | | | | | | | | | lp_bld_tgsi_soa.c: In function 'lp_emit_immediate_soa': lp_bld_tgsi_soa.c:3065:18: warning: unused variable 'size' [-Wunused-variable] const uint size = imm->Immediate.NrTokens - 1; ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>
* gallivm: add LLVMAttribute parameter to lp_build_intrinsicMarek Olšák2015-07-315-10/+15
| | | | | | This will help remove some duplicated code from radeon. Reviewed-by: Dave Airlie <airlied@redhat.com>
* gallivm: Fix profile build.Jose Fonseca2015-07-231-1/+1
|
* gallivm: Add ifdefs so raw_debug_stream is only defined when usedTom Stellard2015-07-231-0/+2
| | | | | | | Its only use is to implement a custom version of LLVMDumpValue on some Windows and embedded platforms. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: Don't use raw_debug_ostream for dissasemblingTom Stellard2015-07-231-14/+13
| | | | | | | | All LLVM API calls that require an ostream object have been removed from the disassemble() function, so we don't need to use this class to wrap _debug_printf() we can just call this function directly. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallium: replace INLINE with inlineIlia Mirkin2015-07-2111-41/+41
| | | | | | | | | | | | | | | | Generated by running: git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g' git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g' git checkout src/gallium/state_trackers/clover/Doxyfile and manual edits to src/gallium/include/pipe/p_compiler.h src/gallium/README.portability to remove mentions of the inline define. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com>
* gallivm: Initialize LLVM Modules's DataLayout to an empty string.Tom Stellard2015-07-201-5/+23
| | | | | | | | | | | | | | This fixes crashes in llvmpipe with LLVM 3.8 and also some piglit tests on radeonsi that use the draw module. This is just a temporary solution. The correct solution will require creating a TargetMachine during gallivm initialization and pulling the DataLayout from there. This will be a somewhat invasive change, and it will need to be validatated on multiple LLVM versions. https://llvm.org/bugs/show_bug.cgi?id=24172 Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* gallivm: fix lp_build_compare_extRoland Scheidegger2015-07-062-1/+4
| | | | | | | | | | | | The expansion should always be to the same width as the input arguments no matter what, since these functions should work with any bit width of the arguments (the sext is a no-op on any sane simd architecture). Thus, fix the caller expecting differently. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=91222 Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: add fp64 support. (v2.1)Dave Airlie2015-07-018-31/+553
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for ARB_gpu_shader_fp64 and ARB_vertex_attrib_64bit to llvmpipe. Two things that don't mix well are SoA and doubles, see emit_fetch_double, and emit_store_double_chan in this. I've also had to split emit_data.chan, to add src_chan, which can be different for doubles. It handles indirect double fetches from temps, inputs, constants and immediates. It doesn't handle double stores to indirects, however it appears the mesa/st doesn't currently emit these, it always does UARL/MOV combos, which will work fine. tested with piglit, no regressions, all the fp64 tests seem to pass. v2: switch to using shuffles for fetch/store (Roland) assert on indirect double stores - mesa/st never emits these (it uses MOV) fix indirect temp/input/constant/immediates (Roland) typos/formatting fixes (Roland) v2.1: cleanup some long lines, emit_store_double_chan cleanups. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
* softpipe,llvmpipe: fix PIPE_SHADER_CAP_MAX_INPUTS valueMarek Olšák2015-06-251-1/+1
| | | | | | | | | | PIPE_MAX_SHADER_INPUTS was recently bumped to 80 because of tessellation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91099 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91101 Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* draw/gallivm: add invocation ID support for llvmpipe.Dave Airlie2015-06-232-0/+6
| | | | | | | This extends the draw code to add support for invocations. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
* llvmpipe: Truncate the binned constants to max const buffer size.Jose Fonseca2015-06-191-1/+5
| | | | | | | | | Tested with Ilia Mirkin's gzdoom.trace and "arb_uniform_buffer_object-maxuniformblocksize fsexceed" piglit test without my earlier fix to fail linkage when UBO exceeds GL_MAX_UNIFORM_BLOCK_SIZE. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* gallivm: Only build lp_profile() body when PROFILE is definedTom Stellard2015-06-121-1/+1
| | | | | | | The only use of lp_profile() is wrapped in #if defined(PROFILE), so there is no reason to build it unless this macro is defined. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* tgsi/ureg: don't emit in/out arrays if drivers don't support ranged declarationsMarek Olšák2015-06-051-0/+1
| | | | | | Softpipe, llvmpipe, r300g, and radeonsi pass tests. Other drivers need testing. Freedreno and nv30 are definitely broken. Other drivers seem to be alright.
* gallivm: silence unused var warnings for non-debug buildBrian Paul2015-06-011-0/+1
| | | | Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: Remove stub disassemblerSymbolLookupCB.Jose Fonseca2015-06-011-13/+1
| | | | | | | | | It's incompletete -- it wasn't filling ReferenceType so it was causing garbagge on the disassembly. Furthermore it seems impossible to get the jump information through this interface. The solution for function size problem is to effectively book-keep the machine code start and end address while JIT'ing.
* gallivm: make sampling more robust when the sampler setup is bogusRoland Scheidegger2015-05-291-6/+32
| | | | | | | | | Pure integer formats cannot be sampled with linear tex / mip filters. In GL such a setup would make the texture incomplete. We shouldn't rely on the state tracker though to filter that out, just return all zeros instead of dying in the lerp. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: Use the LLVM's C disassembly interface.Jose Fonseca2015-05-291-223/+37
| | | | | | | | | | | It doesn't do everything we want. In particular it doesn't allow to detect jumps or return opcodes. Currently we detect the x86's RET opcode. Even though it's worse for LLVM 3.3, it's an improvement for LLVM 3.7, which was totally busted. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* gallivm: Disable frame pointer omission on LLVM 3.7.Jose Fonseca2015-05-291-0/+10
| | | | Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* gallivm: Workaround LLVM PR23628.Jose Fonseca2015-05-281-0/+11
| | | | | | | | | | Temporarily undefine DEBUG macro while including LLVM C++ headers, leveraging the push/pop_macro pragmas, which are supported both by GCC and MSVC. https://bugs.freedesktop.org/show_bug.cgi?id=90621 Trivial.
* gallivm: Do not use NoFramePointerElim with LLVM 3.7.Vinson Lee2015-05-272-0/+4
| | | | | | | | | TargetOptions::NoFramePointerElim was removed in llvm-3.7.0svn r238244 "Remove NoFramePointerElim and NoFramePointerElimOverride from TargetOptions and remove ExecutionEngine's dependence on CodeGen. NFC." Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
* gallium: remove TGSI_SAT_MINUS_PLUS_ONEMarek Olšák2015-05-202-35/+2
| | | | | | | | It's a remnant of some old NV extension. Unused. I also have a patch that removes predicates if anyone is interested. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* llvmpipe: enable ARB_texture_viewRoland Scheidegger2015-05-131-1/+1
| | | | | | | | | | | | | | | All the functionality was pretty much there, just not tested. Trivially fix up the missing pieces (take target info from view not resource), and add some missing bits for cubes. Also add some minimal debug validation to detect uninitialized target values in the view... 49 new piglits, 47 pass, 2 fail (both related to fake multisampling, not texture_view itself). No other piglit changes. v2: move sampler view validation to sampler view creation, update docs. Reviewed-by: Brian Paul <brianp@vmware.com>
* gallivm: Fix build against LLVM 3.7 SVN r235265Nick Sarnie2015-04-202-2/+2
| | | | | | | | | LLVM removed JITEmitDebugInfo from TargetOptions since they weren't used v2: Be consistent with the LLVM version check (Aaron Watry) Signed-off-by: Nick Sarnie <commendsarnex@gmail.com> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
* gallivm: Fix build since llvm-3.7.0svn r234495Nick Sarnie2015-04-101-4/+0
| | | | | | | | Revert 50e9fa2ed69cb5f76f66231976ea789c0091a64d as LLVM reverted their change. Signed-off-by: Nick Sarnie <commendsarnex@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
* gallivm: Fix build since llvm-3.7.0svn r234460.Vinson Lee2015-04-091-0/+4
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89963 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
* gallivm: don't use control flow when doing indirect constant buffer lookupsRoland Scheidegger2015-04-091-57/+36
| | | | | | | | | | | | | | | | | | | | | | | llvm goes crazy when doing that, using way more memory and time, though there's probably more to it - this points to a very much similar issue as fixed in 8a9f5ecdb116d0449d63f7b94efbfa8b205d826f. In any case I've seen a quite plain looking vertex shader with just ~50 simple tgsi instructions (but with a dozen or so such indirect constant buffer lookups) go from a terribly high ~440ms compile time (consuming 25MB of memory in the process) down to a still awful ~230ms and 13MB with this fix (with llvm 3.3), so there's still obvious improvements possible (but I have no clue why it's so slow...). The resulting shader is most likely also faster (certainly seemed so though I don't have any hard numbers as it may have been influenced by compile times) since generally fetching constants outside the buffer range is most likely an app error (that is we expect all indices to be valid). It is possible this fixes some mysterious vertex shader slowdowns we've seen ever since we are conforming to newer apis at least partially (the main draw loop also has similar looking conditionals which we probably could do without - if not for the fetch at least for the additional elts condition.) v2: use static vars for the fake bufs, minor code cleanups Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: (trivial) fix the logic deciding if function call should be used...Roland Scheidegger2015-04-011-3/+1
| | | | | Copy and paste bug with the img filter decision. Since there's only 2 different filters anyway just drop this bit.
* gallivm: do some hack heuristic to disable texture functionsRoland Scheidegger2015-04-011-0/+40
| | | | | | | | | | | | | | We've seen some cases where performance can hurt quite a bit. Technically, the more simple the function the more overhead there is for using a function for this (and the less benefits this provides). Hence don't do this if we expect the generated code to be simple. There's an even more important reason why this hurts performance, which is shaders reusing the same unit with some of the same inputs, as llvm cannot figure out the calculations are the same if they are performned in the function (even just reusing the same unit without any input being the same provides such optimization opportunities though not very much). This is something which would need to be handled by IPO passes however.
* gallivm: implement TG4 for ARB_texture_gatherRoland Scheidegger2015-03-312-40/+133
| | | | | | | | | | | | | | | | This is quite trivial, essentially just follow all the same code you'd use with linear min/mag (and no mip) filter, then just skip the filtering after looking up the texels in favor of direct assignment of the right channel to the result. (This is though not true for the multi-offset version if we'd want to support it - for this would probably need to do something along the lines of 4x nearest sampling due to the necessity of doing coord wrapping individually per texel.) Supports multi-channel formats. From the SM5 gather cap bit, should support non-constant offsets, plus shadow comparisons (the former untested), but not component selection (should be easy to implement but all this stuff is not really exposable anyway for now). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: add gather support to sampler interfaceRoland Scheidegger2015-03-313-21/+34
| | | | | | Luckily thanks to the revamped interface this is a lot less work now... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: simplify sampler interfaceRoland Scheidegger2015-03-314-220/+205
| | | | | | | | | | | | | | | | | | This has got a bit out of control with more and more parameters added. Worse, whenever something in there changes all callees have to be updated for that, even though they don't really do much with any parameter in there except pass it on to the actual sampling function. Hence simply put almost everything into a struct. Also instead of relying on some arguments being NULL, be explicit and set this in a key (which is just reused for function generation for simplicity). (The code still relies on them being NULL in the end for now.) Technically there is a minimal functional change here for shadow sampling: if shadow sampling is done is now determined explicitly by the texture function (either sample_c or the gl-style tex func inherit this from target) instead of the static texture state. These two should always match, however. Otherwise, it should generate all the same code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: Fix build against LLVM 3.7 SVN r233648Michel Dänzer2015-03-311-0/+5
| | | | Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: fix texture function name (key) when using txf/ldRoland Scheidegger2015-03-281-2/+5
| | | | | | | | | | When using the texel fetch functions rather than ordinary texturing, the arguments are all int vecs instead of float vecs, not to mention the actual function would look completely different. Hence this must be included in the texture function name (which serves as the key) otherwise things crash badly when a shader accesses the same texture and sampler unit with both txf/ld and ordinary texturing instructions with otherwise matching keys.
* gallivm: Fix build since llvm r233411Jan Vesely2015-03-271-0/+4
| | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
* gallivm: use llvm function calls for texturing instead of inliningRoland Scheidegger2015-03-272-26/+438
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are issues with inlining everything, most notably llvm will use much more memory (and be slower) when compiling. Ideally we'd probably use functions for shader functions too but texture sampling usually is responsible for quite some IR (it can easily reach 80% of total IR instructions) so this seems like a good start. This still generates a different function for all different combinations just like before, however it is possible llvm is missing some optimization opportunities - it is believed though such opportunities should be somewhat rare, but at least for now it can still be switched off (at compile time only). It should probably make compiled code also smaller because the same function should be used for different variants in the same module (so for the opaque/partial or linear/elts variants). No piglit change (though it does indeed speed up unrealistic tests like fp-indirections2 by a factor of 30 or so). Has a small negative performance impact in openarena - I suspect this could be fixed by running some IPO passes (despite the private linkage, llvm right now does NO optimization at all wrt anything going past the call, even if there's just one caller - so things like values stored before the call and then always written by the function etc. will not be optimized away, nor will dead arguments (which we mostly shouldn't have) be eliminated, always constant arguments promoted etc.). v2: use proper return values instead of pointer function arguments. llvm supports aggregate return values, which do wonders here eliminating unnecessary stack variables - everything in fact will be returned in registers even without any IPO optimizations. It makes the code simpler too. With this I could not measure a peformance impact in openarena any longer (though since there's still no constant value propagation etc. into the tex functions this does not mean it couldn't have a negative impact elsewhere). v3: fix some minor issues suggested by Jose, and do disassembly (and the profiling) without hacks. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: pass jit_context pointer through to samplingRoland Scheidegger2015-03-275-74/+132
| | | | | | | | | | | | | | The callbacks used for getting the dynamic texture/sampler state were using the jit_context from the generated jit function. This works just fine, however that way it's impossible to generate separate functions for texture sampling, as will be done in the next commit. Hence, pass this pointer through all interfaces so it can be passed to a separate function (technically, it would probably be possible to extract this pointer from the current function instead, but this feels hacky and would probably require some more hacks if we'd use real functions instead of inlining all shader functions at some point). There should be no difference in the generated code for now. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: Use MCInstrInfo in the disassembler for querying instruction infoTom Stellard2015-03-231-7/+1
| | | | This fixes the build since llvm r232885 and also simplifies the code.
* gallivm: Silence unused variable warnings on release builds.Jose Fonseca2015-03-222-0/+4
| | | | Reviewed-by: Brian Paul <brianp@vmware.com>
* gallivm: remove unused 'builder' variableBrian Paul2015-03-191-1/+0
| | | | | Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: Use INFINITY directly.Jose Fonseca2015-03-181-8/+1
| | | | | | Already done below. Reviewed-by: Brian Paul <brianp@vmware.com>
* gallivm: abort properly when running out of buffer space in lp_disassemblyRoland Scheidegger2015-03-171-4/+8
| | | | | | | Before this actually ran into an infinite loop printing out "invalid"... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallium: add FMA and DFMA opcodes (v3)Marek Olšák2015-03-161-0/+1
| | | | | | | | | Needed by ARB_gpu_shader5. v2: select DMAD for FMA with double precision v3: add and select DFMA Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* gallivm: (trivial) Fix typo in comment introduced by 70dc8aAlexandre Demers2015-03-131-1/+1
| | | | | | | Fix typo in comment introduced by 70dc8a Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Jose Fonseca <jfonseca@vmware.com>