summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary/gallivm/lp_bld_arit.c
Commit message (Collapse)AuthorAgeFilesLines
...
* gallivm: implement iabs/issg opcode.Dave Airlie2012-05-091-1/+1
| | | | | | Reimplemented by Olivier Galibert <galibert@pobox.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
* gallivm: Updated lp_build_log2_approx to use a more accurate polynomial.James Benton2012-04-051-28/+37
| | | | | | | Tested with lp_test_arit with 100% passes and piglit tests with 100% pass for log but some tests still fail for pow. Signed-off-by: José Fonseca <jfonseca@vmware.com>
* gallivm: Updated lp_build_polynomial to compute odd and even terms ↵James Benton2012-04-051-7/+25
| | | | | | separately to decrease data dependency for faster runtime. Signed-off-by: José Fonseca <jfonseca@vmware.com>
* gallivm: fix floating type in lp_build_mod helperRoland Scheidegger2012-03-051-1/+1
| | | | untested, but cannot have worked before.
* gallivm: only do rcp/mul for floatingDave Airlie2012-02-281-1/+2
| | | | | | | rcp asserts on type.floating so don't go passing non-floating things into it. Signed-off-by: Dave Airlie <airlied@redhat.com>
* gallivm: add frem support to the lp_build_mod helper.Dave Airlie2012-02-281-1/+2
| | | | | | for completeness. Signed-off-by: Dave Airlie <airlied@redhat.com>
* gallivm: add integer and unsigned mod arit functions. (v2)Dave Airlie2012-02-281-0/+20
| | | | | | use a single entry point, as per Jose's suggestion. Signed-off-by: Dave Airlie <airlied@redhat.com>
* llvmpipe: Don't assume vector is 4 wide in lp_build_sin()/lp_build_cos()José Fonseca2012-02-201-81/+60
| | | | Reviewed-by: Dave Airlie <airlied@redhat.com>
* llvmpipe: Use lp_build_ifloor_fract for exp2 calculation.José Fonseca2011-10-161-5/+1
| | | | | | | Instead of separate ifloor / fract calls. No change for SSE4.1 code, but less FP<->SI conversions on non SSE4.1 systems.
* gallivm: Add a note about log2 computation and denormalized numbers.José Fonseca2011-07-221-0/+6
|
* gallivm: Fix lp_build_exp2 order 4-5 polynomial coefficients and bump order.José Fonseca2011-07-221-12/+12
| | | | | | | Not sure how I computed these, but they were wrong (which explains why bumping the polynomial order before never improved precision). This allows to pass the EXP test cases of PSPrecision/VSPrecision DCTs.
* gallivm: Increase lp_build_rsqrt() precision.José Fonseca2011-07-221-1/+1
| | | | | | | Add an iteration step, which makes rqsqrt precision go from 12bits to 24, and fixes RSQ/NRM test case of PSPrecision/VSPrevision DCTs. There are no uses of this function outside shader translation.
* gallivm: Fix lp_build_exp/lp_build_log.José Fonseca2011-07-221-2/+2
| | | | | Never used so far -- we only used the base 2 variants -- which is why it went unnoticed so far.
* gallivm/llvmpipe: remove lp_build_context::builderBrian Paul2010-12-021-130/+171
| | | | The field was redundant. Use the gallivm->builder value instead.
* gallivm/llvmpipe: squash merge of the llvm-context branchBrian Paul2010-11-301-132/+157
| | | | | | | | | | | | | | This branch defines a gallivm_state structure which contains the LLVMBuilderRef, LLVMContextRef, etc. All data structures built with this object can be periodically freed during a "garbage collection" operation. The gallivm_state object has to be passed to most of the builder functions where LLVMBuilderRef used to be used. Conflicts: src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c src/gallium/drivers/llvmpipe/lp_state_setup.c
* gallivm: Add a note about SSE4.1's nearest mode rounding.José Fonseca2010-10-181-0/+6
|
* gallivm: remove newlinesBrian Paul2010-10-121-2/+0
|
* gallivm: Less code duplication in log computation.José Fonseca2010-10-091-34/+79
|
* gallivm: faster iround implementation for sse2Roland Scheidegger2010-10-091-1/+53
| | | | | | sse2 supports round to nearest directly (or rather, assuming default nearest rounding mode in MXCSR). Use intrinsic to use this rather than round (sse41) or bit manipulation whenever possible.
* gallivm: fix trunc/itrunc commentRoland Scheidegger2010-10-091-6/+6
| | | | trunc of -1.5 is -1.0 not 1.0...
* gallivm: Combined ifloor & fract helper.José Fonseca2010-10-061-0/+42
| | | | The only way to ensure we don't do redundant FP <-> SI conversions.
* gallivm: Fast implementation of iround(log2(x))José Fonseca2010-10-061-0/+35
| | | | Not tested yet, but should be correct.
* gallivm: Use a faster (and less accurate) log2 in lod computation.José Fonseca2010-10-061-0/+44
|
* gallivm: Take the type signedness in consideration in round/ceil/floor.José Fonseca2010-10-061-48/+59
|
* gallivm: Use SSE4.1's ROUNDSS/ROUNDSD for scalar rounding.José Fonseca2010-09-291-21/+71
|
* gallivm: Add unorm support to lp_build_lerp()José Fonseca2010-09-221-9/+75
| | | | Unfortunately this can cause segfault with LLVM 2.6, if x is a constant.
* gallivm: Add a new debug flag to warn about performance issues.José Fonseca2010-09-111-4/+13
|
* gallivm: Fix lp_build_sum_vector.José Fonseca2010-08-301-6/+4
| | | | | | | The result is scalar, so when argument is zero/undef we can pass vector zero/undef. Also, support the scalar case.
* util: remove util_is_pot in favor of util_is_power_of_twoMarek Olšák2010-08-291-1/+1
| | | | The function was duplicated.
* gallivm: Emit DIVPS instead of RCPPS.José Fonseca2010-08-211-12/+24
| | | | | | | See comments for detailed rationale. Thanks to Michal Krol and Zack Rusin for detecting and investigating this in detail.
* gallivm: Refactor the Newton-Rapshon steps, and disable once again.José Fonseca2010-08-141-28/+83
| | | | It causes a very ugly corruption on the Earth's halo on Google Earth.
* gallivm: Fix and enable the extra Newton/Raphson step in lp_build_rcp().José Fonseca2010-08-111-2/+2
| | | | Thanks to Michal for spotting this.
* gallivm: Fix bitwise operations for floats, division for integersnobled2010-08-101-3/+14
| | | | | | http://bugs.freedesktop.org/29407 Signed-off-by: José Fonseca <jfonseca@vmware.com>
* gallivm: Even more type checkingnobled2010-08-101-1/+11
| | | | | | http://bugs.freedesktop.org/29407 Signed-off-by: José Fonseca <jfonseca@vmware.com>
* gallivm: More type checks.José Fonseca2010-08-091-0/+43
|
* gallivm: Don't call LLVMBuildFNeg on llvm-2.6.José Fonseca2010-08-091-4/+3
| | | | It didn't exist yet.
* gallivm: Always use floating-point operators for floating-point typesnobled2010-08-091-71/+110
| | | | | | | | | | | | | | | | | This fixes the assert added in LLVM 2.8: assert(getType()->isIntOrIntVectorTy() && "Tried to create an integer operation on a non-integer type!") But it also fixes some subtle bugs, since we should've been doing this since LLVM 2.6 anyway. Includes a modified patch from steckdenis@yahoo.fr for the FNeg instructions in emit_fetch(); thanks for pointing those out. http://bugs.freedesktop.org/29404 http://bugs.freedesktop.org/29407 Signed-off-by: José Fonseca <jfonseca@vmware.com>
* gallivm: Add type checks for the basic operations.José Fonseca2010-08-081-0/+12
|
* gallivm: Remove unnecessary header.Vinson Lee2010-07-061-1/+0
|
* gallivm: finish implementation of lp_build_iceil()Brian Paul2010-07-061-19/+67
| | | | | | Plus fix minor error in lp_build_iceil() by tweaking the offset value. And add a bunch of comments for the round(), trunc(), floor(), ceil() functions.
* gallivm: Remove unnecessary headers.Vinson Lee2010-05-261-2/+0
|
* gallivm: Efficient implementation of sin/cos.Qicheng Christopher Li2010-05-241-105/+429
| | | | | | Based on Julien Pommier's SSE and SSE2 algorithms. Signed-off-by: José Fonseca <jfonseca@vmware.com>
* gallivm: Silent warning.José Fonseca2010-05-101-1/+1
|
* gallivm: cosf/sinf are macros on MSVC.José Fonseca2010-05-101-2/+12
| | | | So taking the function address does not work.
* gallivm: Actually do floor/ceil/trunc for scalars.José Fonseca2010-05-081-166/+26
| | | | | | Also start axing the code duplication for scalar case. The olution is to treat the scalar case specially in a few innermost functions, and leave outer functions untouched.
* gallivm: Use a minimax polynomial for exp2 in range [0,1] instead [-0.5,5].José Fonseca2010-05-081-14/+41
| | | | | | | | | | The advantage of range[-0.5, 0.5] is that it doesn't require floor (for which intrinsics are only available in SSE4.1). But the EXP opcode pretty much forces us to use floor, and there is a good floor approximation around truncation available anyway. This fixes EXP failures in VShader DCT.
* gallivm: The the JIT engine to use our sinf()/cosf() on Windows.José Fonseca2010-05-081-18/+79
| | | | | A quick hack to get the right results, as there are many DCT tests which use these opcodes to generate data to test other opcodes.
* gallicm: Newton-Raphson step to improve precision.José Fonseca2010-05-041-2/+27
| | | | | Disabled as it doesn't make VS/PSPrecision DCT happy, and it would unnecessarily slow some cases where it is not needed.
* gallivm: Disable llvm.cos.v4f32 and llvm.sin.v4f32 instrinsics on Windows.José Fonseca2010-04-271-0/+18
| | | | | | | | Runtime linking doesn't quite work. Just comment then out for now to prevent crashes. These will go away in the future because calling 4 times CRT's cosf()/sinf() is over-precise and under-performing.
* gallivm: LLVMConstBitCast -> LLVMBuildBitCastJosé Fonseca2010-04-241-2/+4
| | | | As the argument in general might not be a constant.