| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Reimplemented by Olivier Galibert <galibert@pobox.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
|
|
|
| |
Tested with lp_test_arit with 100% passes and piglit tests with 100%
pass for log but some tests still fail for pow.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
|
|
|
|
|
|
| |
separately to decrease data dependency for faster runtime.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
|
|
|
|
| |
untested, but cannot have worked before.
|
|
|
|
|
|
|
| |
rcp asserts on type.floating so don't go passing non-floating
things into it.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
|
|
| |
for completeness.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
|
|
| |
use a single entry point, as per Jose's suggestion.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
| |
Reviewed-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
|
|
|
| |
Instead of separate ifloor / fract calls.
No change for SSE4.1 code, but less FP<->SI conversions on non SSE4.1
systems.
|
| |
|
|
|
|
|
|
|
| |
Not sure how I computed these, but they were wrong (which explains why
bumping the polynomial order before never improved precision).
This allows to pass the EXP test cases of PSPrecision/VSPrecision DCTs.
|
|
|
|
|
|
|
| |
Add an iteration step, which makes rqsqrt precision go from 12bits to
24, and fixes RSQ/NRM test case of PSPrecision/VSPrevision DCTs.
There are no uses of this function outside shader translation.
|
|
|
|
|
| |
Never used so far -- we only used the base 2 variants -- which is why
it went unnoticed so far.
|
|
|
|
| |
The field was redundant. Use the gallivm->builder value instead.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This branch defines a gallivm_state structure which contains the
LLVMBuilderRef, LLVMContextRef, etc. All data structures built with
this object can be periodically freed during a "garbage collection"
operation.
The gallivm_state object has to be passed to most of the builder
functions where LLVMBuilderRef used to be used.
Conflicts:
src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
src/gallium/drivers/llvmpipe/lp_state_setup.c
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
sse2 supports round to nearest directly (or rather, assuming default nearest
rounding mode in MXCSR). Use intrinsic to use this rather than round (sse41)
or bit manipulation whenever possible.
|
|
|
|
| |
trunc of -1.5 is -1.0 not 1.0...
|
|
|
|
| |
The only way to ensure we don't do redundant FP <-> SI conversions.
|
|
|
|
| |
Not tested yet, but should be correct.
|
| |
|
| |
|
| |
|
|
|
|
| |
Unfortunately this can cause segfault with LLVM 2.6, if x is a constant.
|
| |
|
|
|
|
|
|
|
| |
The result is scalar, so when argument is zero/undef we can pass vector
zero/undef.
Also, support the scalar case.
|
|
|
|
| |
The function was duplicated.
|
|
|
|
|
|
|
| |
See comments for detailed rationale.
Thanks to Michal Krol and Zack Rusin for detecting and investigating this
in detail.
|
|
|
|
| |
It causes a very ugly corruption on the Earth's halo on Google Earth.
|
|
|
|
| |
Thanks to Michal for spotting this.
|
|
|
|
|
|
| |
http://bugs.freedesktop.org/29407
Signed-off-by: José Fonseca <jfonseca@vmware.com>
|
|
|
|
|
|
| |
http://bugs.freedesktop.org/29407
Signed-off-by: José Fonseca <jfonseca@vmware.com>
|
| |
|
|
|
|
| |
It didn't exist yet.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes the assert added in LLVM 2.8:
assert(getType()->isIntOrIntVectorTy() &&
"Tried to create an integer operation on a non-integer type!")
But it also fixes some subtle bugs, since we should've been doing this
since LLVM 2.6 anyway.
Includes a modified patch from steckdenis@yahoo.fr for the
FNeg instructions in emit_fetch(); thanks for pointing those out.
http://bugs.freedesktop.org/29404
http://bugs.freedesktop.org/29407
Signed-off-by: José Fonseca <jfonseca@vmware.com>
|
| |
|
| |
|
|
|
|
|
|
| |
Plus fix minor error in lp_build_iceil() by tweaking the offset value.
And add a bunch of comments for the round(), trunc(), floor(), ceil()
functions.
|
| |
|
|
|
|
|
|
| |
Based on Julien Pommier's SSE and SSE2 algorithms.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
|
| |
|
|
|
|
| |
So taking the function address does not work.
|
|
|
|
|
|
| |
Also start axing the code duplication for scalar case. The olution is to
treat the scalar case specially in a few innermost functions, and leave
outer functions untouched.
|
|
|
|
|
|
|
|
|
|
| |
The advantage of range[-0.5, 0.5] is that it doesn't require floor (for
which intrinsics are only available in SSE4.1).
But the EXP opcode pretty much forces us to use floor, and there is a
good floor approximation around truncation available anyway.
This fixes EXP failures in VShader DCT.
|
|
|
|
|
| |
A quick hack to get the right results, as there are many DCT tests
which use these opcodes to generate data to test other opcodes.
|
|
|
|
|
| |
Disabled as it doesn't make VS/PSPrecision DCT happy, and it would
unnecessarily slow some cases where it is not needed.
|
|
|
|
|
|
|
|
| |
Runtime linking doesn't quite work.
Just comment then out for now to prevent crashes. These will go away in
the future because calling 4 times CRT's cosf()/sinf() is over-precise
and under-performing.
|
|
|
|
| |
As the argument in general might not be a constant.
|