| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Work in progress (disabled).
USE_8x2_TILE_BACKEND define in knobs.h enables AVX512 code paths
(emulated on non-AVX512 HW).
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
|
|
|
|
| |
Fixes: b3bd8bb611bb465d2e5e ("swr: [rasterizer core] add support
for "RAW" surface format")
CovID: 1373647
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
| |
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
| |
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
| |
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
|
|
|
|
|
| |
Align and use streaming store instructions for BE fifo queues.
Provides slightly faster enqueue and doesn't pollute the caches.
Add appropriate memory fences to ensure streaming writes are
globally visible.
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
| |
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
|
|
|
| |
- Fix conflict between windows MemoryFence and llvm::sys::MemoryFence
- Declare gettid()
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
| |
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
| |
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
| |
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
| |
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
|
|
|
|
|
| |
Refactoring to leave existing simd_* intrinsics in "simdintrin.h" unchanged,
adding corresponding simd16_* intrinsics in "simd16intrin.h" on the side,
with emulation, that we can use piecemeal, rather than the all-or-nothing
approach to bring up avx512.
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
| |
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
|
|
| |
Enabling KNOB_SIMD_WIDTH = 16 for AVX512 pre-work and low level simd utils
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
| |
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
|
|
|
| |
Currently, most code paths between AVX2 and AVX512 are identical
(see changes to knobs.h).
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix build error with icc.
CXX libswrAVX_la-swr_clear.lo
icpc: command line warning #10006: ignoring unknown option '-Wdelete-non-virtual-dtor'
In file included from ./rasterizer/jitter/jit_api.h(31),
from swr_context.h(30),
from swr_clear.cpp(24):
./rasterizer/common/os.h(135): error: expected an identifier
void _mm256_storeu2_m128i(__m128i *hi, __m128i *lo, __m256i a)
^
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Don't clear bucket descriptions to fix issues with sim level
buckets getting out of sync.
2. Close out threadviz file descriptors in ClearThreads().
3. Skip buckets for jitter based buckets when multithreaded. We need
thread local storage through llvm jit functions to be fixed before
we can enable this.
4. Fix buckets StopCapture to correctly detect capture complete.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
|
|
| |
Output with slashes instead of backslashes for unix/linux.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
|
|
| |
Fix static code analysis errors found by coverity on Linux
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
|
|
|
| |
Need to do lazy eval of the threadviz knob since order of globals
is undefined.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
|
|
|
|
|
| |
BackendPixelRate should be easier to read/maintain now hopefully.
Small perf bump by moving some of the pfn's to inline functions
without template params.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
|
|
| |
v2: use _mm_cmpunord_ps for vIsNaN
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
|
|
|
|
| |
Acked-by: Brian Paul <brianp@vmware.com>
|
|
|
|
| |
Acked-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
| |
Reduce list traversal during Alloc and Free.
Add ability to have multiple lists based on alloc size (not used for now)
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Pass pointer to core buckets mgr back to sim layer.
Add support for RDTSC_START/RDTSC_STOP macros in the builder.
Each unique shader now has a unique bucket associated with it,
enabling more detailed reporting at the shader level. Currently
due to some llvm issue with thread local storage, 64bit runs require
single threaded mode.
|
| |
|