summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* softpipe: check for SP_NEW_STIPPLE when building quad pipelineBrian Paul2014-10-311-0/+1
| | | | | | | Fixes polygon stipple if both DO_PSTIPPLE_IN_DRAW_MODULE and DO_PSTIPPLE_IN_HELPER_MODULE are zero/off. Reviewed-by: Charmaine Lee <charmainel@vmware.com>
* r600g: Fix build with opencl and radeonsi disabledTom Stellard2014-10-311-6/+6
|
* clover: Fix bug when binary programs are passed to clBuildProgram() v2Tom Stellard2014-10-312-6/+14
| | | | | | | | | | | | | This was a regression introduced by 611d66fe4513e53bde052dd2bab95d448c909a2a Passing a binary program to clBuildProgram() is legal, but passing one to clCompileProgram() is not. v2: - Code cleanups. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
* clover: Factor input validation of clCompileProgram into a new function v2Tom Stellard2014-10-311-10/+23
| | | | | | | | | This factors out the validation that is common with clBuildProgram(). v2: - Code cleanups. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
* radeonsi/compute: Enable PIPE_SHADER_IR_NATIVE for compute shaders v2Tom Stellard2014-10-314-59/+127
| | | | | | v2: - Drop dependency on LLVM >= 3.5.1 - Rename si_create_shader() to si_shader_binary_read()
* r600g/compute: Enable PIPE_SHADER_IR_NATIVE for compute shaders v2Tom Stellard2014-10-318-97/+180
| | | | | v2: - Drop dependency on LLVM >= 3.5.1
* gallium/radeon: Add query for symbol specific config informationTom Stellard2014-10-313-0/+86
| | | | | | | This adds a query which allows drivers to access the config information of a specific function within the LLVM generated ELF binary. This makes it possible for the driver to handle ELF binaries with multiple kernels / global functions.
* r300g: remove enabled/disabled hyperz and AA compression messagesMarek Olšák2014-10-301-2/+0
| | | | | | It's annoying with octave. Reported by Michael Burian. Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
* r600g: Delete unused variable 'max_global_size' in 'r600_get_compute_param'Dieter Nützel2014-10-301-1/+0
| | | | Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de>
* radeon/llvm: Dynamically allocate branch/loop stack arraysMichel Dänzer2014-10-292-6/+37
| | | | | | | | | | | This prevents us from silently overflowing the stack arrays, and allows arbitrary stack depths. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85454 Cc: mesa-stable@lists.freedesktop.org Reported-and-Tested-by: Nick Sarnie <commendsarnex@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* vc4: Add support for ARL and indirect register access on TGSI_FILE_CONSTANT.Eric Anholt2014-10-2810-34/+407
| | | | | Fixes 14 ARB_vp tests (which had no lowering done), and should improve performance of indirect uniform array access in GLSL.
* vc4: Fix mixup of return type in reloc_tex().Eric Anholt2014-10-281-2/+2
|
* vc4: Drop redundant check for is_tmu_write().Eric Anholt2014-10-281-3/+0
| | | | This function is only called when it would return true.
* vc4: Don't forget to validate code that's got PROG_END on it.Eric Anholt2014-10-281-5/+6
| | | | | This signal doesn't terminate the program now, it terminates the program soon. So you have to actually validate the code in the instruction.
* vc4: Add .dir-locals.el for kernel style in the kernel code.Eric Anholt2014-10-281-0/+12
|
* vc4: Fix a couple missing '\n's in error output.Eric Anholt2014-10-282-2/+2
|
* r300g/vdpau: enable againDavid Heidelberger2014-10-281-0/+1
| | | | | Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz> Signed-off-by: Marek Olšák <marek.olsak@amd.com>
* r300g: only set clip_halfz for chips with HW TCLMarek Olšák2014-10-281-1/+1
| | | | | I forgot that we cannot emit vertex shader state on a chip without VS. In such a case, clip_halfz is handled by the Draw module.
* radeonsi: fix incorrect index buffer max size for lowered 8-bit indicesMarek Olšák2014-10-281-1/+1
| | | | | Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
* radeonsi: fix polygon mode for points and lines and point/line fill modesMarek Olšák2014-10-281-3/+3
| | | | | | | Fixes piglit/polygon-mode-offset. Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
* r600g: fix polygon mode for points and lines and point/line fill modesMarek Olšák2014-10-282-6/+6
| | | | | | | Fixes piglit/polygon-mode-offset. Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
* r600g: Implement sm5 UBO/sampler indexingGlenn Kennard2014-10-287-19/+164
| | | | | | | | | Caveat: Shaders using UBO/sampler indexing will not be optimized by SB, due to SB not currently supporting the necessary CF_INDEX_[01] index registers. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
* r600g: Implement sm5 interpolation functionsGlenn Kennard2014-10-282-3/+237
| | | | | | Requires evergreen/cayman Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
* nv50: handle inverted render conditionsTobias Klausmann2014-10-264-10/+51
| | | | | | | This enables ARB_conditional_render_inverted. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* freedreno/ir3: consider instruction neighbors in cpRob Clark2014-10-252-11/+178
| | | | | | | | | | | | | | | | | Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive scalar registers. Keep track of instruction neighbors in copy- propagation step and avoid eliminating mov's which would cause an instruction to need multiple distinct left and/or right neighbors. This lets us not fall on our face when we encounter things like: 1: MOV TEMP[2], IN[0].xyzw 2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D 3: MOV TEMP[2].xy, IN[0].yxzz 4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D 5: END Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: always mov tex coordsRob Clark2014-10-251-54/+30
| | | | | | | | | | | | | | | | Always insert extra mov's for the tex coord into the fanin. This simplifies things a bit, and avoids a scenario where multiple sam instructions can have mutually exclusive input's to it's fanin, for example: 1: TEX OUT[0].xy, IN[0].xyxx, SAMP[0], 2D 2: TEX OUT[0].zw, IN[0].yxxx, SAMP[0], 2D The CP pass can always remove the mov's that are not actually needed, so better to start out with too many mov's in the front end, than not enough. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno: rename a couple debug flagsRob Clark2014-10-253-7/+7
| | | | | | | | | dscis -> noscis dbypass -> nobypass a bit more consistant w/ nobin, etc. And IMO a bit more sensible names. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: skip virtual outputs in standalone compilerRob Clark2014-10-251-0/+3
| | | | | | | Kills get added to the outputs list, to ensure they get scheduled. But they aren't *really* outputs so skip them in the header comment block. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: standalone compiler updates for ir3testRob Clark2014-10-254-18/+51
| | | | | | | | | | | | | | | In order to test compiler changes more easily, spit out the assembled shader with some header information so that we can know about inputs/outputs more easily. See: git://people.freedesktop.org/~robclark/ir3test In ir3test we have a big collection of tgsi shaders and reference ir3_compiler outputs. When making compiler changes, regenerate the compiler outputs and feed to ir3test to compare the new vs reference shader. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* ilo: improve blob decodingChia-I Wu2014-10-251-8/+31
| | | | | | | The last few dwords were skipped if the total number of dwords was not a multiple of 4. Change the formatting for better readability. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
* llvmpipe: Ensure the packed input of the lp_test_format is aligned.José Fonseca2014-10-241-2/+10
| | | | | | | | Fixes: - https://bugs.freedesktop.org/show_bug.cgi?id=85377 - http://llvm.org/bugs/show_bug.cgi?id=21365 Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* llvmpipe: Flush stdout on lp_test_* unit tests.José Fonseca2014-10-242-0/+3
| | | | | | | So that the order of test messages and gallivm/llvmpipe debug output is preserved. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* gallium: introduce PIPE_CAP_CLIP_HALFZ.Mathias Fröhlich2014-10-2415-0/+20
| | | | | | | | | | | | In preparation of ARB_clip_control. Let the driver decide if it supports pipe_rasterizer_state::clip_halfz being set to true. v3: Initially enable on ilo. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de
* vc4: Reuse uniform_data/contents indices when making uniforms.Eric Anholt2014-10-241-0/+7
| | | | | | | | | | This allows vc4_opt_cse.c to CSE-away operations involving the same uniform values. total instructions in shared programs: 37341 -> 36906 (-1.16%) instructions in affected programs: 10233 -> 9798 (-4.25%) total uniforms in shared programs: 10523 -> 10320 (-1.93%) uniforms in affected programs: 2467 -> 2264 (-8.23%)
* vc4: When asked to discard-map a whole resource, discard it.Eric Anholt2014-10-241-14/+28
| | | | | | | This saves a bunch of extra flushes when texsubimaging a whole texture that's been used for rendering, or subdataing a whole BO. In particular, this massively reduces the runtime of piglit texture-packed-formats (when the probes have been moved out of the inner loop).
* vc4: Refactor flushing before mapping a BO.Eric Anholt2014-10-243-12/+13
| | | | I'm going to want to make some other decisions here before flushing.
* vc4: Allow dead code elimination of unused varyings.Eric Anholt2014-10-245-5/+31
| | | | | | | total instructions in shared programs: 39022 -> 37341 (-4.31%) instructions in affected programs: 26979 -> 25298 (-6.23%) total uniforms in shared programs: 11242 -> 10523 (-6.40%) uniforms in affected programs: 5836 -> 5117 (-12.32%)
* vc4: Add debug output to match shaderdb info to program dumps.Eric Anholt2014-10-244-7/+29
| | | | | | I'm going to be using VC4_DEBUG=shaderdb,norast to do shaderdb stats, but when debugging regressions, I want to match shaderdb output to shader disassembly.
* radeon: enable Hyper-Z on r600g and radeonsi by defaultAndreas Boll2014-10-244-5/+5
| | | | | | | | | | | | | | | | | This reverts commit 01e637114914453451becc0dc8afe60faff48d84. Since then many Hyper-Z issues have been fixed or worked around. Enable Hyper-Z by default so that we get enough feedback for the upcoming mesa 10.4 release. If you have issues with Hyper-Z try to disable Hyper-Z using the enviroment variable R600_DEBUG=nohyperz and please report the issue on the bugtracker. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75011 See also: https://bugs.freedesktop.org/show_bug.cgi?id=75112 Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* Revert "freedreno/a3xx: only emit dirty consts"Rob Clark2014-10-232-9/+5
| | | | | | | This reverts commit 94bb33617d1e8978dc52b8aaa4eb41bfb6703f79. Which somehow broke gnome-shell.. and needs more investigation. For now, revert..
* freedreno: fix PIPE_TRANSFER_DISCARD_WHOLE_RESOURCERob Clark2014-10-231-7/+6
| | | | | | | | | | | | | | | fd_bo_cpu_prep() doesn't realize the bo is already referenced in unflushed cmdstream. It could be made to do so (but would have to be implemented twice, ie. both for msm and kgsl). But we still can't do the expected thing if the caller isn't using _NOSYNC. Because of the way the tiling works, we need to build quite a bit of cmdstream at flush time, which is not possible to do at the libdrm level. So rather than trying to make fd_bo_cpu_prep() smarter than it can possibly be, just *always* discard and reallocate if the PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag is set. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* clover: use correct typenames for compat::pair's first/secondEmil Velikov2014-10-231-2/+2
| | | | | | | | | | Seems to be a typo judging from the overall declaration of the template. Cc: EdB <edb+mesa@sigluy.net> Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
* auxiliary/os: get the mmap/munmap wrappers working with androidEmil Velikov2014-10-231-5/+12
| | | | | | | | | | - Use macro for munmap under Android - the STATIC_ASSERT uses a off_t which is not used under Android for mmap. As loff_t size does not vary as does off_t just ignore the assert. - Wrap the long lines to improve readability. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
* gallium/nouveau: fully build the driver under androidMauro Rossi2014-10-231-1/+1
| | | | | | Fix the trivial typo in the variable name. Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
* wgl: stw_pixelformat_get_info: correct type for index variableAlon Levy2014-10-231-1/+1
| | | | | Signed-off-by: Alon Levy <alevy@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com>
* u_math.h: fix 64 to 32 bit truncation warningAlon Levy2014-10-231-1/+1
| | | | | Signed-off-by: Alon Levy <alevy@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com>
* gallivm: Fix build with LLVM 3.3.José Fonseca2014-10-231-0/+2
| | | | | | | | | The setMCJITMemoryManager method doesn't exist in LLVM 3.3. I thought I had tested the latest version of my earlier change with LLVM 3.3, but it looks I missed it. Trivial.
* gallivm: Properly update for removal of JITMemoryManager in LLVM 3.6.José Fonseca2014-10-232-38/+41
| | | | | | | | | | | | | | | | | | JITMemoryManager was removed in LLVM 3.6, and replaced by its base class RTDyldMemoryManager. This change fixes our JIT memory managers specializations to derive from RTDyldMemoryManager in LLVM 3.6 instead of JITMemoryManager. This enables llvmpipe to run with LLVM 3.6. However, lp_free_generated_code is basically a no-op because there are not enough hook points in RTDyldMemoryManager to track and free the code of a module. In other words, with MCJIT, code once created, stays forever allocated until process destruction. This is not speicfic to LLVM 3.6 -- it will happen whenever MCJIT is used regardless of version. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* gallivm: Fix white-space.José Fonseca2014-10-231-7/+7
| | | | | | Replace tabs with spaces. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* gallivm,llvmpipe,clover: Bump required LLVM version to 3.3.José Fonseca2014-10-237-119/+8
| | | | | | | | | | | | | | We'll need to update gallivm for the interface changes in LLVM 3.6, and the fewer the number of older LLVM versions we support the less hairy that will be. As consequence HAVE_AVX define can disappear. (Note HAVE_AVX meant whether LLVM version supports AVX or not. Runtime support for AVX is always checked and enforced independently.) Verified llvmpipe builds and runs with with LLVM 3.3, 3.4, and 3.5. Reviewed-by: Roland Scheidegger <sroland@vmware.com>