summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* mesa: Add performance debug for meta code.Eric Anholt2013-04-212-3/+35
| | | | | | | | I noticed a fallback in regnum through sysprof, and wanted a nicer way to get information about it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel: Mention how much data we're trying to subdata in perf debug.Eric Anholt2013-04-211-2/+3
| | | | | Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* Revert "gallivm: Emit vector selects."José Fonseca2013-04-211-2/+14
| | | | | | | | | | | | | | | | | | | | | | | It caused inumerous regressions (LLVM 3.1) in blending. In particular: - lp_test_blend type=u8nx16 rgb_func=sub rgb_src_factor=zero rgb_dst_factor=inv_src_color alpha_func=rev_sub alpha_src_factor=one alpha_dst_factor=const_color ... MISMATCH Src: 0 0 0 b5 49 29 0 a2 0 21 de 0 c3 1b ec 0 Src1: 2d 85 14 0 f8 0 79 a1 99 0 d8 0 59 16 0 0 Dst: 0 a9 97 0 c0 0 78 0 0 8b aa f0 bd 0 78 f6 Con: 7d 0 c0 0 0 bb 77 0 0 0 50 0 40 51 0 0 Res: 0 0 0 0 0 29 0 0 0 0 c8 0 97 1b e3 0 Ref: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 type=u8nx16 rgb_func=max rgb_src_factor=one rgb_dst_factor=inv_const_color alpha_func=min alpha_src_factor=zero alpha_dst_factor=inv_src1_alpha ... MISMATCH Src: d 0 0 e9 0 37 35 f0 62 0 0 b2 e9 f7 0 5c Src1: 8f 0 bf 0 a8 5 0 0 c4 0 d7 7 92 a 0 17 Dst: cb 0 1e 0 0 0 19 8e 0 4d 0 0 0 0 3 46 Con: aa 5a 5f 8f 0 0 bc 92 0 88 0 0 b7 8a c0 88 Res: 44 0 13 0 0 0 7 8e 0 24 0 0 0 0 1 40 Ref: 44 0 13 0 0 37 35 0 62 24 0 0 e9 f7 1 0 This reverts commit 1e266c7ef01251ecf72347a2ba1d174b035cbe3b.
* llvmpipe: verify function on blend test.José Fonseca2013-04-211-0/+2
|
* llvmpipe: Don't support Z32_FLOAT_S8X24_UINT texture sampling support either.José Fonseca2013-04-201-4/+6
| | | | | | | Because we don't support, and the u_format fallback doesn't work for zs formats. Reviewed-by: Brian Paul <brianp@vmware.com>
* llvmpipe: Ignore depth-stencil state if format has no depth/stencil.José Fonseca2013-04-201-4/+10
| | | | | | Prevents assertion failures inside the driver for such state combinations. Reviewed-by: Brian Paul <brianp@vmware.com>
* gallivm: Disable LLVM 2.7 workaround on other versions.José Fonseca2013-04-201-2/+1
| | | | | | | | | 2.7 was a particularly trouble ridden release. Furthermore, the bug no longer can be reproduced ever since the first_level state was taken in account. Reviewed-by: Brian Paul <brianp@vmware.com>
* gallivm: Emit vector selects.José Fonseca2013-04-201-12/+2
| | | | | | | | | | | | | They are supported on LLVM 3.1, at least on x86. (I haven't tested on PPC though.) Actually lp_build_linear_mip_levels() already has been emitting them for some time. This avoids intrinsics, which tend to be an obstacle for certain optimization passes. Reviewed-by: Brian Paul <brianp@vmware.com>
* freedreno: move ir -> ir2Rob Clark2013-04-206-283/+283
| | | | | | | | There will be a new IR for a3xx, which has a very different shader ISA (more scalar oriented). So rename to avoid conflicts later when I start adding a3xx support to the gallium driver. Signed-off-by: Rob Clark <Rob Clark robdclark@freedesktop.org>
* freedreno: cleanup some cruft left over from fdreRob Clark2013-04-202-133/+1
| | | | | | | | The standalone shader assembler needed some meta-data to know about attributes/varyings/etc, to do the shader linkage. We don't need these parts with gallium/tgsi, so just get rid of it. Signed-off-by: Rob Clark <Rob Clark robdclark@freedesktop.org>
* gallivm: implement switch opcodeRoland Scheidegger2013-04-203-12/+340
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Should be able to handle all things which make this tricky to implement. Fallthroughs, including most notably into/out of default, should be handled correctly but are quite a mess. If we see largely unoptimized switches in the wild should probably think about some "real" switch optimization pass, e.g. things like this: switch case1 someinst brk case2 default case3 someinst brk case4 someinst endswitch are legal, but the pointless case2/case3 statements not only cause condition evaluation but will turn this into a "fake" fallthrough case (because mask and defaultmask are already updated for case2 when default is encountered) requiring executing code twice. If default is at the end though, there's never any code re-execution, and if that's not the case if there's no fallthrough in (not even a fake one) and out of default there's no code re-execution neither. v2: add comments, and use enum for break type instead of magic boolean. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: use uint build context for mask instead of floatRoland Scheidegger2013-04-201-1/+1
| | | | | | Unsurprisingly noone was using it except for grabbing builder. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm/tgsi: fix up breakcRoland Scheidegger2013-04-203-2/+6
| | | | | | | | | It seems there was a typo in gallivm breakc handling (I am actually still not sure it is really needed but otherwise that statement really should go away). Also fix the wrong src argument type, even though they weren't really used. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* svga: remove TGSI_OPCODE_BREAKC instruction translationRoland Scheidegger2013-04-201-1/+0
| | | | | | | While initially that opcode probably was meant for something along the lines of sm3 break_comp it has never worked that way (not even the argument count was right) and now the opcode has quite different semantics so just remove it. (Discovered by Jose Fonseca)
* gallium: document breakc and switch/case/default/endswitchRoland Scheidegger2013-04-201-6/+51
| | | | | | | docs were missing, especially the opcode-from-hell switch however is anything but obvious. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: increase nesting limit to 66Roland Scheidegger2013-04-201-2/+4
| | | | | | | | | | | This is still not really correct, since at least for sm 4.0 the nesting limit is 64 per subroutine, and subroutine nesting itself has a limit of 32, so since we have a flat stack we'd need 32*64. But this should probably be better fixed with per-subroutine stacks, since otherwise these structures get really big (like 100kB for the lp_exec_mask). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* draw: implement primitive assemblerZack Rusin2013-04-187-4/+386
| | | | | | | | | | | | | | | | | | | | Input assembler needs to be able to decompose adjacency primitives into something that can be understood by the rest of the pipeline. The specs say that the adjacency primitives are *only* visible in the geometry shader, for everything else they need to be decomposed. Which in most of the cases is not an issue, because the geometry shader always decomposes them for us, but without geometry shader we were passing unchanged adjacency primitives to the rest of the pipeline and causing crashes everywhere. This commit introduces a primitive assembler which, if geometry shader is missing and the input primitive is one of the adjacency primitives, decomposes them into something that the rest of the pipeline can understand. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>
* util/prim: fix decomposed counts for adjacency primitivesZack Rusin2013-04-181-4/+4
| | | | | | Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* draw/so: uses the correct index with the pre clipped coordinatesZack Rusin2013-04-181-6/+6
| | | | | | | | | | | pre_clip_pos is a float[4] we just used (*float)[4] to be able to jump within the array of vertex_headers with it. So if the idx happened to be anything but 0, we'd actually read from some garbage in memory. Change it to just be a simple pointer instead of casting it to something that it's not. As suggested by Jose. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* glapi: Add counter information for glBufferData(), like glBufferSubData().Eric Anholt2013-04-191-2/+2
| | | | This causes this function to become asynchronous with glthread.
* glapi: Add parameter count information for uniforms.Eric Anholt2013-04-192-42/+42
| | | | | | This is the kind of information that would have been present for GLX, if GLX supported modern GL. This allows these entrypoints to get automatic asynchronous marshalling code generated for glthread.
* glapi: skip padding in get_called_parameter_stringPaul Berry2013-04-191-0/+2
| | | | | | | | | | | This bug is currently benign, since get_called_parameter_string() is currently only used for functions that return true for glx_function.has_different_protocol(), and none of those functions include padding. However, in order to implement marshalling of GL API functions, we'll need to use get_called_parameter_string() far more often. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* mesa: Fix up program_parse.y to avoid uninitialized $$Paul Berry2013-04-191-0/+5
| | | | | | | | | | | Without this patch, $$.negate, $$.rgba_valid, and $$.xyzw_valid take on garbage values. At the moment this problem is benign (the garbage values happen to be zero), but in my experiments executing GL operations on a background thread, the garbage values change, leading to piglit failures. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* mesa: Use quotes on bool driconf options to prevent stdbool.h breakage.Eric Anholt2013-04-195-41/+48
| | | | | | | | | | | | | Since stdbool.h's "true" and "false" are #defines, they got expanded when used as macro arguments, and that expanded value was stored in the XML string, producing XML that driconf would then fail to parse. Currently no drivers included stdbool along with driconf, but I keep accidentally doing so on intel as we move towards using normal C. v2: rebase on master. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
* svga: whitespace, comment fixes in svga_pipe_query.cBrian Paul2013-04-191-41/+49
|
* svga: whitespace, comment fixes in svga_pipe_fs/vs.cBrian Paul2013-04-192-48/+41
|
* gallivm: Fix half floats with MCJIT.José Fonseca2013-04-191-0/+3
| | | | | | Prevents: LLVM ERROR: Cannot select: intrinsic %llvm.x86.vcvtph2ps.128
* Revert "i965: Check reg.nr for BRW_ARF_NULL instead of reg.file."Matt Turner2013-04-181-1/+1
| | | | | | | | | This reverts commit ecdda414d361ab4430fd5747c9217687c1f3d63f. Commit was supposed to be a simple typo fix. Clearly needs more investigating. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63688
* configure.ac: Remove gallium-g3dvl flag.Matt Turner2013-04-181-16/+1
| | | | | | | | It's next to useless, since it just allows you to turn off VDPAU and XvMC with a single switch. Just check whether Gallium drivers are enabled instead. Reviewed-by: Christian König <christian.koenig@amd.com>
* radeonsi: add support for compressed texture v2Jerome Glisse2013-04-182-2/+76
| | | | | | | | | | Most test pass, issue are with border color and swizzle. Based on ircnick<maelcum> patch. v2: Restaged commit hunk Signed-off-by: Jerome Glisse <jglisse@redhat.com>
* radeonsi: add 2d tiling support for texture v3Jerome Glisse2013-04-183-73/+21
| | | | | | | | v2: Remove left over code v3: Restage properly the commit so hunk of first one are not in second one. Signed-off-by: Jerome Glisse <jglisse@redhat.com>
* gallium: handle drirc disable_glsl_line_continuations optionVadim Girlin2013-04-194-1/+8
| | | | | | | | NOTE: This is a candidate for the 9.1 branch Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
* llvmpipe: Take in consideration all current constant buffers when mapping.José Fonseca2013-04-181-3/+9
| | | | | Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>
* nv50: add remaining RGBX formatsChristoph Bumiller2013-04-181-4/+12
| | | | | | | | Not all are supported as render targets. The state tracker fallback of using RGBA instead of RGBX currently fails for blending, we could work around this by clearing their alpha to 1 and modifying the color mask to disable writing alpha.
* st/mesa: optionally apply texture swizzle to border color v2Christoph Bumiller2013-04-1820-7/+115
| | | | | | | | | | | | This is the only sane solution for nv50 and nvc0 (really, trust me), but since on other hardware the border colour is tightly coupled with texture state they'd have to undo the swizzle, so I've added a cap. The dependency of update_sampler on the texture updates was introduced to avoid doing the apply_depthmode to the swizzle twice. v2: Moved swizzling helper to u_format.c, extended the CAP to provide more accurate information.
* nv50: set BORDER_COLOR_SRGB in sampler objectsChristoph Bumiller2013-04-182-19/+35
|
* nv50: fix 4th component of Lx_SINT/UINT formatsChristoph Bumiller2013-04-181-6/+6
|
* r600g: Fix build with --enable-openclTom Stellard2013-04-181-1/+2
|
* mesa: enable GL_ARB_texture_float if TEXTURE_FLOAT_ENABLED is definedBrian Paul2013-04-181-1/+3
| | | | | | | | | Per message on mesa-users list, this wasn't working before. Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>
* gallivm: change cubemaps / derivatives handling, take 55Roland Scheidegger2013-04-183-104/+119
| | | | | | | | | | | | | | | | | | | | | | | Turns out the previous "fix" for handling per-pixel face selection and derivatives didn't work out that well - the derivatives were wrong by quite a bit, in theory transformation of the derivatives into cube space should work, but would be _a lot_ more work than the "simplified" transform used. So, for explicit derivatives, I'm just giving up and go back to not honoring them. For implicit derivatives (and the fake explicit ones) however we try something a little different, we just calculate rho as we would for a 3d texture, that is after scaling the coords by the inverse major axis. This gives the same results as calculating the derivs after projection of the coords to the same face as long as all pixels hit the same face (and only without rho_no_opt, otherwise it should be a bit worse). And when not all pixels are hitting the same face, the results aren't so hot but not catastrophically bad (I believe not off by more than a factor of 2 without no_rho_approx and not more than sqrt(2) with no_rho_approx). I think this is better than just picking the wrong face but who knows... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: Add no_rho_approx debug optionRoland Scheidegger2013-04-183-118/+185
| | | | | | | | | | | | | | | | | | | | | This will calculate rho correctly as sqrt(max((ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2), (ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2)) instead of max(|ds/dx|,|dt/dx|,|dr/dx|,|ds/dy|,|dt/dy,|dr/dy|) (for 3 coords - 2 coords work analogous, for 1 coord there's no point doing the exact version), for both implicit and explicit derivatives. While such approximation seems to be allowed in OpenGL some APIs may be less forgiving, and the error can be quite large (sqrt(2) for 2 coords, sqrt(3) for 3 coords so wrong by nearly one mip level in the latter case). This also helps to single out "real" bugs from "expected" ones, so it is debug only (though at least combined with no_brilinear I didn't really see much of a performance difference but only tested with a debug build - at least with implicit mipmaps the instruction count is almost exactly the same though the instructions are more complex (1 sqrt and mul/adds instead of and/max mostly). The code when the option isn't set stays exactly the same. v2: rename no_rho_opt to no_rho_approx. Reviewed-by: Brian Paul <brianp@vmware.com>
* llvmpipe: Support half integer pixel center fs coord.José Fonseca2013-04-184-3/+28
| | | | | | | Tested with graw/fs-fragcoord 2/3, and piglit glsl-arb-fragment-coord-conventions. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* llvmpipe: Remove the static interpolation.José Fonseca2013-04-183-384/+19
| | | | | | | | No longer used. If we ever want the old behavior we can run a loop unroller pass. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* gallivm: Drop pos arg from lp_build_tgsi_soa.José Fonseca2013-04-184-8/+2
| | | | | | Never used. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* docs: update release notes for 9.2Andreas Boll2013-04-181-3/+8
| | | | Reviewed-by: Matt Turner <mattst88@gmail.com>
* ralloc: Move declarations before statements.José Fonseca2013-04-181-2/+4
| | | | Trivial. Should fix MSVC build.
* configure: enable vdpau and xvmc detection, with galliumEmil Velikov2013-04-171-2/+8
| | | | | | | | | | | | Currently the vdpau and xvmc detection code, is enabled for all builds. The state trackers exist only within gallium. Enable whenever at least one gallium driver is selected v2: removed stray '-a' [mattst88 v3]: Removed stray $. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63645 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
* i965: Check reg.nr for BRW_ARF_NULL instead of reg.file.Matt Turner2013-04-171-1/+1
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Implement work-around for CMP with null dest on Haswell.Matt Turner2013-04-171-0/+12
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i915g: Release old fragment shader sampler views with current pipeStuart Abercrombie2013-04-171-3/+8
| | | | | | | We were trying to use a destroy method from a deleted context. This fix is based on what's in the svga driver. Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>