summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* gallium/util: avoid unreferencing random memory on buffer alloc failureIlia Mirkin2015-09-281-1/+1
| | | | | | | | Found by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* gallium/u_blitter: handle allocation failuresMarek Olšák2015-09-241-0/+6
| | | | | | Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
* gallium/ttn: Convert to using VARYING_SLOT_* / FRAG_RESULT_*.Eric Anholt2015-09-162-14/+173
| | | | | | | | | | | | | | | This avoids exceeding the size of the .index bitfield since it got truncated, and should make our NIR look more like the NIR that the rest of the NIR developers are working on. v2: split out vc4 updates, first patch uses varying_slot_to_tgsi_semantic() helper, and second patch does the actual conversion. v3: add frag_result_to_tgsi_semantic() helper and don't try to map frag_results to semantic name/index as if they were varying_slot's v4: use VERT_ATTRIB_ for VS inputs v5: Fix vc4 build. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* tgsi: add a TXQS opcode to retrieve the number of texture samplesIlia Mirkin2015-09-131-1/+2
| | | | | | Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
* tgsi/scan: add support to figure out max nesting depthRob Clark2015-09-132-0/+21
| | | | | | | | | Sometimes a useful thing for compilers (or, for example, tgsi_to_nir) to know. And pretty trivial for scan to figure this out for us. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>
* tgsi, softpipe: Constify tgsi_sampler in query_lod vfuncKrzesimir Nowak2015-09-111-1/+1
| | | | | | | | | | A followup from previous commit - since all functions called by query_lod take pointers to const sp_sampler_view and const sp_sampler, which are taken from tgsi_sampler subclass, we can the tgsi_sampler as const itself now. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* tgsi,softpipe: capitalize the tgsi_sampler_control enum valuesBrian Paul2015-09-112-24/+25
| | | | | | | We use capitalized enum values everywhere else. This improves understanding a bit too. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* tgsi: Add code for handling lodq opcodeKrzesimir Nowak2015-09-102-0/+56
| | | | | | | | | | | | This introduces new vfunc in tgsi_sampler just for this opcode. I decided against extending get_samples vfunc to return the mipmap level and LOD - the function's prototype is already too scary and doing the sampling for textureQueryLod would be a waste of time. v2: - splitted too long lines Reviewed-by: Brian Paul <brianp@vmware.com>
* tgsi: Remove trailing backslash in commentKrzesimir Nowak2015-09-101-1/+1
| | | | | | It clearly is here by accident. Reviewed-by: Brian Paul <brianp@vmware.com>
* gallium/ttn: fix cursor handling vs builderRob Clark2015-09-091-8/+6
| | | | | | | | | | | After inserting instructions the cursor.option becomes _after_instr (even if it started life as an _after_block). So we cannot simply stash the current cursor on the if/loop_stack. Otherwise we end up inserting instructions after the endif/endloop in the block preceeding the if/ loop. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* auxiliary: rework the python generated sources rulesEmil Velikov2015-09-091-12/+17
| | | | | | | | | | | | | | | | | | There are a few bits this commit aims to resolve: One can generalise the mkdir rule to a simple MKDIR_P $(@D) which will expand appropriately for even if we change the subdir name, and/or add new rules. We can also drop the explicit $(srcdir) prefix for the dependency rules, they they are not strictly required, nor used elsewhere in mesa. Finally replace $< with explicit filename to be consistent through the file, and honour PYTHON_FLAGS. v2: Add comprehensive commit summary/message (Ian, Matt) Cc: 11.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
* llvmpipe: convert double to long long instead of unsigned long longOded Gabbay2015-09-041-1/+1
| | | | | | | | | | | | | | | | | round(val*dscale) produces a double result, as val and dscale are double. However, LLVMConstInt receives unsigned long long, so there is an implicit conversion from double to unsigned long long. This is an undefined behavior. Therefore, we need to first explicitly convert the round result to long long, and then let the compiler handle conversion from that to unsigned long long. This bug manifests itself in POWER, where all IMM values of -1 are being converted to 0 implicitly, causing a wrong LLVM IR output. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* gallium/pb_bufmgr_cache: add a way to remove buffers from the cache explicitlyMarek Olšák2015-09-032-6/+41
| | | | | | | | This must be done before exporting a buffer as dmabuf fds, because we lose track of who is using it and can't trust the reference counter. Cc: 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
* u_upload_mgr: remove the return value from u_upload_dataMarek Olšák2015-09-033-22/+18
| | | | Reviewed-by: Brian Paul <brianp@vmware.com>
* u_upload_mgr: remove the return value from u_upload_bufferMarek Olšák2015-09-032-31/+18
| | | | Reviewed-by: Brian Paul <brianp@vmware.com>
* u_upload_mgr: remove the return value from u_upload_alloc_bufferMarek Olšák2015-09-031-11/+9
| | | | Reviewed-by: Brian Paul <brianp@vmware.com>
* u_upload_mgr: remove the return value from u_upload_allocMarek Olšák2015-09-033-34/+34
| | | | | | The return buffer or the returned pointer can be used instead. Reviewed-by: Brian Paul <brianp@vmware.com>
* u_upload_mgr: optimize u_upload_allocMarek Olšák2015-09-031-15/+17
| | | | | | | This is probably the most called util function. It does almost nothing, yet it can consume 10% of the CPU on the profile. This drops it down to 5%. Reviewed-by: Brian Paul <brianp@vmware.com>
* tgsi/scan: add uses_doubles to tgsi scannerDave Airlie2015-09-022-1/+5
| | | | | | | This allows drivers to work out if a shader contains any double opcodes easily. Signed-off-by: Dave Airlie <airlied@redhat.com>
* auxiliary/os: Don't implement os_get_option() on embedded builds.José Fonseca2015-09-011-0/+2
| | | | | | | | Let it be defined externally instead, allowing setting mechanisms other than environment variables. Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Matthew McClure <mcclurem@vmware.com>
* util: add a couple primitive restart helper functionsBrian Paul2015-09-013-0/+331
| | | | | | | | | | The first function translates prim restart indexes to be 0xffff or 0xffffffff. The second splits indexed primitives with restart indexes into sub- primitives without restart indexes. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* tgsi: add tgsi utility to transform a fragment shader to support aa pointCharmaine Lee2015-09-013-0/+346
| | | | | | | | | | This adds a tgsi utility tgsi_add_aa_point to transform a fragment shader to support anti-aliased wide point by computing the fragment distance from the point center. This utility assumes the geometry shader is emitting an extra generic output with point coord data. The semantic index of this generic output is passed to the tgsi_add_aa_point utility. Reviewed-by: Brian Paul <brianp@vmware.com>
* tgsi: adds tgsi utility to transform a shader to support point spriteCharmaine Lee2015-09-013-0/+622
| | | | | | | | | | This adds a tgsi utility tgsi_add_point_sprite to transform a geometry shader to emulate wide points by drawing quads. This utility adds an extra output for the original point position if the point position is to be written to a stream output buffer. It also assumes the driver will add a constant for inverse viewport scale after the user defined constants. Reviewed-by: Brian Paul <brianp@vmware.com>
* tgsi: add new tgsi_two_side.c utility codeBrian Paul2015-09-013-0/+264
| | | | | | | This could be used by any driver where the device doesn't directly support two-sided lighting. This code modifies a fragment shader to accecpt back-face colors and choose between the front/back colors depending on the triangle's front-face sign.
* util: add util_strcasecmp() wrapperBrian Paul2015-09-011-0/+3
| | | | Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallium/util: add a utility to create geometry passthrough shaderCharmaine Lee2015-09-012-0/+57
| | | | Reviewed-by: Brian Paul <brianp@vmware.com>
* gallium/util: fix returning empty box for rectangle intersectionRoland Scheidegger2015-09-011-1/+6
| | | | | | | | | These functions deal with inclusive coordinates, hence a 0/0/0/0 rect returned when there's no intersection doesn't actually represent an empty rectangle. Hence return 0/-1/0/-1 instead. This fixes some problems in llvmpipe with empty scissor rects (which up to now didn't really matter because while the intersect test returned the wrong result all pixels were scissored away later anyway).
* gallium/util: return FALSE for intersection if there's empty rectanglesRoland Scheidegger2015-09-011-1/+6
| | | | | | | | | | | | It isn't really obvious if intersection test should take into account empty rectangles or if the caller should do it. But it looks like most callers actually verified one of the rects but not the other, but since correctly returning an empty rect that other rect could actually be empty leading to more bugs. Hence just verify both rects for emptyness in the intersection test itself which makes the code easier in the caller (though it will be slower if the caller knows the rectangles are non-empty). Reviewed-by: Zack Rusin <zackr@vmware.com>
* tgsi: add some more helper functionsCharmaine Lee2015-09-011-4/+65
| | | | | | | | | | This patch adds some more helper functions such as . tgsi_transform_temps_decl . tgsi_transform_output_decl . tgsi_transform_dst_reg . tgsi_transform_src_reg Reviewed-by: Brian Paul <brianp@vmware.com>
* tgsi: added tgsi_is_shadow_target() helperBrian Paul2015-09-012-0/+21
|
* tgsi: add negate parameter to tgsi_transform_kill_inst()Brian Paul2015-09-014-5/+8
| | | | Reviewed-by: Charmaine Lee <charmainel@vmware.com>
* util: added ffsll() functionBrian Paul2015-09-011-0/+20
| | | | | | | v2: fix errant _GNU_SOURCE test, per Matt Turner. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* util: added util_set_index_buffer()Brian Paul2015-09-012-0/+18
| | | | | Like util_set_vertex_buffers_count(), this basically just copies a pipe_index_buffer object, taking care of refcounting.
* gallium/util: add u_bit_scan_consecutive_rangeMarek Olšák2015-09-011-0/+20
| | | | | Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>
* gallium/util: fix debug_get_flags_option on 32-bitDave Airlie2015-08-291-3/+4
| | | | | | | | | On 32-bit we need to use PRIu64 flags for printfs, otherwise this segfaults in R600_DEBUG=help otherwise. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
* nir: Convert the builder to use the new NIR cursor API.Kenneth Graunke2015-08-271-17/+17
| | | | | | | | | | | | | | | | | | The NIR cursor API is exactly what we want for the builder's insertion point. This simplifies the API, the implementation, and is actually more flexible as well. This required a bit of reworking of TGSI->NIR's if/loop stack handling; we now store cursors instead of cf_node_lists, for better or worse. v2: Actually move the cursor in the after_instr case. v3: Take advantage of nir_instr_insert (suggested by Connor). v4: vc4 build fixes (thanks to Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v1] Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v4] Acked-by: Connor Abbott <cwabbott0@gmail.com> [v4]
* gallium/util: fix code formatting in u_blitter.hBrian Paul2015-08-271-30/+25
| | | | Trivial.
* gallium/ddebug: new pipe for hang detection and driver state dumping (v2)Marek Olšák2015-08-261-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: lots of improvements This is like identity or trace, but simpler. It doesn't wrap most states. Run with: GALLIUM_DDEBUG=1000 [executable] where "executable" is the app and "1000" is in miliseconds, meaning that the context will be considered hung if a fence fails to signal in 1000 ms. If that happens, all shaders, context states, bound resources, draw parameters, and driver debug information (if any) will be dumped into: /home/$username/dd_dumps/$processname_$pid_$index. Note that the context is flushed after every draw/clear/copy/blit operation and then waited for to find the exact call that hangs. You can also do: GALLIUM_DDEBUG=always to do the dumping after every draw/clear/copy/blit operation without flushing and waiting. Examples of driver states that can be dumped are: - Hardware status registers saying which hw block is busy (hung). - Disassembled shaders in a human-readable form. - The last submitted command buffer in a human-readable form. v2: drop pipe-loader changes, drop SConscript rename dd.h -> dd_pipe.h Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>
* gallium: add flags parameter to pipe_screen::context_createMarek Olšák2015-08-262-2/+2
| | | | | | | | This allows creating compute-only and debug contexts. Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>
* gallium/auxiliary: optimize rgb9e5 helper some moreRoland Scheidegger2015-08-261-45/+42
| | | | | | | | | | | | | | | | I used this as some testing ground for investigating some compiler bits initially (e.g. lrint calls etc.), figured I could do much better in the end just for fun... This is mathematically equivalent, but uses some tricks to avoid doubles and also replaces some float math with ints. Good for another performance doubling or so. As a side note, some quick tests show that llvm's loop vectorizer would be able to properly vectorize this version (which it failed to do earlier due to doubles, producing a mess), giving another 3 times performance increase with sse2 (more with sse4.1), but this may not apply to mesa. No piglit change. Acked-by: Marek Olšák <marek.olsak@amd.com>
* gallium/auxiliary: optimize rgb9e5 helper a bitRoland Scheidegger2015-08-261-18/+17
| | | | | | | | | | | | | | | | | This code (lifted straight from the extension) was doing things the most inefficient way you could think of. This drops some of the more expensive float operations, in particular - int-cast floors (pointless, values always positive) - 2 raised to (signed) integers (replace with simple exponent manipulation), getting rid of a misguided comment in the process (implement with table...) - float division (replace with mul of reverse of those exponents) This is like 3 times faster (measured for float3_to_rgb9e5), though it depends (e.g. llvm is clever enough to replace exp2 with ldexp whereas gcc is not, division is not too bad on cpus with early-exit divs). Note that keeping the double math for now (float x + 0.5), as the results may otherwise differ. Acked-by: Marek Olšák <marek.olsak@amd.com>
* gallium/ttn: Use nir_builder_insert() rather than poking at cf_list.Kenneth Graunke2015-08-251-16/+16
| | | | | | | | | I intend to remove nir_builder::cf_node_list, so I can't have this code poking at it directly. The proper way is to set the insertion point and then simply insert things there. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
* nir: Store gl_shader_stage in nir_shader.Kenneth Graunke2015-08-251-4/+21
| | | | | | | | | | | | This makes it easy for NIR passes to inspect what kind of shader they're operating on. Thanks to Michel Dänzer for helping me figure out where TGSI stores the shader stage information. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
* nir: move control flow modification to its own fileConnor Abbott2015-08-241-0/+1
| | | | | | | | | | | | | | | | We want to start reworking and expanding this code, but it'll be a lot easier to do once we disentangle it from the rest of the stuff in nir.c. Unfortunately, there are a few unavoidable dependencies in nir.c on methods we'd rather not expose publicly, since if not used in very specific situations they can cause Bad Things (tm) to happen. Namely, we need to do some magical control flow munging when adding/removing jumps. In the future, we may disallow adding/removing jumps in nir_instr_insert_*() and nir_instr_remove(), and use separate functions that are part of the control flow modification code, but for now we expose them and put them in a separate, private header. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* util/u_blitter: implement alpha blending for pipe->blitMarek Olšák2015-08-213-19/+41
|
* tgsi: fix parsing of tessellation shader inputs/outputsMarcos Paulo de Souza2015-08-171-1/+16
| | | | | | | | | | Tessellation control shaders write to outputs as OUT[ADDR[0].x][0], make sure to parse the indirect dimension on outputs. Also tess control inputs/outputs and tess eval input declarations need to receive the same treatment as geometry shader inputs. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* tgsi: set implicit array size for tess stagesMarcos Paulo de Souza2015-08-171-1/+5
| | | | Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* winsys/amdgpu: add a new winsys for the new kernel driverMarek Olšák2015-08-141-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: - lots of changes according to Emil Velikov's comments - implemented radeon_winsys::read_registers v3: - a lot of new work, many of them adapt to libdrm interface changes Squashed patches: winsys/amdgpu: implement radeon_winsys context support winsys/amdgpu: add reference counting for contexts winsys/amdgpu: add userptr support winsys/amdgpu: allocate IBs like normal buffers winsys/amdgpu: add IBs to the buffer list, adapt to interface changes winsys/amdgpu: don't use KMS handles as reloc hash keys winsys/amdgpu: sync buffer accesses to different rings winsys/amdgpu: use dependencies instead of waiting for last fence v2 gallium/radeon: unify buffer_wait and buffer_is_busy in the winsys interface (amdgpu part) winsys/amdgpu: track fences per ring and be thread-safe winsys/amdgpu: simplify waiting on a variable in amdgpu_fence_wait gallium/radeon: allow the winsys to choose the IB size (amdgpu part) winsys/amdgpu: switch to new amdgpu_cs_query_fence_status interface winsys/amdgpu: handle fence and dependencies merge winsys/amdgpu follow libdrm change to move user fence into UMD winsys/amdgpu: use amdgpu_bo_va_op for va map/unmap v2 winsys/amdgpu: use the new tiling flags winsys/amdgpu: switch to new GTT_USWC definition winsys/amdgpu: expose amdgpu_cs_query_reset_state to drivers winsys/amdgpu: fix valgrind warnings winsys/amdgpu: don't use VRAM with APUs that don't have much of it winsys/amdgpu: require LLVM 3.6.1 for VI because of bug fixes there winsys/amdgpu: remove amdgpu_winsys::num_cpus winsys/amdgpu: align BO size to page size winsys/amdgpu: reduce BO cache timeout winsys/amdgpu: remove useless flushing and waiting in amdgpu_bo_set_tiling winsys/amdgpu: use amdgpu_device_handle as a unique device ID instead of fd winsys/amdgpu: use safer access to amdgpu_fence_wait::signalled winsys/amdgpu: allow maximum IB size of 4 MB winsys/amdgpu: add ip_instance into amdgpu_fence gallium/radeon: add RING_COMPUTE instead of RADEON_FLUSH_COMPUTE winsys/amdgpu: set the ring type at CS initilization winsys/amdgpu: query the GART page size from the kernel winsys/amdgpu: correctly wait for shared buffers to become idle winsys/amdgpu: set the amdgpu_cs_fence structure only once at fence creation winsys/amdgpu: add a specific error message for cs_submit -> -ENOMEM winsys/amdgpu: check num_active_ioctls before calling amdgpu_bo_wait_for_idle winsys/amdgpu: clear user fence BO after allocating it winsys/amdgpu: fix user fences winsys/amdgpu: make amdgpu_winsys_create public winsys/amdgpu: remove thread offloading winsys/amdgpu: flatten the amdgpu_cs_context structure and simplify more v4: require libdrm 2.4.63
* vl: add HEVC profiles and definesChristian König2015-08-141-0/+7
| | | | | Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>
* ttn: add buffer texture typeRob Clark2015-08-121-0/+3
| | | | | Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>