summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* Merge remote-tracking branch 'mesa/13.0' into nougat-x86Chih-Wei Huang2017-01-092-1/+4
|\
| * tgsi: fix the src type of TGSI_OPCODE_MEMBARMarek Olšák2016-12-141-0/+1
| | | | | | | | | | | | | | | | It's a literal integer. The next commit will need this. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit 16ba04d6deea4f89cbaec00a001d5c2ac841692b)
| * cso: don't release sampler states that are boundMarek Olšák2016-12-141-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes random radeonsi GPU hangs in Batman Arkham: Origins (Wine) and probably many other games too. cso_cache deletes sampler states when the cache size is too big and doesn't check which sampler states are bound, causing use-after-free in drivers. Because of that, radeonsi uploaded garbage sampler states and the hardware went bananas. Other drivers may have experienced similar issues. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (cherry picked from commit 6dc96de303290e8d1fc294da478c4f370be98dea)
* | Merge remote-tracking branch 'mesa/13.0' into nougat-x86Chih-Wei Huang2016-11-164-64/+47
|\ \ | |/
| * gallium/hud: protect against and initialization raceSteven Toth2016-11-144-8/+41
| | | | | | | | | | | | | | | | | | | | In the event that multiple threads attempt to install a graph concurrently, protect the shared list. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit 381edca826ee27b1a49f19b0731c777bdf241b20)
| * gallium/hud: close a previously opened handleSteven Toth2016-11-143-1/+6
| | | | | | | | | | | | | | | | | | We're missing the closedir() to the matching opendir(). Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit 5a58323064b32442e2de23c95642bc421be696f8)
| * gallium/hud: fix a problem where objects are free'd while in use.Steven Toth2016-11-144-55/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of trying to maintain a reference counted list of valid HUD objects, and freeing them accordingly, creating race conditions between unanticipated multiple threads, simply accept they're allocated once and never released until the process terminates. They're a shared resource between multiple threads, so accept they're always available for use. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit 6ffed086795aaa84ab35668bb59d712cdde34da3)
* | android: fix building errors on Android 7.0Chih-Wei Huang2016-11-011-1/+1
| |
* | android: print debug info to logcatWuZhen2016-11-012-3/+11
| | | | | | | | | | | | | | | | | | | | Redirect logs printed to stderr to logcat. NO_REF_TASK Tested: local run Change-Id: I58e3966a608af361b86c54b4c95a92561b711968 Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
* | gallium: introduce load_pipe_screen()Rob Herring2016-11-012-0/+12
|/ | | | | | | | | | | | Introduce load_pipe_screen() public entry point for other code which dlopen()'s gralloc_dri.so for purposes of loading a pipe_screen. This way drm_gralloc can avoid static linking of each gallium winsys and driver, and avoid duplicated logic to figure out which pipe driver to load. This is based on Rob Clark's work. I moved it into pipe_loader which seems to be a better spot. Signed-off-by: Rob Herring <robh@kernel.org>
* draw: improve vertex fetch (v2)Roland Scheidegger2016-10-193-86/+134
| | | | | | | | | | | | | | | | | | | | | | | The per-element fetch has quite some calculations which are constant, these can be moved outside both the per-element as well as the main shader loop (llvm can figure out it's constant mostly on its own, however this can have a significant compile time cost). Similarly, it looks easier swapping the fetch loops (outer loop per attrib, inner loop filling up the per vertex elements - this way the aos->soa conversion also can be done per attrib and not just at the end though again this doesn't really make much of a difference in the generated code). (This would also make it possible to vectorize the calculations leading to the fetches.) There's also some minimal change simplifying the overflow math slightly. All in all, the generated code seems to look slightly simpler (depending on the actual vs), but more importantly I've seen a significant reduction in compile times for some vs (albeit with old (3.3) llvm version, and the time reduction is only really for the optimizations run on the IR). v2: adapt to other draw change. No changes with piglit. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* draw: improved handling of undefined inputsRoland Scheidegger2016-10-191-21/+32
| | | | | | | | | | | | | | | Previous attempts to zero initialize all inputs were not really optimal (though no performance impact was measurable). In fact this is not really necessary, since we know the max number of inputs used. Instead, just generate fetch for up to max inputs used by the shader, directly replacing inputs for which there was no vertex element by zero. This also cleans up key generation, which previously would have stored some garbage for these elements. And also drop the assertion which indicates such bogus usage by a debug_printf (the whole point of initializing the undefined inputs was to make this case safe to handle). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: print out time for jitting functions with GALLIVM_DEBUG=perfRoland Scheidegger2016-10-191-0/+11
| | | | | | | | Compilation to actual machine code can easily take as much time as the optimization passes on the IR if not more, so print this out too. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: Use native packs and unpacks for the lerpsRoland Scheidegger2016-10-193-13/+156
| | | | | | | | | | | | | | | | | | | | | | | | For the texturing packs, things looked pretty terrible. For every lerp, we were repacking the values, and while those look sort of cheap with 128bit, with 256bit we end up with 2 of them instead of just 1 but worse, plus 2 extracts too (the unpack, however, works fine with a single instruction, albeit only with llvm 3.8 - the vpmovzxbw). Ideally we'd use more clever pack for llvmpipe backend conversion too since we actually use the "wrong" shuffle (which is more work) when doing the fs twiddle just so we end up with the wrong order for being able to do native pack when converting from 2x8f -> 1x16b. But this requires some refactoring, since the untwiddle is separate from conversion. This is only used for avx2 256bit pack/unpack for now. Improves openarena scores by 8% or so, though overall it's still pretty disappointing how much faster 256bit vectors are even with avx2 (or rather, aren't...). And, of course, eliminating the needless packs/unpacks in the first place would eliminate most of that advantage (not quite all) from this patch. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* loader: remove loader_get_driver_for_fd() driver_typeEmil Velikov2016-10-181-1/+1
| | | | | | | | | | | | | | Reminiscent from the pre-loader days, were we had multiple instances of the loader logic in separate places and one could build a "GALLIUM_ONLY" version. Since that is no longer the case and the loaders (glx/egl/gbm) do not (and should not) require to know any classic/gallium specific we can drop the argument and the related code. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallium/tgsi: add missing #includeMarek Olšák2016-10-181-0/+2
| | | | | Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* pipe_loader_sw: Don't invoke Unix close() on Windows.Jose Fonseca2016-10-141-0/+2
| | | | Trivial.
* gallium: rename drm_driver_descriptor::{, driver_}nameEmil Velikov2016-10-141-15/+15
| | | | | | | | Historically we use "device name" for the name of the kernel module and "driver name" for the dri/other driver. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* gallium: remove unused drm_driver_descriptor::driver_nameEmil Velikov2016-10-141-12/+0
| | | | | | | | | | | | Likely unused since day 1, although I've only checked back until the st/dri unification with commit 29ca7d2c948 ("st/dri: merge dri/drm and dri/sw backends") Based on the comment, referencing drmOpenByName it's not something we want to bring back. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* tgsi: fix comment typo in tgsi_ureg.cBrian Paul2016-10-131-1/+1
| | | | Trivial.
* gallium/os: Use unsigned integers for size computationAxel Davy2016-10-131-2/+2
| | | | | | | | Use uint64_t instead of int64_t in the calculation, as the result is uint64_t. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* tgsi/ureg: add ureg_DECL_output_layoutNicolai Hähnle2016-10-122-13/+38
| | | | | | | | | For specifying an exact location/component. v2: change the order of parameters (Dave) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
* tgsi/ureg: add layout/component input declarationsNicolai Hähnle2016-10-122-12/+76
| | | | | | | v2: change the order of parameters (Dave) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
* tgsi/scan: fix num_inputs/num_outputs for shaders with overlapping arraysNicolai Hähnle2016-10-121-8/+2
| | | | | | | v2: remove a tautological left-over assert (Marek) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
* draw: initialize shader inputsRoland Scheidegger2016-10-121-0/+7
| | | | | | | | | This should make the code more robust if a shader tries to use inputs which aren't defined by the vertex element layout (which usually shouldn't happen). No piglit change. Reviewed-by: Brian Paul <brianp@vmware.com>
* gallium/util: Really allow aliasing of dst for u_box_union_*Axel Davy2016-10-101-11/+20
| | | | | | | | | | | | | | | | Gallium nine relies on aliasing to work with this function. Without this patch, dirty region tracking was incorrect, which could lead to incorrect textures or vertex buffers. Fixes several game bugs with nine. Fixes https://github.com/iXit/Mesa-3D/issues/234 Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>
* gallium/os: Fix overflow on 32 bitsAxel Davy2016-10-101-4/+10
| | | | | | | | | | | | | On systems with more than 4GB of ram, os_get_total_physical_memory was triggering an integer overflow for the linux and haiku path, when on 32 bits. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94561 Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* gallium/hud: Remove superfluous debugSteven Toth2016-10-064-52/+0
| | | | | | | No longer required. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
* tgsi/scan: don't set interp flags for inputs only used by INTERP (v2)Marek Olšák2016-10-051-48/+57
| | | | | | | | | | | | (v1 pushed, then reverted) This fixes 9 randomly failing tests on radeonsi: GL45-CTS.shader_multisample_interpolation.render.interpolate_at_centroid.* v2: use input_interpolate[input] (correct) instead of input_interpolate[index] (incorrect) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallivm: Use AVX2 gather instrinsics.Jose Fonseca2016-10-041-0/+95
| | | | | | v2: Use AVX2 gather for non aligned loads too. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* gallivm: Use 8 wide AoS sampling on AVX2.Roland Scheidegger2016-10-041-5/+6
| | | | | | | | | | v2: Make sure that with num_lods > 1 and min_filter != mag_filter we still enter the splitting path. So this case would still use 4-wide aos path (as a side note, the 4-wide aos sampling path could actually be improved quite a bit if we have avx2, by just doing the filtering with 256bit vectors). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* gallivm: Basic AVX2 support.José Fonseca2016-10-044-28/+98
| | | | | | v2: pblendb -> pblendvb Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-041-1/+2
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* vl/dri3: fix warning about incompatible pointer typeNayan Deshmukh2016-10-031-1/+1
| | | | | Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com>
* gallium/hud: Add support for CPU frequency monitoringSteven Toth2016-09-304-0/+286
| | | | | | | | Detect all of the CPUs in the system. Expose metrics for min, max and current frequency in Hz. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com>
* Revert "gallium/hud: automatically print % if max_value == 100"Marek Olšák2016-09-301-12/+6
| | | | | | | | This reverts commit dbfeb0ec12d6550e68de1bcd164e422e79bccf2d. With max_value being rounded to 100, it's often wrong. Reviewed-by: Brian Paul <brianp@vmware.com>
* gallium/hud: Add power sensor supportSteven Toth2016-09-293-5/+44
| | | | | | | | | | | | | Implement support for power based sensors, reporting units in milli-watts and watts. Also, minor cleanup - change the related if block to a switch. Tested with two different power sensors, including the nouveau 'power1' sensors on a GTX950 card. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com>
* gallium/hud: Add support for block I/O, network I/O and lmsensor statsSteven Toth2016-09-287-0/+1260
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | V8: Feedback based on peer review convert if block into a switch Constify some func args V7: Increase precision when measuring lmsensors volts Flatten patch series. V6: Feedback based on peer review Simplify sensor initialization (arg passing). Constify some func args V5: Feedback based on peer review Convert sprintf to snprintf Convert char * to const char * int arg converted to bool Func changes to take a filename vs a larger struct. Omit the space between '*' and the param name. V4: Merged with master as of 2016/9/27 6pm V3: Flatten the entire patchset ready for the ML V2: Additional seperate patches based on feedback a) configure.ac: Add a comment related to libsensors b) HUD: Disable Block/NIC I/O stats by default. Implement configuration option --enable-gallium-extra-hud=yes and enable both statistics when this option is enabled. c) Configure.ac: Minor cleanup to user visible configuration settings d) Configure.ac: HUD stats - build system improvements Move the -lsensors out of a deeper Makefile, bring it into the configure.ac. Also, rename a compiler directive to more closely follow the standard. V1: Initial release to the ML Three new features: 1. Disk/block I/O device read/write stats MB/ps. 2. Network Interface RX/TX transfer statistics as a percentage of the overall NIC speed. 3. lmsensor power, voltage and temperature sensors. The lmsensor changes makes a dependency on libsensors so support for the change is opt out by default. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com>
* gallium/radeon/winsyses: reduce the number of pb_cache bucketsNicolai Hähnle2016-09-271-1/+1
| | | | | | | Small buffers are now handled via the slabs code, so separate buckets in pb_cache have become redundant. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* gallium/pipebuffer: add pb_slab utilityNicolai Hähnle2016-09-273-0/+409
| | | | | | | | | | | | | This is a simple framework for slab allocation from buffers that fits into the buffer management scheme of the radeon and amdgpu winsyses where bufmgrs aren't used. The utility knows about different sized allocations and explicitly manages reclaim of allocations that have pending fences. It manages all the free lists but does not actually touch buffer objects directly, relying on callbacks for that. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* gallium/u_math: add util_logbase2_ceilNicolai Hähnle2016-09-271-0/+12
| | | | | | For finding the exponent of the next power of two. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* mesa/st: support lowering multi-planar YUVRob Clark2016-09-261-1/+3
| | | | | | | | | | | | | | | Support multi-planar YUV for external EGLImage's (currently just in the dma-buf import path) by lowering to multiple texture fetch's for each plane and CSC in shader. There was some discussion of alternative approaches for tracking the additional UV or U/V planes: https://lists.freedesktop.org/archives/mesa-dev/2016-September/127832.html They all seemed worse than pipe_resource::next Signed-off-by: Rob Clark <robdclark@gmail.com>
* gallium/util: make use of strtol() in debug_get_num_option()Samuel Pitoiset2016-09-261-17/+8
| | | | | | | | | This allows to use hexadecimal numbers which are automatically detected by strtol() when the base is 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Brian Paul <brianp@vmware.com>
* gallium/util: add comment on util_is_format_compatible()Brian Paul2016-09-211-0/+24
| | | | | | | | | From reading the code, it's not obvious what is src/dest compatible. The list of a->b copy-compatible formats comes from Jose's original check-in message, with some format name updates. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* gallivm: support negation on 64-bit integersNicolai Hähnle2016-09-211-0/+4
| | | | | | | This should be analogous to 32-bit integers. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallivm/llvmpipe: prepare support for ARB_gpu_shader_int64.Dave Airlie2016-09-215-5/+500
| | | | | | | | | | | | | | | | This enables 64-bit integer support in gallivm and llvmpipe. v2: add conversion opcodes. v3: - PIPE_CAP_INT64 is not there yet - restrict DIV/MOD defaults to the CPU, as for 32 bits - TGSI_OPCODE_I2U64 becomes TGSI_OPCODE_U2I64 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* tgsi/softpipe: prepare ARB_gpu_shader_int64 support. (v3)Dave Airlie2016-09-211-132/+541
| | | | | | | | | | | | | | This adds all the opcodes to tgsi_exec for softpipe to use. v2: add conversion opcodes. v3: - no PIPE_CAP_INT64 yet - change TGSI_OPCODE_I2U64 to TGSI_OPCODE_U2I64 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallium/tgsi: add support for 64-bit integer immediates.Dave Airlie2016-09-216-2/+115
| | | | | | | | | | This adds support to TGSI for 64-bit integer immediates. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>
* gallium: add opcode and types for 64-bit integers. (v3)Dave Airlie2016-09-212-11/+85
| | | | | | | | | | | | | | | | | This just adds the basic support for 64-bit opcodes, and the new types. v2: add conversion opcodes. add documentation. v3: - make docs more consistent - change TGSI_OPCODE_I2U64 to TGSI_OPCODE_U2I64 Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2) Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* vl/dri3: handle the case of different GPU(v4.2)Nayan Deshmukh2016-09-201-13/+53
| | | | | | | | | | | | | | | | | | | In case of prime when rendering is done on GPU other then the server GPU, use a seprate linear buffer for each back buffer which will be displayed using present extension. v2: Use a seprate linear buffer for each back buffer (Michel) v3: Change variable names and fix coding style (Leo and Emil) v4: Use PIPE_BIND_SAMPLER_VIEW for back buffer in case when a seprate linear buffer is used (Michel) v4.1: remove empty line v4.2: destroy the context and handle the case when create_context fails (Emil) Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Christian König <christian.koenig@amd.com>