summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau/nvc0
Commit message (Collapse)AuthorAgeFilesLines
* nouveau: more locking - make sure that fence work is always done withIlia Mirkin2016-11-011-1/+4
| | | | the push mutex acquired
* WIP nouveau: add lockingIlia Mirkin2016-11-0111-21/+158
|
* nvc0: use correct bufctx when invalidating CP texturesSamuel Pitoiset2016-10-271-1/+1
| | | | | | | Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 7b2712c367891e96384226a1fa94679a814235d0)
* nvc0: do not break 3D state by pushing MS coordinates on FermiSamuel Pitoiset2016-10-241-43/+44
| | | | | | | | | | | | | | | | | | | Long story short, 3D and CP are aliased on Fermi and initializing compute after pushing the MS sample coordinate offsets seems to corrupt 3D state for weird reasons. I still don't have the faintest clue what is going on, but this seems to only affect Fermi generation. A possible fix could be to use two different channels, one for 3D and one for CP. This fixes a bunch of regressions pinpointed by piglit. Fixes: "nvc0: fix up image support for allowing multiple samples" Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (cherry picked from commit 42273edf79c2500957f51690499aa3405cc689db)
* nv50,nvc0: avoid reading out of bounds when getting bogus so infoIlia Mirkin2016-10-241-2/+5
| | | | | | | | | | The state tracker tries to attach the info to the wrong shader. This is easy enough to protect against. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit 313fba5ee1de9416930e45da8aff63a24763940b)
* nvc0: enable ARB_enhanced_layoutsSamuel Pitoiset2016-10-131-1/+1
| | | | | | | | All ARB_enhanced_layouts piglit tests pass without any changes in our compiler. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* gallium: add PIPE_CAP_TGSI_ARRAY_COMPONENTSNicolai Hähnle2016-10-121-0/+1
| | | | | | | | This is a screen cap because drivers are expected to support it either for all shader types or for none of them. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>
* nvc0: fix valid range for shader buffersSamuel Pitoiset2016-10-103-0/+3
| | | | | | | | When offset != 0, the valid range was wrong because the second argument of util_range_add() is end, not size. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: dump program binary only when NV50_PROG_DEBUG is setSamuel Pitoiset2016-10-071-1/+1
| | | | | | | | When the chipset is forced with NV50_PROG_CHIPSET, we actually only want to output the binary if NV50_PROG_DEBUG is also enabled. Otherwise, this pollutes the shader-db output. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
* nvc0: expose ARB_compute_variable_group_sizeSamuel Pitoiset2016-10-071-2/+6
| | | | | | | | | Only expose 512 threads/block on Fermi to not be limited by 32 GPRs/thread. v4: - use 512 threads on Fermi, 1024 on Kepler+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
* gallium: add PIPE_COMPUTE_CAP_MAX_VARIABLE_THREADS_PER_BLOCKSamuel Pitoiset2016-10-071-0/+2
| | | | | | | | | v3: - use a new case statement in r600_pipe_common.c - fix compilation of softpipe... Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* nvc0: dump program binary when chipset has been forcedSamuel Pitoiset2016-10-051-0/+5
| | | | | | | | Currently, program binaries are only dumped at upload time, but when the chipset has been forced via NV50_PROG_CHIPSET we might want to show the generated code, especially with shaderdb. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
* nvc0: update GM107 sched control codes formatSamuel Pitoiset2016-09-291-2/+2
| | | | | | | | | envyas now uses a much better representation for those control codes and it displays the different flags instead of an unreadable hex number. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: allow to force compiling programs in debug buildSamuel Pitoiset2016-09-261-9/+10
| | | | | | | | | This adds a new envvar called NV50_PROG_CHIPSET which allows to compile shaders with a different target, especially useful for shader-db. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: get rid of nvc0_stage_sampler_states_bind_range()Samuel Pitoiset2016-09-191-74/+9
| | | | | | | Same thing as nvc0_stage_set_sampler_views_range(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: get rid of nvc0_stage_set_sampler_views_range()Samuel Pitoiset2016-09-191-89/+15
| | | | | | | | This function was quite similar to nvc0_stage_set_sampler_views() and I don't see any reasons to not remove it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* gm107/ir: allow indirect inputs to be loaded by frag shaderIlia Mirkin2016-09-101-1/+0
| | | | | | | | | Looks like the GM107 IPA op does not allow a separate offset when using an indirect register. Instead we must use AL2P like we do for indirect vertex operations on Kepler+. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
* gallium: remove PIPE_BIND_TRANSFER_READ/WRITEMarek Olšák2016-09-081-4/+2
| | | | | | | | not used in any useful way Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* nv50,nvc0: respect render condition enable flag when clearing rt/zsIlia Mirkin2016-09-031-4/+8
| | | | | | | | This is a newly added flag. We always pass false into it from nv50_clear_texture, but other callers may want to respect the render condition. (And the functions were originally spec'd to respect it.) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: reduce the initial code segment size to 512KBSamuel Pitoiset2016-09-011-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: allow to resize the code segment dynamicallySamuel Pitoiset2016-09-011-1/+24
| | | | | | | | | | | | | When an application uses a ton of shaders, we need to evict them when the code segment is full but this is not really a good solution if monster shaders are used because code eviction will happen a lot. To avoid this, it seems better to dynamically resize the code segment area after each eviction. The maximum size is arbitrary fixed to 8MB which should be enough. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: add a new bin for the code segmentSamuel Pitoiset2016-09-012-4/+6
| | | | | | | | | To avoid the bins list to grow up indefinitely when the code segment size will be bumped, we need to separate that bin from the SCREEN one because it contains other resources like the uniform bo. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: add nvc0_screen_resize_text_area() helperSamuel Pitoiset2016-09-013-10/+40
| | | | | | | | This function will be helpful for resizing the code segment area when we need to evict all shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: re-upload currently bound shaders after code evictionSamuel Pitoiset2016-09-011-0/+27
| | | | | | | | | | | | | This fixes a very old issue which happens when the code segment size is full. A bunch of real applications like Tomb Raider, F1 2015, Elemental, hit that issue because they use a ton of shaders. In this case, all shaders are evicted (for freeing space) but all currently bound shaders also need to be re-uploaded and SP_START_ID have to be updated accordingly. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: refactor the program upload processSamuel Pitoiset2016-09-013-32/+59
| | | | | | | | | | This refactoring will help for fixing the "out of code space" eviction issue because we will need to reupload the code for all currently bound shaders but it's slightly different than uploading a new fresh code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: remove an attempt at uploading all IMMD into a CBSamuel Pitoiset2016-08-313-40/+0
| | | | | | | | | | | This has never been used because info->immd.bufSize is always 0 and anyways this is an experimental code which has never been completed. This gets rid of some unused code in the program validation process. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: fix indentation in nvc0_screen_init()Samuel Pitoiset2016-08-301-1/+1
| | | | | | | Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: check return value of nvc0_screen_resize_tls_area()Samuel Pitoiset2016-08-302-11/+8
| | | | | | | | While we are at it, make it static and change the return values policy to be consistent. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: make use of FAIL_SCREEN_INIT in nvc0_screen_create()Samuel Pitoiset2016-08-301-9/+7
| | | | | Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: undo overzealous enum usageIlia Mirkin2016-08-301-2/+2
| | | | | | | | Commit 7413625ad3 flipped a few functions too many to use pipe_shader_type. These functions actually take an integer that does not correspond 1:1 with the enum. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
* gallium: add cap to export device pointer sizeJan Vesely2016-08-291-0/+2
| | | | | | | | | v2: document the new cap v3: fix 80 char limit in screen.rst Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
* gallium: Use enum pipe_shader_type in set_shader_images()Kai Wasserbäch2016-08-291-1/+2
| | | | | Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Brian Paul <brianp@vmware.com>
* gallium: Use enum pipe_shader_type in set_shader_buffers()Kai Wasserbäch2016-08-291-1/+1
| | | | | Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Brian Paul <brianp@vmware.com>
* gallium: Use enum pipe_shader_type in set_sampler_views()Kai Wasserbäch2016-08-291-1/+1
| | | | | Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Brian Paul <brianp@vmware.com>
* gallium: Use enum pipe_shader_type in bind_sampler_states() (v2)Kai Wasserbäch2016-08-291-3/+8
| | | | | | | | | | | v1 → v2: - Fixed indentation (noted by Brian Paul) - Removed second assert from nouveau's switch statements (suggested by Brian Paul) Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>
* nvc0: invalidate textures/samplers on GK104+Samuel Pitoiset2016-08-242-12/+22
| | | | | | | | | | | | | | Like Fermi, textures and samplers are aliased between 3D and compute, especially the TIC_FLUSH/TSC_FLUSH methods and we have to re-validate these resources when switching between the two pipelines. This fixes a GPU hang with Elemental (and most likely with other UE4 demos). Tested on GK107 and GM107. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> CC: <mesa-stable@lists.freedesktop.org>
* gallium: add a cap to expose whether driver supports mixed color/zs bitsIlia Mirkin2016-08-231-0/+1
| | | | | | | | | | Some hardware can't render to color/depth buffers of mixed bitness. When that happens a fallback has to happen, but this allows the driver to express that this isn't an optimal scenario. The purpose of this is to remove such fbconfigs from the GLX/EGL config list. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* gallium: change pipe_image_view::first_element/last_element -> offset/sizeMarek Olšák2016-08-172-19/+9
| | | | | | | | | This is required by OpenGL. Our hardware supports this. Example: Bind RGBA32F with offset = 4 bytes. Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* gallium: change pipe_sampler_view::first_element/last_element -> offset/sizeMarek Olšák2016-08-171-8/+10
| | | | | | | | | | | This is required by OpenGL. Our hardware supports this. Example: Bind RGBA32F with offset = 4 bytes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97305 Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* nv50,nvc0: fix depth range when halfz is enabledIlia Mirkin2016-08-141-2/+7
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
* gallium: add render_condition_enable param to clear_render_target/depth_stencilMarek Olšák2016-08-101-2/+4
| | | | | Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
* nvc0: enable ARB_tessellation_shader on GM107+Samuel Pitoiset2016-07-271-3/+0
| | | | | | | This exposes OpenGL 4.1 on Maxwell (tested on GM107 and GM206). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: fix up TCP header on GM107+Samuel Pitoiset2016-07-271-0/+9
| | | | | | | | | | The number of outputs patch (limited to 255) has moved in the TCP header, but blob seems to also set the old position. Also, the high 8-bits are now located inbetween the min/max parallel output read address at position 20. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: use nvc0_m2mf_push_linear() to reduce code duplicationSamuel Pitoiset2016-07-261-13/+3
| | | | | Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: use nve4_p2mf_push_linear() to reduce code duplicationSamuel Pitoiset2016-07-261-36/+9
| | | | | Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: upload sample locations on GM20xSamuel Pitoiset2016-07-243-5/+31
| | | | | | | | This fixes a bunch of multisample piglit tests on GM206, like bin/arb_texture_multisample-texelfetch 2 -auto -fbo Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: fix wrong indentation in nvc0_validate_fb()Samuel Pitoiset2016-07-231-141/+141
| | | | | | Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
* gallium: split transfer_inline_write into buffer and texture callbacksMarek Olšák2016-07-232-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl - driver_transfer_map - u_transfer_unmap_vtbl - driver_transfer_unmap That's 6 indirect calls. Some drivers only had 5. The goal is to have 1 indirect call for drivers that care. The resource type can be determined statically at most call sites. The new interface is: pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data) pipe_context::texture_subdata(ctx, resource, level, usage, box, data, stride, layer_stride) v2: fix whitespace, correct ilo's behavior Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Roland Scheidegger <sroland@vmware.com>
* nvc0/mme: fix offsets used for indirect drawsSamuel Pitoiset2016-07-222-8/+8
| | | | | | | | | | This fixes a regression introduced in 1da704a94c57aa0b0cf8faaa3236fe47dfb8f88c because the offset has moved from 0x180 to 0x1a0, and the macros have to be re-compiled. Fixes: 1da704a ("nvc0: increase the tex handles area size in the driver") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0: fix offsets of MP perf counters input parametersSamuel Pitoiset2016-07-221-15/+15
| | | | | | | | | | | | | This fixes a regression introduced in 1da704a94c57aa0b0cf8faaa3236fe47dfb8f88c because the offset has moved from 0x600 to 0x620, and the kernels used for reading MP perf counters have to be re-assembled. This also fixes amd_performance_monitor_measure piglit. Fixes: 1da704a ("nvc0: increase the tex handles area size in the driver") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>