summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/Makefile.sources
Commit message (Collapse)AuthorAgeFilesLines
* i965/sync: Rename intel_syncobj.c -> brw_sync.cChad Versace2016-10-051-1/+1
| | | | | Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel: Add a new "common" library for more code sharingJason Ekstrand2016-09-031-2/+0
| | | | | | | The first thing to go in this new library is brw_device_info. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Move blorp into src/intel/blorpJason Ekstrand2016-08-291-15/+5
| | | | | | | | | At this point, blorp is completely driver agnostic and can be safely moved into its own folder. Soon, we hope to start using it for doing blits in the Vulkan driver. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965/blorp: Pull the guts of blorp_exec into a driver-agnostic headerJason Ekstrand2016-08-291-5/+10
| | | | | Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965/blorp: Remove no longer used state setup helpersJason Ekstrand2016-08-191-1/+0
| | | | | | | | Now that we're using genxml for everything, we no longer need the hand-rolled state emit helpers. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965/blorp: Use genxml for gen8-9 state setupJason Ekstrand2016-08-191-1/+6
| | | | | Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965/blorp: Use genxml for gen7 state setupJason Ekstrand2016-08-191-1/+6
| | | | | Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965: Move gen6_blorp.c to a file that gets recompiled per-genJason Ekstrand2016-08-191-1/+3
| | | | | | | | | At the moment, it's only used for gen6 but that will change soon. We use the genX prefix for recompiled things in the Vulkan driver. It isn't great, but it seems to have worked ok. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965/blorp: Move the non-static blorp state setup helpers to another fileJason Ekstrand2016-08-191-0/+1
| | | | | | | | We're about to start replacing blorp state setup code with packing structs and we want to feel free to delete files as we go. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965: Roll intel_reg.h into brw_defines.hJason Ekstrand2016-08-191-1/+0
| | | | | | | | More than half of the stuff in intel_reg.h had nothing whatsoever to do with registers and really belongs in brw_defines.h anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Implement the WaPreventHSTessLevelsInterference workaround.Kenneth Graunke2016-08-181-0/+1
| | | | | | | | | | | Fixes several GL44-CTS.tessellation_shader (and GL45 and ES31) subcases: - vertex_spacing - tessellation_shader_point_mode.points_verification - tessellation_shader_quads_tessellation.inner_tessellation_level_rounding Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
* i965: brw_blorp_blit.cpp -> blorp_blit.cJason Ekstrand2016-08-171-1/+1
| | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965: brw_blorp_clear.cpp -> blorp_clear.cJason Ekstrand2016-08-171-1/+1
| | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965: Split brw_blorp.c/h into multiple filesJason Ekstrand2016-08-171-0/+3
| | | | | | | | | | | This mega-commit pulls most of the i965-specific bits of blorp into the brw_blorp.c/h files which now contain nothing but i965 wrappers around "core blorp" calls. The "core blorp" api is moved into blorp.h and the internal blorp data structures are moved into blorp_priv.h. The new file blorp.c is created to house "core blorp" internals which are pulled from the old brw_blorp.c Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965: Get rid of the do_lower_unnormalized_offsets passJason Ekstrand2016-07-221-1/+0
| | | | | | | | | We can do this in NIR now. No need to keep a GLSL pass lying around for it. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
* i965: Get rid of gen6_surface_state.cJason Ekstrand2016-07-151-1/+0
| | | | | | | | | The only useful thing left was gen6_init_vtable_surface_functions which we can easily put in brw_wm_surface_state.c. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>
* i965: Move contents of brw_tex.c into intel_tex_validate.c.Kenneth Graunke2016-06-241-1/+0
| | | | | | | | | | | brw_tex.c is a tiny file containing a single function. It's closely tied to the validation logic in intel_tex_validate.c, so it makes sense to put both in the same file. While we're at it, update the function to our modern style. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Add nir based intrinsic lowering and thread ID uniformJordan Justen2016-06-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | We add a lowering pass for nir intrinsics. This pass can replace nir intrinsics with driver specific nir lower code. We lower the gl_LocalInvocationIndex intrinsic based on a uniform which is loaded with a thread specific ID. We also lower the gl_LocalInvocationID based on gl_LocalInvocationIndex. v2: * Create variable during lowering pass. (Ken) v3: * Don't create a variable, but instead just insert an intrisic call to load a uniform from the allocated location. (Jason) v4: * Don't run this pass if thread_local_id_index < 0 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
* i965: Move brw_nir_lower_uniforms.cpp to i965_FILESJason Ekstrand2016-05-261-1/+1
| | | | | | | This gets it out of i965_compiler.la Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* i965: Use ISL for surface format introspectionJason Ekstrand2016-05-231-2/+1
| | | | | With this, we can delete the surface format table in brw_surface_formats.c because all of the information we need is now in ISL.
* i965: Combine Gen4-7 and Gen8+ state base address emitters.Kenneth Graunke2016-05-161-1/+0
| | | | | | | | We're about to start calling it directly, and this means the callers won't have to think about generations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
* i965: Use blorp for all clearsJason Ekstrand2016-05-141-1/+0
| | | | | | | We used to use a meta path on gen8 but we haven't since c7cf17ae758. We might as well delete the meta path since blorp works on all gens. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965: Use blorp for all stencil blitsJason Ekstrand2016-05-141-1/+0
| | | | | | | We used to use a meta path because blorp didn't support 16x MSAA. Now it does, so we don't need the meta paths anymore. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965: Use blorp for all updownsample blitsJason Ekstrand2016-05-141-1/+0
| | | | | | | We used to use a meta path because blorp didn't support 16x MSAA. Now it does, so we don't need the meta paths anymore. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965/blorp: Delete the old blorp shader emit codeJason Ekstrand2016-05-141-2/+0
| | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
* i965/fs: rename our lower_d2f pass to lower_d2xIago Toral Quiroga2016-05-101-1/+1
| | | | | | | Since it no longer handles conversions from double to float but from double to various other 32-bit types. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: add a pass for legalizing d2fConnor Abbott2016-05-101-0/+1
| | | | | | | | | We need to do this late, in order to avoid partial writes during the optimization loop. v2: Use subscript() instead of stride(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: add a pass for lowering PACK opcodesConnor Abbott2016-05-101-0/+1
| | | | | | v2: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Reimplement ARB_transform_feedback2 on Haswell and later.Kenneth Graunke2016-05-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | My old implementation accumulated <start, end> pairs in a buffer, and eventually processed that data on the CPU. This meant flushing the batchbuffer and waiting for it to completely execute before we could map it, resulting in really long stalls. We could also run out of space in the buffer, and have to do this early. Instead, we can use Haswell's MI_MATH command to do the (end - start) subtraction, as well as the multiplication by 2 or 3 to convert from the number of primitives written to the number of vertices written. We still need to CS stall to read the counters, but otherwise everything is completely pipelined - there's no CPU<->GPU synchronization required. It also uses only 80 bytes in the buffer, no matter what. Improves performance in Manhattan on Skylake GT3e at 800x600 by 6.1086% +/- 0.954166% (n=9). At 1920x1080, improves performance by 2.82103% +/- 0.148596% (n=84). v2: Fix number of primitives -> number of vertices calculation for GL_TRIANGLES (I was multiplying by 4 instead of 3.) Caught by Jordan Justen. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Implement ARB_query_buffer_object for HSW+Jordan Justen2016-05-041-0/+1
| | | | | | | | | | | | | | | v2: * Declare loop index variable at loop site (idr) * Make arrays of MI_MATH instructions 'static const' (idr) * Remove commented debug code (idr) * Updated comment in set_query_availability (Ken) * Replace switch with if/else in hsw_result_to_gpr0 (Ken) * Only divide GL_FRAGMENT_SHADER_INVOCATIONS_ARB by 4 on hsw and gen8 (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965/blorp: Convert state setup to CJason Ekstrand2016-04-261-3/+3
| | | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/blorp: Convert brw_blorp.cpp to a C fileJason Ekstrand2016-04-261-1/+1
| | | | | Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: add build rule for brw_nir_trig_workarounds.c on AndroidRob Herring2016-04-211-1/+3
| | | | | | | | | | | Commit bfd17c76c126 ("i965: Port INTEL_PRECISE_TRIG=1 to NIR.") added a generated file brw_nir_trig_workarounds.c which broke the Android build. Add the necessary makefiles to the Android build. Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Rob Herring <robh@kernel.org> Tested-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
* i965/blorp: Re-introduce clear programsTopi Pohjolainen2016-04-211-0/+1
| | | | | | | This partially reverts 2f28a0dc23165123cf1e8b5942acad37878edd8a Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/blorp: Pipeline upload support for gen8Topi Pohjolainen2016-04-211-0/+1
| | | | | | | | v2 (Ken): Drop GEN8_RASTER_FRONT_WINDING_CCW in raster state Add emission of pma stall. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Expose the surface format tableJason Ekstrand2016-04-141-0/+1
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Port INTEL_PRECISE_TRIG=1 to NIR.Kenneth Graunke2016-04-111-0/+1
| | | | | | | | | | | | | | | | | | | | This makes the extra multiply visible to NIR's algebraic optimizations (for constant reassociation) as well as constant folding. This means that when the result of sin/cos are multiplied by an constant, we can eliminate the extra multiply altogether, reducing the cost of the workaround. It also means we only have to implement it one place, rather than in both backends. This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion, which has a ton of sin() calls, but always multiplies them by an immediate constant. The extra multiply gets folded away. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Add boilerplate function for QueryInternalFormat driver hookEduardo Lima Mitev2016-03-031-0/+1
| | | | | | | By default, we call back the driver's hook fallback function that has generic implementations for the all the queries. Reviewed-by: Dave Airlie <airlied@redhat.com>
* i965: Extract push constant state to a new fileBen Widawsky2016-02-171-0/+1
| | | | | | | | | | | Every stage has a corresponding 3DSTATE_CONSTANT_XS packet, so having the code to create and emit push constant buffers in genX_vs_state.c is a little strange. Moving it to a separate file seems more logical. v2 [Ken]: Rebase on master, explain motivation in the commit message. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Apply VS attribute workarounds in NIR.Kenneth Graunke2016-02-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch re-implements the pre-Haswell VS attribute workarounds. Instead of emitting shader code in the vec4 backend, we now simply call a NIR pass to emit the necessary code. This simplifies the vec4 backend. Beyond deleting code, it removes the primary use of ATTR as a destination. It also eliminates the requirement that the vec4 VS backend express the ATTR file in terms of VERT_ATTRIB_* locations, giving us a bit more flexibility. This approach is a little different: rather than munging the attributes at the top, we emit code to fix them up when they're accessed. However, we run the optimizer afterwards, so CSE should eliminate the redundant math. It may even be able to fuse it with other calculations based on the input value. shader-db does not handle non-default NOS settings, so I have no statistics about this patch. Note that the scalar backend does not implement VS attribute workarounds, as they are unnecessary on hardware which allows SIMD8 VS. v2: Do one multiply for FIXED rescaling and select components from either the original or scaled copy, rather than multiplying each component separately (suggested by Matt Turner). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
* i965: Move brw_compiler_create() to new brw_compiler.c.Matt Turner2016-02-011-0/+1
| | | | | | | A future patch will want to use designated initalizers, which aren't available in C++, but this is C. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965: adding missing headers to the dist tarballEmil Velikov2016-01-181-0/+2
| | | | Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
* i965: Move GLSL lowering passes out of libi965_compiler.laKristian Høgsberg Kristensen2016-01-081-5/+5
| | | | | | | | | The scope of libi965_compiler.la is to be able to take nir shaders and generate i965 EU code. As such, we don't want the GLSL IR lowering passes in the library. With this change, libi965_compiler.la no longer needs to link to libglsl.la. Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Port tessellation evaluation shaders to vec4 mode.Kenneth Graunke2015-12-281-0/+1
| | | | | | | | | This can be used on Broadwell by setting INTEL_SCALAR_TES=0. More importantly, it will be used for Ivybridge and Haswell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Add tessellation control shaders.Kenneth Graunke2015-12-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | The TCS is the first tessellation shader stage, and the most complicated. It has access to each of the control points in the input patch, and computes a new output patch. There is one logical invocation per output control point; all invocations run in parallel, and can communicate by reading and writing output variables. One of the main responsibilities of the TCS is to write the special gl_TessLevelOuter[] and gl_TessLevelInner[] output variables which control how much new geometry the hardware tessellation engine will produce. Otherwise, it simply writes outputs that are passed along to the TES. We run in SIMD4x2 mode, handling two logical invocations per EU thread. The hardware doesn't properly manage the dispatch mask for us; it always initializes it to 0xFF. We wrap the whole program in an IF..ENDIF block to handle an odd number of invocations, essentially falling back to SIMD4x1 on the last thread. v2: Update comments (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Add tessellation evaluation shadersKenneth Graunke2015-12-221-0/+1
| | | | | | | | | | | | | | | | | | | The TES is essentially a post-tessellator VS, which has access to the entire TCS output patch, and a special gl_TessCoord input. Otherwise, they're very straightforward. This patch implements SIMD8 tessellation evaluation shaders for Gen8+. The tessellator can generate a lot of geometry, so operating in SIMD8 mode (8 vertices per thread) is more efficient than SIMD4x2 mode (only 2 vertices per thread). I have another patch which implements SIMD4x2 mode for older hardware (or via an environment variable override). We currently handle all inputs via the pull model. v2: Improve comments (suggested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* i965: Add tessellation shader surface support.Kenneth Graunke2015-12-111-0/+2
| | | | | | | | | | | | This is brw_gs_surface_state.c copy and pasted twice with search and replace. brw_binding_table.c code is similarly copy and pasted. v2: Drop dword_pitch related fields. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
* i965: Import tables enumerating the set of validated L3 configurations.Francisco Jerez2015-12-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | It should be possible to use additional L3 configurations other than the ones listed in the tables of validated allocations ("BSpec » 3D-Media-GPGPU Engine » L3 Cache and URB [IVB+] » L3 Cache and URB [*] » L3 Allocation and Programming"), but it seems sensible for now to hard-code the tables in order to stick to the hardware docs. Instead of setting up the arbitrary L3 partitioning given as input, the closest validated L3 configuration will be looked up in these tables and used to program the hardware. The included tables should work for Gen7-9. Note that the quantities are specified in ways rather than in KB, this is because the L3 control registers expect the value in ways, and because by doing that we can re-use a single table for all GT variants of the same generation (and in the case of IVB/HSW and CHV/SKL across different generations) which generally have different L3 way sizes but allow the same combinations of way allocations. v2: Use slice count from the devinfo structure instead of the gt number to implement get_l3_way_size(). Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* i965: Create new files for HS/DS/TE state upload code.Kenneth Graunke2015-12-071-1/+5
| | | | | | | | | | | | | | | | For now, this just splits the existing code to disable these stages into separate atoms/files. We can then replace it with real code. v2: Bump the render atoms in this patch so it compiles (in my branch, I'd bumped it in an earlier patch). 61 seems to be the minimum that works, which doesn't match the old value + the number of atoms I added in this patch, so apparently we had some slop before. v3: Actually disable the DS unit on Gen8+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Remove useless gen6_blorp.h/gen7_blorp.h headers.Matt Turner2015-11-241-2/+0
| | | | Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>