summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
Commit message (Collapse)AuthorAgeFilesLines
* mesa: hook up core bits of GL_ARB_shader_group_voteIlia Mirkin2016-06-061-0/+5
| | | | | | Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>
* i965/fs: handle fp64 opcodes in brw_do_channel_expressionsIago Toral Quiroga2016-05-101-9/+14
| | | | | | | | | | | | | | In the case of the pack opcode we are already doing the lowering in NIR, so no need to do it here. The unpack opcode operates on scalars, so it should not be lowered. In the case of frexp_sig and frexp_exp, they are lowered in lower_instructions, so we don't have to care about them. All the remaining opcodes involve conversions from and to doubles and are business as usual. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Switch to NIR for ldexp lowering.Kenneth Graunke2016-04-131-1/+1
| | | | | | | | | | | The old GLSL IR based lowering doesn't quite work right in all cases, and fails several dEQP-GLES31 and Vulkan CTS tests. Jason's new approach in NIR passes all the tests. There's not likely to be a ton of advantage to lowering early in GLSL IR anyway, so...switch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/gen7+: Use NIR for lowering of pack/unpack opcodes.Matt Turner2016-02-011-0/+8
| | | | Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* glsl: Remove 2x16 half-precision pack/unpack opcodes.Matt Turner2016-02-011-3/+0
| | | | | | i965/fs was the only consumer, and we're now doing the lowering in NIR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* i965/fs: Switch from GLSL IR to NIR for un/packHalf2x16 scalarizing.Matt Turner2016-02-011-0/+4
| | | | Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* glsl: move to compiler/Emil Velikov2016-01-261-2/+2
| | | | | | Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>
* nir: move glsl_types.{cpp,h} to compilerEmil Velikov2016-01-261-1/+1
| | | | | | | | Allows us to remove the SCons workaround :-) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>
* glsl: Delete the ir_binop_bfm and ir_triop_bfi opcodes.Kenneth Graunke2016-01-131-23/+7
| | | | | | | | | | | | | TGSI doesn't use these - it just translates ir_quadop_bitfield_insert directly. NIR can handle ir_quadop_bitfield_insert as well. These opcodes were only used for i965, and with Jason's recent patches, we can do this lowering in NIR (which also gains us SPIR-V handling). So there's not much point to retaining this GLSL IR lowering code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
* glsl: Remove ir_unop_any.Matt Turner2015-12-181-17/+0
| | | | | | | The GLSL IR to TGSI/Mesa IR paths for any_nequal have the same optimizations the ir_unop_any paths had. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965: Clean up #includes in the compiler.Matt Turner2015-11-241-2/+0
| | | | Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* nir: remove dependency on glslRob Clark2015-10-161-1/+1
| | | | | | | | | | | | | | | Move glsl_types into NIR, now that the dependency on glsl_symbol_table has been split out. Possibly makes sense to rename things at this point, but if we do that I'd like to keep it split out into a separate patch to make git history easier to follow (IMHO). v2: fix android build v3: I f***ing hate scons.. but at least it builds Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Rob Clark <robclark@freedesktop.org>
* glsl: Add parser/compiler support for unsized array's length()Samuel Iglesias Gonsalvez2015-09-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The unsized array length is computed with the following formula: array.length() = max((buffer_object_size - offset_of_array) / stride_of_array, 0) Of these, only the buffer size needs to be provided by the backends, the frontend already knows the values of the two other variables. This patch identifies the cases where we need to get the length of an unsized array, injecting ir_unop_ssbo_unsized_array_length expressions that will be lowered (in a later patch) to inject the formula mentioned above. It also adds the ir_unop_get_buffer_size expression that drivers will implement to provide the buffer length. v2: - Do not define a triop that will force backends to implement the entire formula, they should only need to provide the buffer size since the other values are known by the frontend (Curro). v3: - Call state->has_shader_storage_buffer_objects() in ast_function.cpp instead of using state->ARB_shader_storage_buffer_object_enable (Tapani). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
* i965: add support for ARB_shader_subroutineDave Airlie2015-07-241-0/+1
| | | | | | | | This just adds some missing pieces to nir/i965, it is lightly tested on my Haswell. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
* glsl: Remove never used sin_reduced/cos_reduced.Matt Turner2015-04-061-2/+0
| | | | | | | | These were added in commit f2616e56, presumably in preparation for translating ARB vp/fp into GLSL IR. That never happened, and neither did a lowering pass that actually generated these instructions. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
* glsl: Allow vector logic ops to be generated.Matt Turner2015-03-241-6/+3
| | | | | | | They're not accessible from the source language, but optimizations are allowed to generate them. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* Fix invalid extern "C" around header inclusion.Mark Janes2015-03-051-2/+0
| | | | | | | | | | | System headers may contain C++ declarations, which cannot be given C linkage. For this reason, include statements should never occur inside extern "C". This patch moves the C linkage statements to enclose only the declarations within a single header. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* i965: just avoid warnings with fp64Dave Airlie2015-02-201-0/+13
| | | | | | | This just fills in some blanks to avoid warnings in the i965 driver. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Dave Airlie <airlied@redhat.com>
* i965/fs: Add support for ir_unop_saturateAbdiel Janulgue2014-08-311-0/+1
| | | | | Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
* i965/fs: Support fine/coarse derivative opcodesChris Forbes2014-08-151-0/+4
| | | | | | | | The quality level (fine/coarse/dont-care) is plumbed through to the generator as a constant in src1. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Skip channel expressions splitting for interpolationChris Forbes2014-07-131-0/+25
| | | | | | | | The backend will have to do a message send, so we want to keep these in one piece, just like texture ops. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Use unreachable() instead of unconditional assert().Matt Turner2014-07-011-10/+5
| | | | Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965: Move compiler debugging output to stderr.Eric Anholt2014-02-221-2/+2
| | | | | | Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Fix broken assertsChris Forbes2013-11-171-1/+1
| | | | | | | These would never fire. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Generate code for ir_binop_imul_high.Matt Turner2013-10-071-0/+1
| | | | | | v2: Make accumulator's type match the type of the operation. Noticed by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Generate code for ir_binop_carry and ir_binop_borrow.Matt Turner2013-10-071-0/+2
| | | | | | Using the ADDC and SUBB instructions on Gen7. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* glsl: Add support for ldexp.Matt Turner2013-09-171-0/+1
| | | | | v2: Drop frexp. Rebase on builtins rewrite. Reviewed-by: Paul Berry <stereotype441@gmail.com>
* i965: Add support for ir_triop_csel.Matt Turner2013-09-091-0/+1
| | | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965/fs: Add support for translating ir_triop_fma into MAD.Matt Turner2013-08-271-0/+1
| | | | Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* i965: Add cases for ir_triop_vector_insert that assert.Kenneth Graunke2013-05-201-0/+1
| | | | | | | | | | brw_link_shader() unconditionally calls lower_vector_insert() with true as the second parameter. This means that both constant and variable indexed expressions will get lowered, so we should never see this in the backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965: Add cases for ir_binop_vector_extract that assert.Kenneth Graunke2013-05-201-0/+1
| | | | | | | | | | | | | | | do_vec_index_to_swizzle() should remove any vector extract operations with a constant index. It's unconditionally called from do_common_optimization(). do_vec_index_to_cond_assign() should remove the rest, and it is unconditionally called from brw_link_shader(). This means that we should never see ir_binop_vector_extract in the backend. Silences compiler warnings. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* i965/fs: Add support for bit instructions.Matt Turner2013-05-061-0/+37
| | | | | | | | | | | | Don't bother scalarizing ir_binop_bfm, since its results are identical for all channels. v2: Subtract result of FBH from 31 (unless an error) to convert MSB counts to LSB counts. v3: Use op0->clone() in ir_triop_bfi to prevent (var_ref channel_expressions) from appearing multiple times in the IR. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]
* i965/fs: Use the LRP instruction for ir_triop_lrp when possible.Kenneth Graunke2013-02-281-1/+15
| | | | | | | | | | | | | | | | | | | v2 [mattst88]: - Add BRW_OPCODE_LRP to list of CSE-able expressions. - Fix op_var[] array size. - Rename arguments to emit_lrp to (x, y, a) to clear confusion. - Add LRP function to brw_fs.cpp/.h. - Corrected comment about LRP instruction arguments in emit_lrp. v3 [mattst88]: - Duplicate MAD code for LRP instead of using a function pointer. - Check for != GRF instead of == IMM in emit_lrp. - Lower LRP on gen < 6. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> 1
* i965: Assert that the 4x8 pack/unpack operations have been loweredMatt Turner2013-01-251-0/+4
| | | | | Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
* i965/fs/gen7: Emit code for GLSL 3.00 pack/unpack operations (v4)Chad Versace2013-01-241-0/+12
| | | | | | | | | | | | | v2: Remove lewd comment. [for idr] v3: - Optimize away tmp register for packHalf2x16. [for anholt, paul] - Improve comments. [for anholt, paul] - Reduce near-duplicate code by removing vec4_visitor emit_pack/unpack methods. [for chadv] v4: Factor our UD/W register conversion into helper function. [for anholt] Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v2) Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
* glsl: Add a "ubo_load" expression type for fetches from UBOs.Eric Anholt2012-08-071-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drivers will probably want to be able to take UBO references in a shader like: uniform ubo1 { float a; float b; float c; float d; } void main() { gl_FragColor = vec4(a, b, c, d); } and generate a single aligned vec4 load out of the UBO. For intel, this involves recognizing the shared offset of the aligned loads and CSEing them out. Obviously that involves breaking things down to loads from an offset from a particular UBO first. Thus, the driver doesn't want to see variable_ref(ir_variable("a")), and even more so does it not want to see array_ref(record_ref(variable_ref(ir_variable("a")), "field1"), variable_ref(ir_variable("i"))). where a.field1[i] is a row_major matrix. Instead, we're going to make a lowering pass to break UBO references down to expressions that are obvious to codegen, and amenable to merging through CSE. v2: Fix some partial thoughts in the ir_binop comment (review by Kenneth) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Add support for ir_unop_f2u to i965 backend.Paul Berry2012-06-151-0/+1
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965: Add forgotten bitcast operations in brw_fs_channel_expressions.Kenneth Graunke2012-06-071-0/+4
| | | | Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
* intel: Convert from GLboolean to 'bool' from stdbool.h.Kenneth Graunke2011-10-181-1/+1
| | | | | | | | | | | | | | | | | I initially produced the patch using this bash command: for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i 's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i 's/GL_FALSE/false/g' $file; done Then I manually added #include <stdbool.h> to fix compilation errors, and converted a few functions back to GLboolean that were used in core Mesa's function pointer table to avoid "incompatible pointer" warnings. Finally, I cleaned up some whitespace issues introduced by the change. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chad Versace <chad@chad-versace.us> Acked-by: Paul Berry <stereotype441@gmail.com>
* i965: Fix Android build by removing relative includesChad Versace2011-08-301-3/+3
| | | | | | | | | | Replace each occurence of #include "../glsl/*.h" with #include "glsl/*.h" Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad@chad-versace.us>
* i965/fs: Implement new ir_unop_u2i and ir_unop_i2u opcodes.Kenneth Graunke2011-06-291-0/+2
| | | | | | | | | | No MOV is necessary since signed/unsigned integers share the same bit-representation; it's simply a question of interpretation. In particular, the fs_reg::imm union shouldn't need updating. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
* Convert everything from the talloc API to the ralloc API.Kenneth Graunke2011-01-311-1/+1
|
* i965: Fix compile warning about missing opcodes.Eric Anholt2010-12-041-0/+5
|
* glsl: Remove the ir_binop_cross opcode.Kenneth Graunke2010-11-171-28/+0
|
* i965: Handle new ir_unop_round_even in channel expression splitting.Eric Anholt2010-10-261-0/+1
|
* i965: Move FS backend structures to a header.Eric Anholt2010-10-111-2/+0
| | | | It's time to start splitting some of this up.
* i965: Update expression splitting for the vector-result change to compares.Eric Anholt2010-09-221-8/+9
| | | | | | Fixes: glsl1-precision exp2 glsl1-precision log2
* i965: Fix the vector/expression splitting for the write_mask change.Eric Anholt2010-09-221-4/+1
| | | | +113 piglits.
* i965: Add switch cases for ir_unop_noise, which should have been lowered.Eric Anholt2010-09-091-0/+3
| | | | Fixes compiler warnings.
* i965: Add a pass for the FS to reduce vector expressions down to scalar.Eric Anholt2010-08-261-0/+365
This is a step towards implementing a GLSL IR backend for the 965 fragment shader. Because it has downsides with the current codegen, it is hidden under the environment variable INTEL_NEW_FS. This results in an increase in instruction count at the moment (1444 -> 1752 for glsl-fs-raytrace, 345 -> 359 on my demo), because dot products are turned into a series of multiplies and adds instead of a custom expansion of MULs and MACs, and by not splitting the variable types up we don't get tree grafting and thus there are extra moves of temporary storage. However, register count drops for the non-GLSL path (64 -> 56 on my demo shader) because the register allocator sees all the sub-operations.