| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the linker to deal with intrinsic functions which are undefined
all the way down to the driver back-end, and introduce intrinsic
definition helpers in the built-in generator.
We still need to figure out what kind of interface we want for drivers
to communicate to the GLSL front-end which of the supported intrinsics
should use a default GLSL implementation and which should use a
hardware-specific override. As there's no default GLSL implementation
for atomic ops, this seems like something we can worry about later on.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v2: Define local helper function to generate ir_call nodes in the
builtin generator.
|
|
|
|
|
|
|
|
|
| |
v2: Fix GLSL version in which the type became available. Add
contains_atomic() convenience method. Split off atomic counter
comparison error checking to a separate patch that will handle all
opaque types. Include new ir_variable fields for atomic types.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
| |
Future patches will need to call this function when there isn't an
ir_varible present to refer to.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ever since the addition of interface blocks with instance names, we have
had an implicit invariant:
var->type->is_interface() ==
(var->type == var->interface_type)
The odd use of == here is intentional because !var->type->is_interface()
implies var->type != var->interface_type.
Further, if var->type->is_array() is true, we have a related implicit
invariant:
var->type->fields.array->is_interface() ==
(var->type->fields.array == var->interface_type)
However, the ir_variable constructor doesn't maintain either invariant.
That seems kind of silly... and I tripped over it while writing some
other code. This patch makes the constructor do the right thing, and it
introduces some tests to verify that behavior.
v2: Add general-ir-test to .gitignore. Update the description of the
ir_variable invariant for arrays in the commit message. Both suggested
by Paul.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
|
|
|
|
|
|
|
|
|
| |
For interface blocks that contain arrays, this field will contain the
maximum element of each contained array that is accessed by the
shader. This is a first step toward supporting unsized arrays in
interface blocks.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
These built-ins have two "out" parameters, which makes implementing them
efficiently with our current compiler infrastructure difficult. Instead,
implement them in terms of the existing ir_binop_mul IR (to return the
low 32-bits) and a new ir_binop_mul64 which returns the high 32-bits.
v2: Rename mul64 -> imul_high as suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
| |
Calculates the carry out of the addition of two values and the
borrow from subtraction respectively. Will be used in uaddCarry() and
usubBorrow() built-in implementations.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
V2 [Chris Forbes]:
- Add new pattern, fixup parameter reading.
V3: Rebase onto new builtins machinery
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
| |
Note the parameter name change in the int version of ir_constant, to
avoid the conflict with the loop iterator.
v2: Make analogous change to builtin_builder::imm().
Reviewed-by: Paul Berry <stereotype441@gmail.com>
|
|
|
|
|
| |
v2: Drop frexp. Rebase on builtins rewrite.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
It's a ?: that operates per-component on vectors. Will be used in
upcoming lowering pass for ldexp and the implementation of frexp.
csel(selector, a, b):
per-component result = selector ? a : b
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
builtin_info was originally going to be a structure containing a bunch
of information, but after various rewrites, it turned into a boolean
availability predicate.
builtin_avail is a better name than builtin_info, since it doesn't
store any information other than availability.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
|
|
|
|
|
|
|
|
| |
Matt noticed that this was missing. Nothing uses this currently.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We already have ir_expression constructors for unary and binary
operations, which automatically infer the type based on the opcode and
operand types.
These are convenient and also required for ir_builder support.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
|
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
We can simply call the stored predicate function. If state is NULL,
just report that the function is available.
v2: Add a comment (requested by Paul Berry).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
A signature is a built-in if and only if builtin_info != NULL, so we
don't actually need a separate flag bit. Making a boolean-valued
method allows existing code to ask the same question while not worrying
about the internal representation.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the upcoming built-in function rewrite, we'll need to be able to
answer "Is this built-in function signature available?".
This is actually a somewhat complex question, since it depends on the
language version, GLSL vs. GLSL ES, enabled extensions, and the current
shader stage.
Storing such a set of constraints in a structure would be painful, so
instead we store a function pointer. When creating a signature, we
simply point to a predicate that inspects _mesa_glsl_parse_state and
answers whether the signature is available in the current shader.
Unfortunately, IR reader doesn't actually know when built-in functions
are available, so this patch makes it lie and say that they're always
present. This allows us to hook up the new functionality; it just won't
be useful until real data is populated. In the meantime, the existing
profile mechanism ensures built-ins are available in the right places.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
|
|
|
|
|
|
| |
v2: Add constant folding support.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds all of the parsing and semantics for GLSL 150 style
geometry shaders.
v2 (Paul Berry <stereotype441@gmail.com>): Add a few missing calls to
get_pipeline_stage(). Fix some signed/unsigned comparison warnings.
Fix handling of NULL consumer in assign_varying_locations().
v3 (Bryan Cain <bryancain3@gmail.com>): fix indexing order of 2D
arrays. Also, allow interpolation qualifiers in geometry shaders.
v4 (Paul Berry <stereotype441@gmail.com>): Eliminate
get_pipeline_stage()--it is no longer needed thanks to 030ca23 (mesa:
renumber shader indices according to their placement in pipeline).
Remove 2D stuff. Move vertices_per_prim() to ir.h, so that it will be
accessible from outside the linker. Remove
inject_num_vertices_visitor. Rework for GLSL 1.50.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v5 (Paul Berry <stereotype441@gmail.com>): Split out
do_set_program_inouts() argument refactoring to a separate patch.
Move geom_array_resizing_visitor to later in the series.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new opcode is used to generate a new vector with a single field from
the source vector replaced. This will eventually replace
ir_dereference_array of vectors in the LHS of assignments.
v2: Convert tabs to spaces. Suggested by Eric.
v3: Add constant expression handling for ir_triop_vector_insert. This
prevents the constant matrix inversion tests from regressing. Duh.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new opcode is used to get a single field from a vector. The field
index may not be constant. This will eventually replace
ir_dereference_array of vectors. This is similar to the extractelement
instruction in LLVM IR.
http://llvm.org/docs/LangRef.html#extractelement-instruction
v2: Convert tabs to spaces. Suggested by Eric.
v3: Add array index range checking to ir_binop_vector_extract constant
expression handling. Suggested by Ken.
v4: Use CLAMP instead of MIN2(MAX2()). Suggested by Ken.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
| |
i965/Gen7+ and Radeon/Evergreen+ have bfm/bfi instructions to implement
bitfieldInsert() from ARB_gpu_shader5.
v2: Add ir_binop_bfm and ir_triop_bfi to st_glsl_to_tgsi.cpp.
Remove spurious temporary assignment and dereference.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
|
|
|
|
|
|
| |
v2: Move use of ir_binop_bfm and ir_triop_bfi to a later patch.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2 [mattst88]:
- Rebase.
- #define GL_ARB_texture_query_lod to 1.
- Remove comma after ir_lod in ir.h for MSVC.
- Handled ir_lod in ir_hv_accept.cpp, ir_rvalue_visitor.cpp,
opt_tree_grafting.cpp.
- Rename textureQueryLOD to textureQueryLod, see
https://www.khronos.org/bugzilla/show_bug.cgi?id=821
- Fix ir_reader of (lod ...).
v3 [mattst88]:
- Rename textureQueryLod to textureQueryLOD, pending resolution of
Khronos 821.
- Add ir_lod case to ir_to_mesa.cpp.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_frag_attrib -> gl_varying_slot
FRAG_ATTRIB_* -> VARYING_SLOT_*
FRAG_BIT_* -> VARYING_BIT_*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
V2: - emit `sample` parameter properly for multisample texelFetch()
- fix spurious whitespace change
- introduce a new opcode ir_txf_ms rather than overloading the
existing ir_txf further. This makes doing the right thing in
the driver somewhat simpler.
V3: - fix weird whitespace
V4: - don't forget to include the new opcode in tex_opcode_strs[]
(thanks Kenneth for spotting this)
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V2] Reviewed-by: Eric Anholt <eric@anholt.net>
[V2] Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many GPUs have an instruction to do linear interpolation which is more
efficient than simply performing the algebra necessary (two multiplies,
an add, and a subtract).
Pattern matching or peepholing this is more desirable, but can be
tricky. By using an opcode, we can at least make shaders which use the
mix() built-in get the more efficient behavior.
Currently, all consumers lower ir_triop_lrp. Subsequent patches will
actually generate different code.
v2 [mattst88]:
- Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a
subsequent patch and ir_triop_lrp translated directly.
v3 [mattst88]:
- Move changes from the next patch to opt_algebraic.cpp to accept
3-src operations.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we had separate constructors for one, two, and four operand
expressions. This patch consolidates them into a single constructor
which uses NULL default parameters.
The unary and binary operator constructors had assertions to verify that
the caller supplied the correct number of operands for the expression,
but the four-operand version did not. Since get_num_operands for
ir_quadop_vector returns the number of vector_elements, we can safely
add that without breaking the semantics of ir_quadop_vector.
This also paves the way for expressions with three operands. Currently,
none can be constructed since get_num_operands() never returns 3.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
| |
For each function {pack,unpack}{Snorm,Unorm}4x8, add a corresponding
opcode to enum ir_expression_operation. Validate the new opcodes in
ir_validate.cpp.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
|
|
|
|
|
|
|
|
|
| |
v2: A previous patch contained a spurious hunk that removed an
assignment to ir_variable::uniform_block. That hunk was moved to this
patch. Suggested by Carl Worth.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
| |
In ir_expression's constructor, the cases for {bit,logic}_{and,or,xor}
failed to handle the case when both operands were vectors.
Note: This is a candidate for the stable branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Replace tabs with spaces. According to docs/devinfo.html, Mesa's
indetation style is:
indent -br -i3 -npcs --no-tabs infile.c -o outfile.c
This patch prevents whitespace weirdness in the next patch.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For each function {pack,unpack}{Snorm,Unorm,Half}2x16, add a corresponding
opcode to enum ir_expression_operation. Validate the new opcodes in
ir_validate.cpp.
Also, add opcodes for scalarized variants of the Half2x16 functions. (The
code generator for the i965 fragment shader requires that all vector
operations be scalarized. A lowering pass, to be added later, will
scalarize the Half2x16 functions).
v2: Fix assertion message in ir_to_mesa [for idr].
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Tuner <mattst88@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch replaces the three ir_variable_mode enums:
- ir_var_in
- ir_var_out
- ir_var_inout
with the following five:
- ir_var_shader_in
- ir_var_shader_out
- ir_var_function_in
- ir_var_function_out
- ir_var_function_inout
This eliminates a frustrating ambiguity: it used to be impossible to
tell whether an ir_var_{in,out} variable was a shader in/out or a
function in/out without seeing where the variable was declared in the
IR. This complicated some optimization and lowering passes, and would
have become a problem for implementing varying structs.
In the lisp-style serialization of GLSL IR to strings performed by
ir_print_visitor.cpp and ir_reader.cpp, I've retained the names "in",
"out", and "inout" for function parameters, to avoid introducing code
churn to the src/glsl/builtins/ir/ directory.
Note: a couple of comments in the code seemed to indicate that we were
planning for a possible future in which geometry shaders could have
shader-scope inout variables. Our GLSL grammar rejects shader-scope
inout variables, and I've been unable to find any evidence in the GLSL
standards documents (or extensions) that this will ever be allowed, so
I've eliminated these comments.
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the location of each varying is recorded in ir_variable as
a multiple of the size of a vec4. In order to pack varyings, we need
to be able to record, e.g. that a vec2 is stored in the second half of
a varying slot rather than the first half.
This patch introduces a field ir_variable::location_frac, which
represents the offset within a vec4 where a varying's value is stored.
Varyings that are not subject to packing will always have a
location_frac value of zero.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
|
|
|
|
|
|
|
|
| |
No reason for this to be global from what I can see
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Drivers will probably want to be able to take UBO references in a
shader like:
uniform ubo1 {
float a;
float b;
float c;
float d;
}
void main() {
gl_FragColor = vec4(a, b, c, d);
}
and generate a single aligned vec4 load out of the UBO. For intel,
this involves recognizing the shared offset of the aligned loads and
CSEing them out. Obviously that involves breaking things down to
loads from an offset from a particular UBO first. Thus, the driver
doesn't want to see
variable_ref(ir_variable("a")),
and even more so does it not want to see
array_ref(record_ref(variable_ref(ir_variable("a")),
"field1"), variable_ref(ir_variable("i"))).
where a.field1[i] is a row_major matrix.
Instead, we're going to make a lowering pass to break UBO references
down to expressions that are obvious to codegen, and amenable to
merging through CSE.
v2: Fix some partial thoughts in the ir_binop comment (review by Kenneth)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
| |
We're going to need this structure to cross-validate the uniform
blocks between shader stages, since unused ir_variables might get
dropped. It's also the place we store the RowMajor qualifier, which
is not part of the GLSL type (since that would cause a bunch of type
equality checks to fail).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we performed conversions from float->uint by a two step
process: float->int->uint. However, on platforms that use saturating
conversions (e.g. i965), this didn't work, because if the source value
was larger than the maximum representable int (0x7fffffff), then
converting it to an int would clamp it to 0x7fffffff.
This patch just adds the new opcode; further patches will adapt
optimization passes and back-ends to use it, and then finally the
ast_to_hir logic will be modified to emit the new opcode.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
| |
Determines whether it's a basis vector, i.e., a vector with one element
equal to 1 and all other elements equal to 0.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
| |
Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
| |
The opcodes are bitcast_f2u, bitcast_f2i, bitcast_i2f and bitcast_u2f.
Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
| |
This points to the object with the function body, allowing us to map
from a built-in prototype to the actual body with IR code to execute.
Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- copy_masked_offset copies part of a constant into another,
assign-like.
- copy_offset copies a constant into (a subset of) another,
funcall-return like.
These methods are to be used to trace through assignments and function
calls when computing a constant expression.
Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
|
|
|
|
|
|
| |
Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, set_callee() performed some assertions about the type of the
ir_call; protecting the bare pointer ensured these checks would be run.
However, ir_call no longer has a type, so the getter and setter methods
don't actually do anything useful. Remove them in favor of accessing
callee directly, as is done with most other fields in our IR.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Aside from ir_call, our IR is cleanly split into two classes:
- Statements (typeless; used for side effects, control flow)
- Values (deeply nestable, pure, typed expression trees)
Unfortunately, ir_call confused all this:
- For void functions, we placed ir_call directly in the instruction
stream, treating it as an untyped statement. Yet, it was a subclass
of ir_rvalue, and no other ir_rvalue could be used in this way.
- For functions with a return value, ir_call could be placed in
arbitrary expression trees. While this fit naturally with the source
language, it meant that expressions might not be pure, making it
difficult to transform and optimize them. To combat this, we always
emitted ir_call directly in the RHS of an ir_assignment, only using
a temporary variable in expression trees. Many passes relied on this
assumption; the acos and atan built-ins violated it.
This patch makes ir_call a statement (ir_instruction) rather than a
value (ir_rvalue). Non-void calls now take a ir_dereference of a
variable, and store the return value there---effectively a call and
assignment rolled into one. They cannot be embedded in expressions.
All expression trees are now pure, without exception.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|