summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno/ir3/ir3_group.c
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/ir3: relax restriction in groupingRob Clark2016-04-241-3/+5
| | | | | | | | | | Currently we were two restrictive, and would insert an output move in cases like: MOV OUT[0], IN[0].xyzw Loosen the restriction to allow the current instruction to appear in the neighbor list but only at it's current possition. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: fix grouping issue w/ reverse swizzlesRob Clark2016-04-181-1/+17
| | | | | | | | | | | | | | | | | | | | | | | When we have something like: MOV OUT[n], IN[m].wzyx the existing grouping code was missing a potential conflict. Due to input needing to be sequential scalar regs, we have: IN: x <-> y <-> z <-> w which would be grouped to: OUT: w <-> z2 <-> y2 <-> x (where the 2 denotes a copy/mov) but that can't actually work. We need to realize that x and w are already in the same chain, not just that they aren't both already in new chain being built. With this fixed, we probably no longer need the hack from f68f6c0. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: drop unused instr category argRob Clark2016-04-041-1/+1
| | | | | | No longer used, so drop the extra arg to ir3_instr_create() Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: remove ir3_instruction::categoryRob Clark2016-04-041-4/+3
| | | | Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: 'keeps' need neighbors found tooRob Clark2015-08-121-0/+5
| | | | | | | | | This shows up with a glamor shader, which does a TXF and uses the result for conditional kill. Before we wouldn't group the fanin (collect) neighbors which need to be allocated adjacently at RA, resulting in badness. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: block reshuffling and loops!Rob Clark2015-06-211-2/+5
| | | | | | | | | | | | | | | | This shuffles things around to allow the shader to have multiple basic blocks. We drop the entire CFG structure from nir and just preserve the blocks. At scheduling we know whether to schedule conditional branches or unconditional jumps at the end of the block based on the # of block successors. (Dropping jumps to the following instruction, etc.) One slight complication is that variables (load_var/store_var, ie. arrays) are not in SSA form, so we have to figure out where to put the phi's ourself. For this, we use the predecessor set information from nir_block. (We could perhaps use NIR's dominance frontier information to help with this?) Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: simplify find_neighbors stop conditionRob Clark2015-06-211-17/+1
| | | | Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: move inputs/outputs to shaderRob Clark2015-06-211-15/+21
| | | | | | | | | These belong in the shader, rather than the block. Mostly a lot of churn and nothing too interesting. But splitting this out from the rest of ir3_block reshuffling to cut down the noise in the later patch. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: remove tgsi f/eRob Clark2015-06-211-22/+20
| | | | | | Also remove ir3_flatten which was only used by tgsi f/e. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: more builder helpersRob Clark2015-06-211-17/+6
| | | | | | | | Use ir3_MOV() builder in a couple of spots, rather than open-coding the instruction construction. Also add ir3_NOP() builder and use that instead of open coding. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: handle const/immed/abs/neg in cpRob Clark2015-04-051-6/+0
| | | | | | | | | | | | | Be smarter about propagating copies from const or immed, or with abs/neg modifiers. Also, realize that absneg.s and absneg.f are really "fancy" mov instructions. This opens up the possibility to remove more copies. It helps the TGSI frontend a bit, but will be really needed for the NIR f/e which builds everything up in SSA form (ie. will *always* insert a mov from const or immediate). Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: helpful iterator macrosRob Clark2015-03-081-6/+3
| | | | | | | | | I remembered that we are using c99.. which makes some sugary iterator macros easier. So introduce iterator macros to iterate all src registers and all SSA src instructions. The _n variants also return the src #, since there are a handful of places that need this. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: fix failed assert in groupingRob Clark2015-03-081-27/+44
| | | | | | | | | | | | | | | | | | | | | Turns out there are scenarios where we need to insert mov's in "front" of an input. Triggered by shaders like: VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[9] DCL SAMP[0] DCL TEMP[0], LOCAL 0: MOV TEMP[0].xy, IN[1].xyyy 1: MOV TEMP[0].w, IN[1].wwww 2: TXF TEMP[0], TEMP[0], SAMP[0], 1D_ARRAY 3: MOV OUT[1], TEMP[0] 4: MOV OUT[0], IN[0] 5: END Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: handle "holes" in inputsRob Clark2015-01-131-1/+31
| | | | | | | | | | | | If, for example, only the x/y/w components of in.xyzw are actually used, we still need to have a group of four registers and assign all four components. The hardware can't write in.xy and in.w to discontiguous registers. To handle this, pad with a dummy NOP instruction, to keep the neighbor chain contiguous. This fixes a problem noticed with firefox OMTC. Signed-off-by: Rob Clark <robclark@freedesktop.org>
* freedreno/ir3: simplify RARob Clark2015-01-071-0/+228
Group inputs/outputs, in addition to fanin/fanout, as they must also exist in sequential scalar registers. This lets us simplify RA by working in terms of neighbor groups. NOTE: has the slight problem that it can't optimize out mov's for things like: MOV OUT[n], IN[m] To avoid this, instead of trying to figure out what mov's we can eliminate, we first remove all mov's prior to grouping, and then re-insert mov's as needed while grouping inputs/outputs/fanins. Eventually we'd prefer the frontend to not insert extra mov's in the first place (so we don't have to bother removing them). This is the plan for an eventual NIR based frontend, so separate out the instr grouping (which will still be needed for NIR frontend) from the mov elimination (which won't). Signed-off-by: Rob Clark <robclark@freedesktop.org>