SPA 5.0:
PRET{.NOINC}
#OffsetS24
{&req_6}
{?sched}
;
PRET{.NOINC}
c[#BankU05][#AddrU16]
{&req_6}
{?sched}
;
// This form is not patchable and is deprecated
.NOINC: push is not counted against the ApiCallLimit maximum call nesting depth. .test: { .F, .LT, .EQ, .LE, .GT, .NE, .GE, .NUM, Signed numeric tests .NAN, .LTU, .EQU, .LEU, .GTU, .NEU, .GEU, .T, Signed or Unordered tests .OFF, .LO, .SFF, .LS, .HI, .SFT, .HS, .OFT, Unsigned integer tests .CSM_TA, .CSM_TR, .CSM_MX, .FCSM_TA, .FCSM_TR, .FCSM_MX, .RLE, .RGT } Clip State Machine tests
{@{!}Pg}
RET
{CC.test}
{&req_6}
{?sched>=?WAIT5}
;
Unconditional prepare-for-return.
This instruction is intended to be used in conjunction with BRX or JMX to support indirect calls. PRET does everything a CAL does, except the actual control transfer. Specifically, PRET pushes a return address, optionally subject to call nesting limits.
PRET uses a relative (signed) PC target address. Target address is in bytes, not instructions or words, and is relative to the PC of the next instruction following the PRET.
When a PRET is executed, the .NOINC suffix controls whether or not it is counted towards the ApiCallLimit maximum call nesting level (default is to count, .NOINC overrides this). If the call limit is exceeded, PRET records this call depth overflow in a special state bit and does not update the stack.
A subsequent BRX.LMT or JMX.LMT should check this state bit, which will cause the branch to be converted into a NOP when the bit is set. See the BRX/JMX description for details.
Note that there is only one special state bit per warp, shared by all threads in the warp, so it is critical that there be no warp divergence between the PRET and the subsequent BRX/JMX. In general, there should be no intervening branch-type instructions between a PRET and the subsequent branch at all, which trivially satisfies the no divergence requirement.
Conditional return from subroutine.
The return condition is based on a predicate AND condition code bits. To just return on predicate, use CC.TRUE. To return based only on CC, use PT as the predicate.
Subroutines can return from within nested control flow, including loops and if-then-elses.
If the matching PRET or CAL incremented the call depth counter towards a maximum user call depth, the RET will decrement the call depth counter when it pops the return address from the stack.
A RET on an empty stack, e.g. without a matching CAL, will act like an EXIT and terminate the program. This includes checking for any barriers waiting on all threads. See EXIT for full details.
PRET MONGO; BRX.LMT R0; MONGO: FMAD ... RET; RET CC.LT;