SPA 5.0:
SETCRSPTR
Ra
{&req_6}
{&rdN}
{?sched>=?WAIT5}
;
Control flow operations use a per-warp hardware managed stack of call/return tokens. SETCRSPTR overwrites the state of the current warp's stack pointer with the value in Ra.
Any cached stack entries in SM will not be written back before the stack pointer state is updated. The stack cache should first be flushed using the CCTLL.CRS.WBALL instruction.
The warp's stack pointer is state among all threads of the warp: if multiple threads execute SETCRSPTR within a warp, HW arbitrarily selects one thread to execute the instruction.
The value of Ra must have the following format:
bits field
--------------------------
16: 0 curPhysStackDepth
22:17 reserved
30:23 curApiCallDepth
31:31 KillFutureBranch
'curPhysStackDepth' is the number of tokens currently on the stack, rounded up to the next multiple of 4. For example, if the instruction stream consisted solely of 125 SSY instructions, then SETCRSPTR should use a value of 'curPhysStackDepth' of 128. The value of 'curPhysStackDepth' is independent of the state of the physical stack cache that the SM may implement. If the value of 'curPhysStackDepth' does not correspond to the correct number of entries on the CRS, then future UnwindStack() may not be performed correctly. Moreover, 'curPhysStackDepth' may be clamped, as described below.
'curApiCallDepth' is the current depth of the API-visible call-stack, in entries. 'curApiCallDepth' is updated by the non-.NOINC variants of CAL, JCAL, and PRET. If the value of 'curApiCallDepth' does not correspond to the correct number of entries on the CRS, then future incrementing stack operations may fail or future UnwindStack() may not be performed correctly.
'KillFutureBranch' is the state of the result of the limit check performed by the non-.NOINC variant of PRET. This state is only used to control subsequent BRA.LMT or JMP.LMT instructions. If the value of 'KillFutureBranch' does not correspond to the correct state of the CRS, then future API-limited branches may fail or future UnwindStack() may not be performed correctly.
In general, the value provided to SETCRSPTR should be the same as the one returned by GETCRSPTR. Using a different value should be done with care, and the stack content in memory should be adjusted to reflect the new values.
SETCRSPTR should not be used when no backing CRS was allocated in local memory. The behavior in such situations is UNPREDICTABLE.
If SETCRSPTR is used within the trap handler, the value of 'curPhysStackDepth' is clamped to the CRS allocation size. If SETCRSPTR is executed within user mode, the value of 'curPhysStackDepth' is clamped into the trap handler reserved space, 16 entries from the end of the allocation. Clamping into the reserved space guarantees space for a subsequent trap.
SOFTWARE NOTES:
SETCRSPTR R0;