BRA : Relative Branch
BRX : Relative Branch Indirect
JMP : Absolute Jump
JMX : Absolute Jump Indirect

Format:

SPA 5.0:
   Relative branches:
        {@{!}Pg}   BRA{.U}{.LMT}   {CC.test,}#ImmS24               {&req_6}   {?sched>=?WAIT5}   ;   
        {@{!}Pg}   BRA{.U}{.LMT}   {CC.test,}c[#ImmU05][#ImmU16]   {&req_6}   {?sched>=?WAIT5}   ;   // This form is not patchable and is deprecated.   
        {@{!}Pg}   BRX{.LMT}       {CC.test,}Ra + #ImmS24          {&req_6}   {?sched>=?WAIT5}   ;   

   Absolute Jumps:
        {@{!}Pg}   JMP{.U}{.LMT}   {CC.test,}#ImmU32               {&req_6}   {?sched>=?WAIT5}   ;   
        {@{!}Pg}   JMP{.U}{.LMT}   {CC.test,}c[#ImmU05][#ImmU16]   {&req_6}   {?sched>=?WAIT5}   ;   
        {@{!}Pg}   JMX{.LMT}       {CC.test,}Ra + #ImmS32          {&req_6}   {?sched>=?WAIT5}   ;   

.U      Unanimous condition, branch/jump is only taken if all active
        threads in the warp agree on taking the branch.

.LMT    ApiCallLimit check (limit already reached by previous PRET, see the RET/PRET opcode page for more details.)

.test:  { .F,      .LT,     .EQ,     .LE,      .GT,      .NE,      .GE,   .NUM,     Signed numeric tests
          .NAN,    .LTU,    .EQU,    .LEU,     .GTU,     .NEU,     .GEU,  .T*,      Signed or Unordered tests 
          .OFF,    .LO,     .SFF,    .LS,      .HI,      .SFT,     .HS,   .OFT,     Unsigned integer tests
          .CSM_TA, .CSM_TR, .CSM_MX, .FCSM_TA, .FCSM_TR, .FCSM_MX, .RLE,  .RGT  }   Clip State Machine tests
        If no condition code test is specified, CC.TRUE is assumed.

Description:

Conditional control flow.

BRA/BRX compute a target PC address using a PC-relative signed offset operand and a signed immediate offset on a per thread basis and then jumps to target PC if branch condition evaluates to true. Note that the value in Ra is considered as a signed value. Target (offset) address is specified as an offset in bytes, not instructions or words, relative to the PC of the next instruction within the address range specified by the 40b virtual base address (PROGRAM_BASE).

JMP/JMPX compute a target PC address using an absolute (unsigned) address operand and a signed immediate offset on per thread basis. Note that the value in Ra is considered as an unsigned value. Target address is specified in bytes, not instructions or words, and is an absolute offset within the address range specified by the 40b virtual base address (PROGRAM_BASE).

The branch/jump condition is based on a predicate AND condition code bits. To branch/jump only on the predicate, use CC.TRUE. To branch/jump only on CC, use PT for the predicate.

Branch/Jump target can be:

Branch/Jump target operands can be:
  (1) Immediate                         // BRA and JMP 
      - Immediate lsb0 and lsb1 must be 0
      - S24 (24 bit signed)   immediate for BRA 
      - U32 (32 bit unsigned) immediate for JMP
  (2) Immediate Address Constant        // BRA and JMP
      The interpretation of constant is as follows:
       JMP: As U32 (with a range 0f 0..4GB) absolute target PC 
       BRA: As S32 (with a range of +/- 0..2GB)  PC-relative byte offset.
  (3) Register + Immediate              // BRX and JMX
      - Ra is in bytes (Ra+Imm may then sum to word-aligned) and interpreted as 
       JMX: As u32 (with a range 0f 0..4GB) absolute target PC 
       BRX: As s32 (with a range of +/- 0..2GB)  PC-relative byte offset.
      - immediate is signed and in bytes
        - Even when Ra=Rz the immediate is still signed, which is different from most
	  opcodes that use the register + offset semantics.  
        - S24 (24 bit signed) immediate for BRX 
        - S32 (32 bit signed) immediate for JMX

The target PC computation (Ra + imm)/(PC + Ra +imm )/(PC +const) is done with "infinite precision"  then final
	result is checked to not overflow the range 0..4GB.

Additional Information:

A branch/jump op cannot push a SYNC token as is needed to synchronize potentially divergent control flow (an if-then-else construct). To push a SYNC token, the programmer should use an SSY instruction prior to the branch/jump, and a NOP.S (SYNC) at the end of the potentially divergent code. See the SSY opcode page for more information.

The .LMT option allows CAL nesting limits to be checked when a branch is used in conjunction with PRET to perform conditional subroutine calls. If the .LMT option is specified, the per-warp LMT state bit is checked before executing the branch, and if set, the branch instruction is converted into a NOP. The LMT state bit is left intact by the branch. For more details refer to the PRET description.

Examples:

BRA  REL:0x28;                     // defaults to CC.TRUE
BRA  CC.EQ,c[2][0x48];
BRX  CC.GE,R5+0x128;
JMP  CC.LT,0x128000;
JMP  CC.EQ,c[2][0x48];
JMP  ABS:0xCA80;
JMX  CC.GE,R5+0x128000;

// pseudo-code example
if (CC.LT)
  R6 = R0;
else
  R6 = R1;
R7 = R6*R6;

// in assembler could be:
  SSY    LABEL0;
  BRA    CC.GE,LAB_ELSE;          // or GEU if fp comparison
LAB_IF:
  MOV  R6,R0;
  SYNC;
LAB_ELSE:
  MOV  R6,R1;
  SYNC;
LABEL0:
  MUL    R7,R6,R6;                // sync happens before MUL

Back to Index of Instructions