SPA 5.0:
{@{!}Pg}
VABSDIFF4{.dfmt}{.safmt.sbfmt}{.SAT}{.red}{.lanemask}
Rd{.CC},Ra{.asel4},Sb{.bsel4},Rc
{&req_6}
{?sched}
;
.dfmt: { .UD* }
.safmt: { .U8, .S8* }
.sbfmt: { .U8, .S8* }
.red: { .SIMD_MRG*, .ACC, .INVALID[6] } // reduction operation
.lanemask: { .xyzw*, ... } // lane mask selects bytes for merge/reduction
.asel4: { .3210*, ... } // a input byte permute from (Ra,Sb)
.bsel4: { .7654*, ... } // b input byte permute from (Ra,Sb)
4-way SIMD parallel absolute difference with subsequent reduction operation.
Sources (Ra, Rc) and destination (Rd) are all 32b data registers.
The following source Sb is allowed:
Sb(register) Sb(#IMM08)
4-way SIMD parallel operation |A - B| done first.
Both inputs are are first promoted to S09 (based on their individual .S8 or .U8 format). After that the absolute difference is done, producing a SIMD U09 result. These are promoted to SIMD S10 values to match the data pipe of the other quad SIMD instructions.
Optional set of Condition Code, based on final 32 bit result in Rd. The setting of CF & OF is based on final reduction operation adder, they are only valid if the reduction operation is .ACC.
ZF = (result==0) ? 1 : 0) SF = result[31] CF = if (.red==.ACC) Carry is defined as the carry-out of the msb adder. if (.red!=.ACC) 0 OF = if (.red==.ACC) Overflow is defined as the XOR of the resulting sign bit and the real sign bit. if (.red!=.ACC) 0
VABSDIFF4.S8.U8.ACC R0, R1, R2, R3; VABSDIFF4.UD.U8.U8.ACC R0, R1, R2, R3;