SPA 5.0:
{@{!}Pg}
FFMA{.fmz}{.rnd}{.SAT}
Rd{.CC},{-}Ra,{-}Sb, {-}Sc
{&req_6}
{?sched}
;
.fmz: { < NULL >*, .FTZ, .FMZ, INVALID } .fmz controls denorm flush and multiply mode. < NULL >: Denorms supported. No special handling of 0. This is default. .FTZ: Flush input/output denorms to sign-preserving zero. .FMZ: Flush input/output denorms to sign-preserving zero AND if either source is 0.0, the product is forced to +0.0 (even if other source is infinity or NaN), regardless of the input signs. The 0.0 test is done after input denorm flush. .rnd: { .RN*, .RM, .RP, .RZ } .RN - Round to the nearest even. .RM - Round towards -Infinity .RP - Round towards +Infinity .RZ - Round towards 0 .SAT: output saturate (.SAT) to (+0.0f,1.0f), with NaN converted to +0.0f. .CC: Write condition codes FFMA allows the following sources Sb,Sc: Sb(register), Sc(register) Sb(constant with immediate address), Sc(register) Sb(#Imm20<<12), Sc(register) Sb(register), Sc(constant with immediate address)
{@{!}Pg}
FFMA32I{.fmz}{.SAT}
Rd{.CC},{-}Ra, #Imm32, {-}Rd
{&req_6}
{?sched}
// Rc must be same as Rd;
For FFMA32I .rnd defaults to .RN
The product of Ra and Sb is computed to infinite precision and then Sc is added with a precision sufficient to guarantee that after rounding, the result is identical to that of an add with infinite precision followed by the rounding. The rounding is done to single precision (fp32) using the .rnd rounding mode.
See the IEEE-754 2008 standard, section 5.4.1 for details.
Fp32 operations support all 4 required IEEE-754 2008 rounding modes:
The optional IEEE rounding mode roundTiesToAway is not supported.
See the IEEE-754 2008 specification, Section 4.3.3.
The chosen NaN behavior for fp32 operations is different than that of fp64
operations. The chosen fp32 behavior is different from Intel's x87/SSE behavior,
but is still IEEE-754 2008 compliant when .FMZ
is not used: The
standard allows canonicalization of the NaN result to be implementation-defined.
See the IEEE-754 2008 specification, Section 6.2.
FFMA R0,R1,R2,R3; FFMA32I R0,R1,0x3d000000,R0;