HMUL2

SPA 5.3:

        {@{!}Pg}   HMUL2{.ofmt}{.fmz}{.SAT}   Rd, {-}{|}Ra{|}{.iswz}, {-}{|}Rb{|}{.iswz}                             {&req_6}   {&rdN}   {&wrN}   {?sched}   ;                       
        {@{!}Pg}   HMUL2{.ofmt}{.fmz}{.SAT}   Rd, {-}{|}Ra{|}{.iswz}, {-}{|}c[#BankU05][#AddrU16]{|}                 {&req_6}   {&rdN}   {&wrN}   {?sched}   ;                       
        {@{!}Pg}   HMUL2{.ofmt}{.fmz}{.SAT}   Rd, {-}{|}Ra{|}{.iswz}, {{-}{|}#Immfp10H1{|}}, {{-}{|}#Immfp10H0{|}}   {&req_6}   {&rdN}   {&wrN}   {?sched}   ;// Imm order: H1, H0   
        {@{!}Pg}   HMUL2_32I{.fmz}{.SAT}      Rd, Ra{.iswz}, {{-}{|}#Immfp16H1{|}}, {{-}{|}#Immfp16H0{|}}            {&req_6}   {&rdN}   {&wrN}   {?sched}   ;// Imm order: H1, H0   

 .fmz:       { < NULL >*, .FTZ, .FMZ, INVALID } 
             .fmz controls denorm flush and multiply mode.
               < NULL >: Denorms supported. No special handling of 0.
                         This is default.
               .FTZ:     Flush input/output denorms to sign-preserving zero.
               .FMZ:     Flush input/output denorms to sign-preserving zero AND
                         if either source is 0.0, the product is forced to +0.0
                         (even if other source is infinity or NaN), regardless
                         of the input signs. The 0.0 test is done after input
                         denorm flush.

 .SAT:       Saturate output to the inclusive range [+0.0 .. 1.0]
             (NaN is converted to +0.0)

 .ofmt:      { .F16_V2*, .F32, .MRG_H0, .MRG_H1 } 
             Output format.
             .F16_V2:    Outputs two 16-bit floating point numbers, packed in the 32-bit output.
             .F32:       Compute a single 16-bit floating point result (simd lane 0) and 
                         outputs a single 32-bit floating point number. 
                         This mode flushes fp16 denorm results to zero prior to the conversion. 
                         Inputs are unaffected.
             .MRG_H0:    Generates a single 16-bit floating point number (simd lane 0), and
                         writes it to bottom 16-bits of Rd. The upper bits of
                         Rd are not modified.
             .MRG_H1:    Generates a single 16-bit floating point number (simd lane 1), and
                         writes it to upper 16-bits of Rd. The lower bits of
                         Rd are not modified.
                         Note: .MRG_H0/1 modifier support is restricted. See below.
                         Supports .MRG_H0, .MRG_H1 

 .iswz:      { .H1_H0*, .F32, .H0_H0, .H1_H1 } 
             Input format.
             .H1_H0:     Input is a set of two 16-bit floating point numbers.
             .F32:       Input is a single 32-bit floating point number that
                         will be converted to a 16-bit floating point number
                         and replicated to both halves of the SIMD operation.
                         The conversion will round towards 0 (truncation).
                         Any denorms generated in FP32 -> FP16  conversion process will flush to 0. 
                         Denorms can still be generated from the operation itself.
             .H0_H0:     Input is a single 16-bit floating point number in the
                         lower 16-bits of a 32-bit register, and is replicated
                         to both halves of the SIMD operation.
             .H1_H1:     Input is a single 16-bit floating point number in the
                         upper 16-bits of a 32-bit register, and is replicated
                         to both halves of the SIMD operation.

immfp16H0   fp16 immediate in {sign, exp[4:0], mant[9:0]} format.
immfp16H1   fp16 immediate in {sign, exp[4:0], mant[9:0]} format. 

immfp10H0   Most signficant 10 bits of fp16 immediate.
immfp10H1   Most signficant 10 bits of fp16 immediate.


For HMUL2 with an immediate "Sb" operand .iswz field is not encoded and 
behavior defaults to .H1_H0 for the immediate operand. Also Absolute values and
negates are not encoded and default to false. SASS can support absolute/negates 
when enclosed in curly braces. e.g {-1.0} or {|-19.5|} and encode appropriate 
immediates.

For HMUL2 with a constant "Sb" operand, .iswz is not encoded and 
behavior defaults to .F32 for the constant reference operand. Also absolute
value for the constant operand defaults to false.

For HMUL2_32I 
    - .ofmt is not encoded and behaviour defaults to .F16_V2 
    - .iswz is specified only for Ra operand. For other source operands .iswz
      is not encoded and behaviour defaults to .H1_H0  
    - Absolute values and negates for the immediate operand are not encoded and
      behavior defaults to false. 
      Note: SASS can support absolute/negates when enclosed in curly braces
      (e.g {-1.0} or {|-19.5|}) and encode appropriate immediates.

The sign of operands Ra and Rb/c[][] are encoded together in single sign bit. 

For immediate operands forms:
    - Sign of both immediates must be same. 
    - Absolute operator must be present or absent on both immediates together.
    - Negate on Ra can be supported by inverting immediates.
HMUL2 : FP16 SIMD Multiply

Format:

Description:

IEEE Rounding Modes

NaN Behavior:

Examples: