TLD4S : Texture Load 4 with scalar/non-vec4 source/destinations

Format

SPA 5.0:
        {@{!}Pg}   TLD4S{.F16}.comp{.toff}{.DC}{.NODEP}{.phase}   Rd1, Rd0, Ra, Rb, #tsPtrIdxU13   {&req_6}   {&rdN}   {&wrN}   {?sched}   ;   

  .comp:   TLD4S will return only a single component of texture.  This field is used to select which component of 
           a multi-component texture is returned for this texture fetch.  The component selected is the real, 
           post-swizzle component (example, TLD4.R would return values from the red component, just like a TEX.R
           instruction).  If the specified component is not present, zeroes will be returned.
               .R - Select the Red component of the texture
               .G - Select the Green component of the texture
               .B - Select the Blue component of the texture
               .A - Select the Alpha component of the texture

  .toff:   Programmable Texel Offset.
               < NONE >
               .AOFFI - _aoffimmi(u,v,w)  [DX10]   // 1 register required
                   ((w & 0x3f)<<16) | ((v & 0x3f)<<8) | (u & 0x3f)
                   Each 6b field is a 2's complement integer from -32 to +31.
                   AOFFI is not supported with CubeMap textures

  .DC:     Depth comparison filter mode using reference value.
               RefVal                           // 1 register required
	       Depth Comparison filter is not supported by 3D textures.

  .NODEP:  Indicates that there is no subsequent quad derivatives to be calculated.
	   Threads that have been "killed" will be disabled to stop unnecessary texture fetches.

  .phase:  Allows control on the current warps texture hash, used for scheduling.
               < NONE >
               .T - postfix increment of the 3 bit texture component of the hash.
	       .P - postfix increment of the 5 bit phase component, and zero out the 3 bit texture component of the hash. 

Immediate Inputs:

  #tsPtrIdxU13:
    This immediate index (word address) is used to fetch the packed header+sampler pointer entry from constant cache.
    The bank from which it is fetched is determined by bundle state. The constant bank entry is 32 bit structure of
    the form "samplerPtr[31:20] | headerPtr[19:0]".
    In SetSamplerBinding.ViaHeaderBinding (i.e. OGL) mode, the headerPtr would be used as the samplerPtr as well.
    Any header pointer greater than one specified in SetTexHeaderPoolC.MaximumIndex will be regarded as an "invalid"
    texture (i.e. equivalent to BIND_GROUP_TEXTURE_HEADER_VALID_FALSE in Fermi).
    Any sampler pointer greater than one specified in SetSamplerHeaderPoolC.MaximumIndex will be regarded as an
    "invalid" texture (i.e. equivalent to BIND_GROUP_TEXTURE_HEADER_VALID_FALSE in fermi).

Neither Ra nor Rb can be RZ.

Implied Inputs:
    TLD4S maps to the TLD4 instruction with
        #paramA = 2D
        .wmask  = 0xf (implies all the samples are needed) and
        .NDV    = FALSE

.F16:  If specified, texture return data is in packed FP16 format. 
     Otherwise, the return data is in 32 bit format (fp32 or S/UINT32).
     Partial register writes do no occur: any unused portion of the return 
     register is written with the value 0.
     Note: .F16 modifier is not supported for integer textures in SPA 5.2.
     (return value is UNPREDICTABLE)

Rounding mode is controlled by a PRI: [SM]PRI_SM_TEXIO_CONTROL_FP16_ROUNDING_MODE.  It must be set to the same value
as PRI_TEX_F_DBG_FP16_ROUNDING_MODE

Description

Texture fetch of the 4-texel bilerp footprint (but no filter) using a texture coordinate vector.

Bilerp footprint only, done on finest mip-map level (level 0). Texture must be 2D. The four texel samples are placed into the Rd0 and Rd1 vectors in counter clockwise order starting at lower left.

Legal instruction modifiers for TLD4S and corresponding parameter packing in Ra and Rb is specified below.

    Legal modifier table
    .AOFFI .DC Ra Rb
    - - s (scalar) t (scalar)
    - .DC s,t (vec2) dc (scalar)
    .AOFFI - s,t (vec2) toff (scalar)
    .AOFFI .DC s,t (vec2) toff,dc (vec2)

Ra/Rb must not be specified as RZ.

The results are written back as two vec2s, starting at Rd0 and Rd1. Registers Rd0 and Rd1 must both be aligned to vec2 boundaries.

Examples:

TLD4S.R R8, R10, R0, R5, 5;  // {R9, R8, R11, R10 }  = {Sample 3,2,1,0} 

Back to Index of Instructions