TEX

SPA 5.0:
        {@{!}Pg}   TEX{.B}{.lod}{.AOFFI}{.DC}{.NDV}{.NODEP}{.phase}        Rd, Ra{, Rb}, #tsPtrIdxU13, #paramA{, #wmskU04}             {&req_6}   {&rdN}   {&wrN}   {?sched}   ;   
        {@{!}Pg}   TEX{.B}{.lod}{.AOFFI}{.DC}{.NDV}{.NODEP}{.phase}        Rd, Ra{, Rb}, #tidU08, #smpU05, #paramA{, #wmskU04}         {&req_6}   {&rdN}   {&wrN}   {?sched}   ;   

2 additional forms support for tiled resources (sparse status predicate and LOD clamping).
        {@{!}Pg}   TEX{.B}{.lod}{.LC}{.AOFFI}{.DC}{.NDV}{.NODEP}{.phase}   {Ps,} Rd, Ra{, Rb}, #tsPtrIdxU13, #paramA{, #wmskU04}       {&req_6}   {&rdN}   {&wrN}   {?sched}   ;   
        {@{!}Pg}   TEX{.B}{.lod}{.LC}{.AOFFI}{.DC}{.NDV}{.NODEP}{.phase}   {Ps,} Rd, Ra{, Rb}, #tidU08, #smpU05, #paramA{, #wmskU04}   {&req_6}   {&rdN}   {&wrN}   {?sched}   ;   

.B:      Bindless mode, where the texture header pointer and sampler pointer is packed into a 32 bit register as:
         samplerPtr[31:20] | headerPtr[19:0]
         Data is sent via register Rb.

.lod:    { .LZ, .LB, .LL, .LBA, .LLA } 
         Level of detail (LOD) adjust mode.
            < NONE >
            .LZ  - LOD level 0 (finest)       // no register required
            .LB  - LOD bias discrete          // 1 fp32 register required
            .LL  - LOD absolute discrete      // 1 fp32 register required
            .LBA - LOD bias averaged          // 1 fp32 register required (Tesla legacy mode)
            .LLA - LOD absolute averaged      // 1 fp32 register required (Tesla legacy mode)
         The "averaged" options allow the TEX pipe to average the LOD's across the quad as a performance optimization.
	 LOD Level 0 actually selects the level set by textureHeader.resViewMinMapLevel.

.LC:  LOD Clamp value for Sparse Textures.
               A 12 bit (fixed point u4.8 format) value. Packed with the ARRAY index in the same register.

.AOFFI: Programmable Texture Offset.
            _aoffimmi(u,v,w)  [DX10]   // 1 register required
                ((w & 0xf)<<8) | ((v & 0xf)<<4) | (u & 0xf)
            Each 4b field is a 2's complement integer from -8 to +7.
            AOFFI is not supported with CubeMap textures.

.DC:     Depth comparison filter mode using reference value.
            RefVal                           // 1 fp32 register required
            Depth Comparison filter is not supported by 3D textures.
            For TEX and TEX.LZ, the .DC option will force a depth comparison filter mode regardless of the sampler state.
            For TEX.LB, TEX.LL, TEX.LBA, TEX.LLA, if the sampler state does not enable depth comparison the .DC option 
	    will not force a depth comparison filter mode.

.NDV:    Forces the TEX to be considered non-divergent even though quad may be divergent.  
            This will not promote inactive threads, only force it to be treated as non-divergent despite the fact
            that some threads might be inactive.  To activate disabled threads in a quad SAM must be used.
	    Only the active mask and shader type are used to determine if a quad of threads is divergent.

.NODEP:  Indicates that there are no subsequent quad derivatives to be calculated.
	 Threads that have been "killed" will be disabled to stop unnecessary texture fetches.

.phase:  { .T, .P }
         Allows control on the current warps texture hash, used for scheduling.
             < NONE >
             .T - postfix increment of the 3 bit texture component of the hash.
	     .P - postfix increment of the 5 bit phase component, and zero out the 3 bit texture component of the hash. 


#tsPtrIdxU13:
         This immediate index (word address) is used to fetch the packed header+sampler pointer entry from constant cache.  The bank from which
         it is fetched  is determined by bundle state. The constant bank entry is 32 bit structure of the form
         "samplerPtr[31:20] | headerPtr[19:0]"
         Note: Ignored if .B option is used.
         In SetSamplerBinding.ViaHeaderBinding (i.e. OGL) mode, the headerPtr  would be used as the samplerPtr as well.
         Any header pointer greater than one specified in SetTexHeaderPoolC.MaximumIndex  will be regarded as an "invalid"
         texture (i.e. equivalent to BIND_GROUP_TEXTURE_HEADER_VALID_FALSE in Fermi). 
         Any sampler pointer greater than one specified in SetSamplerHeaderPoolC.MaximumIndex  will be regarded as an "invalid"
         texture (i.e. equivalent to BIND_GROUP_TEXTURE_HEADER_VALID_FALSE in fermi).

#tidU08, #smpU05:
         This is the Fermi-compatible specification of tsPtrIdxU13 which allows running of legacy apps/traces where SASS will
	 transform these into tsPtrIdxU13 as follows:
	 #tsPtrIdxU13 = {#smpU05, #tidU08}

#paramA: source coordinate description.

*Valid paramA specifiers for TEX*
parameter	Coordinate Registers implied
1D	s
2D	s,t
3D	s,t,r
CUBE	s,t,r
ARRAY_1D	a,s
ARRAY_2D	a,s,t
RESERVED	// for ARRAY_3D
ARRAY_CUBE	a,s,t,r

           s,t,r are fp32, 
           a is U16 integer


#wmskU04: destination write mask (decimated contiguous writes)
         Allows for write masking the returning data writes via a bit enable
         for each of R,G,B,A. A four-vector is always returned from TEX.
         #wmskU04 defaults to 0xf.

Ps:
         Predicate returning sparse tile status. Indicate that the surface access is happening to a page marked as sparse (valid, not mapped).

Reg	parameter	format
Ra+0	(.LC) : {LodClamp[27:16] \| array[15:0]}	{fixed point u4.8\|u16}
	!(.LC) : array[15:0]	u32
Ra+1	s	fp32
Ra+2	t	fp32
Ra+3	r	fp32
Rb+0	SamplerPtr\|HeaderPtr	u32
Rb+1	LOD	fp32
Rb+2	toff[11:0]	u32
Rb+3	DC	fp32

TEX : Texture Fetch

Format

Description

Additional Information:

Sanitation of Texture Input Coordinates:

Notes On Status Bits Sent To Texture Pipe:

Examples: