SPA 5.0:
{@{!}Pg}
TXD{.B}{.LC}{.AOFFI}{.NODEP}{.phase}
{Ps,} Rd, Ra, Rb, #tsPtrIdxU13, #paramA{, #wmskU04}
{&req_6}
{&rdN}
{&wrN}
{?sched}
;
.B: Bindless mode, where the texture header pointer and sampler pointer is packed into a 32 bit register as: samplerPtr[31:20] | headerPtr[19:0] Data is sent via register Ra. .LC: LOD Clamp value for Sparse Textures. A 12 bit (u4.8 format) value. Packed with the ARRAY index in the same register. .AOFFI: Programmable Texel Offset. _aoffimmi(u,v,w) [DX10] // 1 register required ((v & 0xf)<<4) | (u & 0xf) Each 4b field is a 2's complement integer from -8 to +7. .NODEP: Indicates that there is no subsequent quad derivatives to be calculated. Threads that have been "killed" will be disabled to stop unnecessary texture fetches.
{@{!}Pg}
TXD{.B}{.LC}{.AOFFI}{.NODEP}{.phase}
{Ps,} Rd, Ra, Rb, #tidU08, #smpU05, #paramA{, #wmskU04}
{&req_6}
{&rdN}
{&wrN}
{?sched}
;
.phase: Allows control on the current warps texture hash, used for scheduling.
< NONE >
.T - postfix increment of the 3 bit texture component of the hash.
.P - postfix increment of the 5 bit phase component, and zero out the 3 bit texture component of the hash.
Ps:
Predicate returning sparse tile status. Indiate that the surface access is happening to a page marked as sparse (not valid).
Immediate Inputs:
#tsPtrIdxU13:
This immediate index (word address) used to fetch the packed header+sampler pointer entry from constant cache. The bank from
which it is fetched is determined by bundle state. The constant bank entry is 32 bit structure of the form
"samplerPtr[31:20] | headerPtr[19:0]".
Note: Ignored if .B option is used.
In SetSamplerBinding.ViaHeaderBinding (i.e. OGL) mode, the headerPtr would be used as the samplerPtr as well.
Any header pointer greater than one specified in SetTexHeaderPoolC.MaximumIndex will be regarded as an "invalid"
texture (i.e. equivalent to BIND_GROUP_TEXTURE_HEADER_VALID_FALSE in fermi).
Any sampler pointer greater than one specified in SetSamplerHeaderPoolC.MaximumIndex will be regarded as an
"invalid" texture (i.e. equivalent to BIND_GROUP_TEXTURE_HEADER_VALID_FALSE in fermi).
#tidU08, #smpU05:
This is the "almost" Fermi-compatible specification of tsPtrIdxU13 which allows running of legacy apps/traces
where sass will transform these into tsPtrIdxU13 as follows:
tsPtrIdxU13 = {#smpU05, #tidU08}
#paramA: source coordinate description.
parameter | Coordinate Registers implied |
---|---|
1D | s |
2D | s,t |
RESERVED | // for 3D |
RESERVED | // for CUBE |
ARRAY_1D | a,s |
ARRAY_2D | a,s,t |
RESERVED | // for ARRAY_3D |
RESERVED | // for ARRAY_CUBE |
s,t,r are fp32,
a is U16 integer
If the source coordinate description does not match the texture type of the texture header, zeroes will be returned. The array specifiers can be freely used with non-array textures (and the opposite holds as well), provided the number of coordinates (1D,2D) matches. #wmskU04 destination write mask (decimated contiguous writes) Allows for write masking the returning data writes via a bit enable for each of R,G,B,A. A four-vector is always returned from TXD. #wmskU04 defaults to 0xf. Neither Ra nor Rb can be RZ.
Texture fetch using a texture coordinate vector and derivatives.
Note: TXD hardware does not support CUBE and 3D. These must still be emulated by SHFL/TEX.
The parameter assignment in register Ra/Rb is as follows:
Reg | parameter | format |
---|---|---|
Ra+0 | { SamplerPtr[31:20] | HeaderPtr[19:0] } | u32 |
Ra+1 | s | fp32 |
Ra+2 | t | fp32 |
Ra+3 | (.LC) ? { LodClamp[31:20] | toff[19:12] | array[11:0] }  : { toff[27:16] | array[15:0] } |
u32 |
Rb+0 | dsdx | fp32 |
Rb+1 | dsdy | fp32 |
Rb+2 | dtdx | fp32 |
Rb+3 | dtdy | fp32 |
The texture parameter source registers Ra/Rb and the destination (result) register Rd have alignment restrictions based on the number of scalar registers being read/written. Specifically,
Some input texture values will be sanitized before being used.
Corresponds to these DX ops:
sample_d
Notes about equivalency with TEX instruction:
TXD R0,R2,R4,5,2D,0xf;
For OpenGL, it is necessary to premultiply the dy values by SR18 to cancel the effects of the origin aware DDY expansion (documented in FSWZ).
S2R R8, SR18;
FMUL R5, R5, R8;
FMUL R7, R7, R8;
TXD R0,R2,R4,5,2D,0xf;