SPA 5.0:
{@{!}Pg}
TEXS{.F16}{.lod}{.DC}{.NODEP}{.phase}
RZ, Rd0, Ra{, Rb}, #tsPtrIdxU13, #paramA, wmsk2C
{&req_6}
{&rdN}
{&wrN}
{?sched}
;
.lod: { .LZ, .LL } LOD adjust mode. < NONE > .LZ - LOD level 0 (finest) // no register required .LL - LOD absolute discrete // 1 fp32 register required LOD Level is actually relative to textureHeader.resViewMinMapLevel. .NODEP: Indicates that there are no subsequent quad derivatives to be calculated. Threads that have been "killed" will be disabled to stop unnecessary texture fetches. .DC: Depth comparison filter mode using reference value. RefVal // 1 fp32 register required Depth Comparison filter is not supported by 3D textures. For TEXS and TEXS.LZ, the .DC option will force a depth comparison filter mode regardless of the sampler state. For TEXS.LL, if the sampler state does not enable depth comparison the .DC option will not force a depth comparison filter mode. .phase: { .T, .P } Allows control on the current warps texture hash, used for scheduling. Phasing is explained here Texture Phasing. < NONE > .T - postfix increment of the 3 bit texture component of the hash. .P - postfix increment of the 5 bit phase component, and zero out the 3 bit texture component of the hash. #tsPtrIdxU13: This immediate index (word address) is used to fetch the packed header+sampler pointer entry from constant cache. The bank from which it is fetched is determined by bundle state. The constant bank entry is 32 bit structure of the form "samplerPtr[31:20] | headerPtr[19:0]". In SetSamplerBinding.ViaHeaderBinding (i.e. OGL) mode, the headerPtr would be used as the samplerPtr as well. Any header pointer greater than one specified in SetTexHeaderPoolC.MaximumIndex will be regarded as an "invalid" texture (i.e. equivalent to BIND_GROUP_TEXTURE_HEADER_VALID_FALSE in Fermi). Any sampler pointer greater than one specified in SetSamplerHeaderPoolC.MaximumIndex will be regarded as an "invalid" texture (i.e. equivalent to BIND_GROUP_TEXTURE_HEADER_VALID_FALSE in fermi). wmsk2C : {R, G, B, A, RG, RA, GA, BA} // destination write mask for up to 2 component writeback. wmsk34C: {RGB, RGA, GBA, RBA, RGBA*} // destination write mask for 3 or 4 component writeback. .F16: If specified, texture return data is in packed F16 format. Otherwise, the return data is in 32 bit format (fp32 or S/UINT32). Partial register writes do no occur: any unused portion of the return register is written with the value 0. Note: .F16 modifier is not supported for integer textures in SPA 5.2. UNPREDICTABLE)
{@{!}Pg}
TEXS{.F16}{.lod}{.DC}{.NODEP}{.phase}
Rd1,Rd0, Ra{, Rb}, #tsPtrIdxU13, #paramA{, wmsk34C}
{&req_6}
{&rdN}
{&wrN}
{?sched}
;
#paramA: source coordinate description.
parameter | Coordinate Registers implied |
---|---|
1D | s |
2D | s,t |
3D | s,t,r |
CUBE | s,t,r |
RESERVED | // for ARRAY_1D |
ARRAY_2D | a,s,t |
RESERVED | // for ARRAY_3D |
RESERVED | // for ARRAY_CUBE |
s,t,r are fp32, a is U16 integer
Not all combinations of .lod, .DC, and #paramA are allowed. See the encoding table in the Description, below. Rounding mode is controlled by a PRI: [SM]PRI_SM_TEXIO_CONTROL_FP16_ROUNDING_MODE. It must be set to the same value as PRI_TEX_F_DBG_FP16_ROUNDING_MODE.
Texture fetch using a texture coordinates/parameters stored in registers Ra,Rb. The return data is written back to registers Rd0, Rd1 based on wmsk2C/wmsk34C specification. Legal instruction modifiers for TEXS and corresponding parameter packing in Ra and Rb is specified below.
#paramA | .DC | .lod | encoding | Ra-packing | Ra-size | Rb | Rb-size |
---|---|---|---|---|---|---|---|
1D | - | .LZ | 0 | s | scalar | must be RZ | none |
2D | - | <NONE> | 1 | s | scalar | t | scalar |
2D | - | .LZ | 2 | s | scalar | t | scalar |
2D | - | .LL | 3 | s,t | vec2 | lod | scalar |
2D | .DC | <NONE> | 4 | s,t | vec2 | dc | scalar |
2D | .DC | .LL | 5 | s,t | vec2 | lod,dc | vec2 |
2D | .DC | .LZ | 6 | s,t | vec2 | dc | scalar |
ARRAY_2D | - | <NONE> | 7 | array,s | vec2 | t | scalar |
ARRAY_2D | - | .LZ | 8 | array,s | vec2 | t | scalar |
ARRAY_2D | .DC | .LZ | 9 | array,s | vec2 | t,dc | vec2 |
3D | - | <NONE> | 10 | s,t | vec2 | r | scalar |
3D | - | .LZ | 11 | s,t | vec2 | r | scalar |
CUBE | - | <NONE> | 12 | s,t | vec2 | r | scalar |
CUBE | - | .LL | 13 | s,t | vec2 | r,lod | vec2 |
Rd1 | wmsk | wmsk encoding | Rd0-size | Rd0-packing | Rd1-size | Rd1-packing |
---|---|---|---|---|---|---|
RZ | R | 0 | scalar | Rd0+0 = R component | none | must be RZ |
RZ | G | 1 | scalar | Rd0+0 = G component | none | must be RZ |
RZ | B | 2 | scalar | Rd0+0 = B component | none | must be RZ |
RZ | A | 3 | scalar | Rd0+0 = A component | none | must be RZ |
RZ | RG | 4 | vec2 | Rd0+0 = R component, Rd0+1 = G component |
none | must be RZ |
RZ | RA | 5 | vec2 | Rd0+0 = R component, Rd0+1 = A component |
none | must be RZ |
RZ | GA | 6 | vec2 | Rd0+0 = G component, Rd0+1 = A component |
none | must be RZ |
RZ | BA | 7 | vec2 | Rd0+0 = B component, Rd0+1 = A component |
none | must be RZ |
non-RZ | RGB | 0 | vec2 | Rd0+0 = R component, Rd0+1 = G component |
scalar | Rd1+0=B component |
non-RZ | RGA | 1 | vec2 | Rd0+0 = R component, Rd0+1 = G component |
scalar | Rd1+0=A component |
non-RZ | RBA | 2 | vec2 | Rd0+0 = R component, Rd0+1 = B component |
scalar | Rd1+0=A component |
non-RZ | GBA | 3 | vec2 | Rd0+0 = G component, Rd0+1 = B component |
scalar | Rd1+0=A component |
non-RZ | RGBA | 4 | vec2 | Rd0+0 = R component, Rd0+1 = G component |
vec2 | Rd1+0=B component, Rd1+1=A component |
Rd1 | wmsk | wmsk encoding | Rd0-size | Rd0-packing | Rd1-size | Rd1-packing |
---|---|---|---|---|---|---|
RZ | R | 0 | scalar | Rd0[15:0] = R component, Rd0[31:16] = 0 |
none | must be RZ |
RZ | G | 1 | scalar | Rd0[15:0] = G component, Rd0[31:16] = 0 |
none | must be RZ |
RZ | B | 2 | scalar | Rd0[15:0] = B component, Rd0[31:16] = 0 |
none | must be RZ |
RZ | A | 3 | scalar | Rd0[15:0] = A component, Rd0[31:16] = 0 |
none | must be RZ |
RZ | RG | 4 | scalar | Rd0[15:0] = R component, Rd0[31:16] = G component |
none | must be RZ |
RZ | RA | 5 | scalar | Rd0[15:0] = R component, Rd0[31:16] = A component |
none | must be RZ |
RZ | GA | 6 | scalar | Rd0[15:0] = G component, Rd0[31:16] = A component |
none | must be RZ |
RZ | BA | 7 | scalar | Rd0[15:0] = B component, Rd0[31:16] = A component |
none | must be RZ |
non-RZ | RGB | 0 | scalar | Rd0[15:0] = R component, Rd0[31:16] = G component |
scalar | Rd1[15:0] =B component, Rd1[31:16] = 0 |
non-RZ | RGA | 1 | scalar | Rd0[15:0] = R component, Rd0[31:16] = G component |
scalar | Rd1[15:0] =A component, Rd1[31:16] = 0 |
non-RZ | RBA | 2 | scalar | Rd0[15:0] = R component, Rd0[31:16] = B component |
scalar | Rd1[15:0] =A component, Rd1[31:16] = 0 |
non-RZ | GBA | 3 | scalar | Rd0[15:0] = G component, Rd0[31:16] = B component |
scalar | Rd1[15:0] =A component, Rd1[31:16] = 0 |
non-RZ | RGBA | 4 | scalar | Rd0[15:0] = R component, Rd0[31:16] = G component |
scalar | Rd1[15:0] =B component, Rd1[31:16]=A component |
The texture parameter source registers Ra/Rb and the destination (result) registers Rd0/Rd1 have alignment restrictions based on the number of scalar registers being read/written. Specifically,
Some input texture values will be sanitized before being used, see Additional Information for more details.
TEXS corresponds to these DX ops:
sample = TEXS // sample_d = TXD/sw emulated // emulate this via SAM/SWZ/TEX/RAM sample_l = TEXS.LL // lod supplied sample_c = TEXS.DC // depth comparison filter with reference value sample_lz = TEXS.LZ // lod level 0 (finest)
Texture input coordinates go through a sanitation step before being used in texture calculations.
TEXS RZ, R0, R19, R29, 0x1, 2D, RG; # reads R19 & R29; writes R0 & R1 TEXS.LL R0, R2, R4, R9, 0x3, 2D, RGBA; # reads R4, R5, & R9; writes R0, R1, R2, & R3 TEXS.DC R4, R0, R8, R19, 0x2, 2D, RGBA; # reads R8, R9, & R19; writes R4, R5, R0, & R1