AST : Attribute Store

Format:

SPA 5.0:
       Immediate attribute stores:
        {@{!}Pg}   AST{.P}{.sz}    a[#ImmU10],Rb,{Rc}   {&req_6}   {&rdN}   {?sched}   ;   

       Indexed patch attribute stores:
        {@{!}Pg}   AST.P{.sz}      a[Ra + #ImmS11],Rb   {&req_6}   {&rdN}   {?sched}   ;   

       Indexed VTG (Vertex,Tess,Geom) attribute stores:
        {@{!}Pg}   AST.PHYS{.sz}   a[Ra],Rb,{Rc}        {&req_6}   {&rdN}   {?sched}   ;   

 .P:      Store patch attributes
 .sz:     { .32*, .64, .96, .128 } 
 .PHYS:   Indexed mode for VTG attribute load that uses physical attribute number determined via AL2P
	  .PHYS is encoded as .P=0 and Ra!=RZ and imm=0

Description:

Store VTG (Vertex,Tess,Geom) attribute(s).

This instruction performs stores to the patch section or attribute section (of ISBE structure), or to attribute fifo's of geometry shader fifos (GSFIFO).

Attribute writes:

When performing attribute writes, attribute address is specified as

  • either specified as unsigned 10b immediate (Ra is RZ or not specified), in which case it is interpreted as logical attribute byte address or
  • if Ra is specified and is not RZ, the immediate has to be zero and value in Ra register is the al2pResult_vtg_t data structure result of AL2P instruction.

    Store data is register specified in Rb.

    Normal Geometry shaders need to specify additional register Rc that contains current state for hw state machine managing each thread's attribute fifo. This state machine must be initialized to 0 at the beginning of the Geometry Shader program. The shader program should not attempt to change the contents of the state machine register as it is deemed an opaque value. Since this state machine indirectly specifies the vertex offset to write the attribute data to, HW will kill writes that could damage another threads data. However modifying the GS state machine is extremely dangerous and should never be done by SW without the OUT instruction.

    IMPORTANT: When a GS program terminates, the SM hw will auto-generate a final OUT instruction that will source R0 as the final state machine. Any GS program must have R0 up-to-date with the current state machine whenever the thread can possibly terminate. If this is not done, GS output will be lost. It is strongly recommended that all GS programs just reserve R0 as the state machine register. The update of the state machine register happens via OUT instruction.

    Vertex and Tessellation shaders do not need to specify Rc and do not use Rc.

    Patch Attributes writes:

    For patch attribute writes, attrubute address within patch is specified as Ra + #sImm11 byte address. If Ra is not specified or Ra is RZ, the input address is an unsigned 10b immediate byte address.

    Patch Attributes only exist for TI shader outputs. An attribute store can be directed to the patch attribute section by using the .P suffix. This per-patch data is only stored once per patch, so if multiple threads write to the same destination, only one can win (which thread wins is deterministic for AST, but may vary between architectures, and should not be relied upon). Stores to the per-patch area are not protected by an OMAP, instead presenting a directly addressable buffer for attributes. Attributes written by TI shaders can only be read by that TI shader (using ALD.O) and the TS shaders.

    TESSELLATION_LOD_* attributes are a special case of patch attributes, as seen on the Patch Attribute Address Map). They must be written to a fixed location, and are laid out for the maximum case (quads). If triangles/line tessellation is being performed instead, some of these TESSELLATION_LOD_* attributes are no longer needed. When in Triangle/Line tessellation mode, values written to the additional attributes will be ignored by TG, meaning SW can safely use them as generic patch attributes (despite their names).

  • Additional Information:

    An output BMAP is formed from the IMAP of the next shader, the OMAP of the current shader, and the address of the attribute.

    For the most common usage cases, the BMAP Definition is simply:

       output BMAP = (OMAP from current stage & (IMAP from next stage | ST_REQ from current stage))
    

    For an attribute address, the following table describes the behavior:

      ----------------------------------------------
       output BMAP       Store Result
      ----------------------------------------------
           0             Silently Discarded
           1             Store OK
      ----------------------------------------------
    

    Writing attribute addresses above 1024 (256*4) or below 0 will result in the write being silently discarded.

    .sz controls the size of stored data. Supported options are 32, 64, 96, and 128 bits, read from consecutive data registers.

    Each attribute is assumed to be a 32b scalar. When doing a vector store, AST will access up to 4 consecutive logical attribute addresses.

    Alignment applies to both register and attribute index. LSB bits are dropped for alignment:

       .32      forces (Ra+#ImmS11) 2 lsb to 00.
       .64      forces Rd register address[0:0] to  0 and (Ra+#ImmS11) 3 lsb to 000.
       .96/.128 forces Rd register address[1:0] to 00 and (Ra+#ImmS11) 4 lsb to 0000.
    

    Examples:

    AST.128     a[64   ],R1;
    AST.P.64    a[R0-16],R1;
    AST.PHYS.32 a[R0   ],R1, R2;
    
    using AL2P to do indexed AST:
    AL2P.O.128      R0,  R1, -32;
    AST.PHYS.128    R2,  a[R0] ;
    

    Back to Index of Instructions