Sun Microsystems, Inc.
spacerspacer
spacer www.sun.com docs.sun.com |
spacer
black dot
 
 
E.  SPARC-V9 Instruction Set E.5 SPARC-V9 Synthetic Instruction-Set Mapping  Previous   Contents   Next 
   
 

E.6 UltraSPARC and VIS Instruction Set Extensions

This section describes extensions that require SPARC-V9. The extensions support enhanced graphics functionality and improved memory access efficiency.


Note - SPARC-V9 instruction set extensions used in executables may not be portable to other SPARC-V9 systems.


E.6.1 Graphics Data Formats

The overhead of converting to and from floating-point arithmetic is high, so the graphics instructions are optimized for short-integer arithmetic. Image components are 8 or 16 bits. Intermediate results are 16 or 32 bits.

E.6.2 Eight-bit Format

A 32-bit word contains pixels of four unsigned 8-bit integers. The integers represent image intensity values (, G, B, R). Support is provided for band interleaved images (store color components of a point), and band sequential images (store all values of one color component).

E.6.3 Fixed Data Formats

A 64-bit word contains four 16-bit signed fixed-point values. This is the fixed 16-bit data format.

A 64-bit word contains two 8-bit signed fixed-point values. This is the fixed 32-bit data format.

Enough precision and dynamic range (for filtering and simple image computations on pixel values) can be provided by an intermediate format of fixed data values. Pixel multiplication is used to convert from pixel data to fixed data. Pack instructions are used to convert from fixed data to pixel data (clip and truncate to an 8-bit unsigned value). The FPACKFIX instruction supports conversion from 32-bit fixed to 16-bit fixed. Rounding is done by adding one to the rounding bit position. You should use floating-point data to perform complex calculations needing more precision or dynamic range.

E.6.4 SHUTDOWN Instruction

All outstanding transactions are completed before the SHUTDOWN instruction completes.

Table E-13

SPARC

Mnemonic

Argument List

Description

SHUTDOWN

shutdown

 

shutdown to enter power down mode

E.6.5 Graphics Status Register (GSR)

You use ASR 0x13 instructions RDASR and WRASR to access the Graphics Status Register.

Table E-14

SPARC

Mnemonic

Argument List

Description

RDASR

WRASR

rdasr

wrasr

%gsr, regrd

regrs1, reg_or_imm, %gsr

read GSR

write GSR

E.6.6 Graphics Instructions

Unless otherwise specified, floating-point registers contain all instruction operands. There are 32 double-precision registers. Single-precision floating-point registers contain the pixel values, and double-precision floating-point registers contain the fixed values.

The opcode space reserved for the Implementation-Dependent Instruction1 (IMPDEP1) instructions is where the graphics instruction set is mapped.

Partitioned add/subtract instructions perform two 32-bit or four 16-bit partitioned adds or subtracts between the source operands corresponding fixed point values.

Table E-15

SPARC

Mnemonic

Argument List

Description

FPADD16

FPADD16S

FPADD32

FPADD32S

FPSUB16

FPSUB16S

FPSUB32

FPSUB32S

fpadd16

fpadd16s

fpadd32

fpadd32s

fpsub16

fpsub16s

fpsub32

fpsub32s

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

four 16-bit add

two 16-bit add

two 32-bit add

one 32-bit add

four 16-bit subtract

two 16-bit subtract

two 32-bit subtract

one 32-bit subtract

Pack instructions convert to a lower pixel or precision fixed format.

Table E-16

SPARC

Mnemonic

Argument List

Description

FPACK16

FPACK32

FPACKFIX

FEXPAND

FPMERGE

fpack16

fpack32

fpackfix

fexpand

fpmerge

fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs2, fregrd

fregrs2, fregrd

fregrs1, fregrs2, fregrd

four 16-bit packs

two 32-bit packs

four 16-bit packs

four 16-bit expands

two 32-bit merges

Partitioned multiply instructions have the following variations.

Table E-17

SPARC

Mnemonic

Argument List

Description

FMUL8x16

FMUL8x16AU

FMUL8x16AL

FMUL8SUx16

FMUL8ULx16

FMULD8SUx16

FMULD8ULx16

fmul8x16

fmul8x16au

fmul8x16al

fmul8sux16

fmul8ulx16

fmuld8sux16

fmuld8ulx16

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

8x16-bit partition

8x16-bit upper partition

8x16-bit lower partition

upper 8x16-bit partition

lower unsigned 8x16-bit partition

upper 8x16-bit partition

lower unsigned 8x16-bit partition

Alignment instructions have the following variations.

Table E-18

SPARC

Mnemonic

Argument List

Description

ALIGNADDRESS

ALIGNADDRESS_LITTLE

FALIGNDATA

alignaddr

alignaddrl

 

faligndata

regrs1, regrs2, regrd

regrs1, regrs2, regrd

 

fregrs1, fregrs2, fregrd

find misaligned data access address

same as above, but little-endian

 

do misaligned data, data alignment

Logical operate instructions perform one of sixteen 64-bit logical operations between rs1 and rs2 (in the standard 64-bit version).

Table E-19

SPARC

Mnemonic

Argument List

Description

FZERO

FZEROS

FONE

FONES

FSRC1

fzero

fzeros

fone

fones

fsrc1

fregrd

fregrd

fregrd

fregrd

fregrs1, fregrd

zero fill

zero fill, single precision

one fill

one fill, single precision

copy src1

FSRC1S

FSRC2

FSRC2S

FNOT1

FNOT1S

fsrc1s

fsrc2

fsrc2s

fnot1

fnot1s

fregrs1, fregrd

fregrs2, fregrd

fregrs2, fregrd

fregrs1, fregrd

fregrs1, fregrd

copy src1, single precision

copy src2

copy src2, single precision

negate src1, 1's complement

same as above, single precision

FNOT2

FNOT2S

FOR

FORS

FNOR

fnot2

fnot2s

for

fors

fnor

fregrs2, fregrd

fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

negate src2, 1's complement

same as above, single precision

logical OR

logical OR, single precision

logical NOR

FNORS

FAND

FANDS

FNAND

FNANDS

fnors

fand

fands

fnand

fnands

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

logical NOR, single precision

logical AND

logical AND, single precision

logical NAND

logical NAND, single precision

FXOR

FXORS

FXNOR

FXNORS

FORNOT1

fxor

fxors

fxnor

fxnors

fornot1

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

logical XOR

logical XOR, single precision

logical XNOR

logical XNOR, single precision

negated src1 OR src2

FORNOT1S

FORNOT2

FORNOT2S

FANDNOT1

fornot1s

fornot2

fornot2s

fandnot1

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

same as above, single precision

src1 OR negated src2

same as above, single precision

negated src1 AND src2

FANDNOT1S

FANDNOT2

FANDNOT2S

fandnot1s

fandnot2

fandnot2s

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

 

same as above, single precision

src1 AND negated src2

same as above, single precision

Pixel compare instructions compare fixed-point values in rs1 and rs2 (two 32 bit or four 16 bit)

Table E-20

SPARC

Mnemonic

Argument List

Description

FCMPGT16

FCMPGT32

FCMPLE16

FCMPLE32

fcmpgt16

fcmpgt32

fcmple16

fcmple32

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

4 16-bit compare, set rd if src1>src2

2 32-bit compare, set rd if src1>src2

4 16-bit compare, set rd if src1<=src2

2 32-bit compare, set rd if src1<=src2

FCMPNE16

FCMPNE32

FCMPEQ16

FCMPEQ32

fcmpne16

fcmpne32

fcmpeq16

fcmpeq32

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

4 16-bit compare, set rd if src1nesrc2

2 32-bit compare, set rd if src1nesrc2

4 16-bit compare, set rd if src1=src2

2 32-bit compare, set rd if src1=src2

Edge handling instructions handle the boundary conditions for parallel pixel scan line loops.

Table E-21

SPARC

Mnemonic

Argument List

Description

EDGE8

EDGE8L

EDGE16

edge8

edge8l

edge16

regrs1, regrs2, regrd

regrs1, regrs2, regrd

regrs1, regrs2, regrd

8 8-bit edge boundary processing

same as above, little-endian

4 16-bit edge boundary processing

EDGE16L

EDGE32

EDGE32L

edge16l

edge32

edge32l

regrs1, regrs2, regrd

regrs1, regrs2, regrd

regrs1, regrs2, regrd

same as above, little-endian

2 32-bit edge boundary processing

same as above, little-endian

Pixel component distance instructions are used for motion estimation in video compression algorithms.

Table E-22

SPARC

Mnemonic

Argument List

Description

PDIST

pdist

fregrs1, fregrs2, fregrd

8 8-bit components, distance between

The three-dimensional array addressing instructions convert three- dimensional fixed-point addresses (in rs1) to a blocked-byte address. The result is stored in rd.

Table E-23

SPARC

Mnemonic

Argument List

Description

ARRAY8

 

ARRAY16

ARRAY32

array8

 

array16

array32

regrs1, regrs2, regrd

 

regrs1, regrs2, regrd

regrs1, regrs2, regrd

convert 8-bit 3-D address to blocked byte address

same as above, but 16-bit

same as above, but 32-bit

E.6.7 Memory Access Instructions

These memory access instructions are part of the SPARC-V9 instruction set extensions.

Table E-24

SPARC

imm_asi

Argument List

Description

 

STDFA

STDFA

STDFA

STDFA

 

ASI_PST8_P

ASI_PST8_S

ASI_PST8_PL

ASI_PST8_SL

 

stda fregrd, [fregrs1] regmask, imm_asi

eight 8-bit conditional stores to:

primary address space

secondary address space

primary address space, little endian

secondary address space, little endian

 

STDFA

STDFA

STDFA

STDFA

 

ASI_PST16_P

ASI_PST16_S

ASI_PST16_PL

ASI_PST16_SL

 

four 16-bit conditional stores to:

primary address space

secondary address space

primary address space, little endian

secondary address space, little endian

 

STDFA

STDFA

STDFA

STDFA

 

ASI_PST32_P

ASI_PST32_S

ASI_PST32_PL

ASI_PST32_SL

 

two 32-bit conditional stores to:

primary address space

secondary address space

primary address space, little endian

secondary address space, little endian


Note - To select a partial store instruction, use one of the partial store ASIs with the STDA instruction.


Table E-25

SPARC

imm_asi

Argument List

Description

 

LDDFA

STDFA

 

ASI_FL8_P

 

ldda [reg_addr] imm_asi, freqrd

stda freqrd, [reg_addr] imm_asi

8-bit load/store from/to:

primary address space

LDDFA

STDFA

ASI_FL8_S

ldda [reg_plus_imm] %asi, freqrd

stda [reg_plus_imm] %asi

secondary address space

LDDFA

STDFA

ASI_FL8_PL

 

primary address space, little endian

LDDFA

STDFA

ASI_FL8_SL

 

secondary address space, little endian

 

LDDFA

STDFA

 

ASI_FL16_P

 

16-bit load/store from/to:

primary address space

LDDFA

STDFA

ASI_FL16_S

 

secondary address space

LDDFA

STDFA

ASI_FL16_PL

 

primary address space, little endian

LDDFA

STDFA

ASI_FL16_SL

 

secondary address space, little endian


Note - To select a short floating-point load and store instruction, use one of the short ASIs with the LDDA and STDA instructions.


Table E-26

SPARC

imm_asi

Argument List

Description

LDDA

LDDA

ASI_NUCLEUS_QUAD_LDD

ASI_NUCLEUS_QUAD_LDD_L

[reg_addr] imm_asi, regrd

[reg_plus_imm] %asi, regrd

128-bit atomic load

128-bit atomic load, little endian

 

LDDFA

STDFA

 

ASI_BLK_AIUP

 

ldda [reg_addr] imm_asi, freqrd

stda freqrd, [reg_addr] imm_asi

64-byte block load/store from/to:

primary address space, user privilege

LDDFA

STDFA

ASI_BLK_AIUS

ldda [reg_plus_imm] %asi, freqrd

stda fregrd, [reg_plus_imm] %asi

secondary address space, user privilege.

LDDFA

STDFA

ASI_BLK_AIUPL

 

primary address space, user privilege, little endian

LDDFA

STDFA

ASI_BLK_AIUSL

 

secondary address space, user privilege little endian

LDDFA

STDFA

ASI_BLK_P

 

primary address space

LDDFA

STDFA

ASI_BLK_S

 

secondary address space

LDDFA

STDFA

ASI_BLK_PL

 

primary address space, little endian

LDDFA

STDFA

ASI_BLK_SL

 

secondary address space, little endian

LDDFA

STDFA

ASI_BLK_COMMIT_P

 

64-byte block commit store to primary address space

LDDFA

STDFA

ASI_BLK_COMMIT_S

 

64-byte block commit store to secondary address space


Note - To select a block load and store instruction, use one of the block transfer ASIs with the LDDA and STDA instructions.


 
 
 
  Previous   Contents   Next