News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

PSHUFB test piece.

Started by hutch--, January 27, 2011, 04:01:54 AM

Previous topic - Next topic

hutch--

Needed to see the capacity of BSHUFB. Looks like it will really hurry up BSWAP for streaming data. It appears to be a very useful instruction with many applications. NOTE that it requires SSE3.


IF 0  ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
                      Build this template with "CONSOLE ASSEMBLE AND LINK"
                                 Computer must be SSE3 capable
ENDIF ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include\masm32rt.inc

    .data
      align 16
      shflmask dd  03020100h,07060504h,0B0A0908h,0F0E0D0Ch
      pshmsk dd shflmask

      align 16
      shflmask2 dd 00010203h,04050607h,08090A0Bh,0C0D0E0Fh
      pshmsk2 dd shflmask2

      align 16
      mytest db "32107654BA98FECD",0,0,0,0      ; data to shuffle
      pmtst dd mytest

      align 16
      mybuff db 20 dup (0)                      ; output buffer for result
      pmbuf dd mybuff

    .code

start:
   
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    call main
    inkey
    exit

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    mov eax, pshmsk2            ; load the shuffle mask address
    movdqa xmm2, [eax]          ; copy it into XMM2

    mov eax, pmtst              ; load the data address in EAX
    movdqa xmm1, [eax]          ; load the data into XMM1

    pshufb xmm1, xmm2           ; shuffle bytes to order in XMM2

    mov eax, pmbuf              ; load output buffer address
    movdqa [eax], xmm1          ; copy 16 byte result to it

    print pmbuf,13,10

    ret

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

GregL

hutch,

I had to add .XMM to get it to assemble with MASM 10.0.


hutch--

That makes sense, I built this with ML version 9.0 and just typed it in and it worked.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

sinsi

My new you beaut AMD processor doesn't support SSSE3 so I get an invalid opcode error for pshufb :(
Light travels faster than sound, that's why some people seem bright until you hear them.

hutch--

What a shame, is there another AMD specific that will do the same or similar task ?
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

sinsi

Maybe in SSE5? It seems that Intel and AMD are going their own ways again.

This was interesting: http://abinstein.blogspot.com/2007/09/amds-latest-x86-extension-sse5-part-2.html
AMD make logical instruction encodings to allow for expansion whereas Intel squeeze them in any old how  :lol
Light travels faster than sound, that's why some people seem bright until you hear them.

frktons

Mind is like a parachute. You know what to do in order to use it :-)