The MASM Forum Archive 2004 to 2012
Welcome, Guest. Please login or register.
November 19, 2017, 11:54:51 PM

Login with username, password and session length
Search:     Advanced search
128553 Posts in 15254 Topics by 684 Members
Latest Member: mottt
* Home Help Search Login Register
+  The MASM Forum Archive 2004 to 2012
|-+  General Forums
| |-+  The Campus
| | |-+  Allignment for SSE - MOVAPS
« previous next »
Pages: [1] Print
Author Topic: Allignment for SSE - MOVAPS  (Read 2415 times)
Draakie
Member
*****
Gender: Female
Posts: 223


Mad hatter syndrome.....


Allignment for SSE - MOVAPS
« on: January 11, 2008, 05:27:12 AM »

Hi again,

This one's for API Mnenomonic grinder types Tongue. SSE provides for a statement MOVAPS - move alligned to 16 byte
border data to XMM register. If the Data is not alligned - an exception is generated. The alternative MOVUPS. This
I accept and understand. However..... The data sets I would like to access are assigned via the API call GlobalAlloc.
Obviously I'am missing something.....how do I ensure Memmory assignment at a 16 byte border.

I would dearly like to make use of the faster MOVAPS instruction - to gain those extra cycles.

Thanks
Draakie


Logged

Does this code make me look bloated ? (wink)
Rockoon
Member
*****
Gender: Male
Posts: 612


Re: Allignment for SSE - MOVAPS
« Reply #1 on: January 11, 2008, 06:30:12 AM »

I believe GlobalAlloc only gives you a guarantee of 8-byte alignment.

The mission, should you choose to accept it, is to stop using GlobalAlloc.. or to use GlobalAlloc to allocate a lot of memory all at once and then manage that memory pool yourself.

(You shouldnt be using GlobalAlloc to allocate only 16 bytes, regardless of your alignment needs)
Logged

When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.
Draakie
Member
*****
Gender: Female
Posts: 223


Mad hatter syndrome.....


Re: Allignment for SSE - MOVAPS
« Reply #2 on: January 11, 2008, 07:12:33 AM »

Thanx Rockoon,

BUT no, I'am not just allocating 16-bytes.... more like multi-kilobytes (vertex data of structure x_REAL4,y_REAL4,z_REAL4,w_REAL4).
So what should I be using to get a guarenteed 16byte allignment, HeapAllocate ? When you say "manage the memory pool yourself"
what do you mean exactly ?

Draakie
Logged

Does this code make me look bloated ? (wink)
hutch--
Administrator
Member
*****
Posts: 12013


Mnemonic Driven API Grinder


Re: Allignment for SSE - MOVAPS
« Reply #3 on: January 11, 2008, 07:24:52 AM »

Draakie,

use the alignment macro in the masm32 macros for any memory you like. It will handle any power of two you point at it.
Logged

Regards,



Download site for MASM32
http://www.masm32.com
Synfire
Member
*****
Gender: Male
Posts: 121


Randomly Informative


Re: Allignment for SSE - MOVAPS
« Reply #4 on: January 11, 2008, 07:29:42 AM »

msvcrt.lib contains _aligned_malloc() and _aligned_free() which can be used for just this purpose. If they are not defined in msvcrt.inc, then the prototypes would be:

Code:
_aligned_malloc PROTO C _Size:DWORD, _Alignment:DWORD
_aligned_free PROTO C _Memory:DWORD

Use these just like you would the normal malloc/free, except when you call _aligned_malloc you have an extra argument which allows you to specify the alignment (should be self explanatory really). Hope this helps.
Logged

Draakie
Member
*****
Gender: Female
Posts: 223


Mad hatter syndrome.....


Re: Allignment for SSE - MOVAPS
« Reply #5 on: January 11, 2008, 08:21:18 AM »

OH YES THAT HELPS !

ta Hutch and SynFire.
Logged

Does this code make me look bloated ? (wink)
asmfan
Looking for a job
Member
*****
Gender: Male
Posts: 301


Here I am


Re: Allignment for SSE - MOVAPS
« Reply #6 on: January 11, 2008, 09:09:23 AM »

You can use VirtualAlloc - you'll get 4K byte alignment (or 65K if calling for the first chunk of data).
If you need many small chunks - use HeapAlloc with size = NeededSize + (Alignment - 1) and then adjust final aligned pointer as follows - pointer = pointer + (Alignment - 1); pointer = pointer AND (-Alignment); Alignment must be a power of 2. In binary every number "-Alignment" represents a bit mask: "-2"=111..110b applying that mask on a pointer leads to alignment down to needed alignment factor, thus we add some (Alignment-1) before applying that mask to ensure the mew aligned start address pointer will belong to already commited data and will be above (if neede) or equal to the returned by HeapAlloc pointer and at the same time will be aligned.
I hope it explains the binary basics of alignment.
P.S. Be sure to store the basic pointer (unaligned) to Free the memory correctly.
Logged

Russia is a weird place
NightWare
Member
*****
Gender: Male
Posts: 416


when dream comes true


Re: Allignment for SSE - MOVAPS
« Reply #7 on: January 11, 2008, 11:31:49 PM »

Obviously I'am missing something.....how do I ensure Memmory assignment at a 16 byte border.
or with something like that :
Code:
ALIGN 16
;
; allouer un bloc mémoire à un pointeur (ce bloc est aligné sur 16 octets)
; note : la structure _Informations_Memoire_ doit, évidement, avoir été au préalable créée et définie
; enfin, comme ici on alloue un bloc mémoire, il faudra le libérer en fin de programme avec LibererBlocMemoire
;
; Syntaxe :
; mov eax,{taille du bloc mémoire que l'on va créer (en octets)}
; mov esi,{OFFSET (adresse) d'une structure _Informations_Memoire_}
; mov edi,{OFFSET (adresse) d'une structure _Bloc_Memoire_}
; call AllouerBlocMemoire
;
; Retourne :
; eax = adresse du bloc mémoire aligné
; et les variables de la structure _Bloc_Memoire_ sont définies
;
AllouerBlocMemoire PROC
push ecx ;; empiler ecx
push edx ;; empiler edx

mov (_Bloc_Memoire_ PTR [edi])._Taille_Du_Bloc_,eax ;; placer eax dans x._Taille_Du_Bloc_
add eax,000000010h ;; ajouter 16 à la taille (pour aligner le bloc mémoire sur 16 octets)
invoke HeapAlloc,(_Informations_Memoire_ PTR [esi])._Instance_,HEAP_NO_SERIALIZE or HEAP_ZERO_MEMORY,eax ;; la taille en octets du bloc mémoire à créer
mov (_Bloc_Memoire_ PTR [edi])._Pointeur_A_Liberer_,eax ;; sauvegarder eax dans x._Pointeur_A_Liberer_
and eax,0FFFFFFF0h ;; ) nécessaire pour aligner le bloc mémoire sur 16 octets
add eax,000000010h ;; )
mov (_Bloc_Memoire_ PTR [edi])._Pointeur_,eax ;; sauvegarder eax dans x._Pointeur_

pop edx ;; désempiler edx
pop ecx ;; désempiler ecx
ret ;; retourner (sortir de la procédure)
AllouerBlocMemoire ENDP
Logged
Draakie
Member
*****
Gender: Female
Posts: 223


Mad hatter syndrome.....


Re: Allignment for SSE - MOVAPS
« Reply #8 on: January 14, 2008, 05:25:46 AM »

My French is non-existent. Is there some-one willing to translate the above into English ?
Draakie tgrin

PS: NightWare - You have posted a couple of SSE (Zero Mem fill etc.) routines in French.
     Besides ToutenASM and a couple others - please remember all your valuable comments
    are lost on us poor English second language folk.
     
Logged

Does this code make me look bloated ? (wink)
daydreamer
Member
*****
Posts: 616


Re: Allignment for SSE - MOVAPS
« Reply #9 on: January 14, 2008, 09:39:29 AM »

My French is non-existent. Is there some-one willing to translate the above into English ?
Draakie tgrin

PS: NightWare - You have posted a couple of SSE (Zero Mem fill etc.) routines in French.
     Besides ToutenASM and a couple others - please remember all your valuable comments
    are lost on us poor English second language folk.
     
I also use that kinda solution
its simple and fast solution with clipping away all bits below 16 with help of AND eax,$FFFFFF0h, will ensure align 16
AND eax,someFFs can also be useful to get a rollaround effect inside reserved memory, instead of much slower check boundaries or get a GPF
I for example use it when I need some code that needs to tile a 1024x1024 texture
Logged
NightWare
Member
*****
Gender: Male
Posts: 416


when dream comes true


Re: Allignment for SSE - MOVAPS
« Reply #10 on: January 14, 2008, 10:16:08 PM »

drakkie,
Roll Eyes the instructions are not in english ?... ok i'm gonna make an effort this time...

Code:
mov eax,Size ; size of the memory to alloc
add eax,000000010h ; add 16 to the size
invoke HeapAlloc,MemInstance,HEAP_NO_SERIALIZE or HEAP_ZERO_MEMORY,eax ; alloc
mov PointerToFree,eax ; the pointeur you need to free the memory block
; here it's the alignment
and eax,0FFFFFFF0h ; remove 0 to 15 bits value
add eax,000000010h ; add 16 ;; )
mov PointerToUse,eax ; the ALIGN 16 pointer

note : it's also possible to use
Code:
add eax,000000011h ; add 16+1 to the size
when you need to load a txt file, to ensure there is a final 0 (just to use your string routines on the txt file in memory...)

daydreamer,
the natural choice of asm coder...  ThumbsUp
Logged
Draakie
Member
*****
Gender: Female
Posts: 223


Mad hatter syndrome.....


Re: Allignment for SSE - MOVAPS
« Reply #11 on: January 15, 2008, 05:10:22 AM »

Thanks Loads NightWare - the effort is appreciated. Yup the code is in English (well - u're labels were not obvious at all naughty)
- but sometimes the comments speak the preverbial thousand words. wink - Please remember this thread is'nt just for me
- but those newbs to SSE who may now benefit from your and Daydreamer's infinite wizdom.

Draakie
Logged

Does this code make me look bloated ? (wink)
NightWare
Member
*****
Gender: Male
Posts: 416


when dream comes true


Re: Allignment for SSE - MOVAPS
« Reply #12 on: January 15, 2008, 10:19:31 PM »

the effort is appreciated. Yup the code is in English (well - u're labels were not obvious at all naughty)
in fact it wasn't a big effort, it was a bit necessary to translate a bit the algo, i've forgotten i use my own structures to speed up things... and it's true it was quite unreadable for someone who don't speak french
Logged
Pages: [1] Print 
« previous next »
Jump to:  

Powered by MySQL Powered by PHP The MASM Forum Archive 2004 to 2012 | Powered by SMF 1.0.12.
© 2001-2005, Lewis Media. All Rights Reserved.
Valid XHTML 1.0! Valid CSS!