I am sitting here trying to code a fast mersenne twister, but I have seconds thoughts if there arent any better suitable for the purpose I want it for: procedural texture
so can you more experienced old guys help me with advice choosing one
1: it must be good enough to generate procedural texture
2: it would be good if I could find a BAD kinda randomgenerator, bad in the sense its maybe could be possible to a specific seed, gives a pattern that is easily reversed
Hi,
The standard easy random number generator is the linear
congruential pseudo-random generator.
Ran(n+1) = Ran(n) * number + (other number)
With proper choices of the numbers, you get reasonable
sequences.
Quote1: it must be good enough to generate procedural texture
Try it, it should be okay.
Quote2: it would be good if I could find a BAD kinda randomgenerator, bad in the sense its maybe could be possible to a specific seed, gives a pattern that is easily reversed
That can be done with this kind of generator. If I
understand you correctly. The least significant bits
are not very random, and the multiplier and constant
can be determined.
Some suggested numbers.
; - - - - - -
; "Random number routine" from Knuth/Numerical Recipies
Rand32 DD 31415926 ; a la K, Use zero to test NR results
RandA DD 1664525 ; As per NR
RandC DD 1013904223 ; As per NR
RAND32 = RAND32 * RANDA + RANDC
HTH,
Steve N.
i like that code, Steve
i suggest initializing the Rand32 seed with a reading from QueryPerformanceCounter or something
so that the sequence has a new starting place each time a program is run
i have even used QueryPerfomanceCounter, itself, to generate pseudo-random values
Hi Dave,
Yes, I use a RandInit routine to read a time of day
to start with different seeds. Starting with the same
one helps in comparing two routines like different sorts.
It can also help in debugging. If a texture needs to be
repeatable, then specifying a particular value may be
handy. The values I gave were just to follow the
"Book Code" for a good start.
Cheers,
Steve N.
try this :
.DATA
ALIGN 4
Rand_Seed DWORD ?
.CODE
ALIGN 16
;
; Initialize random numbers
;
; syntax :
; call InitRandNum
;
; Return:
; eax = number of ticks since the beginning
;
InitRandNum PROC
push ecx
push edx
invoke GetTickCount
test eax,eax
jnz Label1
mov eax,123456789
Label1: mov Rand_Seed,eax
pop edx
pop ecx
ret
InitRandNum ENDP
ALIGN 16
;
; generate a random number between 0 and max value - 1
;
; syntax :
; mov ebx,Range
; call GetRandNum
;
; return :
; eax = the random number
;
GetRandNum PROC
push edx
mov eax,987654321
mul Rand_Seed
bswap eax
xor eax,078787878h
jnz Label1
mov eax,123456789
Label1: mov Rand_Seed,eax
cmp ebx,1
je Label2
mul ebx
mov eax,edx
Label2:
pop edx
ret
GetRandNum ENDP
change the values/mask if needed...
Another from Knuth, I use rdtsc as the seed.
; Constants for the random number generator
MP equ 2147483647 ; A Mersenne prime
AA equ 48271 ; This does well in the spectral test
QQ equ 44488 ; MM / AA
RR equ 3399 ; MM % AA; It is important that RR < QQ
; Seed the simple randon number generatot
rand_seed proc USES edx
push edx
rdtsc
.if sdword ptr eax < 0 ; if seed -ve
add eax, MP ; make +ve
.endif
mov RSeed, eax ; don't need the value in edx
pop edx
ret
rand_seed endp
align 4
; Taken from Knuth
; RSeed = AA * (RSeed % QQ) - RR * (RSeed / QQ)
random proc USES esi ebx ecx edx edi
mov ecx, QQ
mov esi, RR
mov edi, AA
xor edx, edx
mov eax, RSeed
div ecx
; Quotient in eax, Remainder in edx
mov ebx, edx ; save remainder in ebx
mul esi ; eax = RR * (RSeed / QQ)
xchg eax, ebx ; save above result in ebx and get RSeed % QQ into eax
mul edi ; eax = AA * (RSeed % QQ)
sub eax, ebx ; eax = = AA * (RSeed % QQ) - RR * (RSeed / QQ)
.if sdword ptr eax < 0 ; is eax -ve
add eax, MP ; make eax +ve
.endif
mov RSeed, eax ; Save result as new seed
ret ; and return the random number
random endp
Bruce
i was searching for this kind of code too this post was helpfull for me too :P thx you guys
lol - guess i'll toss my hat in the ring, too
;--------------------------------------------------------------------------
Randm PROC
;Random ASCII character generator
;generates numbers 0-9, letters a-z and A-Z
;Call With: Nothing
; Returns: eax = al = pseudo-random ASCII character
INVOKE Sleep,0
sub esp,8
INVOKE QueryPerformanceCounter,esp
pop eax
pop edx
xor eax,edx
xor edx,edx
mov ecx,62
div ecx
xchg eax,edx
add eax,30h ;'0'
cmp al,39h ;'9'
jbe Randm0
add eax,7
cmp al,5Ah ;'Z'
jbe Randm0
add eax,6
Randm0: ret
Randm ENDP
;--------------------------------------------------------------------------
I don't actually know, but I would guess that procedural textures would not require high-quality random numbers, and based on this I'm assuming that execution speed is the most important characteristic. George Marsaglia posted the C implementations for eight random number generators here (http://www.ciphersbyritter.com/NEWS4/RANDC.HTM), and guessing that the quality of the bits in the last half of the 32-bit number that it generates would be unlikely to affect the intended use, I selected CONG. And I'm also guessing that the range of each generated number needs to be controllable, so I added a parameter and code to do this, scaling the return value to the range [0, base). I tested the cycle counts for several other generators for comparison. I don't have an implementation of mersenne twister to test, but I seem to recall that the implementations I have tested were not at all fast.
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
include \masm32\include\masm32rt.inc
.686
include \masm32\macros\timers.asm
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
.data
counts dd 10 dup(0)
.code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
;--------------------------------------------------------
; This is an asm implementation of a simple congruential
; generator by George Marsaglia.
;--------------------------------------------------------
.data
cong_seed dd 380116160
.code
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
align 4
cong proc
mov eax, cong_seed
mov ecx, 69069
mul ecx
add eax, 1234567
mov cong_seed, eax
ret
cong endp
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
;----------------------------------------------------------
; This is the above generator with a base parameter added,
; performing the mod operation with a multiply.
;----------------------------------------------------------
align 4
congb proc base:DWORD
mov eax, cong_seed
mov ecx, 69069
mul ecx
add eax, 1234567
mov cong_seed, eax
mov ecx, [esp+4]
mul ecx
mov eax, edx
ret 4
congb endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
;----------------------------------------------------------
; This is the above generator with a base parameter added,
; performing the mod operation with a division.
;----------------------------------------------------------
align 4
congb2 proc base:DWORD
mov eax, cong_seed
mov ecx, 69069
mul ecx
xor edx, edx
add eax, 1234567
mov cong_seed, eax
div dword ptr [esp+4]
mov eax, edx
ret 4
congb2 endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
;----------------------
; A quick visual test.
;----------------------
mov ebx, 1000
.WHILE ebx
invoke congb, 0100h
print uhex$(eax),9
dec ebx
.ENDW
print chr$(13,10)
inkey
print chr$(13,10)
;-----------------------------------
; A quick test of the distribution.
;-----------------------------------
mov ebx, 100000000
.WHILE ebx
invoke congb, 10
inc DWORD PTR counts[eax*4]
dec ebx
.ENDW
print ustr$(counts),13,10
print ustr$(counts+4),13,10
print ustr$(counts+8),13,10
print ustr$(counts+12),13,10
print ustr$(counts+16),13,10
print ustr$(counts+20),13,10
print ustr$(counts+24),13,10
print ustr$(counts+28),13,10
print ustr$(counts+32),13,10
print ustr$(counts+36),13,10
print chr$(13,10)
inkey
print chr$(13,10)
invoke Sleep, 3000
counter_begin 1000, HIGH_PRIORITY_CLASS
invoke nrandom, 100
counter_end
print ustr$(eax)," cycles, nrandom",13,10
counter_begin 1000, HIGH_PRIORITY_CLASS
invoke crt_rand
counter_end
print ustr$(eax)," cycles, crt_rand",13,10
counter_begin 1000, HIGH_PRIORITY_CLASS
invoke cong
counter_end
print ustr$(eax)," cycles, cong",13,10
counter_begin 1000, HIGH_PRIORITY_CLASS
invoke congb, 100
counter_end
print ustr$(eax)," cycles, congb",13,10
counter_begin 1000, HIGH_PRIORITY_CLASS
invoke congb2, 100
counter_end
print ustr$(eax)," cycles, congb2",13,10,13,10
inkey "Press any key to exit..."
exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start
The cycle counts running on a P3:
86 cycles, nrandom
60 cycles, crt_rand
5 cycles, cong
8 cycles, congb
30 cycles, congb2
Edit: The attachment contains the test app above, along with a separate app that does "scatterplot" test.
in cong, is the 'xor edx, edx' superfluous ?
Quotein cong, is the 'xor edx, edx' superfluous ?
Yes it is, and also for congb, but not for congb2. Thanks for pointing that out. The code I started with was essentially that for congb2, and I failed to notice that the xor edx, edx was not necessary for the others.
thanks everyone
here is my first attempt, see if I can find some suitable numbers for 16bit, maybe pmulhw it better suitable?
seed dw 31415
.code
mov ax,seed ;seed
mov ebx,56789
mov edx,263
movd MM0,eax
movd MM1,ebx
movd MM2,edx
pmullw MM0,MM2
paddw MM0,MM1
movd eax,MM0 ;new seed
mov seed,ax
why not make "seed" a 32-bit value ?
or....
movzx eax,word ptr seed
of course, you ARE trying to get random numbers - lol
Quoteit would be good if I could find a BAD kinda randomgenerator, bad in the sense its maybe could be possible to a specific seed, gives a pattern that is easily reversed
All PRNGs do this, unless you are using a truly random source for input such as radioactive decay.
Have a look at the Alternating Step Generator LFSR.
You could either seed it randomly or with known values. You must "tap" in specific places though based on the length of the LFSR in order to get the maximum period, which is all combinations except zero (zero would always produce zero output).
http://en.wikipedia.org/wiki/Alternating_step_generator
Best regards,
Astro.
daydreamer,
The returned values look OK, and over a small range (10) the distribution appears uniform, but over a larger range (100) the distribution contains gaps, with half of the values missing, and this problem is very apparent in the scatter plot (size the window to see the effect at different ranges). I get 19 cycles on my P3.
thanks Michael
latest version make 4 different kinds of randomgenerators in parallel, a cpuid case code testing SSE2 caps and it can be easily changed to generate 8 numbers in parallel
I also think its faster to split it into a initalization proc and a random Macro
seed dw 31415 ;seeds
seed2 dw 14142
seed3 dw 15915
seed4 dw 28159
c1 dw 12345 ;constants for adding
c2 dw 6789
c3 dw 54321
c4 dw 9876
c5 dd 69069 ;constants for multiplying
c6 dd 262262
.code
lea ebx,seed ;initalize randomgenerator
movq MM2,[ebx+16] ; should be placed separate in a initalization proc
movq MM1,[ebx+8] ;ddinit
;proc ddrandom or make it a macro
movq MM0,[ebx]
pmullw MM0,MM2
paddw MM0,MM1
;new seeds
movq [ebx],MM0
Hi *.*:
D:\MASM32\BIN>test
000000CA 000000DA 00000098 00000088 000000C5
0000007A 00000029 00000023 00000001 000000D6
000000D0 000000D2 0000004D 0000005C 000000A7
00000085 0000008A 00000052 0000006A 000000CD
000000E1 000000E4 00000077 00000019 000000ED
000000D5 00000066 000000EB 0000002A 0000004F
0000001C 0000009C 0000001F 00000078 00000073
00000077 00000017 000000D5 000000EB 00000053
0000005A 00000093 000000FA 000000E7 0000003C
00000044 000000E1 00000083 0000007C 000000B6
00000038 000000F6 000000B9 000000DA 0000007D
000000B4 00000000 000000F4 00000063 00000097
000000BB 0000007E 000000B3 000000CB 00000080
0000007E 000000E8 000000B3 00000087 0000003D
000000EF 000000ED 000000CB 0000008D 000000AC
000000B3 00000018 00000021 00000022 0000003A
0000008A 0000009C 0000006C 000000E4 00000048
00000053 000000E3 00000050 00000071 000000F4
0000001E 000000A3 0000008F 0000008A 000000A1
000000A1 00000007 00000085 00000072 0000005B
00000051 000000D1 000000E4 0000009C 000000FA
00000012 00000052 00000054 000000B6 00000098
0000009A 000000A3 00000002 000000A8 00000082
00000053 0000001B 00000018 0000007A 000000D1
00000056 0000006F 00000043 00000006 0000000D
00000002 00000077 00000090 000000B3 000000BC
0000000E 00000044 000000D3 00000086 0000003B
000000FF 000000EC 00000085 0000005B 0000001B
0000008E 000000A3 000000BB 0000002A 00000013
00000013 0000007C 000000B0 000000E4 000000C8
000000D7 00000011 00000029 00000011 00000084
000000EB 0000003E 000000CB 0000000B 0000007E
00000067 000000E6 0000003E 000000F6 00000090
00000026 0000002C 000000A7 00000010 0000005E
00000039 0000008E 000000BC 0000008F 0000003E
000000AE 000000D4 000000A8 000000DC 0000001F
00000079 000000D6 00000052 000000DD 000000C8
0000009B 00000095 000000B2 000000D1 0000001B
00000075 000000C5 000000EA 00000057 00000008
000000AE 00000031 0000004D 000000B0 0000001E
00000033 00000028 00000063 00000066 00000001
000000FE 0000001F 00000083 0000003C 000000C8
000000AB 00000074 000000D1 00000089 00000008
000000F5 00000075 000000E4 0000002F 000000EE
00000019 0000007F 00000013 00000032 000000E7
000000FA 000000F7 0000005C 00000081 0000006B
00000087 000000DD 0000008A 00000082 00000055
00000090 000000C1 000000E6 0000009C 00000071
00000028 00000066 00000076 0000003C 0000000B
000000FB 000000D9 0000000E 000000D4 000000D4
00000010 00000066 0000000B 000000D5 00000087
00000008 000000EC 0000004E 00000019 000000A5
0000006E 000000C5 00000033 00000053 000000AD
0000001E 0000008D 0000002E 00000004 00000091
00000039 000000B5 0000005A 0000000C 0000004B
000000B4 00000068 0000001C 00000069 000000D2
000000F1 00000068 000000F8 00000017 00000042
00000053 000000B7 000000FD 0000006C 00000065
000000FE 000000E1 0000000C 00000073 000000FA
000000E3 00000097 000000C6 0000008F 00000095
00000076 0000009B 000000BE 000000FD 00000076
000000D5 000000BE 000000CE 0000004D 000000AC
00000022 000000DC 00000039 000000B1 000000AD
0000002F 000000FC 0000007F 000000CC 0000003A
000000BE 000000E4 000000BE 0000001C 00000092
00000064 000000E8 000000E4 000000A8 00000052
000000E1 0000004E 00000068 000000F7 00000036
0000006F 00000005 00000071 000000E9 00000032
00000044 00000044 0000002F 000000BF 0000007D
000000D4 000000F8 00000011 0000001B 00000035
00000033 0000005E 0000001C 00000006 000000F3
00000016 0000005C 00000056 000000C9 000000A1
000000A5 000000B3 0000000F 00000018 000000CE
000000EB 000000F1 000000A1 00000073 00000025
00000067 00000039 000000F9 0000004E 00000082
000000FE 000000F4 0000000B 000000A5 00000056
00000026 000000A4 0000009F 0000007A 000000A6
00000099 000000BA 00000045 000000E9 000000DF
00000032 00000059 000000B4 000000EC 000000A8
000000E3 00000041 0000000F 00000082 000000A0
00000012 000000C8 00000058 000000F2 0000003E
00000008 000000CF 00000014 00000036 000000A4
00000086 000000C9 00000047 00000051 00000004
00000014 0000005F 00000000 000000B6 000000D1
000000DC 00000067 00000079 0000003F 00000062
00000013 0000003E 000000D5 000000FC 00000053
000000C1 000000EC 00000016 000000C7 00000060
0000001F 00000009 00000086 0000008E 0000006C
00000043 000000FE 00000070 00000068 0000004C
00000013 0000009C 00000026 0000002C 0000002F
0000000C 00000095 000000FD 0000002C 000000B7
0000007A 0000009E 00000055 0000006A 00000047
000000DF 000000F6 000000CA 00000078 00000097
00000022 000000A7 00000037 00000072 000000AE
000000FC 00000067 0000007C 0000008F 0000003B
00000068 000000A8 00000016 000000E5 0000001E
000000D9 000000B3 00000067 00000023 00000036
0000003D 000000CD 00000024 000000BC 000000F0
000000C4 000000C6 00000070 00000090 000000FB
000000AC 00000094 0000006D 000000DA 00000064
000000B7 0000003A 00000019 0000005C 000000A6
000000E0 000000C5 00000030 00000070 000000D9
000000A0 000000DE 00000015 000000A8 000000F5
000000BF 00000097 00000080 0000005B 0000004C
00000074 00000037 000000EE 00000015 000000D7
00000009 000000F9 00000065 000000EB 000000D1
00000027 00000084 00000039 000000B9 000000AA
00000091 000000EB 0000008F 000000EF 0000005A
0000005F 0000002F 000000D2 000000BC 00000072
000000C0 00000067 00000060 00000053 000000AC
000000C9 000000F6 000000E2 00000092 0000008A
00000028 000000DF 0000009F 000000AB 000000F2
00000064 00000008 00000044 0000002A 0000007D
0000008E 00000038 00000045 000000D9 0000004C
0000009E 000000B8 000000D4 00000042 000000C6
00000081 00000068 00000043 000000F4 000000B5
000000C2 0000004F 00000041 00000019 00000086
0000007B 0000003E 00000029 000000F9 000000E5
0000003C 00000018 000000A5 00000065 00000068
0000009D 0000002D 000000EF 0000000E 000000A9
00000063 000000DD 000000BB 0000004C 000000D1
0000008C 00000009 0000006C 00000050 0000000A
00000056 0000001C 0000005E 000000CA 000000B3
0000009E 00000042 00000082 000000B6 00000075
00000039 000000BF 0000001C 0000000D 0000003C
000000CA 0000004F 000000D6 00000016 000000E8
0000003F 000000E0 000000DD 0000001B 00000032
0000004D 000000FB 000000DC 0000006C 0000002F
0000006A 00000083 000000CA 000000AB 000000C8
00000050 000000CF 00000066 00000060 0000004B
000000A1 0000009E 00000058 00000067 000000FF
0000007E 00000099 000000B0 00000039 00000085
0000002A 00000057 0000009D 00000005 00000001
000000F3 000000F6 00000026 0000009D 00000058
00000004 000000EF 000000D8 000000B1 00000014
0000002A 000000C2 000000EC 0000000D 00000043
0000008A 0000008B 000000B2 00000068 000000B2
0000006C 000000DC 0000008C 00000012 00000023
000000EB 0000000B 0000007C 000000EE 000000C2
00000023 000000EE 00000067 000000FA 00000078
000000A6 00000075 0000002C 00000046 000000A2
00000029 00000078 00000015 00000034 00000078
000000A8 000000FB 000000D2 0000000E 000000A0
000000CF 00000047 000000AD 000000EF 000000C9
000000DF 000000C3 000000AC 0000007F 00000048
00000034 0000000A 0000003D 00000024 0000008A
00000039 00000016 000000A8 00000001 00000082
00000043 000000B2 00000017 000000C1 0000009C
00000029 000000A6 0000009E 0000004B 000000A9
000000B4 00000026 000000FE 000000BB 00000063
000000B7 0000003D 0000001E 000000C5 00000048
000000AB 000000EA 00000084 00000096 00000046
0000008A 00000087 000000AE 00000007 000000D9
0000001A 000000E5 00000067 000000D6 00000037
0000003D 000000A5 000000D0 000000FB 000000B1
000000E0 000000E9 00000070 00000003 00000057
00000057 00000015 000000F7 00000066 000000CD
000000F0 00000010 0000000E 000000E3 0000000C
0000004A 00000028 0000006A 0000009B 0000005B
00000086 00000013 00000034 0000007C 000000A8
000000C3 000000DC 000000BE 000000C7 0000007C
000000D0 00000068 0000005E 000000CC 00000048
000000FB 00000070 00000006 000000A1 000000B8
00000032 000000D4 000000A0 000000D8 000000DA
000000D8 000000CF 000000CE 000000C0 00000071
00000017 00000046 00000003 000000D4 00000018
000000B1 00000070 0000000D 000000EC 0000009A
0000004F 000000A5 00000050 0000000D 0000005B
000000F3 00000058 000000D3 00000012 0000005A
0000003A 00000044 00000005 000000F9 000000C7
000000F6 000000F6 00000094 000000A4 00000090
000000DE 000000B2 00000060 00000077 00000022
00000040 000000DF 00000031 0000009C 00000031
000000E2 000000AA 0000003D 000000E7 0000007D
000000B0 00000063 00000062 000000D0 00000057
0000003B 00000087 00000064 0000002C 000000B7
000000EB 000000E7 00000049 000000FC 00000003
00000036 000000C0 00000047 000000A1 000000EE
000000FA 000000B9 0000001B 000000FB 000000EE
00000080 0000008D 000000CB 0000004E 00000046
000000FF 0000002B 00000022 000000F4 00000080
000000F1 00000057 0000001B 00000064 000000FB
00000043 00000095 0000003A 0000001A 0000002F
0000003A 00000036 00000003 00000052 000000B5
00000019 00000050 000000E2 000000D4 000000D2
0000002F 0000001C 000000E0 00000063 0000004A
00000067 000000D2 000000D2 00000003 000000FA
00000082 00000019 00000014 000000A4 00000019
0000006F 0000002A 000000A0 0000005E 0000005A
0000000B 0000008C 000000B3 00000063 0000005D
000000FF 000000D4 00000057 00000027 00000043
000000B3 00000070 00000092 00000036 000000F7
000000DC 00000095 0000005B 000000FD 00000068
00000081 00000093 00000073 0000005D 000000A4
00000017 000000CE 000000C6 00000074 0000009E
00000046 000000A5 000000FA 000000C9 0000001D
000000E3 0000006C 00000043 00000045 000000FE
00000027 00000037 000000C9 000000AA 00000091
000000FA 00000072 00000000 00000076 00000014
00000098 000000EA 000000F7 0000000B 00000022
Press any key to continue ...
9997486
10003516
10000750
10003332
9999134
9997625
9998169
9997968
10000096
10001924
Press any key to continue ...
84 cycles, nrandom
31 cycles, crt_rand
3 cycles, cong
2 cycles, congb
41 cycles, congb2
Press any key to exit...
Regards: herge
Quote from: MichaelWI don't have an implementation of mersenne twister to test
Mersenne Twister:
CONST SECTION
N equ 624
M equ 397
MN4 equ -908
TEMPERING_MASK_B equ 9d2c5680h
TEMPERING_MASK_C equ 0efc60000h
UM equ 80000000h
LM equ 7fffffffh
DATA SECTION
MTI dd (N+1)
MC dd 69069
MATRIX dd 0
dd 9908b0dfh
MT dd 2496 dup (?)
CODE SECTION
Randomize FRAME Seed
uses edi
pushad
lea edi,MT
mov eax,[Seed]
mov [edi],eax
mov ecx,N
add edi,4
:
mul D[MC]
stosd
dec ecx
jnz <
mov D[MTI],N
popad
ret
ENDF
RandM FRAME limit
uses edi,ebx,esi
push 0
lea edi,MT
cmp D[MTI],N
jb >>L1
cmp D[MTI],N+1
jnz >L2
rdtsc
push eax ; Generate a new seed
call Randomize
L2:
mov esi,edi
L3:
mov eax,[esi]
and eax,UM
mov ebx,[esi+4]
and ebx,LM
or eax,ebx
mov ecx,eax
shr eax,1
mov edx,esi
add edx,(M*4)
mov ebx,[edx]
xor eax,ebx
and ecx,1
xor eax,[MATRIX+ecx*4]
mov [esi],eax
add esi,4
inc D[esp]
cmp D[esp],(N-M)
jnz <L3
L4:
mov eax,[esi]
and eax,UM
mov ebx,[esi+4]
and ebx,LM
or eax,ebx
mov ecx,eax
shr eax,1
mov edx,esi
add edx,MN4
mov ebx,[edx]
xor eax,ebx
and ecx,1
xor eax,[MATRIX+ecx*4]
mov [esi],eax
add esi,4
inc D[esp]
cmp D[esp],(N-1)
jnz <L4
mov edx,edi
add edx,(M-1)*4
mov ebx,[edx]
xor eax,ebx
and ecx,1
xor eax,[MATRIX+ecx*4]
mov [esi],eax
mov D[MTI],0
L1:
mov esi,edi
mov eax,[MTI]
inc D[MTI]
shl eax,2
add esi,eax
mov eax,[esi]
mov ebx,eax
shr eax,11
xor ebx,eax
mov eax,ebx
shl eax,7
and eax,TEMPERING_MASK_B
xor ebx,eax
mov eax,ebx
shl eax,15
and eax,TEMPERING_MASK_C
xor ebx,eax
mov eax,ebx
shr eax,18
xor eax,ebx
xor edx,edx
div D[limit]
mov eax,edx
pop ebx
ret
ENDF
What do you think of my perlin noise generator:
macro SSENOISE x,y
{
pmullw y,dqword[prime2]
paddw y,x
movdqa x,y
psraw x,13
pxor y,x
movdqa x,y
pmullw y,y
pmullw y,dqword[prime4]
paddw y,dqword[prime3]
pmullw x,y
paddw x,dqword[prime1]
pand x,dqword[masks]
}
macro SSESETUP ix,fx,iy,fy,f
{
;rcx=x
;rdx=y
pinsrw fy,edx,0
pshufb fy,dqword[byteshuffle]
ror edx,16
pinsrw iy,edx,0
pshufb iy,dqword[byteshuffle]
rol edx,16
mov eax,ecx
repeat 8
pinsrw fx,eax,%-1
ror eax,16
pinsrw ix,eax,%-1
rol eax,16
add eax,f
end repeat
}
macro SSELERP a,b,n
{
psubw b,a
pmulhw b,n
psllw b,1
paddw a,b
}
macro SSEINTERPOLATENOISE2
{
;x=xmm0:xmm1
;y=xmm2:xmm3 ;4
psrlw xmm1,1 ;4
psrlw xmm3,1 ;4
movdqa xmm4,xmm0 ;5
movdqa xmm5,xmm2 ;6
SSENOISE xmm4,xmm5 ;5
movdqa xmm5,xmm0 ;6
movdqa xmm6,xmm2 ;7
paddw xmm5,dqword[sseone] ;7
SSENOISE xmm5,xmm6 ;6
SSELERP xmm4,xmm5,xmm1 ;5
movdqa xmm5,xmm0 ;6
movdqa xmm6,xmm2 ;7
paddw xmm6,dqword[sseone] ;7
SSENOISE xmm5,xmm6 ;6
movdqa xmm6,xmm0 ;7
movdqa xmm7,xmm2 ;8
paddw xmm6,dqword[sseone] ;8
paddw xmm7,dqword[sseone] ;8
SSENOISE xmm6,xmm7 ;7
SSELERP xmm5,xmm6,xmm1 ;6
SSELERP xmm4,xmm5,xmm3 ;5
}
struct PerlinData
Persistance dd ?
Octave dd ?
Frequency dd ?
FrequencyX8 dd ?
ends
section '.data' data readable writeable
align 16
masks dw 8 dup(07fffh)
prime1 dw 8 dup(53)
prime2 dw 8 dup (838)
prime3 dw 8 dup (1949)
prime4 dw 8 dup (35671)
sseone dw 1,1,1,1,1,1,1,1
ssethree dw 8 dup(0c000h)
byteshuffle dw 100h,100h,100h,100h,100h,100h,100h,100h
gPerlin PerlinData
section '.code' code readable executable
proc SSEPerlinNoise2D uses rdi rsi rbx, x,y,pData
pxor xmm9,xmm9
xor r9,r9
mov r9d,[r8+8]
mov edi,dword[r8+4]
mov ebx,0ffffh ;amplitude
.repeat
SSESETUP xmm0,xmm1,xmm2,xmm3,r9d
SSESCURVE xmm1,xmm4
SSESCURVE xmm3,xmm4
SSEINTERPOLATENOISE2
pinsrw xmm8,ebx,0
pshufb xmm8,dqword[byteshuffle]
pmulhuw xmm4,xmm8
paddusw xmm9,xmm4
shl ecx,1
shl edx,1
shl r9,1
imul ebx,dword[r8]
shr ebx,16
dec edi
.until ZERO?
movdqa xmm0,xmm9
ret
endp
it's all fasm x64 code but you get the idea, SSEPerlinNoise2D returns 8 16 bit values
it's suprisingly fast as well
(http://C:%5CUsers%5CDameon%5CPictures%5C1.png)
thanks Edgar for showing me mersienne twister, its too big for my purpose of making a demo
Damos, nice , thanks now I can take a look at a working noise generator in assembler, instead of HLL code
:U