News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Random Generator

Started by Neil, June 17, 2009, 12:44:13 PM

Previous topic - Next topic

FORTRANS

#60
Quote from: Antariy on October 30, 2010, 10:36:41 PM
So, in short: Axrand is bad, or it is good? If bytes generated by it is good - it shold by good in any usage, probably?
I waits the answer!  :bg

Hi Alex,

   Well, you tested it with ENT, and ENT thinks it good.
Chi value of 25% should be okay.  Knuth uses 10% for
a "Almost Suspect" generator, 5% for "Suspect", and 1%
for "Reject".  But the standard for testing random numbers
is the "Diehard" suite of tests.

Cheers,

Steve N.

Edit:

   It should be noted that Knuth ran multiple tests with small
sample sizes and some erratic behavior must be expected.
the results posted here with ENT have much, much larger
sample sizes.  So the cut off would be larger.

SRN

jj2007

Hi Alex,

Do you agree that I add AxRand to the next version of the MasmBasic library?

I would implement it as

Rand MACRO range  ; based on AxRand by Antariy, returns eax in the range 0...range-1
...
push range
push 0 ; mode is range
call AxRand
Exitm <eax>
ENDM


Rnd MACRO dest ; based on AxRand by Antariy, returns a REAL8 in the range 0...1
...
push dest
push 1 ; mode is offset to destination, or leave result in ST(0)
call AxRandR8
ifb <dest>
   Exitm <>  ; result in ST(0)
else
   Exitm <eax>  ; result is pointer to destination
endif
ENDM


What do you think? It's not only the shortest and fastest random generator ever seen, it seems also a real good one according to ENT...

Antariy

Quote from: FORTRANS on October 30, 2010, 10:55:32 PM
Quote from: Antariy on October 30, 2010, 10:36:41 PM
So, in short: Axrand is bad, or it is good? If bytes generated by it is good - it shold by good in any usage, probably?
I waits the answer!  :bg
   Well, you tested it with ENT, and ENT thinks it good.
Chi value of 25% should be okay.  Knuth uses 10% for
a "Almost Suspect" generator, 5% for "Suspect", and 1%
for "Reject".  But the standard for testing random numbers
is the "Diehard" suite of tests.

I think - if bytes generated by algo is good, than any thing constructed from them *must* be good - no any other way. If 4 subsequently bytes is goodly-random, then DWORD, which is contain them - cannot be bad at point of randomness.

Michael already test Axrand with Diehard, as I recall, and he says that results good. I does not build the testing program from sources yet.



Alex

Antariy

Quote from: jj2007 on October 30, 2010, 11:15:32 PM
Do you agree that I add AxRand to the next version of the MasmBasic library?
.............
What do you think? It's not only the shortest and fastest random generator ever seen, it seems also a real good one according to ENT...

I'm not contrary to using of Axrand  :bg



Alex

jj2007

Quote from: Antariy on October 30, 2010, 11:22:55 PM
Quote from: jj2007 on October 30, 2010, 11:15:32 PM
Do you agree that I add AxRand to the next version of the MasmBasic library?
.............
What do you think? It's not only the shortest and fastest random generator ever seen, it seems also a real good one according to ENT...

I'm not contrary to using of Axrand  :bg


Thanks a lot, Alex :U

In the meantime, I have made small changes to optimise the algo, and they seem to work in terms of speed and size, at least for my archaic CPU...
The attachment is close to the "Full_testing_of_Axrand.zip" posted above. Here are my timings:
Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
10      cycles for Axrand
10      cycles for Axrand2
7       cycles for Axrand3
7       cycles for Rand()

37       bytes for Axrand
30       bytes for Axrand2
31       bytes for Axrand3
29       bytes for Rand()
Add 7-10 bytes per call


Regarding size: Calling the macro twice in a program costs 58 bytes, while using twice the Axrand3 version, and assuming it's 16-byte aligned, costs 32+2*10=52 bytes.

jj2007

Quote from: Antariy on October 30, 2010, 11:21:21 PM
I think - if bytes generated by algo is good, than any thing constructed from them *must* be good - no any other way. If 4 subsequently bytes is goodly-random, then DWORD, which is contain them - cannot be bad at point of randomness.

Axrand seems to be highly random, but independently of that you have raised an interesting question here: Imagine the low bytes are being calculated from two perfectly random high bytes - would the dword be perfectly random?

Try this in the last attachment, axrand.asm:

Quotemov eax, edx   ; return random number
      shr edx, 16
      not edx
      mov ax, dx

FORTRANS

Quote from: jj2007 on October 31, 2010, 04:41:33 PM
you have raised an interesting question here: Imagine the low bytes are being calculated from two perfectly random high bytes - would the dword be perfectly random?

Hi,

   As I pointed out (or thought I had) in Reply #53 and the
attached program, you can put some hideous stuff in the low
word and still get a good random number.  While what you
suggest should not hurt in practice, you should note that it
changes the generator from having 4,294,967,296 distinct
outputs to one having only 65,536 possible numbers.  As your
change does not affect the period, it remains 4,294,967,296
long before it starts to repeat.  That is assuming I properly
understand his algorithm.  So as far as I can see, there will
be no practical difference, unless you really need more than
65,536 distinct outputs.

HTH,

Steve N.

jj2007

#67
Quote from: FORTRANS on October 31, 2010, 05:11:33 PM
So as far as I can see, there will be no practical difference

Chi square distribution ... would exceed this value 5.00 percent of the times., i.e. "suspect"

But what you write is obviously correct - the # of distinct values is reduced to word size.

Here is a new version of the modified algo - a bit shorter, the macro now weighs in at 23 bytes (25 for ranges>127).

Test for #0 byte
Entropy = 7.999208 bits per byte.
would exceed this value 25.00 percent of the times.
Serial correlation coefficient is 0.001851 (totally uncorrelated = 0.0).
Test for #1 byte
Entropy = 7.999241 bits per byte.
would exceed this value 50.00 percent of the times.
Serial correlation coefficient is -0.000120 (totally uncorrelated = 0.0).
Test for #2 byte
Entropy = 7.999224 bits per byte.
would exceed this value 50.00 percent of the times.
Serial correlation coefficient is -0.000343 (totally uncorrelated = 0.0).
Test for #3 byte
Entropy = 7.999208 bits per byte.
would exceed this value 25.00 percent of the times.
Serial correlation coefficient is 0.002037 (totally uncorrelated = 0.0).
Test for full DWORD
Entropy = 7.999803 bits per byte.
would exceed this value 25.00 percent of the times.
Serial correlation coefficient is 0.003151 (totally uncorrelated = 0.0).

FORTRANS

#68
Quote from: jj2007 on October 31, 2010, 05:29:15 PM
Quote from: FORTRANS on October 31, 2010, 05:11:33 PM
So as far as I can see, there will be no practical difference

Chi square distribution ... would exceed this value 5.00 percent of the times., i.e. "suspect"


Hi jj2007,

   You cannot use ENT on a DWORD stream and expect the
correct result for the random numbers involved.  But yes, that
is an expected result from ENT as the two words are now
correlated.  And it is indeed nice that Axrand is not affected
by that inconsistency.

   Working on a Chi-Squared test program using "Numerical
Recipes" subroutines and have duplicated some of the results
in Knuth's AoCP.  And it is working with the LCG as well

Regards,

Steve N.

Edit: removed comment about Numerical Recipes" data not
found.
SRN

dedndave

furthermore, you cannot take bytes from a dword generator in any form and test them in a byte-oriented test program
don't make me come over there and re-write the test code - lol

jj2007

Quote from: dedndave on October 31, 2010, 08:50:00 PM
furthermore, you cannot take bytes from a dword generator in any form and test them in a byte-oriented test program

Shouldn't be a problem is the range is -1, i.e. all elements of a dword enjoy the same probability to appear...

Quote
don't make me come over there and re-write the test code - lol

Come over, Dave, but I warn you it's raining cats and dogs right now :green2

dedndave

i may get over that way, one of these days
better keep a few beers nice and frosty - us yanks like em cold

Antariy

Quote from: dedndave on October 31, 2010, 08:50:00 PM
furthermore, you cannot take bytes from a dword generator in any form and test them in a byte-oriented test program
don't make me come over there and re-write the test code - lol

Dave, this is not dword oriented algo (in full meaning of hard-coded result, as C/C++ generator) - it have range specifier.
Well, now I do test, replace

invoke Axrand,-1

to

invoke Axrand,256


To force algo to generate output values in rage of the BYTE only (0-255).

So, there are results which I get for the lower byte (other bytes are 0 - as must be):

#############################################################
Test for #0 byte
Entropy = 7.999183 bits per byte.

Optimum compression would reduce the size
of this 250000 byte file by 0 percent.

Chi square distribution for 250000 samples is 282.86, and randomly
would exceed this value 25.00 percent of the times.

Arithmetic mean value of data bytes is 127.3206 (127.5 = random).
Monte Carlo value for Pi is 3.147506360 (error 0.19 percent).
Serial correlation coefficient is -0.001592 (totally uncorrelated = 0.0).



So, you don't needed to write new test :P - results still good, even with generation of the BYTEs, not DWORDs  :bg

NOTE: test maded with generation of 250,000 of BYTEs - and results are showed above. So, this is probably very good generator, because it generate good stream which is in ~1000 times bigger than the required range :bg
And note: there is "dirty low byte"  :bg



Alex

FORTRANS

Hi,

   For what it is worth, I coded up one of the tests suggested
by Knuth in his chapter on random numbers.  The test is a
simulation of throwing two dice.  I took the chi-squared routines
from "Numerical Recipes" and fed it some data.  So I am posting
some results.  The first three were from Knuth's discussion to
verify it was working.  And then fed it some data from RandLCG,
RAND, and AxRand.

   The results were not what I expected.  It seems too sensitive
to variation?  Does anyone else have this coded up to compare
against?  Suggestions for another set of data to test?

Regards,

Steve N.


Knuth sample (144 real dice throws).
Number of bins =  11, Constraint =   1
Expected events    2.   4.  10.  12.  22.  29.  21.  15.  14.   9.   6.
Observed events    4.   8.  12.  16.  20.  24.  20.  16.  12.   8.   4.
  DF, CHSQ, PROB    10.00    7.15    0.71

Knuth example #1 (Too high variance.)
Number of bins =  11, Constraint =   1
Expected events    4.  10.  10.  13.  20.  18.  18.  11.  13.  14.  13.
Observed events    4.   8.  12.  16.  20.  24.  20.  16.  12.   8.   4.
  DF, CHSQ, PROB    10.00   29.49    0.10

Knuth example #2 (Too low variance.)
Number of bins =  11, Constraint =   1
Expected events    3.   7.  11.  15.  19.  24.  21.  17.  13.   9.   5.
Observed events    4.   8.  12.  16.  20.  24.  20.  16.  12.   8.   4.
  DF, CHSQ, PROB    10.00    1.14   99.97

RANLCG Random number was not working as planned during debug and used
a fixed seed.

WFL386 RANLCG 103kb
Number of bins =  11, Constraint =   1
Observed events   42.  84. 131. 167. 187. 221. 221. 152. 109.  90.  36.
Expected events   40.  80. 120. 160. 200. 240. 200. 160. 120.  80.  40.
  DF, CHSQ, PROB    10.00    9.23   51.07

RMFORT RANLCG 36kb
Number of bins =  11, Constraint =   1
Observed events   42.  84. 131. 167. 187. 221. 221. 152. 109.  90.  36.
Expected events   40.  80. 120. 160. 200. 240. 200. 160. 120.  80.  40.
DF, CHSQ, PROB    10.00    9.23   51.07

WFL386 RAND (ACM) 103kb (not warmed up)
Number of bins =  11, Constraint =   1
Observed events   44.  78. 124. 173. 199. 226. 202. 154. 129.  76.  35.
Expected events   40.  80. 120. 160. 200. 240. 200. 160. 120.  80.  40.
  DF, CHSQ, PROB    10.00    4.21   93.76

WFL386 RAND (ACM) 103kb (warmed upish)
Number of bins =  11, Constraint =   1
Observed events   42.  69. 115. 153. 206. 268. 188. 152. 113.  79.  55.
Expected events   40.  80. 120. 160. 200. 240. 200. 160. 120.  80.  40.
  DF, CHSQ, PROB    10.00   12.74   23.86


RANLCG (fixed) 1440 events to compare with above runs.
  EOF
Roll of Dice Simulation, Chi-Squared Test
Number of bins =  11, Constraint =   1, Number of events =  1440
Observed evnt   47.   82.   92.  169.  171.  258.  209.  171.  125.   79.   37.
Expected evnt   40.   80.  120.  160.  200.  240.  200.  160.  120.   80.   40.
  DF, CHSQ, PROB    10.00   15.48   11.56

RANLCG max events.
Roll of Dice Simulation, Chi-Squared Test
Number of bins =  11, Constraint =   1, Number of events =100000
Observed evnt 2758. 5559. 8481.11053.13894.16801.13804.11032. 8223. 5581. 2814.
Expected evnt 2778. 5556. 8333.11111.13889.16667.13889.11111. 8333. 5556. 2778.
  DF, CHSQ, PROB    10.00    7.28   69.88

RANLCG (Allow for random seed.) 1440 events to compare with above runs.
  EOF
Roll of Dice Simulation, Chi-Squared Test
Number of bins =  11, Constraint =   1, Number of events =  1440
Observed evnt   34.   85.  104.  163.  188.  238.  209.  156.  146.   87.   30.
Expected evnt   40.   80.  120.  160.  200.  240.  200.  160.  120.   80.   40.
  DF, CHSQ, PROB    10.00   13.39   20.27

AxRand DWORD limited to first 100,000 x2 DWORDS.
10/31/2010  01:13p           1,000,000 rand_dword.bin
Roll of Dice Simulation, Chi-Squared Test
Number of bins =  11, Constraint =   1, Number of events =100000
Observed evnt 2777. 5584. 8279.11167.13793.16660.13855.11232. 8306. 5526. 2821.
Expected evnt 2778. 5556. 8333.11111.13889.16667.13889.11111. 8333. 5556. 2778.
  DF, CHSQ, PROB    10.00    3.76   95.74

Antariy

Hi, Steve!

Do you have the Diehard compiled with good Fortran compiler?

I have downloaded sources for C some days ago (when Michael made the test), but tried to build the program yesterday only.

I have builded it with different versions of MSVC (6 and 10), with PellesC, but program seems to work badly. Floating point results output says about errors or NaNs. Sources is ported not cleanly, too - compilers have many "asks" to code.
So, I'll ask you for the EXE which is compiled with Fortran compiler, as I understand - it is the original language on which is written that test.



Alex