The MASM Forum Archive 2004 to 2012
Welcome, Guest. Please login or register.
December 07, 2022, 04:24:45 AM

Login with username, password and session length
Search:     Advanced search
128553 Posts in 15254 Topics by 684 Members
Latest Member: mottt
* Home Help Search Login Register
+  The MASM Forum Archive 2004 to 2012
|-+  General Forums
| |-+  The Campus
| | |-+  FPU status word
« previous next »
Pages: 1 [2] 3 Print
Author Topic: FPU status word  (Read 25913 times)
jj2007
Member
*****
Gender: Male
Posts: 6011



Re: FPU status word
« Reply #15 on: February 03, 2012, 05:27:24 PM »

You are still limited to only a few branching options, as most of the x86 branch instruction are unusable.

There should only be a few, actually. The FPU holds by default REAL10 values, and when you compare two of them, there are three options:
- bigger
- equal
- smaller
All the rest makes sense only in a reg32 context (carry, unsigned etc). Correct me if I am wrong.
By the way, you can always use
Code:
num1 REAL8 123.456
num2 REAL8 123.455
fld num1
fld num2
push eax
fistp num1
pop eax
push edx
fistp num1
pop edx
cmp eax, edx
to do your reg32-style conditional jumps...
Logged

dedndave
Member
*****
Posts: 12523


Re: FPU status word
« Reply #16 on: February 03, 2012, 05:28:43 PM »

you must understand that the CPU has several instructions that are context-dependant
for the FPU, comparisons are always signed - e.g., there is only one context
Logged
jj2007
Member
*****
Gender: Male
Posts: 6011



Re: FPU status word
« Reply #17 on: February 03, 2012, 05:42:58 PM »

Complete example, since we are in The Campus BigGrin

Code:
include \masm32\include\masm32rt.inc
.686
.data
num5 REAL8 123.5
num4 REAL8 123.4
num4e REAL8 123.4
num3 REAL8 123.3

.code
start: fld num4
fld num5
print "num5 is "
fcomi st, st(1)
.if Zero?
print "equal to num4", 13, 10
.elseif Carry?
print "lower than num4", 13, 10
.else
print "higher than num4", 13, 10
.endif
fstp st
fstp st

fld num4
fld num3
print "num3 is "
fcomi st, st(1)
.if Zero?
print "equal to num4", 13, 10
.elseif Carry?
print "lower than num4", 13, 10
.else
print "higher than num4", 13, 10
.endif
fstp st
fstp st

fld num4e
fld num4
print "num4 is "
fcomi st, st(1)
.if Zero?
print "equal to num4e", 13, 10
.elseif Carry?
print "lower than num4e", 13, 10
.else
print "higher than num4e", 13, 10
.endif
fstp st
fstp st

inkey " ", 13, 10
exit
end start

Watch out for precision problems - "equal" means that all 80 bits are equal. MasmBasic users may use the low, medium, high and top precision flag.

include \masm32\MasmBasic\MasmBasic.inc   ; download
.data
num5   REAL8 123.5
num4   REAL8 123.4

   Init

   Fcmp num4, num5, low
   .if Carry?
      Print Str$("num4 at %f is lower than num5\n", num4)
   .elseif Zero?
      Print Str$("num4 at %f is equal to num5\n", num4)
   .else
      Print Str$("num4 at %f is higher than num5\n", num4)
   .endif

   Fcmp num4, num5, medium
   .if Carry?
      Print Str$("num4 at %f is lower than num5\n", num4)
   .elseif Zero?
      Print Str$("num4 at %f is equal to num5\n", num4)
   .else
      Print Str$("num4 at %f is higher than num5\n", num4)
   .endif

   Inkey
   Exit
end start

Code:
num4 at 123.4000 is equal to num5
num4 at 123.4000 is lower than num5
Logged

dedndave
Member
*****
Posts: 12523


Re: FPU status word
« Reply #18 on: February 03, 2012, 06:36:03 PM »

Jochen's example brings something to mind....

comparing floating point values is really not a simple subject   Tongue
things like precision, rounding, and epsilon can come into play
it depends entirely on the application
if you are comparing values of currency, you will use different code than if you are calculating pixels to fill a circle  Tongue

on that note, i did a little google'ing and came across this article that you may find helpful

http://www.cprogramming.com/tutorial/floating_point/understanding_floating_point_representation.html
Logged
SteveAsm
Member
****
Posts: 94


Re: FPU status word
« Reply #19 on: February 03, 2012, 06:58:43 PM »

Quote
There should only be a few, actually.

This is what I mean:
- bigger
- equal
- smaller

Not all the normal jump instruction work based on those three conditions.
My application, based on the conditions, uses all of these:
  je, jne,... jl, jnl,... jg, jng,... jle, jnle,... jge, jnge,... jz, jnz


Quote
comparing floating point values is really not a simple subject.
things like precision, rounding, and epsilon can come into play.
it depends entirely on the application.

Yes, as JJ has pointed out with his example, it is complex and in no way simple.
Logged
raymond
Member
*****
Gender: Male
Posts: 941


Aging newbie


Re: FPU status word
« Reply #20 on: February 03, 2012, 07:58:36 PM »

Quote
I don't want to appear antagonsitic, Ray, but I did look at Dave's first link.

My apology if I sounded the wrong way. It was certainly not intended. I may just have assumed too much from your "Nowhere in my search results was that illustrated".

As for conditional jumps, the FPU always does signed comparisons but never modifies the SF sign flag; only the ZF zero flag and CF carry flag get modified. (The PF parity flag may also be modified but for a totally different reason than by the CPU.)

The jl and jg mnemonics (and their variants) which rely on the SF should therefore not be used after FPU comparisons. This still leaves a majority of the jxxx mnemonics relying only on the CF and ZF flags (or combinations) which can be used with FPU comparisons.

Logged

When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com
SteveAsm
Member
****
Posts: 94


Re: FPU status word
« Reply #21 on: February 03, 2012, 09:47:00 PM »

My apology if I sounded the wrong way. It was certainly not intended.

No, please accept my apologies.
Sometimes when I get frustrated, I tend to google search with blinders on.
I should have explained my self better.

Quote
The jl and jg mnemonics (and their variants) which rely on the SF should therefore not be used after FPU comparisons.
This still leaves a majority of the jxxx mnemonics relying only on the CF and ZF flags (or combinations) which can be used with FPU comparisons.

Okay, I was quite focused on using the jl, jg and variants.
Now I see which ones to stay away from.
The reference materials I have don't explain which groups of jxxx instructions can and can't be used with FPU comparisons.
Thanks
Logged
qWord
Member
*****
Posts: 1425



Re: FPU status word
« Reply #22 on: February 03, 2012, 09:56:48 PM »

The reference materials I have don't explain which groups of jxxx instructions can and can't be used with FPU comparisons.
you should use Intel's and AMD's documentation as reference:
Intel® 64 and IA-32 Architectures Software Developer Manuals
AMD: Developer Guides & Manuals
Logged

FPU in a trice: SmplMath
It's that simple!
dedndave
Member
*****
Posts: 12523


Re: FPU status word
« Reply #23 on: February 04, 2012, 01:59:04 AM »

from Ray's tutorial, ch 7...
Code:
The following example hard-codes the instruction for comparing ST(0) to ST(2).

db   0dbh, 0f0h+2  ;encoding for fcomi st,st(2)
                    ;when not supported by the assembler
fwait              ;insure the instruction is completed
jpe  error_handler ;the comparison was indeterminate
                    ;this condition should be verified first
                    ;then only two of the next three conditional jumps
                    ;should become necessary, in whatever order is preferred,
                    ;the third one being replaced by code to handle that case
ja  st0_greater    ;when all flags are 0
jb  st0_lower      ;only the CF flag would be set if no error
jz  both_equal     ;only the ZF flag would be set if no error
Logged
jj2007
Member
*****
Gender: Male
Posts: 6011



Re: FPU status word
« Reply #24 on: February 07, 2012, 08:58:31 PM »

Currently playing with a new Fcmp routine with top, high, medium and low precision. It seems to work but more tests needed BigGrin

Fcmp sets Sign and Zero flags. Approximate precision (Real10/8/4/x):
top=19 digits, high=15, medium=7, low=4; default = medium, 7 digits

The table below stands for a comparison of 996 ... 1004 against 1000. # means equal. Source attached, requires the MasmBasic library.

Code:
Ref     1234.56789012345678      tp 19 hi 15 me 7  lo 4  default
28      996.000000000000000      <  <  <  <  <  <  <  <  <
27      999.000000000000000      <  <  <  <  <  <  #  #  <
26      999.900000000000000      <  <  <  <  <  <  #  #  <
25      999.990000000000000      <  <  <  <  <  <  #  #  <
24      999.999000000000000      <  <  <  <  #  #  #  #  #
23      999.999900000000000      <  <  <  <  #  #  #  #  #
22      999.999999000000000      <  <  <  <  #  #  #  #  #
21      999.999999990000000      <  <  <  <  #  #  #  #  #
20      999.999999999900000      <  <  <  <  #  #  #  #  #
19      999.999999999990000      <  <  #  #  #  #  #  #  #
18      999.999999999999000      <  <  #  #  #  #  #  #  #
17      999.999999999999900      <  <  #  #  #  #  #  #  #
16      999.999999999999990      <  <  #  #  #  #  #  #  #
15      999.999999999999999      <  <  #  #  #  #  #  #  #
14      1000.000000000000000     #  #  #  #  #  #  #  #  #
13      1000.00000000000000      >  >  #  #  #  #  #  #  #
12      1000.00000000000001      >  >  #  #  #  #  #  #  #
11      1000.00000000000010      >  >  #  #  #  #  #  #  #
10      1000.00000000000100      >  >  #  #  #  #  #  #  #
9       1000.00000000001000      >  >  #  #  #  #  #  #  #
8       1000.00000000010000      >  >  >  >  #  #  #  #  #
7       1000.00000001000000      >  >  >  >  #  #  #  #  #
6       1000.00000100000000      >  >  >  >  #  #  #  #  #
5       1000.00010000000000      >  >  >  >  #  #  #  #  #
4       1000.00100000000000      >  >  >  >  #  #  #  #  #
3       1000.01000000000000      >  >  >  >  >  >  #  #  >
2       1000.10000000000000      >  >  >  >  >  >  #  #  >
1       1001.00000000000000      >  >  >  >  >  >  #  #  >
0       1004.00000000000000      >  >  >  >  >  >  >  >  >
Ref     1234.56789012345678      tp 19 hi 15 me 7  lo 4  default

Comparing PI, high precision:
MyPI_hi         at 3.14159265358980000 is exact
MyPIexact       at 3.14159265358979324 is exact
MyPI_low        at 3.14159265358978000 is exact

Comparing PI, top precision:
MyPI_hi         at 3.14159265358980000 is higher than the real PI
MyPIexact       at 3.14159265358979324 is exact
MyPI_low        at 3.14159265358978000 is lower than the real PI

* MbFlexCmp.zip (13.71 KB - downloaded 330 times.)
« Last Edit: February 09, 2012, 01:07:01 AM by jj2007 » Logged

jj2007
Member
*****
Gender: Male
Posts: 6011



Floating point comparison: timings
« Reply #25 on: February 09, 2012, 08:56:16 AM »

Code:
Intel(R) Pentium(R) 4 CPU 3.40GHz (SSE3)
5       cycles for 10*cmp       5 ms    for 10000000 comparisons
1307    cycles for 10*Fcmp      389 ms  for 10000000 comparisons
5       cycles for 10*cmp       5 ms    for 10000000 comparisons
1317    cycles for 10*Fcmp      392 ms  for 10000000 comparisons

      REPEAT 2
         Fcmp v1, v2   ; Real4 vs Real8
         nop
         Fcmp eax, v3   ; reg32 vs Real10
         nop
         Fcmp v4, ecx   ; QWord vs reg32
         nop
         Fcmp xmm0, v5   ; xmm vs REAL4
         nop
         Fcmp eax, xmm1   ; reg32 vs xmm
         nop
      ENDM

* FcmpTimings.zip (14.86 KB - downloaded 291 times.)
Logged

raymond
Member
*****
Gender: Male
Posts: 941


Aging newbie


Re: FPU status word
« Reply #26 on: February 09, 2012, 08:09:03 PM »

Quote
The FPU holds by default REAL10 values

Just noticed this in this thread. Let's clarify this a bit to prevent newbies from interpreting this wrongly.

FPU data registers are designed to hold REAL10 values, similar to the CPU's general purpose registers are designed to hold 32-bit values. The actual value in any of the FPU's data registers depends on what is loaded into them and/or under what conditions they have been modified.

At least under Windows, the FPU's precision control is set to REAL8 at the opening of any program. If the program needs REAL10, it must change the precision control before performing any operation.

To be more precise, the statement should thus have been:

The FPU by default holds values in the REAL10 format, but those values are not necessarily in the REAL10 precision.
Logged

When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com
jj2007
Member
*****
Gender: Male
Posts: 6011



Re: FPU status word
« Reply #27 on: February 09, 2012, 09:39:32 PM »

That is an interesting point, Raymond. So accordingly, if the FPU is set to REAL4 accuracy, and you use fldpi, the FPU holds a REAL4 crippled value of PI instead of 3.1415926535897932380 aka 4000C90FDAA22168C235h?

If that is the case, then Olly seems to have a bug, because it claims that the FPU holds always the same REAL10 value, irrespective of the precision at the time of loading Roll Eyes

Of course, if you use the Fcmp macro to compare a REAL4 with a REAL8 (e.g. xmm0) variable, the accuracy of the comparison depends on the weaker partner.
Logged

dedndave
Member
*****
Posts: 12523


Re: FPU status word
« Reply #28 on: February 10, 2012, 01:25:42 AM »

don't forget - FINIT sets it to real10   Tongue
Logged
raymond
Member
*****
Gender: Male
Posts: 941


Aging newbie


Re: FPU status word
« Reply #29 on: February 10, 2012, 01:58:17 AM »

That is not what I stated. If you load one of the hard coded constants (such as pi) from the FPU, it will get loaded with its full REAL10 value. If you immediately save that value without modification as a REAL10, it will be saved with its full REAL10 precision, regardless of the precision control. You are saving an image of the data register without any conversion.

However, if you compute the value of 1/3 with the precision control set to REAL8, the data register will contain a truncated value in REAL10 format. And, even if you save it as a REAL10, the saved value will still be that truncated value in REAL10 format.

Thus, if you compute something (apart from a multiple of 1/2) with the precision control set to REAL8 and save it as a REAL10,
then you compute the identical something with the precision control set to REAL10 and also save it as a REAL10,
then compare those two values with REAL10 precision control, they will NOT be identical. They will not even be identical with the precision control set to REAL8 if you load the saved REAL10 values for comparison.
Logged

When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com
Pages: 1 [2] 3 Print 
« previous next »
Jump to:  

Powered by MySQL Powered by PHP The MASM Forum Archive 2004 to 2012 | Powered by SMF 1.0.12.
© 2001-2005, Lewis Media. All Rights Reserved.
Valid XHTML 1.0! Valid CSS!