The MASM Forum Archive 2004 to 2012
Welcome, Guest. Please login or register.
August 26, 2019, 01:35:22 AM

Login with username, password and session length
Search:     Advanced search
128553 Posts in 15254 Topics by 684 Members
Latest Member: mottt
* Home Help Search Login Register
+  The MASM Forum Archive 2004 to 2012
|-+  General Forums
| |-+  The Workshop
| | |-+  why is "add" faster than "inc"
« previous next »
Pages: [1] 2 3 4 Print
Author Topic: why is "add" faster than "inc"  (Read 24750 times)
thomas_remkus
Guest


Email
why is "add" faster than "inc"
« on: April 28, 2006, 08:17:49 PM »

i'm in a tight loop. doing basically nothing but adding numbers. when I use "inc eax" it's much slower than "add eax, 1". why is this? the difference is significant.
Logged
arafel
Guest


Email
Re: why is "add" faster than "inc"
« Reply #1 on: April 28, 2006, 08:33:46 PM »

It's faster only on PIV and above processors, because there is some penalty due to partial register stall when using inc instruction. On PIII and below "add reg, 1" is much slower.

(at least for Intel cpus, don't know if there is difference for AMD)
Logged
jdoe
Guest


Email
Re: why is "add" faster than "inc"
« Reply #2 on: April 28, 2006, 09:51:16 PM »

(at least for Intel cpus, don't know if there is difference for AMD)

add/sub are faster than inc/dec even on AMD processor.  ThumbsUp


Logged
QvasiModo
Guest


Email
Re: why is "add" faster than "inc"
« Reply #3 on: April 28, 2006, 10:19:26 PM »

It's because add changes all of the arithmetic registers, while inc changes only some of them - so the processor may have to wait before another arithmetic operation completes just to set the flags correctly, even when the calculations are completely unrelated.

For example, if I have this:
Code:
cmp eax,10h
add edx,1
the processor doesn't have to wait for the cmp instruction to complete to be able to execute the add instruction. But if I have this:
Code:
cmp eax,10h
inc edx
then the processor has to wait for cmp to know how the flags have to be set after executing inc.
Logged
Ratch
Guest


Email
Re: why is "add" faster than "inc"
« Reply #4 on: April 29, 2006, 12:04:18 AM »

 jdoe,

Quote
add/sub are faster than inc/dec even on AMD processor

     Both ADD and INC are DirectPath vs VectorPath instructions according to the AMD Optimization Manual http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pdf .  Also many optimization examples in the manual use INC under the right circumstances, i.e. no reading or writing the register immediately after modifying it.  I can't find any statement in the manual saying that a ADD is preferable to a INC on the AMD.

QvasiModo,

Quote
It's because add changes all of the arithmetic registers, while inc changes only some of them

     Both ADD and INC change only the register that they are coded to change.

Quote
For example, if I have this:

Code:
cmp eax,10h
add edx,1
the processor doesn't have to wait for the cmp instruction to complete to be able to execute the add instruction. But if I have this:

Code:
cmp eax,10h
inc edx
then the processor has to wait for cmp to know how the flags have to be set after executing inc

     Why?  In both cases the ADD in the first snippet and the INC in the second snippet are going to wipe away any flag settings of the CMP instruction.  Ratch
Logged
tenkey
Member
*****
Posts: 336



Re: why is "add" faster than "inc"
« Reply #5 on: April 29, 2006, 12:39:39 AM »

I think QvasiModo is referring to the differences in flag settings.

Because ADD and CMP change the same set of flags, and INC and CMP don't, there may be a stall for creating the correct flag setting in the latter case.

The difference is CF. In multiprecision arithmetic, you would use INC/DEC for counting and updating addresses. You would need to save and restore CF if there were no increment/decrement instructions that left CF alone.
Logged

A programming language is low level when its programs require attention to the irrelevant.
Alan Perlis, Epigram #8
jdoe
Guest


Email
Re: why is "add" faster than "inc"
« Reply #6 on: April 29, 2006, 12:52:11 AM »

jdoe,

Quote
add/sub are faster than inc/dec even on AMD processor

     Both ADD and INC are DirectPath vs VectorPath instructions according to the AMD Optimization Manual http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pdf .  Also many optimization examples in the manual use INC under the right circumstances, i.e. no reading or writing the register immediately after modifying it.  I can't find any statement in the manual saying that a ADD is preferable to a INC on the AMD.

I don't mind about what was written or not. From the test I did, using add/sub is in worst case as fast or faster that inc/dec. On my AMD athlon 1800+ though.

If you ear from the radio that the sky is green today, would you believe it without going outside to see it by youself ?
Logged
Ratch
Guest


Email
Re: why is "add" faster than "inc"
« Reply #7 on: April 29, 2006, 01:02:43 AM »

jdoe,

Quote
If you ear from the radio that the sky is green today, would you believe it without going outside to see it by youself ?

     From the radio?  No I certainly would not.  But if the one who made the sky said so, then I would believe it until I saw otherwise.  Check your timings again.  They can be tricky with insidious pitfalls.  Ratch
Logged
Ratch
Guest


Email
Re: why is "add" faster than "inc"
« Reply #8 on: April 29, 2006, 01:10:44 AM »

tenkey,

Quote
Because ADD and CMP change the same set of flags, and INC and CMP don't, there may be a stall for creating the correct flag setting in the latter case.

     Again I point out, the CMPs in his example code are effectively NOPs.  The flags the CMPs set or clear are wiped out by the following ADD and INC instructions.  Ratch
Logged
hutch--
Administrator
Member
*****
Posts: 12013


Mnemonic Driven API Grinder


Re: why is "add" faster than "inc"
« Reply #9 on: April 29, 2006, 01:20:42 AM »

In the words of Intel from PIV manual 4,

Quote
The inc and dec instructions should always be avoided. Using add and sub instructions instead of inc and dec instructions avoid data dependence and improve performance.

This probably has something to do with why ADD SUB are faster on later Intel hardware.  BigGrin
Logged

Regards,



Download site for MASM32
http://www.masm32.com
tenkey
Member
*****
Posts: 336



Re: why is "add" faster than "inc"
« Reply #10 on: April 29, 2006, 07:41:40 AM »

tenkey,

Quote
Because ADD and CMP change the same set of flags, and INC and CMP don't, there may be a stall for creating the correct flag setting in the latter case.

     Again I point out, the CMPs in his example code are effectively NOPs.  The flags the CMPs set or clear are wiped out by the following ADD and INC instructions.  Ratch

Here is a demonstration of the INC instruction (DEC is similar). Predict what the following code will produce, then run it. Replace the INC with the equivalent ADD and see if there's a difference.

Code:
.386
.model stdcall, flat
option casemap :none   ; case sensitive

include c:\masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc

includelib c:\masm32\lib\kernel32.lib
includelib c:\masm32\lib\user32.lib

.data
caseclr db "CF is cleared by CMP, not set by INC.",0
case0   db "CF is not set by STC.",0
case1   db "CF is not cleared by CMP.",0
case2   db "CF is set by INC.",0
.code
_start:

stc        ; set CF
jnc carryclear_case0   ; error if CF is clear
mov ecx,-1
mov eax,7
cmp eax,3  ; 7 - 3 = 4, no carry (borrow)
jc carryset_case1   ; error if CF is set
; CF status is "clear"
inc ecx    ; FFFFFFFF + 1 = 0 w/carry - is CF set?
jc carryset_case2   ; find out!
carryclear:
invoke MessageBox,NULL,addr caseclr,addr caseclr,MB_OK
jmp quit
carryclear_case0:
invoke MessageBox,NULL,addr case0,addr case0,MB_OK
jmp quit
carryset_case1:
invoke MessageBox,NULL,addr case1,addr case1,MB_OK
jmp quit
carryset_case2:
invoke MessageBox,NULL,addr case2,addr case2,MB_OK
quit:
invoke ExitProcess,0

end _start
Logged

A programming language is low level when its programs require attention to the irrelevant.
Alan Perlis, Epigram #8
EduardoS
Guest


Email
Re: why is "add" faster than "inc"
« Reply #11 on: April 29, 2006, 02:08:26 PM »

(at least for Intel cpus, don't know if there is difference for AMD)

add/sub are faster than inc/dec even on AMD processor.  ThumbsUp




Maybe under certain conditions, generaly not:
Code:
Press any key to start...
add 1 : 1019 clocks
add 2 : 1020 clocks
add 3 : 1020 clocks
add 4 : 1363 clocks
inc 1 : 1020 clocks
inc 2 : 1020 clocks
inc 3 : 1021 clocks
inc 4 : 1361 clocks
add/cmp : 1019 clocks
inc/cmp : 1019 clocks
Press any key to exit...

[attachment deleted by admin]
Logged
dsouza123
Guest


Email
Re: why is "add" faster than "inc"
« Reply #12 on: April 29, 2006, 02:19:27 PM »

Athlon 1.2 Ghz @ 1190 Mhz
Windows XP SP2  512MB

Press any key to start...
add 1 : 1026 clocks
add 2 : 1027 clocks
add 3 : 1351 clocks
add 4 : 1802 clocks
inc 1 : 1027 clocks
inc 2 : 1025 clocks
inc 3 : 1028 clocks
inc 4 : 1373 clocks
add/cmp : 1026 clocks
inc/cmp : 1027 clocks
Press any key to exit...
Logged
Mark Jones
Drifting in the Abstract
Member
*****
Posts: 2302


=- Stargate Atlantis -=


Re: why is "add" faster than "inc"
« Reply #13 on: April 29, 2006, 05:11:18 PM »

when I use "inc eax" it's much slower than "add eax, 1". why is this?

Code optimization is like working on a Sudoku puzzle or decrypting an encrypto-gram. Can be lots of fun, and also maddeningly annoying at the same time. BigGrin   

See Agner Fog's optimization guide: http://www.agner.org/assem/
Logged

"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08
jdoe
Guest


Email
Re: why is "add" faster than "inc"
« Reply #14 on: April 29, 2006, 06:26:10 PM »

But if the one who made the sky said so, then I would believe it until I saw otherwise.

You know how to answer.

Ok, I give you one point about a little gain with INC/DEC on AMD in some circumstance but I keep saying that generaly, using ADD/SUB is as fast or faster. In other words, when writing optimize code, trying both is a good idea.
Logged
Pages: [1] 2 3 4 Print 
« previous next »
Jump to:  

Powered by MySQL Powered by PHP The MASM Forum Archive 2004 to 2012 | Powered by SMF 1.0.12.
© 2001-2005, Lewis Media. All Rights Reserved.
Valid XHTML 1.0! Valid CSS!