News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

CPUID with AMD support test piece.

Started by hutch--, April 26, 2010, 03:18:28 PM

Previous topic - Next topic

Rockoon


AuthenticAMD
Cannot Identify x86 Processor
Not Available
Not Available
n/a - INTEL Only
n/a - INTEL Only
Available
Available
Available
Not Available
Not Available
Available
Not Available


This is an AMD Phenom II x6 1055T  "Thuban"

The specs taken from the Phenom II tech sheet:

Including support for SSE, SSE2, SSE3, SSE4a, ABM, MMX, 3DNow! technology and legacy x86

The Advanced Bit Manipulation (ABM) was introduced with K10 and consists of 2 instructions:

POPCNT - counts the number of bits set in a word.
LZCNT - counts the number of leading zero bits in a word.

Your program says SSE4a is not available, and maybe it isn't.. but the technical specs say that it is. Havent tried them.
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

Rockoon

Quote from: KeepingRealBusy on June 19, 2010, 03:59:48 AM
but running 32 bit mode (XP) so the 3DNow instructions are not available

??? 3DNow should work fine in 32-bit mode. It came out at a time when Intel was still dreaming of SSE.

When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

jj2007

Quote from: Rockoon on June 19, 2010, 12:06:39 PM
The Advanced Bit Manipulation (ABM) was introduced with K10 and consists of 2 instructions:

POPCNT - counts the number of bits set in a word.
LZCNT - counts the number of leading zero bits in a word.
In case somebody is confused about the "word": popcnt is also available on Intel, and not limited to a word:

QuotePOPCNT r16, r/m16
POPCNT r32, r/m32
POPCNT r64, r/m64 (64-bit mode)
http://software.intel.com/file/17971/

Re LZCNT, does that anything different than bsf?

Rockoon

Quote from: jj2007 on June 19, 2010, 12:26:41 PM
Re LZCNT, does that anything different than bsf?

I'm pretty sure that they are functionally different in the degenerate case, where the operand is 0.

Oh, and I think you mean BSR, not BSF.


The latencies are quite different as well. On AMD the LZCNT is supposedly 2 cycles, whereas BSR is still a vector path with 6 cycles of latency.

edit: woops, it doesnt work on xmm regs
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

jj2007

Quote from: Rockoon on June 19, 2010, 01:00:08 PM
Quote from: jj2007 on June 19, 2010, 12:26:41 PM
Re LZCNT, does that anything different than bsf?

I'm pretty sure that they are functionally different in the degenerate case, where the operand is 0.
Most probably yes (I can't test it). BSF/BSR set the zero flag for the null operand, which is certainly not the best solution.

Quote
Oh, and I think you mean BSR, not BSF.
:U

KeepingRealBusy

Quote from: Rockoon on June 19, 2010, 12:17:11 PM
Quote from: KeepingRealBusy on June 19, 2010, 03:59:48 AM
but running 32 bit mode (XP) so the 3DNow instructions are not available

??? 3DNow should work fine in 32-bit mode. It came out at a time when Intel was still dreaming of SSE.



You are absolutely correct, I misread the PDF.

Dave.