Emitting hex codes and endieness...

Started by ThoughtCriminal, June 29, 2005, 04:01:22 AM

Previous topic - Next topic

ThoughtCriminal

I written my first parser.  Very simple, just does byte codes.  Since I want to have COFF support easly in the development process, a good to for hex codeing obj files. 

Anyway my question is do I need to worrk about endieness?

Here is what I'm parsing:

ff,15,00,00,00,01,0f,0f,0f,0f,0f,0f,0f,0f,0f,0f

here is the out put:

00403030 FF 15 00 00 00 01 call        dword ptr ds:[1000000h]
00403036 ??               db          0fh 
00403037 ??               db          0fh 
00403038 ??               db          0fh 

In my text file the value 01 is at the last postion.  In the dissasembly byte codes it is alsot at the last.  In the mnemonic dissasembly the value 1 is in the first position.

I'm just puting the values in left to right.  Do I need to do some kind of coractin for endiemess?

Thanks.

AeroASM

Intel uses little endian which means that the bits within each byte are numbered from right to left (normal) but the bytes within words or dwords (or bigger) are placed in reverse order, from left to right.

tenkey

Endian-ness refers to which way multibyte numeric values are stored in memory. For simplicity, this also includes "logical" data. It does not include "op codes". The result is that, for the Intel x86 platform, immediate data, jump displacements, address offsets, and WORD/DWORD data must be in little-endian form as AeroASM described - "rightmost" byte first.

There is no Endian-ness in registers, as a register holds a single unit of data that may be subdivided into smaller units of data.
A programming language is low level when its programs require attention to the irrelevant.
Alan Perlis, Epigram #8

ThoughtCriminal

Thank you. 

db 0ffh,15h,01,00,00,00
db 0ffh,15h,00,01,00,00
db 0ffh,15h,00,00,01,00
db 0ffh,15h,00,00,00,01

00401120 FF 15 01 00 00 00 call        dword ptr ds:[1]
00401126 FF 15 00 01 00 00 call        dword ptr ds:[100h]
0040112C FF 15 00 00 01 00 call        dword ptr ds:[10000h]
00401132 FF 15 00 00 00 01 call        dword ptr ds:[1000000h]
00401138 FF 15 00 00 00 01 call        dword ptr ds:[1000000h]
0040113E FF 15 00 00 01 00 call        dword ptr ds:[10000h]
00401144 FF 15 00 01 00 00 call        dword ptr ds:[100h]
0040114A FF 15 01 00 00 00 call        dword ptr ds:[1]

dw 015ffh
dd 01000000h
dw 015ffh
dd 00010000h
dw 015ffh
dd 00000100h
dw 015ffh
dd 00000001h

I was wondering why the opcode didn't need to be changed, and then I read you post closer tenkey.

From the above it seems opcodes are best writen out as byte and word and dword data best written out as is own data size.

Writing call 00000001h and writing by dword to get

0040114A FF 15 01 00 00 00 call        dword ptr ds:[1]

Seems a lot easier than writing call 01000000h to get the same thing writing by bytes.



Randall Hyde

Quote from: tenkey on June 29, 2005, 09:07:56 PM
Endian-ness refers to which way multibyte numeric values are stored in memory. For simplicity, this also includes "logical" data. It does not include "op codes". The result is that, for the Intel x86 platform, immediate data, jump displacements, address offsets, and WORD/DWORD data must be in little-endian form as AeroASM described - "rightmost" byte first.

There is no Endian-ness in registers, as a register holds a single unit of data that may be subdivided into smaller units of data.

Well, technically the fact that AL/AH, AX, and EAX all overlap means that you *do* have to consider endianess in x86 registers. I.e, AH would grab bits 0..7 if the x86 were a big endian machine (keeping in mind that AH stands for "high order byte of accumulator").
Cheers,
Randy Hyde