News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

BEAEngine

Started by FlySky, November 30, 2011, 06:25:05 PM

Previous topic - Next topic

FlySky

I've been playing around with BEAEngine to disassemble a file.

I've got the BEAEngine project from the main website. Latest version being the 4.x one.

Now I am experiencing a small problem.

On the site there is an explanation which includes:

//BEAEngine Example
//0x89, 0x94, 0x88, 0x00, 0x20, 0x40, 0x00 
//mov dword ptr ds:[eax + ecx*4 + 402000h], edx
//MyDisasm.Instruction.Category == GENERAL_PURPOSE_INSTRUCTION + DATA_TRANSFER
//MyDisasm.Instruction.Opcode == 0x89
//MyDisasm.Instruction.Mnemonic == "mov "
//     
//MyDisasm.Argument1.ArgMnemonic == "eax + ecx*4 + 402000h"
//MyDisasm.Argument1.ArgType == MEMORY_TYPE
//MyDisasm.Argument1.ArgSize == 32
//MyDisasm.Argument1.AccessMode == WRITE
//MyDisasm.Argument1.Memory.BaseRegister == REG0
//MyDisasm.Argument1.Memory.IndexRegister == REG1
//MyDisasm.Argument1.Memory.Scale == 4
//MyDisasm.Argument1.Memory.Displacement == 0x402000
//MyDisasm.Argument1.SegmentReg == DSReg
//     
//MyDisasm.Argument2.ArgMnemonic == "edx"
//MyDisasm.Argument2.ArgType == REGISTER_TYPE + GENERAL_REG + REG2
//MyDisasm.Argument2.ArgSize == 32
//MyDisasm.Argument2.AccessMode == READ

Now in my project I am dissassembling the following instruction:
push dword ptr [esp+ebx*4+30721500h]

Now according to the xample above the results should be:
MyDisasm.Argument1.ArgMnemonic == "esp + ebx*4 +30721500h" but it returns the following IDA output:

MyDisasm_Argument1_ArgMnemonic
data:00435180 MyDisasm_Argument1_ArgMnemonic db  10h
data:00435181 db    0
data:00435182 db    2
data:00435183 db    0
data:00435184 db    0
data:00435185 db    0
data:00435186 db    0
data:00435187 db    0
data:00435188 db    0
data:00435189 db    0
data:0043518A db    0

Another variable which should be filled correctly:
MyDisasm.Argument1.Memory.Scale == 4 --> but instead it returns IDA output:

MyDisasm_Argument1_Memory_Scale dd 10h

The only thing it fills correctly is the CompleteString variable.

I've included BEAEngine like this:
#Include BeaEngineGoAsm32.inc
Disasm = BeaEngine.lib:Disasm

and also dynamicly link: #DynamicLinkFile msvcrt.dll

Anyone else experienced any of these issues?





donkey

Hi Flysky,

BEAEngine is a great disassembly lib and well worth the effort to learn to use. Beatrix (the author) is a member here and though not active lately you can try to send a PM or visit the dedicated forum on the BEAEngine website. For my own purposes I have a custom build of the library that Beatrix sent me and use that however, since it is a one-off I have not attempted to update it (since it serves my needs as they are quite nicely). Another albeit more cumbersome way to get disassembled code is to use the DbgHelp api, since it is totally COM based with event sinks etc.. without a good knowledge of COM in assembly you might get bogged down in bug hunting more than producing useable code. I am pretty sure that I have posted code for both BEAEngine and DBGHelp disassembly though I haven't looked around for it, my DBGHelp code can be found in the debug tools for RadAsm 3 in the appropriate subforum.

Edgar
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

FlySky

hmm I can't edit my post.

It seems I was using an earlier version of BEAEngine.

Since version 4.x from August this year is giving me compiling errors that it doesn't support/ recognize MyDisasm.Eip

-----------------------------

EDIT FIXED the problem.

Added the latest version of BEAEngine again.

Now with the error. It seems the MyDisasm structure changed.

Use MyDisasm.EIP instead of MyDisasm.Eip

struct _Disasm {
   UIntPtr EIP;  --> In earlier version it was Eip
   UInt64 VirtualAddr;
   UIntPtr SecurityBlock;
   char CompleteInstr[INSTRUCT_LENGTH];
   UInt32 Archi;
   UInt64 Options;
   INSTRTYPE Instruction;
   ARGTYPE Argument1;
   ARGTYPE Argument2;
   ARGTYPE Argument3;
   PREFIXINFO Prefix;
   UInt32 Reserved_[40];
};

FlySky

I've cleaned my 'double posting message'.

Like stated above I fixed the BEAEngine problem, was a pretty stupid mistake from my side.

Now I have one more question.

Is it possible to assemble a string into opcodes?

BEAEngine uses the opcodes and generates the string.
Is there any library which can do the same the other way around.
Converting the string back into it's opcodes. I haven't seen anything like this
in BEAEngine documentation.


dedndave

i recall something about this from before
i think you can use JwAsm to generate one line of code
the forum search tool might be helpful

FlySky

I've found a couple of topics containing a bit of information.
Hutch provided two tools to convert a string to the opcodes (decimal presentation).

From the website from Jwasm it seems to be supporting MASM only?

dedndave

i seem to recall there is a command-line switch for JwAsm to assemble one line
maybe i am thinking of GoAsm   :P

FlySky

Not sure what you mean dedndave but I will continue searching for the solution.

FlySky

Quote from: donkey on November 30, 2011, 06:54:48 PM
Hi Flysky,

BEAEngine is a great disassembly lib and well worth the effort to learn to use. Beatrix (the author) is a member here and though not active lately you can try to send a PM or visit the dedicated forum on the BEAEngine website. For my own purposes I have a custom build of the library that Beatrix sent me and use that however, since it is a one-off I have not attempted to update it (since it serves my needs as they are quite nicely). Another albeit more cumbersome way to get disassembled code is to use the DbgHelp api, since it is totally COM based with event sinks etc.. without a good knowledge of COM in assembly you might get bogged down in bug hunting more than producing useable code. I am pretty sure that I have posted code for both BEAEngine and DBGHelp disassembly though I haven't looked around for it, my DBGHelp code can be found in the debug tools for RadAsm 3 in the appropriate subforum.

Edgar

Edgar, I've found your dbghelp functions COM based. Indeed it's became bughunting instead of writing code that I could use.
the IDebugClient also offers a function called Assembly.

So I coded an Assembly function:

invoke Assemble, offset AssemblyResult, offset FullRebuildPattern1  // This loads the effective adresses of the string to convert called FUllRebuildPattern1 and AssemblyResult is a buffer to receive the translation.

Assemble FRAME Location, Instruction
   uses Ebx
   LOCAL status:%UINT_PTR
   LOCAL EndOffset:Q
   LOCAL pBuffer:%UINT_PTR
   LOCAL DisassemblySize:%UINT_PTR

   invoke CoTaskMemAlloc,1024
   mov [pBuffer], Eax

   invoke DebugCreate,offset IID_IDebugClient, offset pIDebugClient
   test Eax, Eax
   jnz >>.DBGCFAIL
   CoInvoke(pIDebugClient,IDebugClient.QueryInterface,offset IID_IDebugControl,offset pIDebugControl)
   test Eax, Eax
   jnz >>.QIFAIL

   invoke GetCurrentProcessId
   CoInvoke(pIDebugClient,IDebugClient.AttachProcess,0,0,eax,DEBUG_ATTACH_NONINVASIVE | DEBUG_ATTACH_NONINVASIVE_NO_SUSPEND)
   test Eax, Eax
   jnz >>.ATTACHFAIL
   CoInvoke(pIDebugControl,IDebugControl.WaitForEvent,DEBUG_WAIT_DEFAULT,INFINITE)

   mov Ebx, [Instruction] //Load instruction to convert
   mov Edx, [Location] //Load the location where Assembly should be stored
   :
   CoInvoke(pIDebugControl,IDebugControl.Assemble,edx,ebx,offset EndOffset) //Call function. After this call it always ends up with a error it can't determine.
   
   
   test Eax, Eax
   jnz >>.DISASMFAIL

   CoInvoke(pIDebugControl,IDebugControl.Release)
   CoInvoke(pIDebugClient,IDebugClient.Release)
   invoke CoTaskMemFree,[pBuffer]
   ret

   .DBGCFAIL
   invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
   invoke MessageBox,NULL,[pBuffer],"DebugCreate failed",0
   ret
   .QIFAIL
   invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
   invoke MessageBox,NULL,[pBuffer],"QueryInterface failed",0
   CoInvoke(pIDebugClient,IDebugClient.Release)
   ret
   .ATTACHFAIL
   invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
   invoke MessageBox,NULL,[pBuffer],"Attach to process failed",0
   CoInvoke(pIDebugControl,IDebugControl.Release)
   CoInvoke(pIDebugClient,IDebugClient.Release)
   ret
   .DISASMFAIL
   invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
   invoke MessageBox,0,[pBuffer],"Disassemble failed",0
   CoInvoke(pIDebugControl,IDebugControl.Release)
   CoInvoke(pIDebugClient,IDebugClient.Release)
   ret
EndF

donkey

 A quick look at the code shows that you are using 32 bit offsets for the Assemble method, as far as I know the offsets should be 64 bit whether in 32 or 64 bit mode. Also you will have a memory leak if you don't free the memory (pBuffer) in case of an error using CoTaskMemFree.
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

FlySky

Could it be the string I am passing through the Assemble function causing the problem?.
What I did was pretty simple. I created a string which holds
mov esi, [esp+4B0] --> Exactly like this.
In WinDBG you simply type A (Assemble) and type the instruction to assemble.
Not sure if WinDBG is doing anything to the input like converting it or something.

The offsets 32 bit or 64 bit you kind of lost me there. To be able to use 64 bit offsets it would mean I would need
to have a 64 bit function aswell?

donkey

A 64 bit offset needs to be passed as 2 32 bit parameters, to do this pass the 32 bit address in the first parameter and NULL in the second:

invoke SomeFunc, Addr64

would be

invoke SomeFunc, Addr32, 0

A good way to tell if you are not passing wide enough addresses is to check the value of ESP before and after the call, in 32 bit mode you will have a difference of 4 bytes. For the string I am not familiar with the function but it requires a pointer to a NULL terminated ANSI string. I have found that pointers to strings in global or local memory can lead to threading problems with the debug api, I prefer to allocate memory using CoTaskMemAlloc and copy the string to that in order to get reliable results (in some cases it is the only way to get any result at all).

Edgar
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

FlySky

still struggling and here is what I have so far.

I used BEAEngine to disassemble a couple of instructions.
In the DATA section I created:

FullRebuildPattern1 DB 100 Dup (?) // To have a buffer which can store a string.
and a couple more variables which should help me rebuild a string.
RebuildComma   DB ',',0
RebuildSpace   DB ' ',0
RebuildPlus      DB '+',0
RebuildLe      DB '[',0
RebuildRe      DB ']',0

Now after playing with a couple of BEAengine returned disassembles I decided to make my own string and see if I can
code a function which converts that string back to it's opcodes so a real instruction occurs the CPU understands.
I've been using the lstrcat function to build that string.

invoke lstrcat, offset FullRebuildPattern1, offset MovInstr
invoke lstrcat, offset FullRebuildPattern1, offset RebuildSpace
invoke lstrcat, offset FullRebuildPattern1, offset PopRebuildPattern1
invoke lstrcat, offset FullRebuildPattern1, offset RebuildComma
invoke lstrcat, offset FullRebuildPattern1, offset RebuildSpace
invoke lstrcat, offset FullRebuildPattern1, offset RebuildLe
invoke lstrcat, offset FullRebuildPattern1, offset EspInstr
invoke lstrcat, offset FullRebuildPattern1, offset RebuildPlus
invoke lstrcat, offset FullRebuildPattern1, offset EspResult
invoke lstrcat, offset FullRebuildPattern1, offset RebuildRe

The string created is:

mov esi, [esp+000004b0]

Now the function I thought which is nice to recreate the opcodes based on a given string is based on donkey's IDebugClient code.
there is a function called:

The Assemble and AssembleWide methods assemble a single processor instruction. The assembled instruction is placed in the target's memory.

HRESULT
  IDebugControl::Assemble(
    IN ULONG64  Offset,
    IN PCSTR  Instr,
    OUT PULONG64  EndOffset
    );

HRESULT
  IDebugControl4::AssembleWide(
    IN ULONG64  Offset,
    IN PCWSTR  Instr,
    OUT PULONG64  EndOffset
    );

#ifdef UNICODE
#define AssembleT AssembleWide
#else
#define AssembleT Assemble
#endif

Parameters
Offset
Specifies the location in the target's memory to place the assembled instruction.
Instr
Specifies the instruction to assemble. The instruction is assembled according to the target's effective processor type (returned by SetEffectiveProcessorType).
EndOffset
Receives the location in the target's memory immediately following the assembled instruction. EndOffset can be used when assembling multiple instructions.
Return Value
S_OK
The method was successful.

I coded a function called Assemble which takes 2 parameters.

invoke Assemble, offset AssemblyResult, offset FullRebuildPattern1

Parameter 1 is pointer to a place which should receive assembled code.
Defined in the data section as:

AssemblyResult   Db 100 Dup (?)

Parameter 2 pointer to the string to assemble.

The assemble function is based on donkeys IdebugClient code posted on the forum.
Donkey suggested to use 64 bit parameters so I had to rebuild a bit and it takes an ANSI string.
Although I noticed when debugging the dbgeng.dll and following it it automaticly calls MultiByteWideChar API

Assemble FRAME Location, Instruction
   uses Ebx
   LOCAL status:%UINT_PTR
   LOCAL EndOffset:Q
   LOCAL pBuffer:%UINT_PTR
   LOCAL DisassemblySize:%UINT_PTR
   LOCAL wfmt[MAX_PATH] :W
   
   invoke CoTaskMemAlloc,1024
   mov [pBuffer], Eax

   invoke DebugCreate,offset IID_IDebugClient, offset pIDebugClient  //Create debugging event
   test Eax, Eax                                                               
   jnz >>.DBGCFAIL
   CoInvoke(pIDebugClient,IDebugClient.QueryInterface,offset IID_IDebugControl,offset pIDebugControl)   //Determine which debugging interface to use
   test Eax, Eax
   jnz >>.QIFAIL

   invoke GetCurrentProcessId    //Get Current process ID -- it returns the process id the code is being ran from which is correct.
   CoInvoke(pIDebugClient,IDebugClient.AttachProcess,0,0,eax,DEBUG_ATTACH_NONINVASIVE | DEBUG_ATTACH_NONINVASIVE_NO_SUSPEND) //Attach to process - all succeeds till here
   test Eax, Eax
   jnz >>.ATTACHFAIL
   CoInvoke(pIDebugControl,IDebugControl.WaitForEvent,DEBUG_WAIT_DEFAULT,INFINITE)  //Wait for a event

   :
   invoke MultiByteToWideChar,CP_ACP,NULL,[Instruction],-1,offset wfmt,MAX_PATH  //I decided to code the MultiByteToWideChar API in the function to create an ANSI string. return value is 18 so it wrote 18 bytes in the buffer.
   CoInvoke(pIDebugControl,IDebugControl.Assemble,offset wfmt, 0, [Instruction],0 ,offset EndOffset). // After this function it returns the value 80004005.
      
   test Eax, Eax
   jnz >>.DISASMFAIL

   CoInvoke(pIDebugControl,IDebugControl.Release)
   CoInvoke(pIDebugClient,IDebugClient.Release)
   invoke CoTaskMemFree,[pBuffer]
   ret

   .DBGCFAIL
   invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
   invoke MessageBox,NULL,[pBuffer],"DebugCreate failed",0
   ret
   .QIFAIL
   invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
   invoke MessageBox,NULL,[pBuffer],"QueryInterface failed",0
   CoInvoke(pIDebugClient,IDebugClient.Release)
   ret
   .ATTACHFAIL
   invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
   invoke MessageBox,NULL,[pBuffer],"Attach to process failed",0
   CoInvoke(pIDebugControl,IDebugControl.Release)
   CoInvoke(pIDebugClient,IDebugClient.Release)
   ret
   .DISASMFAIL
   invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,NULL,rax,NULL,[pBuffer],1024,NULL
   invoke MessageBox,0,[pBuffer],"Disassemble failed",0
   CoInvoke(pIDebugControl,IDebugControl.Release)
   CoInvoke(pIDebugClient,IDebugClient.Release)
   ret
EndF

donkey

HRESULT
  IDebugControl::Assemble(
    IN ULONG64  Offset,
    IN PCSTR  Instr,
    OUT PULONG64  EndOffset
    );

CoInvoke(pIDebugControl,IDebugControl.Assemble,offset wfmt, 0, [Instruction],0 ,offset EndOffset). // After this function it returns the value 80004005.


Only Offset and EndOffset are 64 bit, the Instr parameter is 32 bit according to the interface definition you posted. You have 64 bits for Offset and Instr and 32 bits for EndOffset (actually the function will see Instr normally and EndOffset as an unreasonalby large number), so it returned an error. You also have a period at the end of the line but that is probably ignored. Try this instead:

CoInvoke(pIDebugControl,IDebugControl.Assemble,offset wfmt, 0, [Instruction],offset EndOffset,0)

Edgar
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

FlySky

Edgar,
First of all thanks for all your help.
I tried what you suggested. The Assemble function still returns: 80004005 which is the E_FAIL return code.
Could it be the created string causing the error?.