News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

EXE Jump Tables

Started by dedndave, May 29, 2009, 05:51:54 PM

Previous topic - Next topic

dedndave

do any of you experienced programmers have a simple method of eliminating the need for jump tables at the end of an EXE ?
i am guessing they are inserted by the assembler ?

jj2007

Good question indeed. They don't look very efficient. Maybe it's a noob question: how does the assembler "know" the memory location of GetStdHandle, i.e. 7C812F3A? Is it guaranteed to be the same in all versions of Windows??

0040106A              .  6A F5                       push -0B  ; StdHandle = STD_OUTPUT_HANDLE
0040106C              .  E8 A7000000                 call <jmp.&kernel32.GetStdHandle>
...
00401118               $ FF25 74114000              jmp near dword ptr [<&kernel32.GetStdHandle>]
...
GetStdHandle            8BFF                         mov edi, edi  ; HANDLE kernel32.GetStdHandle(StdHandle)
7C812F3B                55                           push ebp
7C812F3C                8BEC                         mov ebp, esp

PBrennick

GetStdHandle, for example, has its laddress stored in a lookup table in kernel.dll so even though there are differing versions of that DLL, the address, which can vary, will always be found.

Paul
The GeneSys Project is available from:
The Repository or My crappy website

BogdanOntanu

Quote from: dedndave on May 29, 2009, 05:51:54 PM
do any of you experienced programmers have a simple method of eliminating the need for jump tables at the end of an EXE ?

Simple? Well ...NO. Possible? Yes.

See here for a possible solution: http://www.masm32.com/board/index.php?topic=6519.0;topicseen

The MS libs contain two kinds of API "glue" code: one that will generate a jump table and one that will call "directly".

The jump table is more efficient for many reasons.

AFAIK there are other older threads about this issue also.

Quote
i am guessing they are inserted by the assembler ?

No. Usually this jump table is generated / inserted by the linker when it links extern / API's from LIBs .

However my Sol_Asm assembler does generate the jump tables :D but this is an exception to confirm the above rule.


Ambition is a lame excuse for the ones not brave enough to be lazy.
http://www.oby.ro

dedndave

well - i read many of the other related threads - i did not see anything specific about eliminating the tables
much ado about why they are there, however
i seem to recall something someplace about eliminating them
and, yes, i can see how a program may load faster with the tables
once it has loaded, however, i would have to think it would be faster without them
as always, the best solution is probably a hybrid, where functions that are speed-critical are referenced without tables
and funtions that are used several times, but are not speed-critical use the tables (similar to procs vs macros argument)

EDIT
@JJ - i don't think it matters if they are the same or not
perhaps running under one OS, they are one value and under a different OS, a different value
from what i gather, they are externals that are unresolved until run-time

surely, if you reference a function 100 different places in the code, the table may be a good way to go
that way, the OS only has to set the value one time when the program is loaded

BogdanOntanu

Quote from: jj2007 on May 29, 2009, 06:22:44 PM
Good question indeed.

In fact a relatively irrelevant question unless you are writing a compiler.

Quote
They don't look very efficient.

It depends on the "angle" of view.

- Each "direct" call is still "indirect" in fact from the CPU's point of view.
- Each "direct" call is 1 byte longer than the jump table and this adds up when you use a lot of API in your code.
- Each "direct" call needs relocations and more solving by the linker and thus makes assembly/ linking slower and DLL's bigger
- indirect call's allow you an extra central "hooking" location that can be useful with portable applications and other OS'es (GOT/PLT like/ ready)

Quote
Maybe it's a noob question: how does the assembler "know" the memory location of GetStdHandle, i.e. 7C812F3A? Is it guaranteed to be the same in all versions of Windows??

It does not. IF you generate an OBJ (most common) then the assembler  leaves this task to the linker. The linker uses the information in the LIB's to glue together an jump table or "direct" indirect calls. Both methods make reference to the IAT structure in the PE specification and further fixing is deferred to the OS loader.

The OS loader KNOWS where the API address is in the current OS version/ layout. The OS loader loads the PE executable and patches the IAT with the correct address and since the jump table or the "direct" calls are both referencing the IAT this "magically" and finally solves the problem.

DLL's need further relocations solving but everything else is the same.


Quote
0040106A              .  6A F5                       push -0B  ; StdHandle = STD_OUTPUT_HANDLE
0040106C              .  E8 A7000000                 call <jmp.&kernel32.GetStdHandle>
...


This "jmp.&Kernel32.getStdhandle" is in fact "jmp [iat.dll_01.function_01] UNTIL the OS loader fixes the corect address and Olly is kind enough to show you friendly names. However Olly does this after the OS loader has performed his job.

Check with a hex editor / disassembler to see that in the "cold" executable the values are different.
Ambition is a lame excuse for the ones not brave enough to be lazy.
http://www.oby.ro

dedndave

i knew i saw it someplace - it was in Hutch's scrap-book of source code

the method was devised by EliCZ - and it does not look all that simple - lol
i may have a go, just as a learning experience

the first download on the page....
http://movsd.com/source.htm

BogdanOntanu

Quote from: dedndave on May 29, 2009, 06:38:33 PM
well - i read many of the other related threads - i did not see anything specific about eliminating the tables

The info is there somewhere... I do not recall the more detailed threads exactly but this kind of subject pops in and out periodically in the advanced sections.

Basically IF you import your API functions with specific names like this: __imp__ExitProcess@4 THEN the linker selects an glue code that does not generate an jump table (if the LIB does provide such glue code). The exact details might be slightly diferent but this is the idea.


Quote
and, yes, i can see how a program may load faster with the tables

There are less places to fix at linking (only the jump table) and less relocations at DLL loading time and the jump table is slightly smaller than direct call with a lot of API calls (common case).

Quote
once it has loaded, however, i would have to think it would be faster without them

Oh well... debatable but anyway when you call an API "speed" is no longer of the essence.

Quote
as always, the best solution is probably a hybrid, where functions that are speed-critical are referenced without tables
and funtions that are used several times, but are not speed-critical use the tables (similar to procs vs macros argument)

The jump table is generated ONLY for the API and not for your own functions. None of the API call should be considered speed critical.


Quote
surely, if you reference a function 100 different places in the code, the table may be a good way to go
that way, the OS only has to set the value one time when the program is loaded

The OS loader only fixes one value anyway... and this value is the function address in IAT PE's table (not the jump table).

For DLL's indeed there are more relocations to fix (only if DLL is relocated). One relocation fix is needed for each direct call in code.
Ambition is a lame excuse for the ones not brave enough to be lazy.
http://www.oby.ro

jj2007

Very nicely explained, thanxalot, Bogdan :U

Vortex

The trick to generate direct calls is based on the declaration of external symbols :

EXTERNDEF _imp__ExitProcess@4:PTR pr1
ExitProcess EQU <_imp__ExitProcess@4>

EXTERNDEF _imp__GetCommandLineA@0:PTR pr0
GetCommandLine EQU <_imp__GetCommandLineA@0>

EXTERNDEF _imp__GetModuleHandleA@4:PTR pr1
GetModuleHandle EQU <_imp__GetModuleHandleA@4>

EXTERNDEF _imp__CreateWindowExA@48:PTR pr12
CreateWindowEx EQU <_imp__CreateWindowExA@48>


pr0, pr1, pr2, pr3 etc. are defined in windows.inc

The creation of the EXTERNDEFs above is automated with Scan.exe

[attachment deleted by admin]

dedndave

ahhhh - cool
let me play with that one, too, Vortex - thank you

jj2007

Cool indeed, Vortex - thanks. So that is how the crt_ imp stuff was created.

If I understand correctly, placing the call in the jmp table makes sense for calls that are used more than a few times. So is there a good reason to place GetCommandLine and ExitProcess there?

dedndave

i think i will use a selective approach
there are very few instances where i want to reduce overhead as much as possible
these are primarily timing and thread synchronization related functions
i understand that system calls are inherently long-winded, but there is no reason i can't try to get the most out of it
other than that, the tables are probably a much better deal
i am even open to using both methods for a function if i think it is the best approach
for example, if i have a function in one critical spot - i want no table branch
i use the same function several other places - use the table for those

now, if i can just get access to the function-name strings for an error routine, i will be a happy camper - lol
i think i can figure that one out for myself

yes - thank you Bogdan and Vortex both
that is exactly what i was looking for Vortex - very simple
i like the growing window thingy too - lol

mitchi

Very interesting read Bogdan. I've learned a few things here!  :bg
What exactly are the relocations you talk about?

Vortex :

Nice tool, nice explanation. So it's really all about the symbols declared in the obj :)

dedndave

Jochen -

QuoteIf I understand correctly, placing the call in the jmp table makes sense for calls that are used more than a few times. So is there a good reason to place GetCommandLine and ExitProcess there?

the answer is here, i suspect...

QuoteThe creation of the EXTERNDEFs above is automated with Scan.exe

it is fairly simple to create them manually - or let the Scan.exe make the IMP file and remove the unwanted ones

btw - it seems imperative to use PoLink